Category Archives: HAMMER

HAMMER and Threads Redux – The adventure continues…

Sometimes, like Scott of the Antarctic, your data shaping activities may need to “go outside” of Excel and it ” may be some time” before you return. That’s primarily why the HAMMER.exe command-line version exists; data can be packed up via the DELEGATE or SUBMIT commands; picked up by external HAMMER processes (I often use JSDB’s Grid micro-messaging JavaScript toolkit to orchestrate such workflows, but perhaps now I’ll use this – Amazon’s latest gem - AWS SWF !); then returned safely (unlike Scott) to Excel for further processing.

But, when “going outside” is not desirable or feasible, another solution is required, namely threads. There are those who regard threads as evil and I too would have to admit I generally would not use them without good reason. But sometimes they’re needed, usually to keep a controlling UI (in this case Excel) responsive and capable of monitoring and managing background long-running tasks.

I’ve already added threading capability to HAMMER via the hammerThreadEnabled function and via “internal HAMMER threads” (using NET4.0′s latest parallel programming additions).

Both of these “lock” the Excel main thread until complete, with hammerThreadEnabled being also dependent on Excel’s dependency graph to determine if threading will happen or not. So I figured I needed to bite the bullet and add a new set of threading commands to better handle long-running tasks, be they tasks that eventually return data or a status to the main Excel thread (such as a long running save to ( or fetch from) a high latency web service end point), or those that run continuously (such as a Excel as a web service). The main new commands are:

  • PYTHONTHREAD – like PYTHON but spawns a thread to run to Python script. (NET4.0 runtime required)
  • JAVASCRIPTTHREAD – as above, but using JAVASCRIPT plus no need for NET 4.0.

Both commands “take ownership” of the HAMMER session’s SQLite database  and return the new thread’s ID (in the form PYTHREAD1,PYTHREAD2, JSTHHREAD3 etc), and are likely to be the last command in a HAMMER function call’s sequence.

The thread ID can then be used by the following helper commands to monitor, re-join and clean-up such threads:

  • SQLFROMTHREAD – like SQL but takes two arguments, 1st is the thread id, and the 2nd is the SQL to apply against that thread’s SQLite database. e.g. = … “JSTHREAD1″,”Select * from messages”,”SQLFROMTHREAD” … In general, only Select statements would be issued against this database, but SQLite is thread-safe so data updates are possible but with obvious potential for SQLITE_BUSY locks and time outs. (Likewise, the thread itself can communicate back to main thread using an APPDB connection, as APPDB is always attached uniquely to the main Excel thread).
  • THREADRETURN – will report on whatever return values (be that an error, or on success, a string or a table value) once the thread completes. Again, it uses the thread ID as its argument. If the thread is still running, the command returns “Waiting”.
  • DELETETHREAD – will attempt to terminate an already running thread. It will then, or if already terminated, clean-up and close any resources associated with the thread, mainly its database, but also its Python and/or JavaScript engine instances.

It’s also possible to issue a HAMMER call that will re-join a completed thread’s database, this will happen if a HAMMER function call’s 1st argument is a valid thread-name e.g. =HAMMER(“JSTHREAD2″,”Select * from results”,”SQL”). If the thread is still running, the HAMMER function will exit with a return value of “Waiting”. With this, it’s possible to not only access the thread’s database but also its Python and/or JavaScript engine context. Multiple such HAMMER calls can be made until a DELETETHREAD command is issued or the Excel session is terminated.

As SQLite is thread-safe, SQL can be used as a simple and robust method of inter-thread communication. If the pattern of threads only updating their “own” database while the main Excel thread updates only “APPDB” is followed, very little locking or timeout issues should arise (ha, famous last words – okay, not as famous as Scott’s). There’s also nothing to stop more adventurous inter-thread single database usage if the need or impulse arises (“Do ya feel lucky Punk, well do ya!”). And don’t forget, normal disk-based databases can also be used alongside, or instead of, the more usual in-memory ones.

This use of SQLite falls neatly into my view of SQLite not so much as a database but as means of mapping and managing  shared memory using SQL as an  API.

It’s often said that SQLite should be viewed not as an alternative to Oracle/MySQL/MSAccess/whateverRDBMS but as an alternative to a file-open statement. But for me, I primarily use SQLite as an alternative to rolling-my-own in-memory shared data structures  (be that across threads or application stacks).

Two other new commands added with this version are:

  • VERSION – will return the version of the internal HAMMER (XLite) engine is use.
  • LISTCOMMAND – returns the list of commands available.

There’s also been a small breaking change to the JAVASCRIPT command. Scripts will now only return a table if the return value is a JavaScript Array of Arrays (List of Lists) object or a JSON string representing a LOL. All other objects will be converted to a JSON string representation or scalar string value.

A few extra JavaScript focused helper methods have also been added to the passed-in “db” object:

  • db.jsCreateTable(tableName, JSONaLOLorLOOobject) where the 2nd argument can be a “JSON table” in either List of Lists or List of Objects format. A table will be created using either the first “line” of the LOL or the “names” of the objects fields in the LOO to provide the column names (no types applied, as no need most of the time).
  • db.jsInsertRows(tableName,JSONaLOLorLOOobject) – as above but will insert/append the pass’d “rows” to the specified table.
  • db.jsonSQLQuery(SQLSelectStatement,typeOfJSON) will return a table in List of Lists format if typeOfJSON = “LOL” or as a List of Objects if “LOO”.

There’s also a new JavaScript command: loadAssembly(dotNetAssemblyName) – which will load a .NET assembly allowing the use of any .NET library within JAVASCRIPT.

Both the JAVASCRIPT and threading commands are at a very early stage, only minimal testing so far, so use with care.

To download the latest version of HAMMER, use this link.


JavaScript as an Excel scripting language via HAMMER

I’ve demonstrated in the past how to embed JavaScript in  Excel using the C based JSDB toolset and having already provided Python as a HAMMER scripting language, I felt the time had come to do the same for JavaScript. With this, the latest release of HAMMER, JavaScript’s back!

There’s three main reason’s why I’ve added JavaScript as a partner scripting language to HAMMER’s existing Python (and VBA) integration.

  • JavaScript is everywhere, a lot more people know it and use it than have ever used, or likely to use Python, (or VBA). JavaScript is now the undisputed “glue language”.
  • I needed a scripting language that would work in all versions of HAMMER (the current IronPython implementation requires NET 4.0). For that reason I decided to use the NET2.0 based JINT rather than IronJS as my choice of JavaScript engine. JINT, is simpler, and probably slower that IronJS, but being C# based (IronJS is being developed in F#) I can more easily dig as deep as I need into JINT’s codebase without exhausting my technical skills.
  • Increasingly I’m coming across JSON as data transfer protocol and although I have inbuilt JSON processing commands in HAMMER, JavaScript and JSON are natural partners.

The main use-case I see for JavaScript scripts will be the manipulation of data, which is obviously equally possible with Python, but I see IronPython’s main use as a scripting language for .NET facilities (IronPython’s integration with .NET is complete, the JINT engine’s integration is much less advanced). And as I pointed out above, removing the NET4.0 requirement (for both Excel and the HAMMER.exe command line version) is very useful when building HAMMER Inside bespoke applications.

So my scripting language choices are:

  • VBA, when manipulating the Excel Object Model
  • IronPython when accessing .NET internals, or for complex coding requirements when not  using C# or VB.NET.
  • SQL , for tabular set manipulations, and for the fact “that SQL  does exactly what it says on the tin“.
  • JavaScript for everyday data-focused IF-THEN-ELSE’ing or JSON’ing.

Calling JavaScript from HAMMER is similar to Python. So …

=HAMMER(“myVal=’cat’; return myVal;”,”JAVASCRIPT”) will return “cat”.

Like PYTHON, a object “db” is passed into the script, this allows access to the SQLite instance via several functions:

  • db.SQL(arg) will take a string arg holding a SQL command and execute it.
  • db.SQL(arg,callback(arg)) as above but expects a SELECT statement and will return a JSON encoded object string via the callback’s arg for each row returned.
  • myJSONTable  = db.jsSQLQuery(arg) expects a SQL SELECT statement and will return the table as a single JSON encoded string as a List of Objects.
  • myJSONReturn = db.JSONhammer(arg) will make a call as an internal HAMMER command where arg is a JSON encoded string of a List of commands. Will return a JSON encoded table or a string value.
  • argValue = db.getArgNoArg(argNo) will get the value of a HAMMER argument, where argNo=1 is the last argument before the JavaScript script argument, argNo=2 is the argument prior to that, and so on.
  • tableName  =  db.getArgNoTable(argNo) like above but returns a table name if the argument position holds a table rather than a string value.

A predefined JSON object is also passed in, offering the usual JSON.parse(arg)  and JSON.stringify(obj) methods.

There’s also an alert(arg) command which will display a Window’s message box and a println(arg) command which will append to a _JSSTDOUT specified log file if one has been specified.

Use the “return” command to return either a table (a JavaScript object containing either a List of Lists or a List of Objects) or a string value.

I’ve not fully tested this, or settled completely on the interface between HAMMER and its new JAVASCRIPT engine, but I’ll do so over the next few weeks and will then include a proper examples of its use.  So use with caution.

Here’s a list of the HAMMER commands implemented so far …

Download  the latest version of HAMMER from here …

PowerPivot VBA Refresh Code – Bug Fix

Just a quick post to alert those of you using my PowerPivot Refresh code to a bug in its “refresh a single table” logic. Under certain circumstances, linked tables (i.e. those on the “many” side of a relationship) will fail to refresh if specified individually (fine when part of a refresh all). See here for the details behind the bug, and thanks again to Rob Parker for bringing it to my attention. I’ve updated the sample code with the fix, download it here.

I’ve also tested the code against the just released SQLServer 2010 RC0 version of PowerPivot and it appears to work.

I’ve updated the equivalent PPREFRESH code in HAMMER and this is now part of the tool’s latest release (V1.3.4(Beta)). My previous post, Excel as a Book of Record, previewed the most important new commands available in this release. Alongside those, I’ve also added the following:

  • ISTABLE, if the previous COMMAND’s result  or the previous argument is not a table, this will abend the command sequence.
  • ISARG, as above, but this time checks for an argument (a HAMMER parameter may either be a table, a command or an argument- aka, an ARG).
  • ISOK, previous argument must be the string value “OK”.
  • TABLESARETHESAME, will fail if the last two tables are not identical. Intended mainly for automated regression testing.
  • ARGSARETHESAME, as above. but this time for ARGS.
  • _GUID, will return a globally unique identifier.

Download the latest version of HAMMER from here …

Excel as a book of record.

In the past I’ve talked about Excel as a tool to develop Really Simple Systems. Such “systems” usually occupy the middle ground between continuing to do a task by hand or  investing time/money in using a packaged/bespoke “proper system”.

When such systems are primarily reporting in emphasis, the justification for using Excel is usually straight forward and compelling (and getting even more compelling with the appearance on the scene of Excel 2010′s  PowerPivot.) But, alarm bells sound across the world of professional IT when Excel is proposed as a “book of record” i.e. when it’s to be used to store and maintain a business critical dataset. And, with some considerable justification, the nightmare that is linked (or even worse, shared) workbooks is very real indeed. But yet, businesses continue to do so, and do so quite successfully.

I myself record my business as a series of Excel documents (Invoices, Timesheets, Expenses) in a set of folders (Financial Years subdivided into major customers).  Essentially a document-oriented database.

In the past I simply then used a VBA powered workbook to open the required “documents” and extracted the data necessary for whatever report I required (VAT, year-end etc.).  To better automate (i.e. less bespoke VBA) this task I’ve have made changes to HAMMER to help with this and with similar tasks for clients.

The following list of commands will be added to the next release of HAMMER. (In the meantime these new functions can be previewed here …)

LISTOFFILES

This command takes a single argument, the folder to search, and will return a table of files in that folder and in any sub-folders. The result can then be used to select a list of files for further processing.

Example:

=HAMMER(“C:\a\rss”,”LISTOFFILES”)

_XLTOJSONDOC

This command takes a list of workbooks, opens each one, checks for a list of named ranges and generates a JSON document. The command is intended to be called from within a VBA macro (as opening and closing workbook breaks the “no side effects” rule of UDFs). Most “_” commands such as “_MD5″ etc. are likewise intended for “programming use”, but any command beginning with “_XL” must be restricted to macro (i.e. non-UDF) use.

See the example workbook FetchInventory for an example of this function in action. The function takes one argument (the name of the document to load) and expects a table where the last column is the full name of the workbook to open. Any columns in the source table will be copied to the new “JSON document” table with an additional column called “Document” which will hold a JSON document representing the key-name pairs and table(s) extracted from the workbook.

On opening a workbook, it is searched for a named range with the name of document concatenated with “_fields” (e.g. INVENTORY_fields). The value of this range is expected to be a CSV list of fields and tables to load. A single Excel “document” could contain multiple logical documents (each specified by its own “_fields” list) .

See the PartsInventory_bin4 for an example of a multi-document workbook (INVENTORY and EXAMPLE). The EXAMPLE document in this workbook also demonstrates the various types of tables handled.

Example:

lJSONObjects = oHammer.HAMMER(“C:\a\rss\StockTake1″,”LISTOFFILES”,”Select name,fullname from table2 limit 1″,”SQL”,”INVENTORY”,”_XLTOJSONDOC”)

lReturn = oHammer.HAMMERToRange(lJSONObjects,”Sheet2!A27″)

… will output

JSONDOCVIEW

This command is where the previous commands are leading to, i.e. extracting some real information value from your documents. It converts JSON documents into Excel friendly tables. It is, in essence, a Map function as in MapReduce. In a previous example I used a Python Map and a SQL Reduce, here, both Map and Reduce are via SQL (the command uses a series of SQL commands to perform its task).

Before I describe the function let me explain why I use an intermediate JSON format. I could just extract the data directly from each document and either store directly in Excel or create tables in SQLite of Access to hold this data. And in fact, that’s what I would have done in the past (seeExcel as a document-oriented NoSQL database). Now , however, I tend to favour using a free-format (i.e. no need for a fixed database schema) structure like a JSON document, so as the source documents evolve over time (which tends to happen not just during design stages but as the system matures) this will not break older documents.

So, for example, original Invoice workbooks might not have a backing time-sheet while newer Invoices do. As long as new and old documents share a core sub-set of data fields they can continue to be analysed together.

The command takes 5 arguments and a driving table (a record so far for HAMMER commands, most have a max of two arguments). The driving table’s last column is assumed to contain the JSON document to process, columns prior to this (if any) will be output unchanged for each resulting row.

The first argument specifies the name of the “inner” table to fetch (if any). Most real life documents consist of header details (the “outer”document) and one or more tables (“inner” details). Invoices, time-sheets,  stock-takes, all tend to follow this pattern. This command will effectively join each document’s outer details to a single inner table (if more than 1 inner table, a call for each one is required).

The second (field list in SQL format) and third (SQL where predicate format) arguments specify what inner fields to extract (if blank, then all) and what restrictions to impose (if any). So “InvNo, Date”,”InvNo > 12″ would only fetch documents where the InvNo > 12 and only include the InvNo and Date fields.

The fourth and fifth arguments do the same for the outer table (i.e. Header data).

If any of the columns specified  (inner or outer) can not be found, or if the predicates (inner or outer) result in no selection, no error is returned, the document simply returns no rows. Likewise if an inner table is specified and no such table exists, then no rows are returned for that document – in other words this is not an outer join, which is not usually a problem as in most cases a “header” without detail lines is meaningless. If an outer join is required, then extract the headers (outers) and details (inner tables(s)) separately and join using SQL.

Example:

=HAMMER(“Select Name,FullName,Document from invoice_docs”,”SQL”,”table_2″, “[PART NUMBER],QTY”, “QTY >30″, “Bin_Number”, “Bin_Number > 1″, “JSONDOCVIEW”)

would result in:

For more complex JSON objects use the JSON command to incrementally parse the text or use the VBA JSON module within microETL.  [UPDATE: Feb 2012 - ... or use the JAVASCRIPT command]. But for most situations (especially if you control the expected format) JSONDOCVIEW should handle it.

As JSON is fast becoming the preferred transport format for web and mobile applications having the ability to parse and produce JSON form within Excel is very useful. It is possible, for example, to use a simple web technology such as http://robla.net/jsonwidget/ to craft another type of Really Simple System. This time with the collection happening on the web (most likely using AWS S3 pre-signed forms, so no HTML server required – keep it simple) but with the control and reporting remaining within Excel (a variation on my Steam Powered Server idea).

For an example of a really simple system  download this.

Latest version of HAMMER including the above commands now released …

HAMMER Alongside, as a COM Server

Although it has always been possible to call HAMMER from within VBA via the Application.Run method, this is a somewhat clunky way of doing so and it can also be very inefficient, particularly for tight loops. But now, with this release (V 1.2.0 (Beta) ) core functionality is exposed as a COM Server, which means easier and more efficient interfacing between VBA and the .NET multi-threaded enabled world of the datasmith’s HAMMER.

I call this “HAMMER Alongside”, to differentiate it from HAMMER Inside where I use HAMMER internals to craft stand alone XLLs. With this COM Server method, the standard HAMMER add-in (installed or just-in-time registered) is needed alongside either a VBA add-in or a VBA enabled workbook. The extra “moving parts” are easily justified by the extra flexibility that this method allows (particularly to those with a reasonable grasp of VBA, but perhaps lacking any familiarity with the .NET world).

By learning a small amount of IronPython (here’s a good starting place) it’s relatively easy to hook up VBA to any .NET library through HAMMER, without having to invest time and money in learning heavy duty development environments such as Visual Studio. Using “Internal Threads, it’s also possible to take advantage of .NETs multi-threading capability from VBA. (Note: both these options require .NET4).

So with VBA becoming a first-class language with regards to HAMMER, why offer Python as an alternative scripting choice? Well, one reason is to offer a means to access .NET’s power as per the last paragraph. But, the primary reason, is related to the ability of HAMMER transformations to be “detached” from Excel and run via HAMMER’s command line tools. The .NET 4 version of the command line tool supports Python, so it is possible to initially build out a micro-ETL transformation within Excel (using Python to perform the business logic alongside SQL) and then transfer that logic to the command-line tools with a minimum of modification.

This could be useful , for example, when a PowerPivot model moves to the server, its related HAMMER-powered micro-ETL processing could do likewise (most enterprise ETL tools support call-outs to command-line processes).

The COM server can be accessed from VBA only via late-binding like so:

Set comServ = CreateObject(“hammerCOMServerV1″)

The server’s methods are:

  • hammer(…) – works the same way as the UDF version.
  • hammerVersion() – returns HAMMER version, again the same as UDF version.
  • hammerVersionOK(version) – e.g.  isOK= comServ.hammerVersionOK(120) will return TRUE if the current version is >= V1.2.0
  • arrayResize(anArray) enables the creation of “toFit” UDFs.
  • arrayToSheet(anArray) likewise for “toSheet” UDFs.
  • arrayToRange(anArray,pasteToWhereString) enables “toRange” UDFs.
  • hammer_ppRefresh_inline(optional table,optional timeout) – refreshes a PowerPivot model, again the same as its UDF equivalent.

Download the latest version of HAMMER from here …