Category Archives: Ruby

LightSwitch & Hobo – the return of the 4GL?

Those of us of a certain age have fond memories of the golden era of 4GLs. These simple, but at the time revolutionary, tools enabled business-aware programmers (usually termed analyst-programmers) to quickly build & deploy line-of-business apps. They (both tools and devs) were primarily data-driven, data begat screens, screens begat more data and so on. The resulting apps where server delivered, using either green-screen terminals or client-side delivery apps (a bit like current-day client-side RIAs). The resulting UIs could best be described as “plain but functional”.

These halcyon days were replaced by a combination of  2-tier Windows apps and by the 3-tier enterprise platforms. The increasing sophistication & complexity of the new platforms forced programmers to become highly specialised, often losing their once close links with the business, and even losing sight of the value of business data as a resource in itself (still amazes me when I come across business application programmers with poor or non-existent SQL/RDBMS skills). The analyst-programmer (AP) was no more.

Many of those APs, like myself, managed to stay close to the business by either becoming business/data analysts or datawarehousing/BI specialists. ERP platforms such as SAP, with their complex configuration requirements, also created a welcoming home for business-focused IT refugees.

The need for quick’n'dirty line-of-business apps has not disappeared, but this service is now often being provided by tools such as MS Access and above all, by Excel. This is both good and bad; the good is the expansion of development skills outside of IT; the bad is the effective dis-arming of a large proportion of professional IT folks. To paraphrase, Marvin the Paranoid Android: “Brain the size of a planet, and all they give me is Excel!”

Most of us managed to make the best of the situation, learning to respect, if not love, Excel; becoming SQL wizards; MDX magicians and dimensional modellers par excellence. But the call of the 4GLs remained and we veterans continue to keep a watchful eye for something that will match and hopefully surpass them.

Oracle’s Application Express (aka HTML DB) offered those with Oracle skills hope (very similar in concept to SQL*Forms V3) and the open source tool WaveMaker is also excellent, styling itself, with good reason, the Powerbuilder for Web Enterprise.

In the last month I’ve come across two modern-day descendants of  these 4GL data-driven tools. Hobo and LightSwitch.

Hobo, is an open source extension to  the Ruby on Rails platform. It uses the same MVC architecture but takes the concept further in not just allowing you to build a relationship model for your data, but also enabling the  easy specification of  a lifecycle model and, here’s the biggy, automatically building (and re-building on change) a very respectable UI to present to the outside world. It also offers a starter authentication framework and lots of other useful helpers. All without writing a line of Ruby code, or having any idea of how RoR works!  The end result is a tool that can not only quickly, and iteratively,  build CRUD type applications, but can also handle simple workflow apps out of the box.

LightSwitch, is similar but yet very different. The same, in that it follows the traditional 4GL data to screen approach; presenting the user with a graphical tool to build data tables, to link to existing data sources and to create relationships between entities. Screens can then be generated that, like Hobo, are professional looking and easy to use. Again like Hobo, new ‘skins’ can be applied to change the look and feel.

If a more sophisticated solution is required, code can be added (VB.NET or C#) at predefined events and indeed the resulting project, being fully VS 2010 compliant, can be opened in VS Professional and built out from there. (A similar ability to get under the bonnet exists for Hobo as it is essentially Rails).

Where LightSwitch differs is the deployment methods used. The end result is a Sliverlight app which can either run client-side (with full access to the client’s environment e.g. interact with Office etc.) or as a sandboxed IIS browser app or via the Azure cloud. Same code, same project can easily migrate back & forth between all three options. (Hobo, being a Rails app could also be sort-of-localised using RubyScript2Exe, but it could very easily be cloud deployed using the EC2-based, dead-simple to use, http://heroku.com/)

The LightSwitch data modeller also allows for relationships between local databases, network databases, SQL Azure cloud databases and web service datasets to be built and maintained within the application. The need for mashups between local & central/remote data is a constant requirement for LOB developers and LightSwitch appears to made it very easy to implement.

This data mash-up ability and the option to interact with the client will be a major attraction, at least for corporate devs working largely with MS tools. I say will, as alas the current beta is  “molasses-in-January” slow. I thought initially it was just my 5yr laptop hitting the wall, but others with more modern & powerful hardware also found it so.

So do tools like Hobo and LightSwitch herald the return of the IT analyst/programmer? Probably not, different times; outsourcing, SaaS and packaged software have and will continue to reduce the number of business-facing IT staff. But their places are been taken by IT-aware business folks, citizen programmers, creators of time-assets and it is they that will likely be the beneficiaries of such tools.

Python the new VBA ?

These last two weeks, Python has been on my mind. First off, last week I decided to make time to fully investigate Picalo, an open-source Python-based data analysis tool, and then, this week, Google announced their long awaited cloud-computing offering, Google Apps Engine, with the language at its core.

Python was the first of the “LAMP generation” scripting languages that I decided to learn in any detail ( I had used Perl before that but only on a per-task basis (similar to how I’d used AWK)). I then invested time in learning PHP, then Ruby and finally JavaScript. And here I am, back where I started, with Python.

But it’s not the same Python I learned three years ago, not that it has changed that much, but my appreciation of the language has, largely due to my deep dives into other languages. For example, JavaScript’s treatment of functions as first-class objects, highlighted the same functionality in Python, something I’d missed (or rather, not fully understood) the first time I encountered the language. Likewise, Ruby’s RoR introduced me to a “best of breed” approach to web application design, something that can be used as a comparison aid when approaching new web frameworks such as Django.

But of course the scripting language that continues to power most of my datasmithing activities is Excel VBA. That’s why I was so excited to see a tool such as Proto utilise VBA as its scripting language. But, Microsoft has abandoned VBA, there will be no more Protos.

Also, Excel VBA is now a Windows only language. Windows, however, is no longer the ‘only’ business client OS (see how many Apple laptops you can spot the next time you’re in a business-class airport lounge, a few years ago it would have been zero, not any more), and is currently nowhere to be seen as a cloud computing platform (but that’ll change).

I’m at heart a table-oriented programmer, and I, like Picalo’s author Conan Albrecht, believe “data analysis is best done through scripting”; but not just data analysis, the T in ETL (Extract, Transform and Load) and the I in DI (Data Integration) and SI (Systems Interfacing) also benefit from a scripting approach.

So, what to adopt as a successor/companion-in-her-old-age to VBA, will it be Ruby, JavaScript, Python, Perl, even PHP?

It looks like it’ll be Python because it’s …

The runner up is of course Ruby, but its poor integration with Windows is a major problem and the datasmithing “prior art” of Picalo and Resolver makes Python hard to beat.

UPDATE Jan 2010:

To experience the best of both worlds, VBA & Python, my xLite (Excel combined with SQLite) datasmithing platform now allows Python to be used in conjunction with VBA.  Check it out here

 

UPDATE: July 2011:

For another method of integrating Python (this time .NET’s IronPython) with Excel/VBA see http://blog.gobansaor.com/2011/07/18/vba-multithreading-net-integration-via-hammer/
UPDATE:

Also, as Dan pointed out in the comments below, I’d not included Jython in my list of reasons for embracing Python. I must add it to my list of things to try out particularly as both my “classic” ETL tools, Talend and Kettle are JVM based.

Another thing to add to the (ever growing) list is Mike Pitarro’s SnapLogic python-based ETL tool. They have …

…just released a 2.0 Beta version with some major architectural enhancements. The SnapLogic model is very different from traditional ETL systems. It takes an approach that’s more like the web, based on loose coupling and HTTP interactions. We model data source, sinks, and transformations as URI addressable endpoints, and have a model where than can be chained together in pipelines to build transformation logic. We use a plugin architecture to make it easy to add custom components.

Zimki – the spirt lives on …

Although Zimki is to shut down on Christmas Eve, the ideas behind the service live on. Two new offerings, Horuku and AppJet, offer variations on the idea of hosted application development/deployment.

AppJet, funded by Paul Graham‘s Y-Combinator, is very similar to Zimki, being a server-side JavaScript platform. No details yet as to what sort of paid options will be offered (all accounts are free at the moment). Unlike Zimki there’s no plans to create an open-source version. I like the easy “build a Facebook app” feature; and I guess this is the sort of light-weight applications that they hope to attract.

Although Heroku uses Ruby-on-Rails technology, rather than JavaScript, it is closer to the original Zimki idea; but rather than take the hard (and ultimately unsuccessful in Zimki’s case) road of building an open-source platform from scratch, Heroku takes an already popular open-source project and offers it wrapped in a full on-line development and deployment environment. Again, being in beta, there’s no indication as to what pricing model it will operate under, but I would think that it will attract more “serious” projects than AppJet since anything developed under Heroku is pure Rails which means it can be migrated to any other Rails hosting environment; so no lock-in. The online editor is excellent and whatever about its merits as a hosting service it’s by far the easiest way to learn and explore Ruby and Rails, even easier than this…

If Facebook apps are your goal but you wish to use Ruby rather than AppJet’s JavaScript then not to panic, as being Ruby some bright young spark (no, not me I’m afraid) will already have done a lot of the hard graft for you…

Ruby plus Amazon S3 – Document Centric Database

I’ve said it before and I’m going to repeat myself; learning Ruby has proven to be a great investment, not so much for the language itself but for the insights it gives into other technologies. As soon as a new ‘cool’ technology or idea hits the street some smart Rubyist is bound to attack it, dice it up and serve it back up as easy to digest Ruby code.

Today, it’s the turn of Document Centric Databases done in the style of CouchDB, but replacing JavaScript/Erlang with Ruby and the bespoke data store with Amazon’s S3 service.

Anthony Eden‘s RDDB project is still very much alpha, but looking through the code it looks like it has lots of good ideas, including using EC2 instances as “map reduce workers” listening on Amazon SQS Queues; so the whole Amazon AWS stack might yet get staring roles. The actual data store can be varied, with both partitioned file system and RAM based options currently available alongside S3.

Other Amazon AWS related news, was the announcement today of an option to use European data centres to store S3 data (with a slightly higher charge than using North American locations and with the transfer of data between EU based S3 buckets and US based EC2 instances being no longer free). I’m guessing that the option to fire up European based EC2 servers can’t be far behind. Also, one piece of news I’d missed was that EC2 is now in unlimited beta i.e. it’s now open to all developers. So developers everywhere can, for less that the cost of a mobile text message, fire up their own dedicated and powerful Linux server. The day of a production ready, SLA backed, EC2 service is around the corner.

Google Spreadsheets – ETL tool

Although I’m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word. Three main reasons for this shift to the cloud:

  • Google Docs & Spreadsheets are ‘good enough’ for most of the trivial lists and calculations I require in my personal life and indeed for most business purposes as well, at least those that don’t require a pivot table.
  • These spreadsheets and documents are important but not necessarily in the ‘state secret/I-could-tell-but-then-I’d-have-to-kill-you’ scale of things, by building them in Google Apps they are securely backed-up and easily accessible.
  • A lot of the spreadsheets are collaborative in nature, and in the collaboration field, Google Spreadsheets just gets better and better.

Today, Google announced further additions to their spreadsheet product. The AutoFill feature adds functionality I’ve come to expect from Excel, but with a twist, integration with Google Sets. But the additions that really caught my eye were the new data import functions. Now again, Excel has had web queries since Excel97, and it always amazed me why online pretenders to the throne tended to ignore the most common source of tabular data on the web, the HTML table; something to do with the great XML/Tables divide I guess!

Google now not only fixes this omission,providing access to HTML tables and comma/tab separated file, but also provides access to RSS/ATOM and generic XML sources. All that’s missing now are functions that can read other common online data files formats such as Excel, MSAccess, XBase and of course SQLite.

This addition of HTML import support and the AutoFill feature will further reduce the number of times I’ll need to fire up Excel for personal tasks, but the RSS/ATOM/XML import feature also has potential as a tool in my micro-ETL toolbox. Using Excel as my only micro-ETL tool is possible when the data is either already in Excel/CSV or accessible via a COM API or via ODBC drivers, otherwise I can call-in either Ruby, Talend, Kettle or even RSSBus. But now I’ve another option, if the data is public and published as RSS/ATOM or some other variation on XML, I can use Google Spreadsheets to fetch the data and import the resulting tabular dataset into Excel via a Web Query or via the GData API.

New Google Reader Search facilityOne other thing. While researching this post, looking up links etc. I used another new feature Google added today, Google Reader’s new Search facility. As most of my references are discovered via the blogs I subscribe to, the ability to restrict searches to that subset of the web is fantastic; I even used it to search through my own blog posts! If del.icio.us offered the same option it would make re-finding stuff even easier. I did try to use Google Co-Op to build a search engine restricted to my del.icio.us links but it didn’t seem to like the volume of links (4000 odd) I sent it.