Category Archives: Ruby

Facebook Apps using Ruby on Rails

This is what I love about Ruby and Ruby on Rails, once you learn the basics of Ruby and how a RoR app is put together you can use this knowledge to learn about other technologies, in this case Facebook Applications. The reason for this is, as soon as a new technology hits the street somebody is bound to either build a Ruby library or a RoR plugin targeting the new (and presumably cool) platform. In two excellent articles Stuart Eccles shows how to build a Facebook app using the rFacebook Rails extension. I guess I could have looked at a PHP or Java example but I chose (as I nearly always do) the Rails route as the layers of abstraction and the standard infrastructure of RoR apps allow me to quickly get an overview of the new technology in action but also, if I so desire, allow me to easily deep dive into anything that requires more detailed investigation.

In the late nineties I learned Java (and later .NET) for a similar purpose not as a primary development tool but because of its role as the “language of account” of most new technologies at the time. And although I still come across environments where the only examples are in a Java (or .Net or PHP) I know if I wait a few weeks some bright Rubyist will eventually “document” it in either Ruby or Rails.

Using the sample application built by Stuart I’ve added one of my favourite salad recipes,
Banana & Tomato in a Mustard Dressing
..enjoy.

Like Excel macros? You’ll love this..

One of the most used (and abused) features of Excel is its macro recording facility. How many mundane and repetitive actions have been automated using this feature? How many people found the courage to program in VBA by using the recorder as their training-wheels? Well now iMacros (from German company iOpus GmpH) a Firefox extension and an IE add-on brings macro recording to the web browser. Although Greasemonkey already enables JavaScript programmers to automate Firefox, iMacros offers the same power to non-programmers.

The macros can be saved as bookmarks and can also be shared with others (although I couldn’t get the sharing feature to work). I’ve already automated a number of tedious tasks and had hoped to use it with Google Spreadsheets, alas either the nature of Goggle’s JavaScript or inbuilt protection within Goggle Apps stopped the recorder for working. Pity, as there’s three things missing from Google Spreadsheets as it stands, pivot table support, offline ability and macro support. Google Gears will undoubtedly solve the offline problem, charts are essentially a graphical pivot so I guess a table pivot must be a possibility, iMacros or something like it could perform the duties that VBA provides to Excel. As with Office macros, security issues may well dampen the parade – iMacros allows access to the PC’s file system and a macro can be invoked from a bookmarklet camouflaged as a standard link – but I’ll not worry until (or if) the product goes mainstream.

From an ETL point of view iMacros can act as a powerful web scraping tool and as a automated form-filler. There’s also a commercial version of the product ($499.00) that exposes the tool via an ActiveX API which means Excel/VBA can be used as a web scraping/ form filling environment. If the price tag is too steep then the excellent scRUBTt is both free /open source and is ideal if you’re scraping a lot of data on a frequent basis while for small or once off tasks this equally free and open source Firefox extension is good enough.

RoR Data Warehouse on EC2

If you’ve been putting off evaluating Ruby on Rails and you’re lucky enough to have an Amazon EC2 beta account then it’s your lucky day. Paul Dowman has just made a public AMI (think of it like a virtual machine spec from which you can create a running EC2 instance) with various Ruby on Rails goodies preloaded.

Features:

  • Automatic backup of MySQL database to S3 every 10 minutes.
  • Mongrel_cluster behind Apache 2.2, configured according to Coda Hale’s excellent guide, with /etc/init.d startup script
  • Ruby on Rails 1.2.3
  • Ruby 1.8.5
  • MySQL 5
  • Ubuntu 7.04 Feisty with Xen versions of standard libs (libc6-xen package).
  • All EC2 command-line tools installed
  • MySQL and Apache configured to write logs to /mnt/log so you don’t fill up EC2’s small root filesystem
  • Hostname set correctly to public hostname
  • NTP
  • A script to re-bundle, save and register your own copy of this image in one step (if you want to).

I’ve been meaning to try out Anthony Eden’s RoR based data warehousing tool for some time; no more excuses as I now can fire up an EC2 instance based on Paul’s AMI , install the ActiveWarehouse plugin and away I go. As ActiveWarehouse primarily uses techniques described in The Data Warehouse Toolkit it’s also a good learning tool for those new to data warehousing. All I need now is a sizeable publicly accessible dataset to populate the warehouse to get a true fell for its capabilities. There’s only so much you can do with the venerable Northwind database. Does anybody know of a ‘beefier’ alternative?

Talend vs. Kettle (Pentaho PDI)

Over the last few weeks I’ve received a lot of traffic from Goggle searches comparing Talend and Kettle and also from Vincent McBurney’s ITtoolbox article comparing the two products, so where do I stand?

As ETL tools they take different approaches, Kettle is a meta data driven framework (which is in turn tightly integrated into an even larger BI framework, the Pentaho BI Project), while Talend is at the end of the day a code generator. The code generator nature of Talend does impart certain capabilities such as the ability to easily integrate into other BI platforms (such as JasperSoft and SpagoBI ) and as micro-ETL tool. On the other hand Kettle’s increasing integration into the Pentaho family makes it natural choice for larger BI implementations. So which one would I recommend ? It’s got to be Kettle and the main reason for that choice is the Pentaho PDI community.

My experience so far with the Talend community has not been good. Admittedly I’ve only asked two questions, one looking for more details on a statement regarding ERP and CRM connectors mentioned in a press release and the other a technical query about a new tSQLite connector. Neither were answered. OK I’m a big boy I can handle the rejection and I’m also technically savvy, I can eventually fix most of my own problems. But I’m sure if I posted similar questions on the Pentaho forums I would have received an answer, probably from Matt Casters if he though the general community would not be able to provide the answer.

So will this experience stop me from using Talend? Maybe not, the code generation capability may come in handy as a micro-ETL tool (but I do have other options); however, through an offline conversation with Samatar (an active member of the Kettle community) and Matt Casters I learned of a new XML input step and of a PDI alternative to the Talend tJavaFlex component, the “Modified Java Script Value” step. This allows for the creation of START, TRANSFORM (i.e. for-each-row) and END JavaScript scripts. As Kettle uses the Rhino scripting engine I have access to any Java API (e.g. Palo,S3, Google Spreadsheets) but with the flexibility of JavaScript; brilliant!

Not only that, since my experience with Talend tJavaFlex reintroduced me to the world of Java I though I might as well look into creating my own Kettle plugin and guess what, it looks easy enough. So I’m going to develop my first plugin to be based on the ScriptValueMod step, replacing the Rhino engine with the Java 6 ScriptEngineManager. This will give me access to the latest version of Rhino which supports E4X (will make handling XML must easier) and also opens the possibility of using JavaFX or indeed JRuby as Kettle scripting languages.

UPDATE:

I’ve received a reply to my tSQLite question!

UPDATE: July 2008

This post is now over a year old, Talend as a product and as a community has improved enormously in the last 12 months; so much so, I now use Talend  (Java) in preference to PDI for most of my datasmithing needs.

JavaFX – a GUI DSL

Having mastered JavaScript (OK master is too strong a word – having become comfortable with both its syntax and usage patterns) my next port of call is JavaFX the recently announced Flash/Silverlight competitor. What led me to JavaFX Script was not its role in this Flash/AJAX alternative platform (which unless  Sun improves the JRE download experience is dead in the water they may have a chance with the promised consumer JRE ) but its status as the 2nd scripting language to be supported by the Java 6 ScriptEngineManager – the first being JavaScript. Although the JavaFX platform is still in alpha, the important elements of the scripting language appear to be usable.

Why do I need another scripting language? Well JavaFX Script is a DSL (domain-specific language) and the domain in question is the GUI and as a GUI DSL, JavaFX is impressive.

In my quest for a micro ETL environment the lack of a fast and powerful GUI tool has been a problem. Admittedly, in my Excel/Sqlite xLite suite GUI generation is not a problem as VBA forms are both fast to develop in and feature rich, but the other two options (Java and Ruby) both lack a cost-effective GUI tool (I know there’s Swing and Tk but for somebody who’s used to the speed of development of tools such as VB forms or Oracle’s Application Express, neither appealed). JavaFX changes all that, it’s elegant, powerful, fast to develop in and portable. Further pushes me down the road of using Talend generated Java as my micro-ETL environment of choice.

It’s easy to dismiss JavaFX as hype but the technology behind the hype looks sound. The language borrows powerful features from other languages, object literals from JavaScript, list-comprehensions from Python/Erland/Haskell and first-class functions from JavaScript/Lisp all combined with the full power and glory of Java. The end result:

..is a declarative and statically typed programming language. It has first-class functions, declarative syntax, list-comprehensions, and incremental dependency-based evaluation. The JavaFX Script language makes intensive use of the Java2D swing GUI components and allows for easy creation of GUIs.

Go the community page for FAQs, tutorials and reference information. Timothy O’Brien has a good first pass at calling JavaFX from a Java program and this article by John Smart explains the basics of using ScriptEngineManager to call JavaScript.