<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Gobán Saor &#187; Ruby</title>
	<atom:link href="http://blog.gobansaor.com/category/ruby/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.gobansaor.com</link>
	<description>A country datasmith.</description>
	<lastBuildDate>Tue, 27 Jul 2010 17:23:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.gobansaor.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/67e164f5d51c2b3115a7819b84505c13?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>Gobán Saor &#187; Ruby</title>
		<link>http://blog.gobansaor.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.gobansaor.com/osd.xml" title="Gobán Saor" />
	<atom:link rel='hub' href='http://blog.gobansaor.com/?pushpress=hub'/>
		<item>
		<title>Python the new VBA ?</title>
		<link>http://blog.gobansaor.com/2008/04/11/python-the-new-vba/</link>
		<comments>http://blog.gobansaor.com/2008/04/11/python-the-new-vba/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 13:35:10 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[appengine]]></category>
		<category><![CDATA[AWK]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Picalo]]></category>
		<category><![CDATA[Resolver]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=359</guid>
		<description><![CDATA[These last two weeks, Python has been on my mind. First off, last week I decided to make time to fully investigate Picalo, an open-source Python-based data analysis tool, and then, this week, Google announced their long awaited cloud-computing offering, Google Apps Engine, with the language at its core. Python was the first of the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=359&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>These last two weeks, <a href="http://www.python.org">Python</a> has been on my mind.  First off, last week I decided to make time to fully investigate <a href="http://www.picalo.org/">Picalo</a>, an open-source  Python-based data analysis tool, and then, this week, Google announced their long awaited cloud-computing offering, <a href="http://googleblog.blogspot.com/2008/04/developers-start-your-engines.html">Google Apps Engine</a>, with the language at its core.</p>
<p>Python was the first of the &#8220;<a href="http://en.wikipedia.org/wiki/LAMP_%28software_bundle%29">LAMP generation</a>&#8221; scripting languages that I decided to learn in any detail ( I had used Perl before that but only on a per-task basis (similar to how I&#8217;d used <a href="http://en.wikipedia.org/wiki/AWK_(programming_language)">AWK</a>)).  I then invested time in learning PHP, then Ruby and  finally JavaScript.  And here I am, back where I started, with Python.</p>
<p>But it&#8217;s not the same Python I learned three years ago, not that it has changed that much, but my appreciation of the language has, largely due to my deep dives into other languages.  For example, JavaScript&#8217;s treatment of <a href="http://www.joelonsoftware.com/items/2006/08/01.html">functions as first-class objects</a>, highlighted the same functionality in  Python, something I&#8217;d missed (or rather, not fully understood) the first time I encountered the language.  Likewise, Ruby&#8217;s <a href="http://en.wikipedia.org/wiki/Ruby_on_Rails">RoR</a> introduced me to a &#8220;best of breed&#8221; approach to web application design, something that can be used as a comparison aid when approaching new web frameworks such as <a href="http://en.wikipedia.org/wiki/Django_%28web_framework%29">Django</a>.</p>
<p>But of course the scripting language that continues to  power most of my datasmithing activities is Excel VBA.  That&#8217;s why I was so excited to see a tool such as <a href="http://www.protosw.com">Proto</a> <a href="http://blog.gobansaor.com/2007/09/09/proto-desktop-bi-tool/">utilise VBA</a> as its scripting language. But, <a href="http://msdn2.microsoft.com/en-gb/isv/bb190538.aspx">Microsoft has abandoned VBA</a>, there will be no more Protos.</p>
<p>Also, Excel VBA is <a href="http://www.schwieb.com/blog/2006/08/08/saying-goodbye-to-visual-basic/">now a Windows only language</a>.  Windows, however, is no longer the &#8216;only&#8217; business client OS (see how many Apple laptops you can spot the next time you&#8217;re in a business-class airport lounge, a few years ago it would have been zero, not any more), and is currently nowhere to be seen as a cloud computing platform (<a href="http://blogs.zdnet.com/microsoft/?p=1324">but that&#8217;ll change</a>).</p>
<p><a href="http://blog.gobansaor.com/2007/03/03/tables-vs-xml-the-data-lingua-franca-debate/">I&#8217;m at heart</a> a <a href="http://www.geocities.com/tablizer/top.htm">table-oriented programmer</a>, and I,  like Picalo&#8217;s author <a href="http://warp.byu.edu/site/">Conan Albrecht</a>, believe &#8220;data analysis is best done through scripting&#8221;; but not just data analysis, the T in ETL  (Extract, Transform and Load) and the I in DI (Data Integration) and SI (Systems Interfacing) also benefit from a scripting approach.</p>
<p>So, what to adopt as a successor/companion-in-her-old-age to VBA, will it be Ruby, JavaScript, Python, Perl, even PHP?</p>
<p>It looks like it&#8217;ll be Python because it&#8217;s &#8230;</p>
<ul>
<li> Windows friendly (via <a href="http://www.py2exe.org/">Py2Exe</a>),</li>
<li>&#8230;but also runs and is installed by default on MacOS and most Linux distros,</li>
<li>&#8230;and being Linux friendly means I can use it to power Amazon EC2 hosted &#8220;<a href="http://en.wikipedia.org/wiki/Batch_processing">batch processing</a>&#8220;.</li>
<li>Cross-platform &#8220;native&#8221; GUI support using <a href="http://www.wxwindows.org/">wxWidgets</a> via <a href="http://www.wxpython.org/">wxPython</a>.</li>
<li>Google Apps Engine support, I now have a replacement for the late <a href="http://blog.gobansaor.com/2007/12/13/zimki-the-spirt-lives-on/">Zimki</a>.</li>
<li>It&#8217;s table-oriented thanks to Picalo&#8217;s Table Object which provides a &#8220;good enough&#8221; alternative to Excel VBA&#8217;s powerful Range Object.</li>
<li>Picalo&#8217;s Table Object being an extension of a Python &#8220;list of lists&#8221; is of course memory-bound, but large-scale memory mapped datasets <a href="http://blog.gobansaor.com/2007/09/05/in-memory-olap/">are no longer a problem, they&#8217;re an oppurtunity &#8230;</a></li>
<li><a href="http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython">Iron Python</a> offers powerful .NET integration, whether through the <a href="http://blog.jonudell.net/2007/09/27/first-look-at-resolver-an-ironpython-based-spreadsheet/">innovative Resolver Spreadsheet product</a> or via <a href="http://blog.gobansaor.com/2007/10/04/javascript-as-an-excel-scripting-language-via-exceldna/">ExcelDNA</a>.</li>
<li>Via Google Apps Engine, it looks like becoming the <a href="http://blog.gobansaor.com/2007/07/17/like-excel-macros-youll-love-this/">Macro language</a> for <a href="http://en.wikipedia.org/wiki/Google_Spreadsheets">Google Spreadsheets</a>.</li>
<li>It&#8217;s <a href="http://xkcd.com/353/">powerful, fun to use</a>, easy to learn, function orientated and I even like the use of human-eye-friendly indentation rather than braces and delimiters as a code grouping mechanism.</li>
<li>It&#8217;s easy to wrap C/C++ libraries to make them usable from Python <a href="http://en.wikipedia.org/wiki/Pyrex_(programming_language)">via Pyrex</a> and <a href="http://python.net/crew/theller/ctypes/">cTypes</a>.</li>
<li>Python 2.5 now has embedded SQLite3 (<a href="http://blog.gobansaor.com/2008/04/26/sqlite-the-ultimate-data-smithing-tool/">ultimate datasmithing tool!</a>) support (note<a href="http://blog.gobansaor.com/2007/07/27/php5-and-sqlite3/"> PHP5 developers, SQLite3 not SQLite2!</a>).</li>
<li>Oh, and did I mention the Google Apps Engine service and its equally important <a href="http://blog.gardeviance.org/2008/04/amazon-vs-google.html">but generally overlooked SDK</a>.</li>
</ul>
<p>The runner up is of course Ruby, but <a href="http://rubyonwindows.blogspot.com/2008/03/windows-rubys-red-headed-stepchild.html">its poor integration with Windows</a> is a major problem and the datasmithing &#8220;prior art&#8221; of Picalo and Resolver makes Python hard to beat.</p>
<p><strong>UPDATE Jan 2010:</strong></p>
<p>To experience the best of both worlds, VBA &amp; Python, my xLite (Excel combined with SQLite) datasmithing platform now allows Python to be used in conjunction with VBA.  <strong>Check it out here </strong><a href="http://www.gobansaor.com/xlite"><strong>http://www.gobansaor.com/xlite</strong></a></p>
<p><strong>UPDATE:</strong></p>
<p>Also, as Dan pointed out in the comments below, I&#8217;d not included <a href="http://www.jython.org">Jython</a> in my list of reasons for embracing Python. I must add it to my list of things to try out particularly as both my &#8220;classic&#8221; ETL tools, <a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">Talend and Kettle</a> are JVM based.</p>
<p>Another thing to add to the (ever growing) list is Mike Pitarro&#8217;s <a href="http://www.snaplogic.org">SnapLogic</a> python-based ETL tool.  They have &#8230;</p>
<blockquote><p>&#8230;just released a 2.0 Beta version with some major architectural enhancements. The <a href="http://www.snaplogic.com">SnapLogic</a> model is very different from traditional ETL systems.  It takes an approach that&#8217;s more like the web, based on loose coupling and HTTP interactions.  We model data source, sinks, and transformations as URI addressable endpoints, and have a model where than can be chained together in pipelines to build transformation logic. We use a plugin architecture to make it easy to add custom components.</p></blockquote>
<div id="zemanta-pixie" style="width:100%;margin:5px 0;"><a id="zemanta-pixie-a" title="Zemified by Zemanta" href="http://www.zemanta.com/"><img style="border:medium none;float:right;" src="http://img.zemanta.com/pixie.png?x-id=aa9a5563-a21e-435d-a2ee-2d578ce3956c" alt="" /></a></div>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/359/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/359/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/359/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/359/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/359/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/359/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/359/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/359/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/359/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/359/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/359/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/359/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=359&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/04/11/python-the-new-vba/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		<georss:point>53.204039 -6.574340</georss:point>
		<geo:lat>53.204039</geo:lat>
		<geo:long>-6.574340</geo:long>
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixie.png?x-id=aa9a5563-a21e-435d-a2ee-2d578ce3956c" medium="image" />
	</item>
		<item>
		<title>Zimki &#8211; the spirt lives on &#8230;</title>
		<link>http://blog.gobansaor.com/2007/12/13/zimki-the-spirt-lives-on/</link>
		<comments>http://blog.gobansaor.com/2007/12/13/zimki-the-spirt-lives-on/#comments</comments>
		<pubDate>Thu, 13 Dec 2007 17:36:24 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[AppJet]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Horuku]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[Rails]]></category>
		<category><![CDATA[RoR]]></category>
		<category><![CDATA[zimki]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/12/13/zimki-the-spirt-lives-on/</guid>
		<description><![CDATA[Although Zimki is to shut down on Christmas Eve, the ideas behind the service live on. Two new offerings, Horuku and AppJet, offer variations on the idea of hosted application development/deployment. AppJet, funded by Paul Graham&#8216;s Y-Combinator, is very similar to Zimki, being a server-side JavaScript platform. No details yet as to what sort of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=342&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Although <a href="http://blog.gobansaor.com/2007/09/25/zimki-rip/">Zimki is to shut down on Christmas Eve</a>, the ideas behind the service live on.  Two new offerings, Horuku and AppJet, offer variations on the idea of hosted application development/deployment.</p>
<p><a href="http://appjet.com/">AppJet</a>, funded by <a href="http://www.paulgraham.com/">Paul Graham</a>&#8216;s Y-Combinator, is  very similar to Zimki, being a server-side JavaScript platform.  No details yet as to what sort of paid options will be offered (all accounts are free at the moment).  Unlike Zimki there&#8217;s no plans to create an open-source version.  I like the easy &#8220;<a href="http://appjet.com/docs/guide/facebook">build a Facebook app</a>&#8221; feature; and I guess this is the sort of light-weight applications that they hope to attract.</p>
<p>Although <a href="http://heroku.com/">Heroku</a> uses Ruby-on-Rails technology, rather than JavaScript, it is closer to the original Zimki idea; but rather than take the hard (and ultimately unsuccessful in Zimki&#8217;s case) road of building an open-source platform from scratch, Heroku takes an already popular open-source project and offers it wrapped in  a full on-line development and deployment environment.  Again, being in beta, there&#8217;s no indication as to what pricing model it will operate under, but I would think that it will attract more &#8220;serious&#8221; projects than AppJet since anything developed under Heroku is pure Rails which means it can be migrated to any other Rails hosting environment; so no lock-in.    The online editor is excellent and whatever about its merits as a hosting service it&#8217;s by far the easiest way to learn and explore Ruby and Rails, even easier than <a href="http://blog.gobansaor.com/2007/06/25/ror-data-warehouse-on-ec2/">this&#8230;</a></p>
<p>If Facebook apps are your goal but you wish to use Ruby rather than AppJet&#8217;s JavaScript then not to panic, as being Ruby some bright young spark (no, not me I&#8217;m afraid) will already have done a lot of the <a href="http://blog.gobansaor.com/2007/07/19/facebook-apps-using-ruby-on-rails/">hard graft for you&#8230;</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/342/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/342/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/342/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/342/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/342/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=342&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/12/13/zimki-the-spirt-lives-on/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Ruby plus Amazon S3 &#8211; Document Centric Database</title>
		<link>http://blog.gobansaor.com/2007/11/06/ruby-plus-amazon-s3-document-centric-database/</link>
		<comments>http://blog.gobansaor.com/2007/11/06/ruby-plus-amazon-s3-document-centric-database/#comments</comments>
		<pubDate>Tue, 06 Nov 2007 20:32:36 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[EC2]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[S3]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[CouchDb]]></category>
		<category><![CDATA[map reduce]]></category>
		<category><![CDATA[EU]]></category>
		<category><![CDATA[RDDB]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/11/06/ruby-plus-amazon-s3-document-centric-database/</guid>
		<description><![CDATA[I&#8217;ve said it before and I&#8217;m going to repeat myself; learning Ruby has proven to be a great investment, not so much for the language itself but for the insights it gives into other technologies. As soon as a new &#8216;cool&#8217; technology or idea hits the street some smart Rubyist is bound to attack it, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=336&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve <a href="http://blog.gobansaor.com/2007/07/19/facebook-apps-using-ruby-on-rails/">said it before</a> and I&#8217;m going to repeat myself; learning <a href="http://blog.gobansaor.com/2007/01/16/why-ruby/">Ruby</a> has proven to be a great investment, not so much for the language itself but for the insights it gives into other technologies.  As soon as a new &#8216;cool&#8217; technology or idea hits the street some smart Rubyist is bound to attack it, dice it up and serve it back up as easy to digest Ruby code.</p>
<p>Today, it&#8217;s the turn of  Document Centric Databases done in the style of <a href="http://blog.gobansaor.com/2007/09/14/couchdb-doucument-centric-ods/">CouchDB</a>, but replacing JavaScript/Erlang with Ruby and the bespoke data store with Amazon&#8217;s S3 service.</p>
<p><a href="http://www.anthonyeden.com/">Anthony Eden</a>&#8216;s <a href="http://rddb.rubyforge.org/">RDDB project</a>  is still very much alpha, but looking through the code it looks like it has lots of good ideas, including using EC2 instances as &#8220;<a href="http://labs.google.com/papers/mapreduce.html">map reduce workers</a>&#8221; listening on <a href="http://www.amazon.com/Simple-Queue-Service-home-page/b?ie=UTF8&amp;node=13584001">Amazon SQS Queues</a>; so the whole <a href="http://www.amazon.com/AWS-home-page-Money/b/ref=sc_fe_l_1/105-7856153-5728460?ie=UTF8&amp;node=3435361&amp;no=3435361&amp;me=A36L942TSJ2AJA">Amazon AWS stack</a> might yet get staring roles.  The actual data store can be varied, with both partitioned file system and  RAM based options currently available alongside S3.</p>
<p>Other Amazon AWS related news, was the announcement today of an option to use European data centres to store S3 data (with a slightly higher charge than using North American locations and with the transfer of data between EU based S3 buckets and US based EC2 instances being no longer free).  I&#8217;m guessing that the option to fire up European based EC2 servers can&#8217;t be far behind.   Also, one piece of news I&#8217;d missed was that EC2 is now in unlimited beta i.e. it&#8217;s now open to all developers.   So developers everywhere can, for less that the cost of a mobile text message,  fire up their own dedicated and powerful Linux server.  The day of a production ready, SLA backed, EC2 service is around the corner.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/336/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/336/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/336/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/336/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/336/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=336&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/11/06/ruby-plus-amazon-s3-document-centric-database/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Spreadsheets &#8211; ETL tool</title>
		<link>http://blog.gobansaor.com/2007/09/06/google-spreadsheets-etl-tool/</link>
		<comments>http://blog.gobansaor.com/2007/09/06/google-spreadsheets-etl-tool/#comments</comments>
		<pubDate>Thu, 06 Sep 2007 12:49:37 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[GoogleApps]]></category>
		<category><![CDATA[RSSBus]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[google]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/09/06/google-spreadsheets-etl-tool/</guid>
		<description><![CDATA[Although I&#8217;m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word. Three main [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=312&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Although I&#8217;m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word.   Three main reasons for this shift to the cloud:</p>
<ul>
<li>Google Docs &amp; Spreadsheets are &#8216;good enough&#8217; for most of the trivial lists and calculations I require in my personal life and indeed for most business purposes as well, at least those that don&#8217;t require a pivot table.</li>
<li>These spreadsheets and documents are important but not necessarily in the &#8216;state secret/I-could-tell-but-then-I&#8217;d-have-to-kill-you&#8217; scale of things, by building them in Google Apps they are securely backed-up and easily accessible.</li>
<li>A lot of the spreadsheets are collaborative in nature, and in the collaboration field, Google Spreadsheets just gets <a href="http://google-d-s.blogspot.com/2007/08/peek-boo-i-see-you-on-this-spreadsheet.html">better and better</a>.</li>
</ul>
<p>Today, <a href="http://jrsays.com/2007/09/bloggers-do-it-better-than-me.html">Google announced further additions</a> to their spreadsheet product.  The <a href="http://docs.google.com/support/spreadsheets/bin/answer.py?answer=75509&amp;query=auto+fill&amp;topic=&amp;type=">AutoFill</a> feature adds functionality I&#8217;ve come to expect from Excel, but with a twist, integration with <a href="http://labs.google.com/sets">Google Sets</a>.  But the additions that really caught my eye were the new <a href="http://docs.google.com/support/spreadsheets/bin/answer.py?answer=75507&amp;query=googlereader&amp;topic=&amp;type=">data import functions</a>.  Now again, Excel has had web queries <a href="http://support.microsoft.com/kb/157482">since Excel97</a>, and it always amazed me why online pretenders to the throne tended to ignore the most common source of tabular data on the web, the HTML table; something to do with the great<a href="http://blog.gobansaor.com/2007/03/03/tables-vs-xml-the-data-lingua-franca-debate/"> XML/Tables divide</a> I guess!</p>
<p>Google now not only fixes this omission,providing access  to HTML tables and comma/tab separated file, but  also provides access to RSS/ATOM and generic XML sources.  All that&#8217;s missing now are functions that can read other common online data files formats such as Excel, MSAccess, <a href="http://en.wikipedia.org/wiki/Xbase">XBase</a> and of course SQLite.</p>
<p>This addition of HTML import support and the AutoFill feature will further reduce the number of times I&#8217;ll need to fire up Excel for personal tasks, but the RSS/ATOM/XML import feature also has potential as a tool in my micro-ETL toolbox.  Using Excel as my only micro-ETL tool is possible when the data is either already in Excel/CSV or accessible via <a href="http://sapass.metro.client.jp/Sap_Active_X/SapFunction1.htm">a COM API</a> or via ODBC drivers, otherwise I can call-in either <a href="http://blog.gobansaor.com/2006/12/28/ruby-and-sqlite-a-micro-etl-environment/">Ruby</a>, <a href="http://www.talend.com">Talend</a>,<a href="http://kettle.pentaho.org"> Kettle</a> or even <a href="http://www.rssbus.com">RSSBus</a>.  But now I&#8217;ve another option, if the data is public and published as RSS/ATOM or some other variation on XML, I can use Google Spreadsheets to fetch the data and import the resulting tabular dataset into Excel via a Web Query or <a href="http://blog.gobansaor.com/2007/04/18/whats-up-docs-spreadsheets/">via the GData API.</a></p>
<p><a href="http://gobansaor.files.wordpress.com/2007/09/google-reader-search.jpg" title="New Google Reader Search facility"><img src="http://gobansaor.files.wordpress.com/2007/09/google-reader-search.jpg?w=500" alt="New Google Reader Search facility" align="left" /></a>One other thing.  While researching this post, looking up links etc. I used another <a href="http://googlesystem.blogspot.com/2007/09/google-reader-adds-search.html">new feature Google added today</a>,  Google Reader&#8217;s new Search facility.  As most of my references are discovered via the blogs I subscribe to, the ability to restrict searches to that subset of the web is fantastic; I even used it to search through my own blog posts!  If <a href="http://del.icio.us">del.icio.us</a> offered the same option it would make re-finding stuff even easier.  I did try to use <a href="http://www.google.com/coop/">Google Co-Op</a> to build a search engine restricted to my <a href="http://del.icio.us/gobansaor">del.icio.us links</a> but it didn&#8217;t seem to like the volume of links (4000 odd) I sent it.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/312/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/312/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/312/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=312&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/09/06/google-spreadsheets-etl-tool/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://gobansaor.files.wordpress.com/2007/09/google-reader-search.jpg" medium="image">
			<media:title type="html">New Google Reader Search facility</media:title>
		</media:content>
	</item>
		<item>
		<title>Facebook Apps using Ruby on Rails</title>
		<link>http://blog.gobansaor.com/2007/07/19/facebook-apps-using-ruby-on-rails/</link>
		<comments>http://blog.gobansaor.com/2007/07/19/facebook-apps-using-ruby-on-rails/#comments</comments>
		<pubDate>Thu, 19 Jul 2007 12:15:03 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Rails]]></category>
		<category><![CDATA[recipes]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/07/19/facebook-apps-using-ruby-on-rails/</guid>
		<description><![CDATA[This is what I love about Ruby and Ruby on Rails, once you learn the basics of Ruby and how a RoR app is put together you can use this knowledge to learn about other technologies, in this case Facebook Applications. The reason for this is, as soon as a new technology hits the street [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=294&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>This is what I love about <a href="http://www.ruby-lang.org/en/">Ruby</a> and <a href="http://www.rubyonrails.org/">Ruby on Rails</a>, once you learn the basics of Ruby and how a RoR app is put together you can use this knowledge to learn about other technologies, in this case <a href="http://developers.facebook.com/anatomy.php">Facebook Applications</a>.  The reason for this is, as soon as a new technology hits the street somebody is bound to either build a Ruby library or a RoR plugin targeting the new (and presumably cool) platform.   In two excellent articles <a href="http:/http://www.liverail.net/pages/about-the-author">Stuart Eccles</a> shows <a href="http://www.liverail.net/articles/2007/6/29/tutorial-on-developing-a-facebook-platform-application-with-ruby-on-rails">how to build a Facebook app</a> using <a href="http://rubyforge.org/projects/rfacebook/">the rFacebook Rails extension</a>.  I guess I could have looked at a PHP or Java example but I chose (as I nearly always do) the Rails route as the layers of abstraction and the standard infrastructure of RoR apps allow me to quickly get an overview of the new technology in action but also, if I so desire, allow me to easily deep dive into anything that requires more detailed investigation.</p>
<p>In the late nineties I learned Java (and later .NET) for a similar purpose not as a primary development tool but because of its role as the &#8220;language of account&#8221; of most new technologies at the time.  And although I still come across environments where the only examples are in a Java (or .Net or PHP) I know if I wait a few weeks some bright Rubyist will eventually &#8220;document&#8221; it in either Ruby or Rails.</p>
<p>Using the <a href="http://apps.facebook.com/socialrecipe/">sample application</a> built by Stuart I&#8217;ve added one of my favourite salad recipes,<br />
<a href="http://apps.facebook.com/socialrecipe/recipes/show/7">Banana &amp; Tomato in a Mustard Dressing</a><br />
..enjoy.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/294/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/294/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/294/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/294/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/294/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/294/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/294/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/294/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/294/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/294/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/294/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/294/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=294&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/07/19/facebook-apps-using-ruby-on-rails/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Like Excel macros? You&#8217;ll love this..</title>
		<link>http://blog.gobansaor.com/2007/07/17/like-excel-macros-youll-love-this/</link>
		<comments>http://blog.gobansaor.com/2007/07/17/like-excel-macros-youll-love-this/#comments</comments>
		<pubDate>Tue, 17 Jul 2007 13:57:18 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[GoogleApps]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[excel]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/07/17/like-excel-macros-youll-love-this/</guid>
		<description><![CDATA[One of the most used (and abused) features of Excel is its macro recording facility. How many mundane and repetitive actions have been automated using this feature? How many people found the courage to program in VBA by using the recorder as their training-wheels? Well now iMacros (from German company iOpus GmpH) a Firefox extension [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=292&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>One of the most used (and abused) features of Excel is its macro recording facility.  How many mundane and repetitive actions have been automated using this feature?  How many people found the courage to program in VBA by using the recorder as their training-wheels?  Well now <a href="http://www.iopus.com/imacros/firefox/">iMacros</a>  (from German company iOpus GmpH)  a <a href="http://www.iopus.com/download/imacros-firefox/">Firefox extension</a> and an <a href="http://www.iopus.com/download/imacros-ie/">IE add-on</a> brings macro recording to the web browser.   Although <a href="http://www.greasespot.net/">Greasemonkey</a> already enables JavaScript programmers to automate Firefox, iMacros offers the same power to non-programmers.</p>
<p>The macros  can be saved as bookmarks and can also be shared with others (although I couldn&#8217;t get the sharing feature to work).   I&#8217;ve already automated a number of tedious tasks and had hoped to use it with Google Spreadsheets, alas either the nature of Goggle&#8217;s JavaScript or inbuilt protection within Goggle Apps stopped the recorder for working.  Pity, as there&#8217;s three things missing from Google Spreadsheets as it stands, pivot table support, offline ability and macro support.   <a href="http://gears.google.com/">Google Gears</a> will undoubtedly solve the offline problem,  charts are essentially a graphical pivot so I guess a table pivot must be a possibility, iMacros or something like it could perform the duties that VBA provides to Excel.    As with Office macros, security issues may well dampen the parade &#8211; iMacros allows access to the PC&#8217;s file system and a macro can be invoked from a bookmarklet camouflaged as a standard link &#8211; but I&#8217;ll not worry until (or if) the product goes mainstream.</p>
<p>From an ETL point of view iMacros can act as a powerful web scraping tool and as a automated form-filler.   There&#8217;s also a commercial version of the <a href="http://www.iopus.com/store/">product ($499.00) </a>that exposes the tool via an ActiveX API which means <a href="http://www.iopus.com/imacros/excel.htm">Excel/VBA can be used as a web scraping/ form filling environment</a>.  If the price tag is too steep then the excellent <a href="http://scrubyt.org/">scRUBTt</a> is both free /open source and is ideal if you&#8217;re scraping a lot of data on a frequent basis while for small or once off tasks this equally free and open source Firefox extension is good enough.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/292/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/292/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/292/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/292/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/292/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/292/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/292/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/292/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/292/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/292/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/292/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/292/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=292&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/07/17/like-excel-macros-youll-love-this/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>RoR Data Warehouse on EC2</title>
		<link>http://blog.gobansaor.com/2007/06/25/ror-data-warehouse-on-ec2/</link>
		<comments>http://blog.gobansaor.com/2007/06/25/ror-data-warehouse-on-ec2/#comments</comments>
		<pubDate>Mon, 25 Jun 2007 17:35:23 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[EC2]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[Rails]]></category>
		<category><![CDATA[AmazonAWS]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/06/25/ror-data-warehouse-on-ec2/</guid>
		<description><![CDATA[If you&#8217;ve been putting off evaluating Ruby on Rails and you&#8217;re lucky enough to have an Amazon EC2 beta account then it&#8217;s your lucky day. Paul Dowman has just made a public AMI (think of it like a virtual machine spec from which you can create a running EC2 instance) with various Ruby on Rails [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=288&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve been putting off evaluating  <a href="http://www.rubyonrails.org/">Ruby on Rails </a>and you&#8217;re lucky enough to have an <a href="http://aws.amazon.com/ec2">Amazon EC2</a> beta account then it&#8217;s your lucky day.   <a href="http://pauldowman.com/projects/ruby-on-rails-ec2/">Paul Dowman has just made a public AMI</a> (think of it like a virtual machine spec from which you can create a running EC2 instance) with various Ruby on Rails goodies preloaded.</p>
<blockquote><p>Features:</p>
<ul>
<li>Automatic backup of MySQL database to S3 every 10 minutes.</li>
<li>Mongrel_cluster behind Apache 2.2, configured according to <a href="http://blog.codahale.com/2006/06/19/time-for-a-grown-up-server-rails-mongrel-apache-capistrano-and-you/">Coda Hale’s excellent guide</a>, with /etc/init.d startup script</li>
<li>Ruby on Rails 1.2.3</li>
<li>Ruby 1.8.5</li>
<li>MySQL 5</li>
<li>Ubuntu 7.04 Feisty with <a href="http://wiki.xensource.com/xenwiki/XenSpecificGlibc">Xen versions of standard libs</a> (<a href="http://packages.ubuntu.com/feisty/libs/libc6-xen">libc6-xen</a> package).</li>
<li>All EC2 command-line tools installed</li>
<li>MySQL and Apache configured to write logs to /mnt/log so you don’t fill up EC2’s small root filesystem</li>
<li>Hostname set correctly to public hostname</li>
<li>NTP</li>
<li>A script to re-bundle, save and register your own copy of this image in one step (if you want to).</li>
</ul>
</blockquote>
<p><a href="http://gobansaor.wordpress.com/2007/02/04/build-a-data-warehouse-using-ruby-on-rails/">I&#8217;ve been meaning to try out Anthony Eden&#8217;s RoR based data warehousing too</a>l for some time; no more excuses as I now can  fire up an EC2 instance based on Paul&#8217;s AMI , install the <a href="http://activewarehouse.rubyforge.org/">ActiveWarehouse</a> plugin and away I go.   As ActiveWarehouse primarily uses techniques described in <a href="http://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247" title="Second Edition">The Data Warehouse Toolkit</a> it&#8217;s also a good learning tool for those new to data warehousing.  All I need now is a sizeable publicly accessible dataset to populate the warehouse to get a true fell for its capabilities.  There&#8217;s only so much you can do with the <a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=c6661372-8dbe-422b-8676-c632d66c529c&amp;DisplayLang=en">venerable Northwind</a> database.   Does anybody know of a &#8216;beefier&#8217; alternative?</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/288/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/288/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/288/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/288/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/288/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/288/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/288/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/288/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/288/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/288/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/288/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/288/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=288&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/06/25/ror-data-warehouse-on-ec2/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Talend vs. Kettle (Pentaho PDI)</title>
		<link>http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/</link>
		<comments>http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/#comments</comments>
		<pubDate>Sun, 27 May 2007 16:40:22 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[JavaFX]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[xLite]]></category>
		<category><![CDATA[update]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/05/27/talend-vs-kettle-pentaho-pdi/</guid>
		<description><![CDATA[Over the last few weeks I&#8217;ve received a lot of traffic from Goggle searches comparing Talend and Kettle and also from Vincent McBurney&#8217;s ITtoolbox article comparing the two products, so where do I stand? As ETL tools they take different approaches, Kettle is a meta data driven framework (which is in turn tightly integrated into [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=279&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Over the last few weeks I&#8217;ve received a lot of traffic from Goggle searches comparing Talend and Kettle and also from <a href="http://www.ittoolbox.com/profiles/vmcburney">Vincent McBurney&#8217;</a>s <a href="http://blogs.ittoolbox.com/bi/websphere/archives/wiki-wednesday-comparing-talend-and-pentaho-kettle-open-source-etl-tools-16294">ITtoolbox article</a> comparing the two products, so where do I stand?</p>
<p>As ETL tools they take different approaches, Kettle is a meta data driven framework (which is in turn tightly integrated into an even larger BI framework, the <a href="http://blogs.ittoolbox.com/bi/websphere/archives/wiki-wednesday-comparing-talend-and-pentaho-kettle-open-source-etl-tools-16294">Pentaho BI Project</a>), while Talend is at the end of the day a code generator.  The code generator nature of Talend does impart certain capabilities such as the ability to easily integrate into other BI platforms (such as <a href="http://www.jaspersoft.com/">JasperSoft</a> and <a href="http://spagobi.eng.it">SpagoBI</a> ) and as <a href="http://gobansaor.wordpress.com/2006/12/28/ruby-and-sqlite-a-micro-etl-environment/">micro-ETL</a> tool. On the other hand Kettle&#8217;s increasing integration into the Pentaho family makes it natural choice for larger BI implementations.  So which one would I recommend ?  It&#8217;s got to be Kettle and the main reason for that choice is the Pentaho PDI <a href="http://forums.pentaho.org/forumdisplay.php?f=69">community</a>.</p>
<p>My experience so far with the <a href="http://www.talendforge.org/forum/">Talend community</a> has not been good.  Admittedly I&#8217;ve only asked two questions, one looking for more details on a statement regarding <a href="http://www.talendforge.org/forum/viewtopic.php?id=660">ERP and CRM connectors</a> mentioned in <a href="http://www.talend.com/press/java-data-integration.php">a press release</a> and the other a technical query about a new<a href="http://www.talendforge.org/forum/viewtopic.php?id=721"> tSQLite connector</a>.  Neither were answered. OK I&#8217;m a big boy I can handle the rejection and I&#8217;m also technically savvy, I can eventually fix most of my own problems.  But I&#8217;m sure if I posted similar questions on the Pentaho forums I would have received an answer, probably from <a href="http://www.ibridge.be/">Matt Casters</a> if he though the general community would not be able to provide the answer.</p>
<p>So will this experience stop me from using Talend?  Maybe not, the code generation capability may come in handy as a micro-ETL tool (but I do have <a href="http://gobansaor.wordpress.com/projects/xlite/">other options</a>); however, through an offline <a href="http://gobansaor.wordpress.com/2007/04/26/talend-etl-a-new-contender/#comment-1384">conversation</a> with  <cite>Samatar </cite>(an <a href="http://forums.pentaho.org/search.php?searchid=74414">active member of the Kettle community</a>) and Matt Casters I learned of a new XML input step and of a <a href="http://kettle.pentaho.org/">PDI</a> alternative to the Talend tJavaFlex component, the &#8220;Modified Java Script Value&#8221; step.  This allows for the creation of START, TRANSFORM (i.e. for-each-row) and END JavaScript scripts.  As Kettle uses the Rhino scripting engine I have access to any Java API (e.g. <a href="http://www.jpalo.net">Palo</a>,<a href="http://www.bluishcoder.co.nz/2006/05/amazon-s3.html">S3</a>, <a href="http://code.google.com/apis/spreadsheets/developers_guide_protocol.html">Google Spreadsheets</a>) but with the flexibility of JavaScript; <a href="http://www.bbc.co.uk/comedy/fastshow/characters/brilliant.shtml">brilliant</a>!</p>
<p>Not only that, since my experience with Talend tJavaFlex reintroduced me to the world of Java I though I might as well look into creating my own Kettle plugin and guess what, it looks easy enough.  So I&#8217;m going to develop my first plugin to be based on the ScriptValueMod step, replacing the Rhino engine with the Java 6 <a href="http://java.sun.com/javase/6/docs/api/javax/script/ScriptEngineManager.html">ScriptEngineManager</a>.  This will give me access to the latest version of Rhino which supports E4X (<a href="http://www.devx.com/webdev/Article/33393/0/page/1">will make handling XML must easier</a>) and also opens the possibility of using <a href="http://gobansaor.wordpress.com/2007/05/19/javafx-a-gui-dsl/">JavaFX</a> or indeed <a href="http://jruby.codehaus.org/">JRuby</a> as Kettle scripting languages.</p>
<p>UPDATE:</p>
<p>I&#8217;ve received <a href="http://talendforge.org/forum/viewtopic.php?pid=3146#p3146">a reply</a> to my tSQLite question!</p>
<p>UPDATE: July 2008</p>
<p>This post is now over a year old, Talend as a product and as a community has improved enormously in the last 12 months; so much so, I now use Talend  (Java) in preference to PDI for most of my datasmithing needs.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/279/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/279/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/279/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/279/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/279/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/279/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/279/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/279/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/279/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/279/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/279/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/279/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=279&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>JavaFX &#8211; a GUI DSL</title>
		<link>http://blog.gobansaor.com/2007/05/19/javafx-a-gui-dsl/</link>
		<comments>http://blog.gobansaor.com/2007/05/19/javafx-a-gui-dsl/#comments</comments>
		<pubDate>Sat, 19 May 2007 15:12:26 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[JavaFX]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/05/19/javafx-a-gui-dsl/</guid>
		<description><![CDATA[Having mastered JavaScript (OK master is too strong a word &#8211; having become comfortable with both its syntax and usage patterns) my next port of call is JavaFX the recently announced Flash/Silverlight competitor. What led me to JavaFX Script was not its role in this Flash/AJAX alternative platform (which unless  Sun improves the JRE download [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=278&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Having <a href="http://gobansaor.wordpress.com/2007/05/12/javascript-101/">mastered JavaScript</a> (OK master is too strong a word &#8211; having become comfortable with both its syntax and usage patterns)  my next port of call is <a href="http://www.sun.com/software/javafx">JavaFX</a> the recently announced Flash/<a href="http://blogs.zdnet.com/microsoft/?p=426">Silverlight competitor</a>.  What led me to JavaFX Script was not its role in this Flash/AJAX alternative platform  (<strike>which unless  Sun improves the JRE download experience i</strike><strike>s dead in the water</strike> they may have a chance with the <a href="http://weblogs.java.net/blog/enicholas/archive/2007/05/announcing_the.html">promised consumer JRE</a> ) but its status as the 2nd scripting language to be supported by the Java 6 <a href="http://java.sun.com/javase/6/docs/api/javax/script/ScriptEngineManager.html">ScriptEngineManager</a> &#8211; the first being JavaScript. Although the JavaFX  platform is still in alpha, the  important elements of the scripting language appear to be usable.</p>
<p>Why do I need another  scripting language? Well JavaFX Script is a DSL (<a href="http://en.wikipedia.org/wiki/Domain-specific_language">domain-specific  language</a>) and the domain in question  is  the GUI and  as a GUI DSL, JavaFX is impressive.</p>
<p>In my quest for a micro ETL environment the lack of a fast and powerful GUI tool has been a problem.   Admittedly, in my Excel/Sqlite  <a href="http://gobansaor.wordpress.com/projects/xlite/">xLite </a>suite GUI generation is not a problem as VBA forms are both fast to develop in and feature rich, but the other two options (Java and Ruby) both lack a cost-effective GUI tool (I know there&#8217;s Swing and <a href="http://www.rubycentral.com/book/ext_tk.html">Tk</a> but for somebody who&#8217;s used to the speed of development of tools such as VB forms or <a href="http://www.oracle.com/technology/products/database/application_express/index.html">Oracle&#8217;s Application Express</a>, neither appealed).  JavaFX changes all that, it&#8217;s elegant, powerful, fast to develop in and portable.  Further pushes me down the road of <a href="http://gobansaor.wordpress.com/2007/04/30/ive-got-talend-and-im-going-to-use-it/">using Talend</a> generated Java as my <a href="http://gobansaor.wordpress.com/2006/12/28/ruby-and-sqlite-a-micro-etl-environment/">micro-ETL environment</a> of choice.</p>
<p>It&#8217;s easy to dismiss JavaFX as hype but the technology behind the hype looks sound.  The language borrows powerful features from other languages, object literals from JavaScript,  list-comprehensions from Python/Erland/Haskell and first-class functions from JavaScript/Lisp all combined with the full power and glory of Java.  The end result:</p>
<blockquote><p>..is a declarative and statically typed programming language. It has first-class functions, declarative syntax, list-comprehensions, and incremental dependency-based evaluation. The JavaFX Script language makes intensive use of the Java2D swing GUI components and allows for easy creation of GUIs.</p></blockquote>
<p>Go the <a href="https://openjfx.dev.java.net/">community page</a> for FAQs, tutorials and <a href="https://openjfx.dev.java.net/JavaFX_Programming_Language.html">reference information</a>.  <a href="http:/http://www.oreillynet.com/pub/au/1738">Timothy O&#8217;Brien</a> has a good <a href="http://www.oreillynet.com/onjava/blog/2007/05/javafx_first_steps_hello_onjav_1.html">first pass</a> at calling JavaFX from a Java program and <a href="http://www.onjava.com/pub/a/onjava/2006/04/26/mustang-meets-rhino-java-se-6-scripting.html">this article</a> by <a href="http://www.oreillynet.com/pub/au/2429">John Smart</a> explains the basics of using ScriptEngineManager to call JavaScript.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/278/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/278/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/278/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=278&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/05/19/javafx-a-gui-dsl/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>VBx &#8211; the future VBA?</title>
		<link>http://blog.gobansaor.com/2007/05/02/vbx-the-future-vba/</link>
		<comments>http://blog.gobansaor.com/2007/05/02/vbx-the-future-vba/#comments</comments>
		<pubDate>Wed, 02 May 2007 17:55:48 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/2007/05/02/vbx-the-future-vba/</guid>
		<description><![CDATA[With the future of VBA being a concern for many Office professionals, some of the  MIX07 announcements around dynamic language support in Silverlight may shed some light on a potential replacement for the VBA, VBx. There&#8217;s life in the old dog yet, it may yet get to share the limelight with its cooler cousins Ruby [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=275&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>With the future of VBA being a <a href="http://smurfonspreadsheets.wordpress.com/2007/05/01/more-vba-wonderings/">concern for many Office professionals</a>, some of the  <a href="http://visitmix.com/">MIX07</a> announcements around<a href="http://www.hanselman.com/blog/PuttingMixSilverlightTheCoreCLRAndTheDLRIntoContext.aspx"> dynamic language support in Silverlight</a> may shed some light on a potential replacement for the VBA, <strong><a href="http://www.panopticoncentral.net/archive/2007/05/01/20383.aspx">VBx</a>.</strong></p>
<p>There&#8217;s life in the old dog yet, it may yet get to share the limelight with its cooler cousins Ruby and Python!</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/275/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/275/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/275/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/275/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/275/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=275&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/05/02/vbx-the-future-vba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
	</channel>
</rss>