<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Gobán Saor</title>
	<atom:link href="http://blog.gobansaor.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.gobansaor.com</link>
	<description>A country datasmith.</description>
	<lastBuildDate>Sun, 05 Jul 2009 16:49:15 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<image>
		<url>http://www.gravatar.com/blavatar/67e164f5d51c2b3115a7819b84505c13?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Gobán Saor</title>
		<link>http://blog.gobansaor.com</link>
	</image>
			<item>
		<title>Palo HTTP API via Excel/VBA</title>
		<link>http://blog.gobansaor.com/2009/06/22/palo-http-api-via-excelvba/</link>
		<comments>http://blog.gobansaor.com/2009/06/22/palo-http-api-via-excelvba/#comments</comments>
		<pubDate>Mon, 22 Jun 2009 16:43:18 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[Palo]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[Palo HTTP API]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=719</guid>
		<description><![CDATA[As a result of a request on the Palo support forum last week looking for a VBA tool to directly access the Palo OLAP server via its native HTTP API, I realised I had such a tool. I had built it about a year ago (to use alongside Fiddler Web Debugger and .NET Reflector) to help me understand [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=719&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As a result of a request on the Palo support forum last week looking for a VBA tool to directly access the <a href="http://www.jedox.com/en/home/overview.html">Palo OLAP server</a> via its native HTTP API, I realised I had such a tool. I had built it about a year ago (to use alongside <a href="http://www.fiddler2.com/fiddler2/">Fiddler Web Debugger</a> and <a href="http://www.red-gate.com/products/reflector/">.NET Reflector</a>) to help me understand in detail the interaction between the Excel client and Palo. (Remember this was before <a href="http://blog.gobansaor.com/2009/06/12/palo-bi-suite-community-edition/">Jedox released the Excel add-in as open source</a>)</p>
<p>I removed the SQLite dependent parts (which allowed me to load meta-data into tables and analyse using SQL) and used a pure-VBA MD5 hash routine to reduce the number of moving parts. Also added a few VBA helper routines (including an array-formula UDF to allow direct calling of the API from Excel UI).</p>
<p>As I said, I originally built it as a learning aid but I&#8217;ve started to look at it again in the context of a an Excel/Sqlite Palo ETL add-on.  Having the flexibility of the native API and my own meta-data cache in SQLite might prove very useful.</p>
<p><a href="http://blog.gobansaor.com/projects/excel-vba-palo-http-api/">The code may be found here</a>.  Enjoy!</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a style="text-decoration:none;color:#265e15;border-bottom-color:#996633;border-bottom-width:1px;border-bottom-style:dashed;margin:0;padding:0;" href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/719/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=719&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/06/22/palo-http-api-via-excelvba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Palo BI Suite Community Edition</title>
		<link>http://blog.gobansaor.com/2009/06/12/palo-bi-suite-community-edition/</link>
		<comments>http://blog.gobansaor.com/2009/06/12/palo-bi-suite-community-edition/#comments</comments>
		<pubDate>Fri, 12 Jun 2009 14:08:18 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[GoogleApps]]></category>
		<category><![CDATA[Groovy]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[Community Edition]]></category>
		<category><![CDATA[Palo BI Suite]]></category>
		<category><![CDATA[Worksheet Server]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=709</guid>
		<description><![CDATA[Jedox have finally published a roadmap for the Palo BI Suite Community Edition, having caused considerable confusion by pre-announcing its availability last April. See here for the details.  The headline dates are, a beta version due 1st of July with a Release Candidate due 1st September.
Although the announcement in April was essentially vapour-ware (no Worksheet [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=709&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Jedox have finally published a roadmap for the Palo BI Suite Community Edition, having <a href="http://www.jedox.com/community/palo-forum/thread.php?threadid=1589&amp;hilight=community+edition">caused considerable confusion</a> by pre-announcing its availability last April. <a href="http://www.jedox.com/community/palo-forum/thread.php?threadid=1671">See here for the details</a>.  The headline dates are, a beta version due 1<sup>st</sup> of July with a Release Candidate due 1<sup>st</sup> September.</p>
<p>Although the announcement in April was essentially vapour-ware (no Worksheet Server V3, no Amazon EC2 images), one very significant actual deliverable was the addition to <a href="http://palo.svn.sourceforge.net/">SourceForge</a> of the Palo Excel Add-in sources (GPL licence).  This, at least for me, is very welcome as it now means that Palo is truly open source.  Prior to this, the server sources were GPL’d but not the main front-end tool used by the vast majority of end-users. In fact, the deep Excel integration offered by the Add-In is Palo’s main attraction to the business-focused “datasmiths” who make up the bulk of the product&#8217;s user-base.</p>
<p>The GPL’ing of the Add-in has removed the last barrier that had stopped me, as an independent consultant, from committing to the platform.  While the ‘freeness’ of open source is nice, it’s the source that really attracts me.  With access (and rights) to the source, I have no worries that the terms of use or indeed the product’s core functionality can be arbitrarily changed.  Having the source also means I can delve as deep or as shallow as I need to into the inners of the product, improving my understanding of the technology (both bugs and functionality) as needs dictate.</p>
<p>What has Palo BI Suite to offer, besides being open source?  Well, even if Jedox’s offering consisted of mediocre products, being open source as I explained above is in itself a huge advantage. Having an agnostic FOSS pivot engine that can be shared across many technologies, from Excel to Open Office to a PHP based website, is extremely useful.</p>
<p>However, Jedox’s BI suite is far from mediocre.  Palo is now a very polished and powerful in-memory MOLAP server with excellent integration with Excel (through the Add-In, or if you take out a support contract, via ODBO/MDX powered Pivot Tables).  The addition of a browser delivered spreadsheet (Worksheet Server V3) will add significantly to the product’s street appeal.  Version 3 differs significantly from previous WSS versions; being open source is one, but the entire product was also completely redesigned to meet the challenge posed by web-based products from the likes of <a href="http://sheet.zoho.com/">Zoho</a>, <a href="http://www.editgrid.com/">EditGrid</a> and of course Google Docs (not to mention the ever-present threat of a MS response). Web-based spreadsheets are becoming a commodity, so Jedox’s response was to open source the product but at the same time make it more usable for real-world business analytics.</p>
<p>Current browser-delivered spreadsheets suffer from two shortcomings;</p>
<ul>
<li>Spreadsheets with large numbers of inter-related cells (typical of business models ) tend to perform poorly, in many cases being unusable compared with Excel or Open Office.</li>
<li>Only available as hosted SaaS; not a major problem for some businesses, but for others, services outside the corporate firewall, especially for sensitive information such as what-if, budgeting and sales analysis models, are a no-no.</li>
</ul>
<p>WWS V3 gets around both problems.  Performance is improved by the use of Palo as the spreadsheet’s pivot engine but also by the “lazy calculation” of related cells i.e. a cell that’s not visible, and itself not yet referenced by other visible cells, remains uncalculated, saving on the constant churning that can effect large models.  This combined with a DynaRange concept means templates and models react dynamically and efficiently to the ever changing datasets being presented to the sheets from the Palo OLAP server.   The look’n’feel is very similar to Excel with even array-formulae being fully supported.</p>
<p>The second problem of only-behind-the-firewall access is solved by the open source GPL licence and by the front-end being coded in PHP (very approachable to most in-house support staff and even the <a href="http://twitter.com/martynshiner/">odd accountant</a>).  The core (the bit not yet released) is, as far as I know, C++, so is likely to join Palo Server as being highly efficient and well engineered but perhaps beyond the technical skills of most.</p>
<p>The other elements to the BI Suite are the web-based OLAP-centric ETL Server (now, I see, with Groovy and Javascript scripting support) and the Supervision Server (only in paid Enterprise version) which offers fine-tuned access control and monitoring, plus drill-through functionality from Palo cells, back to the ETL fact tables. The Enterprise Edition also offers a multi-core version of the Palo server along with SAP and ODBO/MDX connectivity.</p>
<p>If multi-dimensional analysis and budgeting could benefit your business and spreadsheets are your preferred method of communicating and working with such analysis, you need to check <a href="http://www.jedox.com/en/products/Overview.html">this out</a>.  Palo is a well kept secret (at least outside of Germany), hardly ever mentioned by the mainstream BI community, but don’t let that put you off; this is one of the best solutions out there, it&#8217;s open source but also comes with the backup of a professional company that can offer not just technical support but also implementation know-how (Jedox eats its own dog-food, being both a BI consultancy and development house).</p>
<p><strong>Update July 4th 2009:</strong></p>
<p>Beta Community Edition<a href="http://www.jedox.com/en/products/palo_olap_server/download.html"> is now available</a>.   I downloaded and installed WWS V3 and gave it a quick test-drive; looks good, Palo interface has the look&#8217;n'feel of the Excel Add-in and the general spreadsheet functionality is very Excel-like, incluing CTRL-Shift-Enter to assign array formulae.  Overall, the Palo BI suite offers a intuitive end-user-friendly interface; from download to effective use in less than 60 minutes, how many BI tools could you say that about?</p>
<p>Also,<a href="http://www.paloinsider.com/?p=72"> in two weeks time </a>a pivot-table friendly <a href="http://en.wikipedia.org/wiki/OLE_DB_for_OLAP">ODBO driver</a> will be included for free with the Palo Excel Add-in (previously only available to those with a Jedox support contract).</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/709/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/709/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/709/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/709/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/709/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/709/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/709/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/709/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/709/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/709/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=709&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/06/12/palo-bi-suite-community-edition/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>LiteBI, Heavy ETL</title>
		<link>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/</link>
		<comments>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/#comments</comments>
		<pubDate>Fri, 24 Apr 2009 12:27:57 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[LiteBI]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=686</guid>
		<description><![CDATA[Although my major BI interest is in micro-BI (or is that  workgroup-BI?)  i.e. data, perhaps cleansed and packaged elsewhere, available locally on a datasmith&#8217;s PC,with most likely an in-memory OLAP as the analysis tool; the possibilities of the &#8220;cloud&#8221; as a BI platform have not escaped me.
From a micro-BI perspective, the ability to act as a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=686&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Although my major BI interest is in micro-BI (or is that  workgroup-BI?)  i.e. data, perhaps cleansed and packaged elsewhere, available locally on a datasmith&#8217;s PC,with most likely an in-memory OLAP as the analysis tool; the possibilities of the &#8220;cloud&#8221; as a BI platform have not escaped me.</p>
<p>From a micro-BI perspective, the ability to act as a backup/mirroring tool or as ETL/marshaling tool (anybody for Hadoop and SQLite?) attracts. I&#8217;ve yet to make up my mind on BI delivered as a cloud<a href="http://en.wikipedia.org/wiki/Platform_as_a_service"> PaaS</a> but obviously <a href="http://jeromepineau.blogspot.com/2009/04/on-demand-bi-beyond-smb.html">many others believe it has a future</a>.</p>
<p>My main worry with PaaS is not lock-in (which exists equally for in-house proprietary solutions) but the dangers of a <a href="http://www.accmanpro.com/tag/coghead/">Coghead-like lock-out</a>.  My other doubts are more technical; believing, as I do, that in-memory offers significant advantages over traditional ROLAP (simplicity been the main one) and multi-tenant in-memory architectures are not yet a runner.  But last week I had a demo of new Spanish BI PaaS service,<strong> </strong><a href="http://www.litebi.com/"><strong>LiteBI</strong></a>, which might just change my mind.</p>
<p>Javier Giménez Aznar and his team previously worked on delivering Pehtaho based datawarehouses to large Spanish corporations and government agencies, so they have a deep understanding of <a href="http://mondrian.pentaho.org/">Mondrian ROLAP</a> and are using that knowledge to build the LiteBI service, but this time with SMBs as the target customers rather than corporates. Pricing starts at €145 per month and is based on number of concurrent users, number of analytical spaces and the data volumes, so it&#8217;s not for very small firms more for the Medium in SMB.</p>
<p>Impressions? The cube designer, dashboard builders and the general UI are all very good and I would think would appeal to end-user datasmiths and, as such, will be a major up-front aid to selling this product.  But it was LiteBIs approach to the thorny issue of ETL and data loading that impressed me and also helped ease some of my Coghead-induced-fears.</p>
<p>BI technology stacks consist of three elements:</p>
<ul>
<li>The &#8220;fancy&#8221; front-end; graphs,animated dashboads and so on.</li>
<li>The pivot engine; ROLAP or MOLAP or both.</li>
<li>The ETL process.</li>
<li>(Many would say there&#8217;s an important 4th, the data-warehouse, but not every BI effort requires one, but that&#8217;s another issue)</li>
</ul>
<p>LiteBI is continuing to build yet more functionality into their UI and this &#8220;fancy&#8221; front-end is essential as it&#8217;s their &#8220;shop window&#8221;.</p>
<p>Mondrian provides their pivot engine, and again they continue to work on optimisations such as column-based datastores to increase speed and automate responsiveness tuning (end-users are very unforgiving of slow pivots).</p>
<p>But it&#8217;s in the 3rd area, that of the ETL process, that you realise the LiteBI team has real-world BI experience.  Data is loaded into LiteBI via an API, but with the ETL process itself happening on the customer side.</p>
<p>&#8220;Well,so what?&#8221; you may ask. The extraction of data has to obviously happen customer-side (even though not in the case of data being sourced from the likes of SalesForce.com). Yes, but it&#8217;s the transformations and data cleansing that adds true value to the ETL process and subsequently determines the quality and usefulness (as opposed to the speed or the &#8220;prettiness&#8221; of delivery) of the solution.</p>
<p>Part of the process of adopting LiteBI, is an ETL consultancy stage where a LiteBI partner company will provide on-site services to build this ETL layer, handling not just transformations but initial load and automating the subsequent delta uploads.</p>
<p>So the cost mounts up, but in reality you can&#8217;t do BI without this investment; there&#8217;s no ETL magic bullet.  Even still, Javier says the typical go-live time for a LiteBI project would be in the order of 3-4 weeks rather than the 3-4 months of similar on-site Pentaho projects.</p>
<p>The end-user &#8216;owning&#8217; the ETL process makes the prospect of a service lock-out slightly less worrying as, at least, one would still have a good starting point for moving to another provider or back in-house. What I would really like to see would be the option to self-host LiteBI, which I guess would involve open sourcing large parts of the service (the automated optimisation strategies could, for example, be excluded from this open source version).</p>
<p>The load API comes packaged as a plugin to <a href="http://kettle.pentaho.org/">Kettle</a> (aka PDI) and the intention is to offer a similar add-on for <a href="http://www.talend.org">Talend</a> in the near future. LiteBI also offers a white-label offering whereby 3rd party OLTP solution providers can use the service as their product&#8217;s BI suite.</p>
<p>Like the <a href="http://www.everything2.org/title/Skibbereen%2520Eagle">Skibbereen Eagle keeping its eye on the Czar of Russia</a>, I too will be keeping a watchful eye on <a href="http://www.litebi.com/">LiteBI</a> and the march of on-demand BI in general.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/686/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=686&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Project Gemini &#8211; XXL, Excel on Steroids</title>
		<link>http://blog.gobansaor.com/2009/04/01/project-gemini-xxl-excel-on-steroids/</link>
		<comments>http://blog.gobansaor.com/2009/04/01/project-gemini-xxl-excel-on-steroids/#comments</comments>
		<pubDate>Wed, 01 Apr 2009 11:47:52 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[Project Gemini]]></category>
		<category><![CDATA[Workgroup BI]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=662</guid>
		<description><![CDATA[In my last post about why I use SQLite in combination with Excel for datasmithing tasks, I listed the more traditional backends (Excel itself, MS Access, RDBMs &#38; MOLAP cubes) that one would expect to &#8220;compete&#8221; with such an idea.   But I suspect that if that same post appeared  two years or so into [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=662&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>In my <a href="http://blog.gobansaor.com/2009/03/14/sqlite-as-the-mp3-of-data/">last post</a> about why I use SQLite in combination with Excel for datasmithing tasks, I listed the more traditional backends (Excel itself, MS Access, RDBMs &amp; MOLAP cubes) that one would expect to &#8220;compete&#8221; with such an idea.   But I suspect that if that same post appeared  two years or so into the future, there would be a fifth contender, Project Gemini cubes.</p>
<p>Project Gemini is due to be delivered as a free add-in to the next version of Excel (2010) ,like the Analysis ToolPak or the Data Mining add-ins for Excel 2003.  (See this OLAP Report <a href="http://www.olapreport.com/Comment_Gemini.htm">Project Gemini, Microsoft&#8217;s Brillaint Trojan Horse</a> for a good overview of the tool).</p>
<p><a href="http://twitter.com/donalddotfarmer">Donald Farmer</a> ,who works on the project, having seen the <a href="http://blog.gobansaor.com/2009/03/14/sqlite-as-the-mp3-of-data/">SQLite as the MP3 of data</a> post and recognising that the <a href="http://en.wikipedia.org/wiki/Use_case">use cases</a> behind combining SQLite with Excel were similar to those of Project Gemini, kindly offered me a demo of the product.  Well, the phrase &#8220;Excel on steroids&#8221; has been much used in the past (in particular of add-ins such as Essbase, Palo or TM1) but this &#8220;ya gotta see&#8221;, Donald likes to call it XXL.</p>
<p>Millions of rows of data in-memory on a 4GB PC being &#8220;modeled&#8221; using a &#8220;user-friendly&#8221; pivot-table-like interface. And when I say, modeled, the user isn&#8217;t being confronted with concepts such as dimensions, levels, attributes, facts and so on, but a classic star schema model is nevertheless being built behind the scenes.  And it&#8217;s this model that allows Gemini to escape some of the inadequacies of pivot tables, e.g. allowing for rules and hierarchies to be defined.  The resulting model can then be saved and shared as a file (keeping to the document-centric ethos of Excel) but it can also be posted to and managed by SharePoint.</p>
<p>SharePoint will be extended to allow the IT function to manage and audit shared models to whatever degree the organisation requires, but the single file format will also allow smaller groups to share without the need for IT involvement (essential if bottom-up adoption is to be encouraged).  SharePoint will also add the &#8220;Web2.0 collaboration layer&#8221;.</p>
<p>How will MS make money from this if it&#8217;s free?  The first clue is the SharePoint backend, more functionality means more reasons to purchase and use MS&#8217;s server stack and the same applies to Excel itself. I, like many others, are very happy using Excel 2003 and look on Excel 2007 the same way the market in general has looked on Vista; i.e. pretty, but lacking a strong enough reason to upgrade unless forced to do so. (Excel 2007 also has the ribbon issue, not one I find a major problem myself, <a href="http://smurfonspreadsheets.wordpress.com/2009/03/13/the-ribbon-bet/">but others do</a>).  But I would upgrade to a version Excel that offered Project Gemini capabilities and I&#8217;m sure others would follow (and more importantly to MS&#8217;s revenues, thousands of corporate accounts would too).</p>
<p>Project Gemini offers proof that MS realises, what those of us on the ground have know for years, that BI projects are in the main, Excel-centric; all the &#8216;hard sums&#8217;  and awkward decisions end-up back on the desktop.  MS has decided to publicly recognise that fact and profit from it. The timing is both economically and technically opportune; PC speed and cheap memory means that a huge chunk of even a large corporation&#8217;s datasets can be analysed by a PC (<a href="http://www.b-eye-network.com/view/9752">according to this</a>, the median size of original data in OLAP datasets is about 5GB); and there&#8217;s obvious cost-benefits for companies facing difficult times requiring more to be done with fewer resources.</p>
<p>What will the effect be on tools such as Essbase, TM1, Palo etc. ?  Well, let me put it this way, if their owners are making strategic plans for 2010 onwards and they&#8217;re not taking account of the Gemini effect perhaps they should.  Most likely Gemini will help increase the overall market for OLAP tools, with the incumbents tending to specialise in their existing niches (e.g. <a href="http://www.jedox.com/en/Sample-Uses-of-Palo/Budget-and-Corporate-Planning.html">Palo in Budgeting</a>, with the added value of being free and open source, which has a premium over just being &#8216;free&#8217;).</p>
<p>So will I put away my Excel-SQLite fixation then? No, for two reasons:</p>
<ul>
<li>Project Gemini is not here yet, and the proof of the pudding will be in the eating. Also, when it does appear it will only apply to Excel 2010 (or whatever) and as many companies are still on Office 2000 (and a few on 97!), it&#8217;ll be at least  5 years before a significant percentage of sites upgrade.</li>
<li>The SQLite addition to Excel offers not just BI capabilities but also makes a nimble ETL and data integration engine. I&#8217;m also experimenting with Amazon S3 integration to enable simple work-flows for small distributed teams (or even same-office groups where <a href="http://blog.gobansaor.com/2007/12/17/the-wan-is-the-new-lan/">the WAN is the new LAN</a>).</li>
</ul>
<p>Whether you agree or not in the validity of  &#8221;<a href="http://esj.com/Articles/2009/03/25/Workgroup-BI-Poised-for-a-Comeback.aspx?Page=1">workgroup BI</a>&#8220;, be aware that MS does and it thinks that BI is about to enter a new phase,  for proof see MS&#8217;s <a href="http://twitter.com/nicfish">Nic Smith&#8217;s</a> <a href="http://blogs.msdn.com/bi/archive/2009/03/22/history-of-business-intelligence.aspx">The History of Business Intelligence</a> video.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/662/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/662/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/662/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/662/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/662/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/662/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/662/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/662/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/662/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/662/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=662&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/04/01/project-gemini-xxl-excel-on-steroids/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>SQLite as the MP3 of data</title>
		<link>http://blog.gobansaor.com/2009/03/14/sqlite-as-the-mp3-of-data/</link>
		<comments>http://blog.gobansaor.com/2009/03/14/sqlite-as-the-mp3-of-data/#comments</comments>
		<pubDate>Sat, 14 Mar 2009 19:13:25 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[MP3]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=622</guid>
		<description><![CDATA[&#8230; and Excel as its &#8220;mixing desk&#8221;.
When I tell people that I use SQLite in combination with Excel as my datasmithing platform, many ask why SQLite? (Many others ask why Excel?  but &#8220;sin scéal eile&#8221;, that&#8217;s another discussion.) Those that question my use of SQLite tend to cluster into four camps:

Pure Excel jocks.
MS Access fans.
The [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=622&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>&#8230; and Excel as its &#8220;mixing desk&#8221;.</p>
<p>When I tell people that I use SQLite in combination with Excel as my datasmithing platform, many ask why SQLite? (Many others ask why Excel?  but &#8220;sin scéal eile&#8221;, that&#8217;s another discussion.) Those that question my use of SQLite tend to cluster into four camps:</p>
<ul>
<li>Pure Excel jocks.</li>
<li>MS Access fans.</li>
<li>The client server database brigade (SQL Server,Oracle; or if FOSS fans; MySQL, PostrgeSQL).</li>
<li>The MOLAP folks (Essbase, TM1, Palo).</li>
</ul>
<p>Now while I have used and will continue to use/encounter all four &#8216;approaches&#8217;, I&#8217;ve come to believe over the last couple of years that SQLite brings something special to the datasmithing game. When I look back over nearly 30 years in the data handling business I keep thinking &#8211; &#8220;If only I had SQLite then, how much easier/quicker/cheaper that task would have been!&#8221;.</p>
<p>Just as &#8220;fractional horsepower&#8221; electrical motors revolutionised manufacturing and eventually all our lives (car starter-motors, fridge motors, washing machines etc.), &#8220;fractional horsepower&#8221; databases can do the same for data. Distributing data to where it is needed.</p>
<p>As operational local caches, this use of SQLite is already far advanced. SQLite is embedded in lots of <a href="http://www.sqlite.org/famous.html">every day software tools</a>, everything from McAfee anti-virus to <a href="http://www.tweetdeck.com/beta/">TweetDeck</a> Twitter clients (best one IMHO). But my interest is more in SQLite&#8217;s potential as a micro-BI (or maybe more correctly a distributed-BI) platform. A sort of MP3 format for distributed structured data, if you like.</p>
<p>But why SQLite (and in particular SQLite in combination with Excel) as my datasmithing tool rather than the four other approaches?  First, what&#8217;s a datasmith?</p>
<p>Managing and manipulating datasets has become an integral part of many people’s job, not just accountants (the original of the species) but marketing executives, sales staff, pricing analysts, process engineers; different job titles, different roles but using a skill that they’ve likely never been formally trained in, a skill without a name; a skill I call datasmithing. I like to think of  myself as a master datasmith, or a datamith&#8217;s datasmith.</p>
<p>If you consider yourself a datasmith then most likely the tool you use to manage your datasets is Excel. And before you apologise, don’t. Excel is by far the best and most flexible end-user data manipulation tool out there. Everything from the current financial crisis downwards has at some stage being blamed on Excel, but you know and I know that many tasks would remain undone or under-done were it not for end-user generated spreadsheets.</p>
<p>Spreadsheets are not however optimal for some tasks, linked spreadsheets in particular are data disasters in waiting. While fantastic for data transformations and presentations, as books-of-record they’re rarely suitable. Other tools such as SQL based relational databases and in-memory OLAP offer much better and potentially much more cost-effective data modelling functionality, but also at a cost of extra complexity and ongoing technical support.</p>
<p>MS Access (which like SQLite, is a document-centric, non-client-server database; but unlike it, is also a forms/reporting development environment) would appear to be the natural local store database. My problem with MS Access has been its tendency to try to be all things to all men, ending up not fully satisfying anybody. Professional developers think it&#8217;s too limiting, non-techs find it too intimidating, even reporting, where it once showed promise left a big enough opening for Crystal Reports to evolve. It is also limited to Windows which might not seem to be a problem if combining with Excel, but, as it&#8217;s often necessary, due to scale or complexity of the data,  to use &#8216;proper&#8217; ETL tools such as Talend, having an OS agnostic database format than can act as a distribution media (think MP3s again) between &#8220;mixing desks&#8221; can be very useful.</p>
<p>The big difference to MS Access for me is SQLite&#8217;s open source code; code that&#8217;s a pleasure to browse and with an approachable API that even I, with my very rusty C skills, can manipulate. Having access to that code allows me to tightly integrate it with Excel, so much so, that I can use Excel functions (built-in functions, VBA user-defined functions and 3rd party add-in functions) directly from SQLite&#8217;s SQL; and vice-versa, access SQL functionality via Excel &#8220;formula&#8221; calls. It is  also possible to  load most datasets into memory using SQLite&#8217;s in-memory mode enabling very fast processing  and near zero-latency when passing data to and from Excel/VBA. In the near future, cheap, large <a href="http://en.wikipedia.org/wiki/Solid-state_drive">SSDs </a>will enable non-memory databases to offer similar speed but also handle extremely large datasets (<a href="http://i.gizmodo.com/5166798/24-solid-state-drives-open-all-of-microsoft-office-in-5-seconds">see this for a glimpse of that future</a>).</p>
<p>What about the big beasts of the data world, the client-server databases? Having spent most of my professional life working with such tools I&#8217;m aware of the power of a well designed relational database. If SQLite is the MP3, then these are the master tapes, the DDD recordings. Most of the data that eventual ends up in SQLite for analysis and/or transformation will have originated in data-warehouses or be directly sourced  from OLTP systems built using relational technology. But for close-up analysis and transformation, the pure simplicity and convenience of SQLite is hard to beat. That simplicity is primarily due to its Excel-like &#8216;document&#8217; nature, all code and data can be housed in a single folder (or <a href="http://www.truecrypt.org/">true-crypt container</a> for added security), ensuring that the &#8216;problem domain&#8217; can be easily archived and/or shared with others without the need for professional IT resources.</p>
<blockquote><p>And yes, I hear you, isn&#8217;t that the basis of Excel-hell? Yes it is, but over the years I&#8217;ve found that this is rarely a problem for datasmiths, they deal day-in day-out with document work-flows, they understand the risks and the benefits (mainly the simplicity) of the approach. Where the nightmare truly happens is when this approach is used as an alternative to an OLTP system i.e. using Excel and other document-like datastores as books-of-record in large multi-user environments &#8211; &#8220;there be monsters for sure&#8221;.</p></blockquote>
<p>How about MOLAP? Wasn&#8217;t Essbase&#8217;s name derived from &#8220;extended spreadsheet database&#8221; and doesn&#8217;t Palo offer a truly excel-friendly multi-user database back-end? Again, having worked with Essbase for many years and now being a big fan of the open source <a href="http://www.palo.net">Palo</a> MOLAP tool, I fully appreciate the power that such tools brings to analysis and multi-user planning tasks. But for many situations, an Excel Pivot Table is &#8220;good enough&#8221; and even when it&#8217;s not, it is possible by utilising what I call a tOLAP cube (essentially, a fact table indexed via tags enabled by Google&#8217;s great addition to SQLite, the <a href="http://dotnetperls.com/Content/SQLite-FTS3.aspx">FTS3</a> virtual table) to build and access  powerful, yet simple, cube-like data structures.</p>
<p>By integrating SQLite with Excel, datasmiths can have the best of both worlds, familiar <a href="http://smurfonspreadsheets.wordpress.com/2007/02/20/accel-or-excess/">spreadsheet front-end combined with a fast and powerful SQL engine and datastore</a>, in fact, everything that MS Access should have been.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/622/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/622/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/622/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/622/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/622/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/622/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/622/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/622/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/622/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/622/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=622&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/03/14/sqlite-as-the-mp3-of-data/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Talend ETL Excel report generator</title>
		<link>http://blog.gobansaor.com/2009/02/13/talend-etl-excel-report-generator/</link>
		<comments>http://blog.gobansaor.com/2009/02/13/talend-etl-excel-report-generator/#comments</comments>
		<pubDate>Fri, 13 Feb 2009 14:43:34 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[excel]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=608</guid>
		<description><![CDATA[Hugo, who you may remember from his OLAP Cube as a Mind Map project, has struck again.  This time something really useful, a component  for the Talend ETL platform that generates Excel reports using templates and a JSP style TAG language to control the output.
I&#8217;ve in the past used the excellent Xlsgen to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=608&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://twitter.com/hugoetl">Hugo,</a> who you may remember from his <a href="http://blog.gobansaor.com/2008/07/30/olap-cube-as-a-mind-map/">OLAP Cube as a Mind Map project,</a> has struck again.  This time something really useful, <a href="http://hugoworld.wordpress.com/2009/02/08/taking-the-pain-out-of-excel-reporting/">a component  for the Talend ETL platform that generates Excel reports</a> using templates and a JSP style TAG language to control the output.</p>
<p>I&#8217;ve in the past used the excellent <a href="http://xlsgen.arstdesign.com/">Xlsgen</a> to automate the production of Excel reports, but Hugo&#8217;s component has the benefit of being free (xlsgen now costs €390!) and open source and it also taps into the vast world of existing <a href="http://www.talend.com">Talend ETL</a> components.</p>
<p>Well done Hugo.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/608/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/608/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/608/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=608&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/02/13/talend-etl-excel-report-generator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL &#8211; does exactly what it says on the tin</title>
		<link>http://blog.gobansaor.com/2008/12/18/sql-does-exactly-what-it-says-on-the-tin/</link>
		<comments>http://blog.gobansaor.com/2008/12/18/sql-does-exactly-what-it-says-on-the-tin/#comments</comments>
		<pubDate>Thu, 18 Dec 2008 20:51:30 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[AmazonAWS]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[DSL]]></category>
		<category><![CDATA[SimpleDB]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=590</guid>
		<description><![CDATA[SQL how unloved it must feel sometimes, constantly being maligned, accused of being on the wrong side of the object-relational impedance mismatch,  lacking the glamour of OO programming languages that claim the moral high ground. Yet at the same time hewing and hauling most of the world&#8217;s structured data on its old but well fashioned [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=590&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><strong>SQL</strong> how <a href="http://codehappy.wordpress.com/2007/07/30/databases-need-a-new-language/?referer=sphere_related_content/">unloved it must feel sometimes</a>, constantly being maligned, accused of being on the wrong side of the <a href="http://en.wikipedia.org/wiki/Object-Relational_impedance_mismatch">object-relational impedance mismatch</a>,  lacking the glamour of OO programming languages that claim the moral high ground. Yet at the same time hewing and hauling most of the world&#8217;s structured data on its old but well fashioned back.</p>
<p><strong>SQL</strong> is perhaps the <a href="http://en.wikipedia.org/wiki/Domain-specific_programming_language">world&#8217;s most popular DSL</a>, a<a href="http://en.wikipedia.org/wiki/Declarative_language"> declarative language</a> for the manipulation of tabular data, easy to learn yet capable of powerful (and sometimes complex) expressions.  And like <a href="http://ronanfitzgerald.net/everythingelse/?p=8">the Ronseal ad</a>, a SQL statement no matter how simple or complex, does exactly what it says, all the complexity of loops and iterations and the attendant errors, abstracted away, it just works!</p>
<p><strong>SQL</strong> is both a programmer and an end-user tool; after Excel formulas, it&#8217;s the language most likely to be understood and used by &#8220;civilians&#8221;.  There are few enough such cross-over tools, so think twice before building a datastore that doesn&#8217;t offer a SQL API.  And I guess that&#8217;s what Amazon did. Although SimpleDB is not a relational database, they&#8217;ve <a href="http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/">decided to add a SQL API</a>, following Google&#8217;s lead with its <a href="http://code.google.com/appengine/docs/datastore/gqlqueryclass.html">SQL front-end</a> to the non relational big-table backed Google App datastore.</p>
<p><strong>SQL</strong> is also the reason why I&#8217;ve integrated SQLite with Excel , leveraging SQL to manipulate tabular data with greater efficiency and fewer errors while still keeping the touchy-feely power of Excel.   I expose SQLite to Excel via <a href="http://www.ozgrid.com/VBA/Functions.htm">UDFs</a> rather than menu options or wizards, so that the transformation logic is visible and approachable (at least to those comfortable with excel formula &#8220;programming&#8221; and with basic SQL).</p>
<p><strong>SQL</strong> is my weapon of choice because of my belief in the primacy of data. It is data that matters in the long run, not the algorithms or GUIs that temporarily use (and abuse) it.  In my time in Guinness Ireland I had the task of transferring master and historical transactional data from &#8220;legacy systems&#8221; into SAP ,Siebel and a new datawarehouse; data that had a decade and a half earlier been transferred by me  into those same legacy systems from even older systems. In fact, the data&#8217;s electronic lineage could be traced back to a 1960&#8217;s era ICL mainframe  (I have the original spec!) and I&#8217;m sure it existed in <a href="http://encyclopedia2.thefreedictionary.com/accounting+machine">accountancy machine</a> punch-cards  prior to that. Understand a business&#8217;s data and you&#8217;ll not just understand the business as it currently operates but also how it operated in the past and its future potential.</p>
<p><strong>SQL</strong> abú.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/590/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/590/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=590&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/12/18/sql-does-exactly-what-it-says-on-the-tin/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Pentaho Data Integration (Kettle) V Talend Benchmark</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/</link>
		<comments>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 17:56:22 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[Matt Casters]]></category>
		<category><![CDATA[PDI.TOS]]></category>
		<category><![CDATA[Pentaho]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587</guid>
		<description><![CDATA[Pentaho&#8217;s Matt Caster has just published a benchmarking exercise comparing Kettle and Talend.  In it he admits he&#8217;s not a Talend expert and he advises that people should perform their own benchmarks where possible as requirements differ.  Nevertheless, unlike most other benchmarks we&#8217;ve seen on the subject he publishes not just the results but the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=587&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://www.ibridge.be/?p=150">Pentaho&#8217;s Matt Caster has just published a benchmarking exercise comparing Kettle and Talend</a>.  In it he admits he&#8217;s not a Talend expert and he advises that people should perform their own benchmarks where possible as requirements differ.  Nevertheless, unlike most <a href="http://blog.gobansaor.com/2008/10/30/open-source-metrics/">other benchmarks we&#8217;ve seen on the subject </a>he publishes not just the results but the actual transformation &#8220;code&#8221; used in the tests. </p>
<p><a href="http://www.nicholasgoodman.com/bt/blog/2008/11/26/an-arms-race-my-customers-dont-care-about/"><span style="color:#000000;text-decoration:none;">For </span></a><a href="http://www.nicholasgoodman.com/bt/blog/2008/11/26/an-arms-race-my-customers-dont-care-about/">many people these benchmarks are of no real interest</a> as long as the product does what is required within the time and resources available they&#8217;re content.  But it would be a mistake to think that benchmarks don&#8217;t matter, they do; people have and will make that final decision based on them.  Remember ETL is not life and death, the decision which tool (if any) to go with may not get the level of investigation that the developers behind such products expect of their potential clientele and this is particularly true of open source.  Busy people will use such reports to direct them down a path or to confirm their existing prejudices. So I&#8217;m really glad to see Matt responding and in particular, responding in the manner he has.</p>
<p>Databases vendors have for years played the benchmarking game, setting and breaking records either via real technological advances or simply gaming the process.  We as purchasers and users knew in many cases to take the results with a large dose of salt, but purchasing decisions where nevertheless made on the backs of these surveys.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/587/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=587&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Spending time on Excel-SQLite, C, VBA Callbacks &amp; Twitter</title>
		<link>http://blog.gobansaor.com/2008/11/20/spending-time-on-excel-sqlite-c-vba-callbacks-twitter/</link>
		<comments>http://blog.gobansaor.com/2008/11/20/spending-time-on-excel-sqlite-c-vba-callbacks-twitter/#comments</comments>
		<pubDate>Thu, 20 Nov 2008 12:44:52 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[xLite]]></category>
		<category><![CDATA[c#]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=573</guid>
		<description><![CDATA[Haven&#8217;t posted here in a while as my spare time has been soaked up programing, well actually refactoring would be more exact.  My xLite &#8220;SQLite empowered Excel&#8221; codebase has grown over the years and required a serious makeover to get rid of stuff I no longer use and to generally make it more robust.  I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=573&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Haven&#8217;t posted here in a while as my spare time has been soaked up programing, well actually <a href="http://en.wikipedia.org/wiki/Refactoring">refactoring</a> would be more exact.  My xLite &#8220;SQLite empowered Excel&#8221; codebase has grown over the years and required a serious makeover to get rid of stuff I no longer use and to generally make it more robust.  I also decided to add some extra functionality to my VBA friendly C wrapper for SQLite (based on Pivotal Solutions&#8217; pssqlite.dll) which meant I had to re-acquaint myself with my long lost C skills, so doing reminded me how much I like C. Close to the metal programing if not exactly super-productive is nevertheless super-powerful.</p>
<p>The new improved xLiteSQLite.dll now has a built-in CSV loader (both file based and string based &#8211; handy for loading <a href="http://en.wikipedia.org/wiki/Palo_(OLAP_database)">Palo</a> HTTP API responses into a table). It also returns a one columned variant array of CSV values for quick rendering via &#8220;text-to-columns&#8221; code (by far the quickest way of handling large dataset pasting into Excel).</p>
<p>I&#8217;ve also added the ability to create SQlite UDFs (user defined functions) in VBA (thanks to <a href="http://stackoverflow.com/users/4007/rpetrich">http://stackoverflow.com/users/4007/rpetrich)</a>.  This is a very powerful feature as it allows SQLite selects to act as a &#8220;loop controller&#8221; calling back to  Excel/VBA functions to process each row, really useful for ETL tasks. And not just <a href="http://www.sqlite.org/c3ref/create_function.html">scalar UDFs but aggregating (aka group-by) functions too</a>, allowing the use of Excel&#8217;s powerful array functions in SQLite statements.</p>
<p>All in all, the changes to the xLite VBA code and the C wrapper makes Excel backed by SQLite a seriously good micro-ETL tool. Combined with <a href="http://www.palo.net">Palo</a>, the result in a truly wonderful micro-BI platform; a cost-effective toolset for these recessionary times.</p>
<p>Of course I&#8217;d be lying if I said code was the only reason I&#8217;ve been neglecting my blogging duties, I&#8217;m afraid I&#8217;ve a confession to make, Twitter has hooked yet another sucker, <a href="http://twitter.com/gobansaor">me!</a> </p>
<p>I&#8217;ve found I&#8217;ve settled in to the whole micro-blogging thing with ease, and have managed to make contact with people I would not have encountered otherwise, as well as reconnecting with others that I&#8217;d lost contact with.  So if you too are all-a-twitter then do please follow <a href="http://twitter.com/gobansaor">gobansaor-on-twitter</a>.</p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/573/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/573/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/573/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/573/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/573/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/573/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/573/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/573/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/573/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/573/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=573&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/11/20/spending-time-on-excel-sqlite-c-vba-callbacks-twitter/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Open Source Metrics and Benchmarks</title>
		<link>http://blog.gobansaor.com/2008/10/30/open-source-metrics/</link>
		<comments>http://blog.gobansaor.com/2008/10/30/open-source-metrics/#comments</comments>
		<pubDate>Thu, 30 Oct 2008 12:24:20 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[ETL benchmarks]]></category>
		<category><![CDATA[PDI 3.0]]></category>
		<category><![CDATA[WaveMaker]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=557</guid>
		<description><![CDATA[Marc Russel&#8217;s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=557&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://marcrussel.wordpress.com/">Marc Russel&#8217;s blog</a> links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  <span style="text-decoration:line-through;">As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable </span><a href="http://www.ibridge.be/?page_id=122"><span style="text-decoration:line-through;">Kettle ETL</span></a><span style="text-decoration:line-through;"> aka PDI 3.0 is now a serious contender for handling very large datasets</span>.  Oops, that&#8217;s what I get for wishing for a result and (mis-)reading the report early in the morning with a cold and bad sore throat, sadly PDI is still very much slower that its OS cousin Talend. In fact, Talend continues to play on the strength that comes from a code generated sloution, i.e. raw speed.  As a pure ETL play, Talend is well capable of playing on the same pitch as the &#8220;big kids&#8221;. </p>
<p>Interestingly, the report is also &#8220;open source&#8221; as it&#8217;s released under a <a href="http://creativecommons.org/licenses/by/3.0/us/"><span style="text-decoration:none;">Creative Commons License</span></a>, so I can link to it <span style="text-decoration:none;"><span style="text-decoration:line-through;">here</span></span><span style="color:#000000;"><a href="http://marcrussel.files.wordpress.com/2008/10/etlbenchmarks_manappsc221008.pdf">.</a></span></p>
<p><span style="color:#000000;"><strong>UPDATE: </strong></span></p>
<p><span style="color:#000000;">There&#8217;s now a new version of the report available (<a href="http://www.manapps.com/" target="_blank">www.manapps.com</a>, Topic Benchmark), it seems the original was just a work-in-progress and was not meant for public release.  The main difference appears to be a significant improvement in Informatica&#8217;s &#8217;score&#8217;, but I&#8217;m not sure as I was really only interested in comparing the two OSS products, Talend and Pentaho PDI, in that &#8216;battle&#8217; Pentaho still comes out &#8217;slower&#8217;. </span></p>
<p><span style="color:#000000;"> The original Marc Russel blog entry and a subsequent one reporting the new updated report appear to have both been removed.  </span></p>
<p><span style="color:#000000;">Also, I was informed of the &#8216;updated&#8217; report via this email from manapps, which assures vendors that they are happy to rerun any tests and provide any information re the running of such tests &#8230; </span></p>
<blockquote><p><span style="color:#000000;">Dear Sir,</span></p>
<p>You referred on your web site to the report called “Benchmark ETL” by Manapps, from November 2008. This draft report was not intended to be publicly released since just a working document.<br />
We would like you (i) publish Asap the modified version (or its related link) that supersedes the former one (on our web site (<a href="http://www.manapps.com/" target="_blank">www.manapps.com</a>, Topic Benchmark), (ii) state that Manapps had no intend to release the former report and accordingly takes no responsibility on its content, (iii) state that Manapps holds all necessary elements at the disposal of all vendors so that they can rerun some tests if wished that will then be published.</p>
<p>Regards,<br />
Philippe THOMAS</p>
<p>Time: Thursday March 5, 2009 at 5:10 pm</p></blockquote>
<p> </p>
<p><a href="http://marcrussel.files.wordpress.com/2008/10/etlbenchmarks_manappsc221008.pdf"></a><a href="http://www.keeneview.com/2008/10/open-source-marketing-metrics-from-0-to.html"><span style="text-decoration:none;">Another analysis of OSS in the wild</span></a> this time from Chris Keene, <a href="http://www.wavemaker.com/"><span style="text-decoration:none;">WaveMaker</span></a> CEO, on OSS as a marketing tool. Bottom line, 1% conversion rate, 700 paying customers in 9 months &#8230;   </p>
<div class="wp-caption aligncenter" style="width: 510px"><a href="http://www.keeneview.com/uploaded_images/metrics-783392.jpg"><img title="WaveMaker, from click to paying customer." src="http://www.keeneview.com/uploaded_images/metrics-783392.jpg" alt="WaveMaker OSS as a marketing tool" width="500" height="400" /></a><p class="wp-caption-text">WaveMaker OSS as a marketing tool</p></div>
<p> </p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/557/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=557&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/10/30/open-source-metrics/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c55040b07850424d301e2e3c32e35b88?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://www.keeneview.com/uploaded_images/metrics-783392.jpg" medium="image">
			<media:title type="html">WaveMaker, from click to paying customer.</media:title>
		</media:content>
	</item>
	</channel>
</rss>