<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Gobán Saor &#187; kettle</title>
	<atom:link href="http://blog.gobansaor.com/category/kettle/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.gobansaor.com</link>
	<description>A country datasmith.</description>
	<lastBuildDate>Tue, 27 Jul 2010 17:23:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.gobansaor.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/67e164f5d51c2b3115a7819b84505c13?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>Gobán Saor &#187; kettle</title>
		<link>http://blog.gobansaor.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.gobansaor.com/osd.xml" title="Gobán Saor" />
	<atom:link rel='hub' href='http://blog.gobansaor.com/?pushpress=hub'/>
		<item>
		<title>LiteBI, Heavy ETL</title>
		<link>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/</link>
		<comments>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/#comments</comments>
		<pubDate>Fri, 24 Apr 2009 12:27:57 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[LiteBI]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=686</guid>
		<description><![CDATA[Although my major BI interest is in micro-BI (or is that  workgroup-BI?)  i.e. data, perhaps cleansed and packaged elsewhere, available locally on a datasmith&#8217;s PC,with most likely an in-memory OLAP as the analysis tool; the possibilities of the &#8220;cloud&#8221; as a BI platform have not escaped me. From a micro-BI perspective, the ability to act as [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=686&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Although my major BI interest is in micro-BI (or is that  workgroup-BI?)  i.e. data, perhaps cleansed and packaged elsewhere, available locally on a datasmith&#8217;s PC,with most likely an in-memory OLAP as the analysis tool; the possibilities of the &#8220;cloud&#8221; as a BI platform have not escaped me.</p>
<p>From a micro-BI perspective, the ability to act as a backup/mirroring tool or as ETL/marshaling tool (anybody for Hadoop and SQLite?) attracts. I&#8217;ve yet to make up my mind on BI delivered as a cloud<a href="http://en.wikipedia.org/wiki/Platform_as_a_service"> PaaS</a> but obviously <a href="http://jeromepineau.blogspot.com/2009/04/on-demand-bi-beyond-smb.html">many others believe it has a future</a>.</p>
<p>My main worry with PaaS is not lock-in (which exists equally for in-house proprietary solutions) but the dangers of a <a href="http://www.accmanpro.com/tag/coghead/">Coghead-like lock-out</a>.  My other doubts are more technical; believing, as I do, that in-memory offers significant advantages over traditional ROLAP (simplicity been the main one) and multi-tenant in-memory architectures are not yet a runner.  But last week I had a demo of new Spanish BI PaaS service,<strong> </strong><a href="http://www.litebi.com/"><strong>LiteBI</strong></a>, which might just change my mind.</p>
<p>Javier Giménez Aznar and his team previously worked on delivering Pentaho based datawarehouses to large Spanish corporations and government agencies, so they have a deep understanding of <a href="http://mondrian.pentaho.org/">Mondrian ROLAP</a> and are using that knowledge to build the LiteBI service, but this time with SMBs as the target customers rather than corporates. Pricing starts at €145 per month and is based on number of concurrent users, number of analytical spaces and the data volumes, so it&#8217;s not for very small firms more for the Medium in SMB.</p>
<p>Impressions? The cube designer, dashboard builders and the general UI are all very good and I would think would appeal to end-user datasmiths and, as such, will be a major up-front aid to selling this product.  But it was LiteBIs approach to the thorny issue of ETL and data loading that impressed me and also helped ease some of my Coghead-induced-fears.</p>
<p>BI technology stacks consist of three elements:</p>
<ul>
<li>The &#8220;fancy&#8221; front-end; graphs,animated dashboads and so on.</li>
<li>The pivot engine; ROLAP or MOLAP or both.</li>
<li>The ETL process.</li>
<li>(Many would say there&#8217;s an important 4th, the data-warehouse, but not every BI effort requires one, but that&#8217;s another issue)</li>
</ul>
<p>LiteBI is continuing to build yet more functionality into their UI and this &#8220;fancy&#8221; front-end is essential as it&#8217;s their &#8220;shop window&#8221;.</p>
<p>Mondrian provides their pivot engine, and again they continue to work on optimisations such as column-based datastores to increase speed and automate responsiveness tuning (end-users are very unforgiving of slow pivots).</p>
<p>But it&#8217;s in the 3rd area, that of the ETL process, that you realise the LiteBI team has real-world BI experience.  Data is loaded into LiteBI via an API, but with the ETL process itself happening on the customer side.</p>
<p>&#8220;Well,so what?&#8221; you may ask. The extraction of data has to obviously happen customer-side (even though not in the case of data being sourced from the likes of SalesForce.com). Yes, but it&#8217;s the transformations and data cleansing that adds true value to the ETL process and subsequently determines the quality and usefulness (as opposed to the speed or the &#8220;prettiness&#8221; of delivery) of the solution.</p>
<p>Part of the process of adopting LiteBI, is an ETL consultancy stage where a LiteBI partner company will provide on-site services to build this ETL layer, handling not just transformations but initial load and automating the subsequent delta uploads.</p>
<p>So the cost mounts up, but in reality you can&#8217;t do BI without this investment; there&#8217;s no ETL magic bullet.  Even still, Javier says the typical go-live time for a LiteBI project would be in the order of 3-4 weeks rather than the 3-4 months of similar on-site Pentaho projects.</p>
<p>The end-user &#8216;owning&#8217; the ETL process makes the prospect of a service lock-out slightly less worrying as, at least, one would still have a good starting point for moving to another provider or back in-house. What I would really like to see would be the option to self-host LiteBI, which I guess would involve open sourcing large parts of the service (the automated optimisation strategies could, for example, be excluded from this open source version).</p>
<p>The load API comes packaged as a plugin to <a href="http://kettle.pentaho.org/">Kettle</a> (aka PDI) and the intention is to offer a similar add-on for <a href="http://www.talend.org">Talend</a> in the near future. LiteBI also offers a white-label offering whereby 3rd party OLTP solution providers can use the service as their product&#8217;s BI suite.</p>
<p>Like the <a href="http://www.everything2.org/title/Skibbereen%2520Eagle">Skibbereen Eagle keeping its eye on the Czar of Russia</a>, I too will be keeping a watchful eye on <a href="http://www.litebi.com/">LiteBI</a> and the march of on-demand BI in general.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/686/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/686/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/686/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=686&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2009/04/24/litebi-heavy-etl/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Pentaho Data Integration (Kettle) V Talend Benchmark</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/</link>
		<comments>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 17:56:22 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[Matt Casters]]></category>
		<category><![CDATA[PDI.TOS]]></category>
		<category><![CDATA[Pentaho]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587</guid>
		<description><![CDATA[Pentaho&#8217;s Matt Caster has just published a benchmarking exercise comparing Kettle and Talend.  In it he admits he&#8217;s not a Talend expert and he advises that people should perform their own benchmarks where possible as requirements differ.  Nevertheless, unlike most other benchmarks we&#8217;ve seen on the subject he publishes not just the results but the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=587&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.ibridge.be/?p=150">Pentaho&#8217;s Matt Caster has just published a benchmarking exercise comparing Kettle and Talend</a>.  In it he admits he&#8217;s not a Talend expert and he advises that people should perform their own benchmarks where possible as requirements differ.  Nevertheless, unlike most <a href="http://blog.gobansaor.com/2008/10/30/open-source-metrics/">other benchmarks we&#8217;ve seen on the subject </a>he publishes not just the results but the actual transformation &#8220;code&#8221; used in the tests. </p>
<p><a href="http://www.nicholasgoodman.com/bt/blog/2008/11/26/an-arms-race-my-customers-dont-care-about/"><span style="color:#000000;text-decoration:none;">For </span></a><a href="http://www.nicholasgoodman.com/bt/blog/2008/11/26/an-arms-race-my-customers-dont-care-about/">many people these benchmarks are of no real interest</a> as long as the product does what is required within the time and resources available they&#8217;re content.  But it would be a mistake to think that benchmarks don&#8217;t matter, they do; people have and will make that final decision based on them.  Remember ETL is not life and death, the decision which tool (if any) to go with may not get the level of investigation that the developers behind such products expect of their potential clientele and this is particularly true of open source.  Busy people will use such reports to direct them down a path or to confirm their existing prejudices. So I&#8217;m really glad to see Matt responding and in particular, responding in the manner he has.</p>
<p>Databases vendors have for years played the benchmarking game, setting and breaking records either via real technological advances or simply gaming the process.  We as purchasers and users knew in many cases to take the results with a large dose of salt, but purchasing decisions where nevertheless made on the backs of these surveys.</p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/587/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/587/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/587/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=587&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Open Source Metrics and Benchmarks</title>
		<link>http://blog.gobansaor.com/2008/10/30/open-source-metrics/</link>
		<comments>http://blog.gobansaor.com/2008/10/30/open-source-metrics/#comments</comments>
		<pubDate>Thu, 30 Oct 2008 12:24:20 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[ETL benchmarks]]></category>
		<category><![CDATA[PDI 3.0]]></category>
		<category><![CDATA[WaveMaker]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=557</guid>
		<description><![CDATA[Marc Russel&#8217;s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=557&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://marcrussel.wordpress.com/">Marc Russel&#8217;s blog</a> links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  <span style="text-decoration:line-through;">As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable </span><a href="http://www.ibridge.be/?page_id=122"><span style="text-decoration:line-through;">Kettle ETL</span></a><span style="text-decoration:line-through;"> aka PDI 3.0 is now a serious contender for handling very large datasets</span>.  Oops, that&#8217;s what I get for wishing for a result and (mis-)reading the report early in the morning with a cold and bad sore throat, sadly PDI is still very much slower that its OS cousin Talend. In fact, Talend continues to play on the strength that comes from a code generated sloution, i.e. raw speed.  As a pure ETL play, Talend is well capable of playing on the same pitch as the &#8220;big kids&#8221;. </p>
<p>Interestingly, the report is also &#8220;open source&#8221; as it&#8217;s released under a <a href="http://creativecommons.org/licenses/by/3.0/us/"><span style="text-decoration:none;">Creative Commons License</span></a>, so I can link to it <span style="text-decoration:none;"><span style="text-decoration:line-through;">here</span></span><span style="color:#000000;"><a href="http://marcrussel.files.wordpress.com/2008/10/etlbenchmarks_manappsc221008.pdf">.</a></span></p>
<p><span style="color:#000000;"><strong>UPDATE: </strong></span></p>
<p><span style="color:#000000;">There&#8217;s now a new version of the report available (<a href="http://www.manapps.com/" target="_blank">www.manapps.com</a>, Topic Benchmark), it seems the original was just a work-in-progress and was not meant for public release.  The main difference appears to be a significant improvement in Informatica&#8217;s &#8216;score&#8217;, but I&#8217;m not sure as I was really only interested in comparing the two OSS products, Talend and Pentaho PDI, in that &#8216;battle&#8217; Pentaho still comes out &#8216;slower&#8217;. </span></p>
<p><span style="color:#000000;"> The original Marc Russel blog entry and a subsequent one reporting the new updated report appear to have both been removed.  </span></p>
<p><span style="color:#000000;">Also, I was informed of the &#8216;updated&#8217; report via this email from manapps, which assures vendors that they are happy to rerun any tests and provide any information re the running of such tests &#8230; </span></p>
<blockquote><p><span style="color:#000000;">Dear Sir,</span></p>
<p>You referred on your web site to the report called “Benchmark ETL” by Manapps, from November 2008. This draft report was not intended to be publicly released since just a working document.<br />
We would like you (i) publish Asap the modified version (or its related link) that supersedes the former one (on our web site (<a href="http://www.manapps.com/" target="_blank">www.manapps.com</a>, Topic Benchmark), (ii) state that Manapps had no intend to release the former report and accordingly takes no responsibility on its content, (iii) state that Manapps holds all necessary elements at the disposal of all vendors so that they can rerun some tests if wished that will then be published.</p>
<p>Regards,<br />
Philippe THOMAS</p>
<p>Time: Thursday March 5, 2009 at 5:10 pm</p></blockquote>
<p> </p>
<p><a href="http://marcrussel.files.wordpress.com/2008/10/etlbenchmarks_manappsc221008.pdf"></a><a href="http://www.keeneview.com/2008/10/open-source-marketing-metrics-from-0-to.html"><span style="text-decoration:none;">Another analysis of OSS in the wild</span></a> this time from Chris Keene, <a href="http://www.wavemaker.com/"><span style="text-decoration:none;">WaveMaker</span></a> CEO, on OSS as a marketing tool. Bottom line, 1% conversion rate, 700 paying customers in 9 months &#8230;   </p>
<div class="wp-caption aligncenter" style="width: 510px"><a href="http://www.keeneview.com/uploaded_images/metrics-783392.jpg"><img title="WaveMaker, from click to paying customer." src="http://www.keeneview.com/uploaded_images/metrics-783392.jpg" alt="WaveMaker OSS as a marketing tool" width="500" height="400" /></a><p class="wp-caption-text">WaveMaker OSS as a marketing tool</p></div>
<p> </p>
<p style="text-align:right;"><em>Why not join me on Twitter at </em><a href="http://www.twitter.com/gobansaor"><em>gobansaor</em></a><em>?</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/557/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/557/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/557/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=557&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/10/30/open-source-metrics/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://www.keeneview.com/uploaded_images/metrics-783392.jpg" medium="image">
			<media:title type="html">WaveMaker, from click to paying customer.</media:title>
		</media:content>
	</item>
		<item>
		<title>New universal SQLite JDBC library.</title>
		<link>http://blog.gobansaor.com/2008/07/21/new-universal-sqlite-jdbc-library/</link>
		<comments>http://blog.gobansaor.com/2008/07/21/new-universal-sqlite-jdbc-library/#comments</comments>
		<pubDate>Mon, 21 Jul 2008 16:54:42 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[JDBC]]></category>
		<category><![CDATA[universal]]></category>
		<category><![CDATA[zentus.com]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=417</guid>
		<description><![CDATA[Both Talend (Java) and Kettle distribute the Zentus.com pure-Java SQLite JDBC driver and for most purposes this run-anywhere version is fine. But, if you really need to take advantage of SQLite&#8217;s speed then connecting using the native JNI version is a must.  Doing this was easy enough, just change over to using a generic JDBC [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=417&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>Both Talend (Java) and Kettle distribute the <a href="http://www.zentus.com/sqlitejdbc/">Zentus.com pure-Java SQLite JDBC driver</a> and for most purposes this run-anywhere version is fine. But, if you really need to take advantage of SQLite&#8217;s speed then connecting using the native JNI version is a must.  Doing this was easy enough, just change over to using a generic JDBC connection specifying the required native jar and placing the associated dll/so on your system path.</p>
<p>But now there&#8217;s an easier way, the latest version (V052, in fact from V050 on) is a universal jar, it contains native JNI libraries for Windows, Linux and MacOS alongside the pure-Java version.  It will automatically pick the correct lib for the platform and fall back to the pure-Java version if required.  You can tell if it&#8217;s picked up the native lib by calling conn.getDriverVersion(); it&#8217;ll return &#8220;native&#8221; if it has.</p>
<p>To upgrade to this jar in Kettle see <a href="http://blog.gobansaor.com/2007/10/05/using-the-latest-pure-java-sqlite-jdbc-driver-in-kettle/">this</a>, this time replacing the nested jar with <strong>sqlitejdbc-v052.jar</strong>.</p>
<p>For Talend:</p>
<ul>
<li>Either rename the new V052 jar to <strong><em>sqlitejdbc_v037_nested.jar</em></strong>, replace the existing V037 jar in the ../lib/java folder with this new renamed file.</li>
<li>Or, you could edit the Java specific XML files in the various tSQlite component folders, replacing the references to the old nested V037 jar.</li>
<li>Or, and this is what I would do, don&#8217;t use the tSQLite components, replace them with tJDBC generic components, then you can pick whatever version of the driver you require, you could even change to a different database provider!</li>
</ul>
<p>The Talend tradition of a separate set of components for each type of database, seems to be a hangover from its Perl-generating roots. It&#8217;s true that database specific components are required for certaing tasks such as  bulk-loading, ELTs and so on, but JDBC was designed to be generic and as long as the SQL syntax is compatible, it makes switching in an out database providers very easy.  So unless there&#8217;s a good reason, stick to using tJDBC.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/417/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/417/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/417/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/417/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/417/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/417/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/417/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/417/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/417/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/417/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/417/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/417/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=417&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/07/21/new-universal-sqlite-jdbc-library/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Regular Expressions as an end-user programming tool?</title>
		<link>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/</link>
		<comments>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/#comments</comments>
		<pubDate>Tue, 01 Jul 2008 12:24:25 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[regular expressions]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=382</guid>
		<description><![CDATA[&#8220;What? Have you completely lost the plot, Gleeson?&#8221;, I hear you scream.  Jamie Zawinski&#8217;s famous quote is intoned once more .. Some people, when confronted with a problem, think “I know, I&#8217;ll use regular expressions.”   Now they have two problems. Of course the above quote could be (and probably has been) changed to&#8230; Most business [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=382&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>&#8220;What? Have you completely lost the plot, Gleeson?&#8221;, I hear you scream.  <a href="http://www.jwz.org/">Jamie Zawinski&#8217;s</a> famous quote is intoned once more ..</p>
<blockquote><p><strong>Some people, when confronted with a problem, think<br />
“I know, I&#8217;ll use regular expressions.”   Now they have two problems.</strong></p></blockquote>
<p>Of course the above quote could be (and probably has been) changed to&#8230;</p>
<blockquote><p><strong>Most business people, when confronted with a problem, think<br />
“I know, I&#8217;ll use a spreadsheet.”   Now they have two problems.</strong></p></blockquote>
<p>They are dense, single-line, single purpose, self contained mini-programs.  The previous statement applies to <a href="http://en.wikipedia.org/wiki/Regular_expression">regular expressions</a> but could equally be used to describe the single most popular end-user programming tool, spreadsheet formulae (particularly in their nested form!).</p>
<p>As somebody with the &#8220;<a href="http://blogs.msdn.com/alfredth/archive/2006/12/21/is-there-a-programming-gene.aspx">programming gene</a>&#8221; (something most, but not all, IT professionals possess, as do a significant proportion of &#8220;civilians&#8221;), such compressed logic somewhat grates compared with the power and elegance of more expressive programming languages, but that hasn&#8217;t stopped me using both spreadsheet formulae and regex to quickly and effectively solve problems when the need arose.</p>
<p>Those without the programming gene (the vast majority of business users), find traditional programming languages next to impossible to get their heads around yet find spreadsheet formulae approachable and useful.  It <a href="http://hackety.org/2007/08/15/oneLinersAreCrucial.html">seems to be something to do with approaching problems as a series of simple problems</a> and not loading the whole problem domain into one&#8217;s brain at one sitting (as <a href="http://www.paulgraham.com/head.html">most programmers and system designers are capable of</a>).</p>
<p>In the past, non-programmers would rarely come in contact with regex as its use was possible only within the realms of professional programming or Unix sys-admin toolsets (sed,<a href="http://en.wikipedia.org/wiki/Awk">awk</a> etc.).  But now, ETL tools such as <a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">Kettle and Talend</a> allow end-users to use regular expressions without the need to understand the underlying programming language.  Taking this to the next step, Talend&#8217;s <a href="http://www.talend.com/products-data-quality/talend-open-profiler.php">new data profiling product</a> uses regular expressions as its main discovery language. They could, I guess, have invented yet another XML dialect and/or <a href="http://en.wikipedia.org/wiki/Query_by_Example">query-by-example</a> dialogue, but instead they&#8217;ve taken the sensible (and cheaper) option and exposed the full power of raw regular expressions.</p>
<p>Will the great unwashed embrace regex in the same way they took to nested Excel functions, embarrassing their professional colleagues with yet more amateurish and often unmaintainable messy solutions, that just work? I think they just might&#8230;</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/382/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/382/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/382/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=382&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>SQLite &#8211; the ultimate data-smithing tool!</title>
		<link>http://blog.gobansaor.com/2008/04/26/sqlite-the-ultimate-data-smithing-tool/</link>
		<comments>http://blog.gobansaor.com/2008/04/26/sqlite-the-ultimate-data-smithing-tool/#comments</comments>
		<pubDate>Sat, 26 Apr 2008 16:21:16 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[AmazonAWS]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[Microsoft Access]]></category>
		<category><![CDATA[Amazon SimpleDB]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=363</guid>
		<description><![CDATA[Image via Wikipedia Although my data-smithing tool box is full to the brim with powerful tools such as Talend, Kettle PDI, Picalo and Excel, all backed by the cloud infrastructure of Amazon&#8217;s S3, SImpleDB and EC2, there&#8217;s one simple yet powerful tool that I always seem to gravitate back to, that tool is SQLite. Now [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=363&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="float:right;margin:1em;"><a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png" target="_blank"><img style="border:medium none;display:block;" src="http://upload.wikimedia.org/wikipedia/commons/thumb/1/19/SQLite_Logo_4.png/202px-SQLite_Logo_4.png" alt="SQLite logo as of 2007-12-15" /></a>Image via <a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png" target="_blank">Wikipedia</a></div>
<p>Although my data-smithing tool box is full to the brim with powerful tools such as <a href="http://www.talend.com">Talend</a><a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">, Kettle PDI</a>, <a href="http://blog.gobansaor.com/2008/04/11/python-the-new-vba/">Picalo</a> and Excel, all backed by the cloud infrastructure of <a href="http://www.amazonaws.com/">Amazon&#8217;s S3, SImpleDB and EC2</a>, there&#8217;s one simple yet powerful tool that I always seem to gravitate back to, that tool is <a href="http://blog.gobansaor.com/2007/02/01/whats-so-good-about-sqlite/">SQLite</a>.</p>
<p>Now obviously being a hewer of data, I need a SQL compliant database for data manipulation  and SQLite performs that task with speed and ease.  But it&#8217;s not just in the hewing, it&#8217;s in the hauling of data where SQLite also shines.</p>
<p>I use SQLite as the container for passing tabular datasets between (and within) my various tools, that data doesn&#8217;t even need to be clean (due to SQlite&#8217;s liberal <a href="http://www.sqlite.org/different.html">manifest typing</a> rules) just so long as it can be expressed as a table.</p>
<p>For example; a Talend job could store an extracted dataset in a SQLite file, pass that file on to a Python script for some special processing (for example extracting further data from a source not directly supported by Talend such as <a href="http://pypi.python.org/pypi/sapnwrfc/">SAP</a> or <a href="http://blog.gobansaor.com/2008/01/03/couchdb-ibms-simpledb-and-s3/">SimpleDB</a>), and then pass the resulting SQLite database on to Excel or <a href="http://blog.gobansaor.com/2007/04/13/proto/">a similar tool</a> to allow a business user to view and perhaps modify the data; finally Talend picking up the file again to load it into a corporate data warehouse.</p>
<p>Now you could use flat files to transport the data or store the intermediate results in a corporate database, but SQLite is as easy, if not easier than, flat files and offers the SQL processing capabilities of big-iron databases, but without the hassle of getting write access to an existing server  or setting one up from scratch.</p>
<p>And I know there are other similar file based database offerings such as MS Access and the Java only <a href="http://hsqldb.org/">HSQLDB</a>, but neither match SQLite&#8217;s ubiquitousness, sheer simplicity and powerful data processing ability.</p>
<div id="zemanta-pixie" style="width:100%;margin:5px 0;"><a id="zemanta-pixie-a" title="Zemified by Zemanta" href="http://www.zemanta.com/"><img style="border:medium none;float:right;" src="http://img.zemanta.com/pixie.png?x-id=7ccd587a-6011-4f41-b9f1-f3fb3f5821f5" alt="" /></a></div>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/363/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/363/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/363/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=363&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/04/26/sqlite-the-ultimate-data-smithing-tool/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/commons/thumb/1/19/SQLite_Logo_4.png/202px-SQLite_Logo_4.png" medium="image">
			<media:title type="html">SQLite logo as of 2007-12-15</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixie.png?x-id=7ccd587a-6011-4f41-b9f1-f3fb3f5821f5" medium="image" />
	</item>
		<item>
		<title>Dublin Bus and PALO ETL &#8211; the connection!</title>
		<link>http://blog.gobansaor.com/2008/01/26/dublin-bus-and-palo-etl-the-connection/</link>
		<comments>http://blog.gobansaor.com/2008/01/26/dublin-bus-and-palo-etl-the-connection/#comments</comments>
		<pubDate>Sat, 26 Jan 2008 20:17:36 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[AmazonAWS]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[S3]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[SimpleDB]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[VBA]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[Dublin]]></category>
		<category><![CDATA[Dublin Bus]]></category>
		<category><![CDATA[hmac]]></category>
		<category><![CDATA[sha1]]></category>
		<category><![CDATA[sha1hmac]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/?p=351</guid>
		<description><![CDATA[Dublin buses, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of &#8220;no sign of a bus for ages, then two or three arrive at the same time&#8221;. Palo MOLAP ETL options appear to be following the same pattern; we&#8217;ve been waiting for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=351&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.dublinbuses.com/">Dublin buses</a>, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of &#8220;no sign of a bus for ages, then two or three arrive at the same time&#8221;. <a href="http://www.palo.net">Palo MOLAP</a> ETL options appear to be following the same pattern; we&#8217;ve been waiting for ETL support for ages and now we see three of them heading down the road towards us. There&#8217;s <a href="http://blog.gobansaor.com/2008/01/05/palo-etl-server-and-sap/">Palo&#8217;s own offering</a>, then came <a href="http://www.stratebi.com/">Stratebi</a>&#8216;s <a href="http://sourceforge.net/projects/palokettleplug/">Kettle Plugin</a> and now <a href="http://www.talend.com/download.php">Talend <b>Version 2.3.0RC2 </b> is offering a Palo output component</a>.</p>
<p>Mind you, the Talend offering is very basic and I&#8217;ve not managed to get the Sratebi plugin to work, leaving Palo&#8217;s ETL Server as the front runner at the moment (drill-through capability is a winner in my book).</p>
<p>I&#8217;ve also been busy re-factoring my <a href="http://blog.gobansaor.com/projects/xlite/">VBA SQLite</a> and Amazon S3  code with the intention of publishing them as an Excel based micro-ETL platform.  While cleaning up the Amazon AWS modules I&#8217;ve been playing with <a href="http://www.amazon.com/b?ie=UTF8&amp;node=342335011">SimpleDB</a>, I&#8217;m impressed, Excel combined with SimpleDB  rocks!</p>
<p>I&#8217;ve also wrapped the open source <a href="http://xyssl.org/code/source/sha1/">XySSL  SHA1 HMAC  </a>C code in a VBA friendly DLL, as searching for a VBA hmac sha1 hash  implementation (essential for Amazon AWS access) has proved fruitless.</p>
<p>Hope to release the lot the end of next month.</p>
<p>UPDATE:</p>
<p>Thanks to Javier and Jorge from <a href="http://www.stratebi.com/">Stratebi</a> I&#8217;ve managed to get the new Kettle Palo plugin to work.  It seems that the TEST facility in the Kettle database connection dialogue throws an exception for Palo connections but the connections work fine in the actual Palo input/output steps.   Did a quick test and it looks very easy to use and fits in well with the Kettle &#8220;way of doing things&#8221;.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/351/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/351/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/351/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=351&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/01/26/dublin-bus-and-palo-etl-the-connection/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>PALO ETL-Server, first sighting &#8230;</title>
		<link>http://blog.gobansaor.com/2007/12/06/palo-etl-server-first-sighting/</link>
		<comments>http://blog.gobansaor.com/2007/12/06/palo-etl-server-first-sighting/#comments</comments>
		<pubDate>Thu, 06 Dec 2007 16:50:05 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[HSQLDB]]></category>
		<category><![CDATA[IMPPalo]]></category>
		<category><![CDATA[Palo ETL-Server]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/12/06/palo-etl-server-first-sighting/</guid>
		<description><![CDATA[I was wrong. I figured Jedox would build their new ETL server on one of the existing open source ETL project code-bases, either Talend or Pentaho&#8217;s Kettle. Instead, the new alpha ETL server code which has just been uploaded to SourceForge is based on neither and appears to have been developed by another German company [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=341&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>I was wrong.  I figured Jedox would build their <a href="http://www.jedox.com/en/news/216/palo_etl-server_enterprise_data_integration_for_palo.html">new ETL server</a> on one of the existing open source ETL project code-bases, either <a href="http://www.talend.com">Talend</a> or <a href="http://kettle.pentaho.org">Pentaho&#8217;s Kettle</a>.  Instead, the new alpha ETL server code which has just been <a href="http://sourceforge.net/projects/palo-importer">uploaded to SourceForge</a> is based on neither and appears to have been developed by another German company <a href="http://www.proclos.com">Proclos</a>.</p>
<p>Rather that a full featured all-things-to-all-men ETL tool, it&#8217;s a specialist MOLAP cube import tool, like an XML driven version of  <a href="http://www.imppalo.com/">IMPPalo</a>.  Being Java based, it should be easy enough to combine with Kettle to offer the best of both worlds; let Kettle do the heavy lifting and the management of conformed dimensions and fact tables, then use Palo ETL-Server to build the hierarchies and load the cubes from these tables.</p>
<p>There&#8217;s no documentation as yet but there&#8217;s two demo projects; importRelDB.xml, which loads data into a cube from a HSQLDB in-memory database and a CSV file; and importOLAP.xml, which copies data from one Palo cube to another.</p>
<p>To run the importRelDB.xml project  &#8230;</p>
<p>Java -jar  importer.jar &#8211; p importRelDB</p>
<p>&#8230; each project is broken up into Jobs (such as Initdata, MasterData, CubeData, again like IMPPalo) and these can be run separately by using the -j option.</p>
<p>The tool is controlled via XML configuration files and lacks a GUI interface (which is fine by me, I&#8217;m more of a command-line guy).  I&#8217;ve  checked-out the <a href="https://palo-importer.svn.sourceforge.net/svnroot/palo-importer">SVN code</a> and am slowly working my way through it, no sign as yet as to how drill-back from PALO cubes will be enabled; as this project is called <strike>Importer</strike> ETLCore, perhaps that&#8217;s yet to come.</p>
<p>So far, I like what I see.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/341/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/341/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/341/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=341&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/12/06/palo-etl-server-first-sighting/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>New ETL platform for PALO OLAP</title>
		<link>http://blog.gobansaor.com/2007/11/28/new-etl-platform-for-palo-olap/</link>
		<comments>http://blog.gobansaor.com/2007/11/28/new-etl-platform-for-palo-olap/#comments</comments>
		<pubDate>Wed, 28 Nov 2007 10:45:56 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[BI]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Palo]]></category>
		<category><![CDATA[Talend]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[olap]]></category>
		<category><![CDATA[drill-back]]></category>
		<category><![CDATA[Drill-down]]></category>
		<category><![CDATA[Jedox]]></category>
		<category><![CDATA[Mondrian]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/11/28/new-etl-platform-for-palo-olap/</guid>
		<description><![CDATA[Jedox have announced that they intend to ship a Palo centric ETL open source server product early next year. This is excellent news and is on top of the new rules engine that was added to Palo this summer. Open source MOLAP has suddenly taken off the training wheels and is getting ready to mess [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=338&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jedox.com/en/news/216/palo_etl-server_enterprise_data_integration_for_palo.html">Jedox have announced</a> that they intend to ship a <a href="http://www.palo.net">Palo</a> centric ETL open source server product early next year.  This is excellent news and is on top of the <a href="http://blog.gobansaor.com/2007/08/04/palo-20-beta-released/">new rules engine that was added</a> to Palo this summer.  Open source <a href="http://en.wikipedia.org/wiki/MOLAP">MOLAP</a> has suddenly taken off the training wheels and is getting ready to mess with the big kids.  The two things I really like about the new proposed Palo-ETL server are; it&#8217;s open-source and it&#8217;s designed to enable drill-down from the analysis cube back to the data source.</p>
<p>Drill back is the 2nd most common reason for continuing IT staff involvement in the day-to-day running of BI projects (the 1st is of course the <a href="http://en.wikipedia.org/wiki/Extract,_transform,_load">ETL</a> process).   The &#8220;Where did that figure come from?&#8221; question is one of the reasons that <a href="http://www.cpearson.com/excel/pivots.htm">the Excel pivot table function</a> is so popular, double-click on a data cell &#8211; the rows used to generate that figure are displayed on a new sheet; simple but powerful.</p>
<p>As to what platform the new server is to be built on, I&#8217;m guessing <a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">Talend</a><a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/"> or Kettle</a>. Now it is possible that Jedox have rolled their own product from scratch but with two  superb open-source products already out there it would seem pointless.  <a href="http://www.talend.com">Talend</a> actively seeks out technology licensing agreements with <a href="http://www.talend.com/partners/talend-alliance-partners.php">other companies</a> and has just opened <a href="http://www.talend.com/press/talend-open-a-new-office-in-germany.php">an office in Germany</a> (Jedox is based in <a href="http://en.wikipedia.org/wiki/Freiburg">Freiburg</a>) so it would be the most likely contender.  But, <a href="http://www.pentaho.com/products/data_integration/">Pentaho&#8217;s Kettle</a> may also be in the running as there&#8217;s already <a href="http://blog.gobansaor.com/2007/02/07/palo-plugin-for-kettle-pentaho-etl/">prior-art</a>  and this <a href="http://forums.pentaho.org/showthread.php?t=55335">Pentaho forum thread.</a>   Also, the one major missing from the comprehensive <a href="http://www.pentaho.com">Pentaho</a> stable is lack of an <a href="http://blog.gobansaor.com/2007/09/05/in-memory-olap/">in-memory OLAP</a> tool (their current OLAP offering is the <a href="http://en.wikipedia.org/wiki/ROLAP">ROLAP</a> <a href="http://mondrian.pentaho.org/">Mondrian project</a>).</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/338/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/338/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/338/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=338&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/11/28/new-etl-platform-for-palo-olap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Using the latest Pure Java SQLite JDBC driver in Kettle</title>
		<link>http://blog.gobansaor.com/2007/10/05/using-the-latest-pure-java-sqlite-jdbc-driver-in-kettle/</link>
		<comments>http://blog.gobansaor.com/2007/10/05/using-the-latest-pure-java-sqlite-jdbc-driver-in-kettle/#comments</comments>
		<pubDate>Fri, 05 Oct 2007 10:45:01 +0000</pubDate>
		<dc:creator>Tom Gleeson</dc:creator>
				<category><![CDATA[ETL]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[kettle]]></category>
		<category><![CDATA[JDBC]]></category>
		<category><![CDATA[out of memory]]></category>

		<guid isPermaLink="false">http://blog.gobansaor.com/2007/10/05/using-the-latest-pure-java-sqlite-jdbc-driver-in-kettle/</guid>
		<description><![CDATA[The bug in the pure Java SQLiteJDBC driver that caused an &#8220;out of memory&#8221; error when trying to connect to a SQLite database using standard windows drive letters (e.g. c:\kettle\mydata.db) is now fixed. The current version (V037) has also been updated to SQLite version 3.4.2. To use the latest driver within Kettle, download the file [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=331&subd=gobansaor&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<p>The bug in the <a href="http://www.zentus.com/sqlitejdbc/index.html">pure Java SQLiteJDBC driver</a> that caused an <a href="http://blog.gobansaor.com/2007/01/20/kettle-and-sqlite/">&#8220;out of memory&#8221; error</a> when trying to connect to a SQLite database using standard windows drive letters (e.g. c:\kettle\mydata.db) is now fixed.  The current version (V037) has also been updated to SQLite version 3.4.2.  To use the latest driver within <a href="http://kettle.pentaho.org">Kettle</a>, download the file from <a href="http://www.zentus.com/sqlitejdbc/dist/sqlitejdbc-v037-nested.tgz">here</a>, go to the ../libext/JDBC folder, delete the included sqlitejdbc-v023-nested.jar and replace with sqlitejdbc-v027-nested.jar.</p>
<p><strong>Update:</strong></p>
<p>Matt Casters of the Pentaho Kettle team left a comment to say version 2.5.2 and version 3.0.0-RC2 will ship with the sqlitejdbc-v027-nested.jar</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/331/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/331/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/331/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/331/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/331/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/331/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/331/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/331/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/331/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/331/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/331/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/331/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=331&subd=gobansaor&ref=&feed=1" />]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2007/10/05/using-the-latest-pure-java-sqlite-jdbc-driver-in-kettle/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/b714f82b5e24beb3b74779615b6ad969?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
	</channel>
</rss>