Feed on
Posts
Comments

Archive for the ‘kettle’ Category

Marc Russel’s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now [...]

Read Full Post »

Both Talend (Java) and Kettle distribute the Zentus.com pure-Java SQLite JDBC driver and for most purposes this run-anywhere version is fine. But, if you really need to take advantage of SQLite’s speed then connecting using the native JNI version is a must.  Doing this was easy enough, just change over to using a generic JDBC [...]

Read Full Post »

“What? Have you completely lost the plot, Gleeson?”, I hear you scream.  Jamie Zawinski’s famous quote is intoned once more ..
Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.”   Now they have two problems.
Of course the above quote could be (and probably has been) changed to…
Most business people, when confronted with [...]

Read Full Post »

Image via Wikipedia
Although my data-smithing tool box is full to the brim with powerful tools such as Talend, Kettle PDI, Picalo and Excel, all backed by the cloud infrastructure of Amazon’s S3, SImpleDB and EC2, there’s one simple yet powerful tool that I always seem to gravitate back to, that tool is SQLite.
Now obviously being [...]

Read Full Post »

Dublin buses, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of “no sign of a bus for ages, then two or three arrive at the same time”. Palo MOLAP ETL options appear to be following the same pattern; we’ve been waiting for [...]

Read Full Post »

I was wrong. I figured Jedox would build their new ETL server on one of the existing open source ETL project code-bases, either Talend or Pentaho’s Kettle. Instead, the new alpha ETL server code which has just been uploaded to SourceForge is based on neither and appears to have been developed by another [...]

Read Full Post »

Jedox have announced that they intend to ship a Palo centric ETL open source server product early next year. This is excellent news and is on top of the new rules engine that was added to Palo this summer. Open source MOLAP has suddenly taken off the training wheels and is getting ready [...]

Read Full Post »

The bug in the pure Java SQLiteJDBC driver that caused an “out of memory” error when trying to connect to a SQLite database using standard windows drive letters (e.g. c:\kettle\mydata.db) is now fixed. The current version (V037) has also been updated to SQLite version 3.4.2. To use the latest driver within Kettle, download [...]

Read Full Post »

Although I’m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word. [...]

Read Full Post »

I’ve been meaning to try out the Apatar ETL/Mashup tool for sometime and today being yet another rainy day in this the worst Irish summer that I can remember (and Irish summers are not renowned for the lack of rainfall) I decided to give it a try out. Not impressed I’m afraid; comes up [...]

Read Full Post »

The announcement of Google Gears is of course a game changer for those working in the development of online apps; its addition to Goggle Reader alone would make it worth while for me and I’m sure we’ll see it integrated into Google Docs and GMail in the near future. If you had any plans [...]

Read Full Post »

Over the last few weeks I’ve received a lot of traffic from Goggle searches comparing Talend and Kettle and also from Vincent McBurney’s ITtoolbox article comparing the two products, so where do I stand?
As ETL tools they take different approaches, Kettle is a meta data driven framework (which is in turn tightly integrated into an [...]

Read Full Post »

For the last few months I’ve being looking for my ideal ETL platform. That ideal would be open source, platform independent (well at least Windows and Linux), flexible, and easily deployable. It had looked like a combination of Kettle and my micro-ETL combinations of Ruby/SSQLite and Excel/SQLite would be the eventual “winners”. [...]

Read Full Post »

Talend have released a new version of their Open Studio ETL tool. Not as full featured as Pentaho Kettle; only supports a limited number of databases and file formats - no SQLite support shock-horror! The press release promises More than 100 Native Connectors and promises connectors to ERP and CRM tools but [...]

Read Full Post »

I’ve spend a few hours trying out the latest Kettle 2.5.0 RC1 release candidate, new UI and lots of new features. Looks like the PALO code developed by 3a-strategy will not make into this release, but I see Cubeware have released IMP:PALO cube loading software, offering both a free and a premium [...]

Read Full Post »

What have Javascript and VBA in common? Not much on the surface and their respective user bases rarely if ever overlap. What they do share are their roles as the imperative (the-if-then-else-loop-etc) programming languages of the “I’m not a programmer” programmers, the great unwashed, the “normal” people out there who [...]

Read Full Post »

The much awaited Palo plugin for the Kettle ETL tool has been released. Oh happy days!
Palo is an open source MOLAP database developed by the German company Jedox. Although it doesn’t the match the power of established OLAP engines such as Essbase and many simple cross-tab/pivot requirements can be handled by an Excel [...]

Read Full Post »

Kettle and SQLite

Matt Casters has added SQLite support to Pentaho’s Kettle ETL tool in the latest development release. I’ve tested it under Windows using JRE 1.5.0_09 and it worked fine but having upgraded to JRE 1.5.0._10 I’m now getting “out of memory” errors, appears to be a problem with the “pure java” jdbc driver [...]

Read Full Post »

I’ve been a big fan of SQLite for several years now. Although I come from an Oracle database background, I find for day-to-day data smith’ing SQLite is ideal. Combine it with the expressive power of Ruby and you have a very powerful micro-ETL environment.
I’m also [...]

Read Full Post »