Feed on
Posts
Comments

Archive for the ‘Talend’ Category

Marc Russel’s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now [...]

Read Full Post »

… well, at least for me.  Let me explain.
For most of my datasmithing career, I’ve had access to corporate Oracle databases and now with the availability of  Oracle10g  Express I can even run my own Oracle instances at home or on EC2.  The combination of a powerful SQL engine, expressive scripting language (PL/SQL) ,OS independence, [...]

Read Full Post »

Both Talend (Java) and Kettle distribute the Zentus.com pure-Java SQLite JDBC driver and for most purposes this run-anywhere version is fine. But, if you really need to take advantage of SQLite’s speed then connecting using the native JNI version is a must.  Doing this was easy enough, just change over to using a generic JDBC [...]

Read Full Post »

Although I had decided to use Talend (Java version) as my primary datasmithing tool I still had one major problem with it, its lack of a scripting tool.  Kettle (Pentaho PDI) has Javascript, Excel has VBA, Picalo has (well OK, is) Python and Talend in its Perl version has Perl.  I could have gone (and [...]

Read Full Post »

“What? Have you completely lost the plot, Gleeson?”, I hear you scream.  Jamie Zawinski’s famous quote is intoned once more ..
Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.”   Now they have two problems.
Of course the above quote could be (and probably has been) changed to…
Most business people, when confronted with [...]

Read Full Post »

If you’ve done any significant amount of work with Talend you’ll undoubtedly have experienced situations where either the generated code/JETemitters or the GUI representation of a job become unstable like so…

The usual advice is to backup your projects (workspace/projectName) , delete the workspace/.Java (or .Perl) and workspace/.JETEmitters folders and restart Talend to force a [...]

Read Full Post »

Image via Wikipedia
Although my data-smithing tool box is full to the brim with powerful tools such as Talend, Kettle PDI, Picalo and Excel, all backed by the cloud infrastructure of Amazon’s S3, SImpleDB and EC2, there’s one simple yet powerful tool that I always seem to gravitate back to, that tool is SQLite.
Now obviously being [...]

Read Full Post »

Dublin buses, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of “no sign of a bus for ages, then two or three arrive at the same time”. Palo MOLAP ETL options appear to be following the same pattern; we’ve been waiting for [...]

Read Full Post »

I was wrong. I figured Jedox would build their new ETL server on one of the existing open source ETL project code-bases, either Talend or Pentaho’s Kettle. Instead, the new alpha ETL server code which has just been uploaded to SourceForge is based on neither and appears to have been developed by another [...]

Read Full Post »

Jedox have announced that they intend to ship a Palo centric ETL open source server product early next year. This is excellent news and is on top of the new rules engine that was added to Palo this summer. Open source MOLAP has suddenly taken off the training wheels and is getting ready [...]

Read Full Post »

Although I’m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word. [...]

Read Full Post »

Talend and Perl

I’ve downloaded V2.1 of the open source Talend ETL tool; lots of new connectors added and the Java SQLite connector no longer requires a JNI adapter. I’ve evaluated Talend in the past mainly concentrating on its Java code generating capability, this time I revisited the original Perl generator. Why? Well I know Perl, [...]

Read Full Post »

I’ve been meaning to try out the Apatar ETL/Mashup tool for sometime and today being yet another rainy day in this the worst Irish summer that I can remember (and Irish summers are not renowned for the lack of rainfall) I decided to give it a try out. Not impressed I’m afraid; comes up [...]

Read Full Post »

Over the last few weeks I’ve received a lot of traffic from Goggle searches comparing Talend and Kettle and also from Vincent McBurney’s ITtoolbox article comparing the two products, so where do I stand?
As ETL tools they take different approaches, Kettle is a meta data driven framework (which is in turn tightly integrated into an [...]

Read Full Post »

Having mastered JavaScript (OK master is too strong a word - having become comfortable with both its syntax and usage patterns) my next port of call is JavaFX the recently announced Flash/Silverlight competitor. What led me to JavaFX Script was not its role in this Flash/AJAX alternative platform (which unless  Sun improves [...]

Read Full Post »

For the last few months I’ve being looking for my ideal ETL platform. That ideal would be open source, platform independent (well at least Windows and Linux), flexible, and easily deployable. It had looked like a combination of Kettle and my micro-ETL combinations of Ruby/SSQLite and Excel/SQLite would be the eventual “winners”. [...]

Read Full Post »

Talend have released a new version of their Open Studio ETL tool. Not as full featured as Pentaho Kettle; only supports a limited number of databases and file formats - no SQLite support shock-horror! The press release promises More than 100 Native Connectors and promises connectors to ERP and CRM tools but [...]

Read Full Post »