Haven’t posted here in a while as my spare time has been soaked up programing, well actually refactoring would be more exact. My xLite “SQLite empowered Excel” codebase has grown over the years and required a serious makeover to get rid of stuff I no longer use and to generally make it more robust. I [...]
Archive for the ‘ETL’ Category
Spending time on Excel-SQLite, C, VBA Callbacks & Twitter
Posted in BI, ETL, Palo, SQLite, VBA, Web2.0, excel, xLite, tagged c#, Twitter on November 20, 2008 | No Comments »
Open Source Metrics and Benchmarks
Posted in ETL, Talend, kettle, tagged ETL benchmarks, PDI 3.0, WaveMaker on October 30, 2008 | 11 Comments »
Marc Russel’s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)). As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now [...]
Why Larry hates the cloud, and my data trinity.
Posted in AmazonAWS, ETL, Palo, SQLite, cloud, excel, olap, tagged cloud bursting, Oracle on October 4, 2008 | No Comments »
Last week Oracle certified Amazon EC2 as a supported platform, that same week Larry Elison attacked the concept of cloud computing as pure hype. Obviously, Larry is not happy with this whole cloud thing, and I think it’s not just the threat it poses to the software industry’s traditional licensing model that worries him, rather, as Robert X. Cringely [...]
Clouds no longer pass by Windows.
Posted in AmazonAWS, EC2, ETL, RSSBus, Web2.0, data, news, tagged cloud, cloud burst, SQLServer on EC2, Windows on EC2 on October 1, 2008 | 4 Comments »
Amazon today announced that later this year, Windows Server woud be available on EC2. No details on cost and licensing etc. but this is major. Up until now, that portion of the business world who are pure MS shops (a very large percentage especially amongst SMEs) were excluded from taking advantage of Amazon’s amazing (and [...]
Cloudy skies, cloudy apps…
Posted in BI, ETL, Ireland, Palo, Web2.0, cloud, data, excel, news, olap, tagged Freiburg, Jedox, WaveMaker, Worksheet Server on August 28, 2008 | 4 Comments »
Just back from a break in Clifden, Connemara, summer is nearly over, the kids return to school today, back to work.
Counties Galway and Mayo were like the rest of the country last week, a tad wet, but unlike the developed east of the island, flooding was not a problem; a problematic drainage area is called [...]
Talend + SQLite + Groovy the new Oracle …
Posted in BI, EC2, ETL, Groovy, Palo, SQLite, Talend, data, excel, olap, tagged Oracle, Oracle 10g Express on August 2, 2008 | 5 Comments »
… well, at least for me. Let me explain.
For most of my datasmithing career, I’ve had access to corporate Oracle databases and now with the availability of Oracle10g Express I can even run my own Oracle instances at home or on EC2. The combination of a powerful SQL engine, expressive scripting language (PL/SQL) ,OS independence, [...]
New universal SQLite JDBC library.
Posted in ETL, Java, SQLite, Talend, kettle, news, tagged JDBC, universal, zentus.com on July 21, 2008 | No Comments »
Both Talend (Java) and Kettle distribute the Zentus.com pure-Java SQLite JDBC driver and for most purposes this run-anywhere version is fine. But, if you really need to take advantage of SQLite’s speed then connecting using the native JNI version is a must. Doing this was easy enough, just change over to using a generic JDBC [...]
Groovy as Talend’s scripting language
Posted in ETL, Groovy, Java, Palo, SQLite, Talend, data, tagged Jetty, SQLite user defined functions on July 20, 2008 | 5 Comments »
Although I had decided to use Talend (Java version) as my primary datasmithing tool I still had one major problem with it, its lack of a scripting tool. Kettle (Pentaho PDI) has Javascript, Excel has VBA, Picalo has (well OK, is) Python and Talend in its Perl version has Perl. I could have gone (and [...]
Regular Expressions as an end-user programming tool?
Posted in ETL, Talend, excel, kettle, tagged regex, regular expressions on July 1, 2008 | 2 Comments »
“What? Have you completely lost the plot, Gleeson?”, I hear you scream. Jamie Zawinski’s famous quote is intoned once more ..
Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.” Now they have two problems.
Of course the above quote could be (and probably has been) changed to…
Most business people, when confronted with [...]
What to do when Talend gets its knickers in a twist?
Posted in ETL, Talend, tagged .item, .JETEmtiters, Java on June 30, 2008 | 2 Comments »
If you’ve done any significant amount of work with Talend you’ll undoubtedly have experienced situations where either the generated code/JETemitters or the GUI representation of a job become unstable like so…
The usual advice is to backup your projects (workspace/projectName) , delete the workspace/.Java (or .Perl) and workspace/.JETEmitters folders and restart Talend to force a [...]
Palo OLAP and sparse dimensions.
Posted in BI, ETL, Palo, excel, tagged Add new tag, drill-through, drill-thru, essbase, ETL-Server, Palo 2.5, pivot, sparse dimension on May 26, 2008 | 9 Comments »
Last week I tried out both the latest Palo 2.5 release and its sister product, ETL-Server. Although I’ve not done any proper benchmarks, 2.5 does appear to be faster than the previous release and the Excel add-in also behaves better when co-habiting with other add-ins and macros (the previous release’s use of, and response to, [...]
Python to replace VB6 …
Posted in ETL, Python, VBA, excel, tagged Add new tag, Microsoft Visual Studio, VB6 on May 5, 2008 | 2 Comments »
… well at least for me. As I discussed previously I’ve been seriously investigating using Python as my primary datasmithing scripting language, in effect a new VBA. I also currently use VBA’s compiled cousin, VB6, for certain tasks such as building Excel RTD servers. The problem with VB6 is it depends on [...]
Palo ETL Server - Not for me …
Posted in BI, ETL, Palo, SQLite, excel, tagged MOLAP, Pivot Table on May 1, 2008 | 2 Comments »
Jedox have just released V1.0 of their Palo-centric ETL Server. I had been looking forward to this, not so much for its ETL ability (which is somewhat limited when compared to the likes of Pentaho PDI or Talend) but for the drill-through capability it would add to Palo. Alas, there’s a catch, you [...]
SQLite - the ultimate data-smithing tool!
Posted in AmazonAWS, ETL, SQLite, Talend, data, excel, kettle, tagged Amazon SimpleDB, Microsoft Access on April 26, 2008 | 1 Comment »
Image via Wikipedia
Although my data-smithing tool box is full to the brim with powerful tools such as Talend, Kettle PDI, Picalo and Excel, all backed by the cloud infrastructure of Amazon’s S3, SImpleDB and EC2, there’s one simple yet powerful tool that I always seem to gravitate back to, that tool is SQLite.
Now obviously being [...]
Python the new VBA ?
Posted in BI, ETL, Palo, Ruby, SQLite, Web2.0, data, excel, news, tagged appengine, AWK, Perl, Picalo, Resolver on April 11, 2008 | 6 Comments »
These last two weeks, Python has been on my mind. First off, last week I decided to make time to fully investigate Picalo, an open-source Python-based data analysis tool, and then, this week, Google announced their long awaited cloud-computing offering, Google Apps Engine, with the language at its core.
Python was the first of [...]
Postgres Plus Cloud Edition is boring …
Posted in AmazonAWS, BI, EC2, ETL, S3, SQLite, SimpleDB, olap, tagged Elastra, EnterpriseDB, Oracle, PostgreSQL on March 27, 2008 | 2 Comments »
… and that’s good. That’s how I like my databases, boring, reliable, consistent, easy to use.
SimpleDB on the other hand is not boring, it’s an exciting new shiny thing that opens up a myriad of new possibilities; but first, I and the rest of the developer community, need to tool up and cast aside [...]
Java - at the eye of a perfect storm
Posted in ETL, Java, Palo, cloud, tagged Dojo, Hibernate, J2EE, Jetty, Oracle APEX, Palo ETL-Server, Spring, Tomcat, WaveMaker on March 11, 2008 | 3 Comments »
The “perfect storm” of ubiquitous broadband, powerful and cheap laptops, virtual machines, cloud-based services/infrastructure and open-source software is changing the nature of IT in a way that’s reminiscent of the revolution started by the IBM PC. Although a lot of emphasis has been put on the influence of consumer-focused services on [...]
Dublin Bus and PALO ETL - the connection!
Posted in AmazonAWS, BI, ETL, Palo, S3, SQLite, SimpleDB, Talend, VBA, excel, kettle, olap, tagged Dublin, Dublin Bus, hmac, sha1, sha1hmac on January 26, 2008 | 5 Comments »
Dublin buses, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of “no sign of a bus for ages, then two or three arrive at the same time”. Palo MOLAP ETL options appear to be following the same pattern; we’ve been waiting for [...]
PALO ETL-Server and SAP
Posted in ETL, Palo, olap, tagged SAP, SAP-BW, XMLA on January 5, 2008 | 1 Comment »
Jedox have just published a roadmap for their open-source ETL-Server, release date of March 2008, same date as the next release of the Palo OLAP Server. In a future release they intend to offer SAP RFC/BAPI and SAP-BW XMLA support, being an old SAP hand this looks very interesting.
There’s also a features page [...]
PALO ETL Server, more sightings …
Posted in ETL, Java, Palo, olap, tagged Apache Axis, Jetty, Palo ETL-Server, SOAP, WSDL on January 3, 2008 | 1 Comment »
First day back after Christmas, snow falling outside.
More additions to the PALO ETL-Server SourceForge project, new version of the core and, a new web server - built using Jetty and Apache Axis. Axis is a SOAP handler so I looked around for the WSDL file to see what services are to be exposed and [...]