<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Gobán Saor</title>
	<atom:link href="http://blog.gobansaor.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.gobansaor.com</link>
	<description>A country datasmith.</description>
	<pubDate>Sun, 20 Jul 2008 19:29:27 +0000</pubDate>
	<generator>http://wordpress.org/?v=MU</generator>
	<language>en</language>
			<item>
		<title>Groovy as Talend&#8217;s scripting language</title>
		<link>http://blog.gobansaor.com/2008/07/20/groovy-as-talends-scripting-language/</link>
		<comments>http://blog.gobansaor.com/2008/07/20/groovy-as-talends-scripting-language/#comments</comments>
		<pubDate>Sun, 20 Jul 2008 18:04:57 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Groovy]]></category>

		<category><![CDATA[Java]]></category>

		<category><![CDATA[Palo]]></category>

		<category><![CDATA[SQLite]]></category>

		<category><![CDATA[Talend]]></category>

		<category><![CDATA[data]]></category>

		<category><![CDATA[Jetty]]></category>

		<category><![CDATA[SQLite user defined functions]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=387</guid>
		<description><![CDATA[Although I had decided to use Talend as my primary datasmithing tool I still had one major problem with it, its lack of a scripting tool.  Kettle (Pentaho PDI) has Javascript, Excel has VBA, Picalo has (well OK, is) Python.  I could have gone (and did experiment) with calling Javascript, Jython or JRuby via JSR223, [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Although I had decided to use <a href="http://www.talend.org">Talend</a> as my primary datasmithing tool I still had one major problem with it, its lack of a scripting tool.  Kettle (Pentaho PDI) has Javascript, Excel has VBA, <a href="http://www.picalo.org/">Picalo</a> has (well OK, is) Python.  I could have gone (and did experiment) with calling Javascript, Jython or JRuby via <a href="http://www.jcp.org/en/jsr/detail?id=223">JSR223</a>, but I wasn&#8217;t happy with the level of integration afforded by this, opting instead to make command line calls to Python (using SQLite as a data carrier).</p>
<p>Then, I discovered <a href="http://groovy.codehaus.org/">Groovy</a>, or I should say rediscovered it, as I&#8217;d come across it many years ago when it was far less developed than is now, liked it then but couldn&#8217;t see a use for it at the time and promptly forgot about it.  Then it appeared wrapped in a Talend component, prompting me to do a quick visit to the Groovy website, which turned into a deep-dive into the language; I&#8217;d found my scripting tool!</p>
<p>Groovy (by the way what a terrible name for a language, or is that just me?), is not really a stand-alone language but more an extension to Java itself; offering the full power of Java but with addition of closures, builders and dynamic types.  In fact, over time Groovy has become more and more Java like (the biggest missing being lack of support for anonymous inner-classes).</p>
<p>To underline this convergence, Groovy is being developed under the separate <a href="http://www.jcp.org/en/jsr/detail?id=241">JSR 241</a> rather than JSR 223.   There&#8217;s full interoperability between both languages; Groovy  compiles down to <a href="http://en.wikipedia.org/wiki/Java_bytecode">JVM bytecode</a> and can use Java classes and objects, Java can likewise use Groovy generated bytecode.  This allows for fast prototyping and development without compromising access to Java&#8217;s vast collection of libraries.</p>
<p>Here for example, is a piece of code to try out the <a href="http://www.jpalo.com/en/news.html">JPalo library&#8217;s</a> ability to access a <a href="http://www.palo.net">Palo</a> cube &#8230;</p>
<pre name="code" class="java">

import org.palo.api.Connection;
import org.palo.api.ConnectionFactory;
import org.palo.api.Cube;
import org.palo.api.Database;
import org.palo.api.Element;
connection = ConnectionFactory.getInstance().newConnection(&quot;localhost&quot;,&quot;7777&quot;,&quot;admin&quot;,&quot;admin&#038;quot <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />
database = connection.getDatabaseByName(&quot;Demo&#038;quot <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />
cube = database.getCubeByName(&quot;Sales&quot;);
rowElements = cube.getDimensionAt(0).getElementsInOrder();
columnElements = cube.getDimensionAt(1).getElementsInOrder();
dataSet = [rowElements,columnElements,]
dataSet &lt;&lt; cube.getDimensionAt(2).getElementAt(0)
dataSet &lt;&lt; cube.getDimensionAt(3).getElementAt(0)
dataSet &lt;&lt; cube.getDimensionAt(4).getElementAt(0)
dataSet &lt;&lt; cube.getDimensionAt(5).getElementAt(0)
// fetch data set
datas=cube.getDataArray(dataSet as Element[][])
connection.disconnect();
// parse the return string
rowcount = rowElements.length;
columncount = columnElements.length;
data=[]
heading=[]
// first row set to the row names (i.e. &quot;Product name&quot; followed by the country names)
heading &lt;&lt; &quot;Product&quot;
for (i in 0..columncount-1) {
    heading &lt;&lt; columnElements[i].getName()
   }
data &lt;&lt; heading
// Now  out each line
for (i in 0..rowcount-1) {
    row = []
    row &lt;&lt; rowElements[i].getName()
    for (j in 0..columncount-1) {
         row &lt;&lt; datas[((i + (j*columncount)))]
         }
    data &lt;&lt; row.flatten()
   }

//output to csv file
def csvOut= new FileOutputStream(&#039;c:/data/File.csv&#039 <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />
for (lines in data) {
         lines.eachWithIndex{col,i -&gt;
                             if (i &gt; 0) {
                                 csvOut &lt;&lt; &quot;,&quot;
                             }
         csvOut &lt;&lt; col
     }
     csvOut &lt;&lt; &quot;\n&quot;
}
csvOut.close()
</pre>
<p>This was done in the Groovy console as a proof of concept, it was then transferred to a tGroovy component where it was parametrised and instead of outputting to a CSV file, it was used to fill the globalBuffer structure (the structure used by tBufferOutput component).</p>
<p>Other things I managed to do with Talend tGroovy over a few days:</p>
<ul>
<li>Extended <a href="http://www.sqlite.org">SQLite</a> with my own user-defined Palo functions.</li>
<li>Set-up a Talend job as an Excel accessible RESTful web service using Jetty.</li>
<li>Interfaced with Amazon S3.</li>
</ul>
<p>Although I was very familiar with the S3 and the JPalo API, both <a href="http://files.zentus.com/sqlitejdbc/api/org/sqlite/Function.html">SQLite UDFs</a> and <a href="http://www.mortbay.org/jetty-6/">Jetty</a> were new to me, and that&#8217;s were scripting proves it worth, giving the developer the maximum support with the minimum of background noise.  But it&#8217;s not just weird and wonderful new APIs that scripting helps expose but as a datasmithing tool, languages such as Groovy give analysts the ability to quickly de-construct and model datasets (for example, see <a href="http://groovy.codehaus.org/Tutorial+6+-+Groovy+SQL">Groovy&#8217;s SQL database support</a> and <a href="http://groovy.codehaus.org/Collections">collections&#8217; functionality</a>).</p>
<p>As a <span style="text-decoration:line-through;">in</span>famous Irish farming-pharma TV add of my youth put it, &#8220;<span style="font-style:italic;">It&#8217;s a queer name but great stuff</span>&#8220;.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/387/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/387/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/387/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/387/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/387/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=387&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/07/20/groovy-as-talends-scripting-language/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Boy scratches Python&#8230;</title>
		<link>http://blog.gobansaor.com/2008/07/05/boy-scratches-python/</link>
		<comments>http://blog.gobansaor.com/2008/07/05/boy-scratches-python/#comments</comments>
		<pubDate>Sat, 05 Jul 2008 18:42:00 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[Python]]></category>

		<category><![CDATA[education]]></category>

		<category><![CDATA[programming]]></category>

		<category><![CDATA[MIT]]></category>

		<category><![CDATA[Scratch]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=386</guid>
		<description><![CDATA[I&#8217;ve written before about Scratch, a teaching platform developed by MIT to introduce kids to the art of programming. My son has been playing around with Scratch for over a year and although he still enjoys it, he&#8217;s showing signs of needing to move to the next level, a &#8216;real&#8217; programming language.  I decided that [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve <a href="http://blog.gobansaor.com/2007/02/19/want-your-kids-to-have-the-programming-itch-then-scratch-it/">written before about Scratch</a>, a teaching platform developed by MIT to introduce kids to the art of programming. My son has been playing around with Scratch for over a year and although he still enjoys it, he&#8217;s showing signs of needing to move to the next level, a &#8216;real&#8217; programming language.  I decided that Python, being <a href="http://blog.gobansaor.com/2008/04/11/python-the-new-vba/">one</a> of my own <a href="http://blog.gobansaor.com/2008/05/05/python-to-replace-vb6/">favourite languages</a>, would be an ideal next step, particularly when I discovered <a href="http://www.pygame.org/wiki/about">PyGame</a>, a Python library based on <a href="http://www.libsdl.org/">SDL</a>.</p>
<p>Using Pygame with its similar problem domain to that of Scratch would, I figured, make the transition to a grown-up platform easier, and so it has; concepts such as sprite, coordinates, animation etc. are common to both.  I took him through the <a href="http://www.pygame.org/docs/tut/chimp/ChimpLineByLine.html">&#8220;Pummel the Chimp&#8221; tutorial</a>, expecting his young eyes to glaze over within 10 minutes, but no, a hour later he was still engaged and learning.  Why? He already has a deep understanding of programming, particularly object oriented programming, all thanks to Scratch.</p>
<p>Most of this knowledge he acquired without any help for me, I simply introduced him to Scratch and explained one or two concepts (variables and messages/method calls) which he initially had trouble with, the rest he picked up from looking at other Scratch projects and from writing his own.</p>
<p>So if your kids (or even you) have an itch to learn the essence of programming in a fun and effective way, then <a href="http://scratch.mit.edu/">Scratch</a> it.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/386/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/386/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/386/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/386/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/386/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=386&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/07/05/boy-scratches-python/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Regular Expressions as an end-user programming tool?</title>
		<link>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/</link>
		<comments>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/#comments</comments>
		<pubDate>Tue, 01 Jul 2008 12:24:25 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Talend]]></category>

		<category><![CDATA[excel]]></category>

		<category><![CDATA[kettle]]></category>

		<category><![CDATA[regex]]></category>

		<category><![CDATA[regular expressions]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=382</guid>
		<description><![CDATA[&#8220;What? Have you completely lost the plot, Gleeson?&#8221;, I hear you scream.  Jamie Zawinski&#8217;s famous quote is intoned once more ..
Some people, when confronted with a problem, think
“I know, I&#8217;ll use regular expressions.”   Now they have two problems.
Of course the above quote could be (and probably has been) changed to&#8230;
Most business people, when confronted with [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>&#8220;What? Have you completely lost the plot, Gleeson?&#8221;, I hear you scream.  <a href="http://www.jwz.org/">Jamie Zawinski&#8217;s</a> famous quote is intoned once more ..</p>
<blockquote><p><strong>Some people, when confronted with a problem, think<br />
“I know, I&#8217;ll use regular expressions.”   Now they have two problems.</strong></p></blockquote>
<p>Of course the above quote could be (and probably has been) changed to&#8230;</p>
<blockquote><p><strong>Most business people, when confronted with a problem, think<br />
“I know, I&#8217;ll use a spreadsheet.”   Now they have two problems.</strong></p></blockquote>
<p>They are dense, single-line, single purpose, self contained mini-programs.  The previous statement applies to <a href="http://en.wikipedia.org/wiki/Regular_expression">regular expressions</a> but could equally be used to describe the single most popular end-user programming tool, spreadsheet formulae (particularly in their nested form!).</p>
<p>As somebody with the &#8220;<a href="http://blogs.msdn.com/alfredth/archive/2006/12/21/is-there-a-programming-gene.aspx">programming gene</a>&#8221; (something most, but not all, IT professionals possess, as do a significant proportion of &#8220;civilians&#8221;), such compressed logic somewhat grates compared with the power and elegance of more expressive programming languages, but that hasn&#8217;t stopped me using both spreadsheet formulae and regex to quickly and effectively solve problems when the need arose.</p>
<p>Those without the programming gene (the vast majority of business users), find traditional programming languages next to impossible to get their heads around yet find spreadsheet formulae approachable and useful.  It <a href="http://hackety.org/2007/08/15/oneLinersAreCrucial.html">seems to be something to do with approaching problems as a series of simple problems</a> and not loading the whole problem domain into one&#8217;s brain at one sitting (as <a href="http://www.paulgraham.com/head.html">most programmers and system designers are capable of</a>).</p>
<p>In the past, non-programmers would rarely come in contact with regex as its use was possible only within the realms of professional programming or Unix sys-admin toolsets (sed,<a href="http://en.wikipedia.org/wiki/Awk">awk</a> etc.).  But now, ETL tools such as <a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">Kettle and Talend</a> allow end-users to use regular expressions without the need to understand the underlying programming language.  Taking this to the next step, Talend&#8217;s <a href="http://www.talend.com/products-data-quality/talend-open-profiler.php">new data profiling product</a> uses regular expressions as its main discovery language. They could, I guess, have invented yet another XML dialect and/or <a href="http://en.wikipedia.org/wiki/Query_by_Example">query-by-example</a> dialogue, but instead they&#8217;ve taken the sensible (and cheaper) option and exposed the full power of raw regular expressions.</p>
<p>Will the great unwashed embrace regex in the same way they took to nested Excel functions, embarrassing their professional colleagues with yet more amateurish and often unmaintainable messy solutions, that just work? I think they just might&#8230;</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/382/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/382/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/382/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=382&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/07/01/regular-expressions-as-an-end-user-programming-tool/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>What to do when Talend gets its knickers in a twist?</title>
		<link>http://blog.gobansaor.com/2008/06/30/what-to-do-when-talend-gets-its-knickers-in-a-twist/</link>
		<comments>http://blog.gobansaor.com/2008/06/30/what-to-do-when-talend-gets-its-knickers-in-a-twist/#comments</comments>
		<pubDate>Mon, 30 Jun 2008 12:38:57 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Talend]]></category>

		<category><![CDATA[.item]]></category>

		<category><![CDATA[.JETEmtiters]]></category>

		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=380</guid>
		<description><![CDATA[If you&#8217;ve done any significant amount of work with Talend you&#8217;ll undoubtedly have experienced situations where either the generated code/JETemitters or the GUI representation of a job  become unstable like so&#8230;

The usual advice is to backup your projects (workspace/projectName) , delete the workspace/.Java (or .Perl) and workspace/.JETEmitters folders and restart Talend to force a [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>If you&#8217;ve done any significant amount of work with <a href="http://www.talend.com">Talend</a> you&#8217;ll undoubtedly have experienced situations where either the generated code/<a href="http://www.eclipse.org/articles/Article-JET/jet_tutorial1.html">JETemitters</a> or the GUI representation of a job  become unstable like so&#8230;</p>
<p><a href="http://gobansaor.files.wordpress.com/2008/06/knickers-in-twist.jpg"><img class="aligncenter size-medium wp-image-381" src="http://gobansaor.files.wordpress.com/2008/06/knickers-in-twist.jpg?w=300&h=280" alt="Talend loses the plot!" width="300" height="280" /></a></p>
<p>The <a href="http://www.talendforge.org/forum/viewtopic.php?id=2389">usual advice</a> is to backup your projects (workspace/projectName) , delete the workspace/.Java (or .Perl) and workspace/<a href="http://www.eclipse.org/articles/Article-JET/jet_tutorial1.html">.JETEmitters</a> folders and restart Talend to force a rebuild of the generated code.  I&#8217;ve have varying degrees of success with this approach, from it fixing the problem to the other extreme of the project raising the dreaded &#8220;null pointer&#8221; error never to load again!  Also, in such a situation, don&#8217;t depend on the in-built backup facility (particularly if it&#8217;s showing errors on the export), make a copy of the project folder directly (in particular, the process folder).</p>
<p>I&#8217;ve found the most reliable method to untwist such messes is to create a brand new Talend installation (unzip download file into another folder), create a new project (don&#8217;t use the front screen&#8217;s import functionality); within this new project, use the import item facility, import the required jobs from the corrupted project and the resulting project should now work OK.</p>
<p>If the job wouldn&#8217;t compile due to its inability to find standard Talend classes (this applies to Java projects, switch to the code view and click on the red spots on the right hand side to see the errors), just saving the job, exiting and restarting Talend should fix this problem.</p>
<p>Sometimes however, as with the screenshot above, none of the above will work.  Remedying this required me to edit the job&#8217;s <em>.item </em>file&#8217;s XML directly (the bug that caused this was seen in a 2.4 RC release, it thankfully appears to have fixed in the production release).  Editing a job&#8217;s XML is not for the faint hearted, but it is possible and with the knowledge gained from studying the XML structure you may find such experience useful (as I did) in the mass-editing  of jobs.  In fact, I&#8217;ve used Talend&#8217;s own data integration functionality to mass generate Talend jobs, demonstrating again the power of this continually improving product!</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/380/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/380/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/380/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/380/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/380/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/380/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/380/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/380/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/380/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/380/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/380/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/380/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=380&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/06/30/what-to-do-when-talend-gets-its-knickers-in-a-twist/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://gobansaor.files.wordpress.com/2008/06/knickers-in-twist.jpg?w=300" medium="image">
			<media:title type="html">Talend loses the plot!</media:title>
		</media:content>
	</item>
		<item>
		<title>NX rather than VNC for EC2 Desktop</title>
		<link>http://blog.gobansaor.com/2008/06/11/nx-rather-than-vnc-for-ec2-desktop/</link>
		<comments>http://blog.gobansaor.com/2008/06/11/nx-rather-than-vnc-for-ec2-desktop/#comments</comments>
		<pubDate>Wed, 11 Jun 2008 18:20:10 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[AmazonAWS]]></category>

		<category><![CDATA[EC2]]></category>

		<category><![CDATA[cloud]]></category>

		<category><![CDATA[Centos]]></category>

		<category><![CDATA[NX]]></category>

		<category><![CDATA[Ubuntu]]></category>

		<category><![CDATA[VNC]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=376</guid>
		<description><![CDATA[The various Amazon EC2 AMIs that I&#8217;ve built over the last few years are getting a bit long in the tooth. Most are based on Fedora 4 and nearly all are over-burdened with software I no longer use nor require.  Time for some rationalisation.
I figure I need two &#8216;template&#8217; AMIs, one containing the bare [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The various <a href="http://en.wikipedia.org/wiki/Amazon_ec2">Amazon EC2 AMIs</a> that I&#8217;ve built over the last few years are getting a bit long in the tooth. Most are based on Fedora 4 and nearly all are over-burdened with software I no longer use nor require.  Time for some rationalisation.</p>
<p>I figure I need two &#8216;template&#8217; AMIs, one containing the bare minimum of software, EC2 tools, Python, Perl and Java; the second loaded with the likes of <a href="http://blog.gobansaor.com/2007/05/27/talend-vs-kettle-pentaho-pdi/">Kettle, Talend</a>, <a href="http://www.hamachi.org">Hamachi VPN</a>, <a href="http://www.oracle.com/technology/products/database/xe/index.html">OracleXE</a> , <a href="http://www.jedox.com/en/enterprise-spreadsheet-server/excel-olap-server/palo-server.html">Palo MOLAP Server</a> and <a href="http://www.jedox.com/en/enterprise-spreadsheet-server/etl-server/introduction.html">Palo ETL Server</a> and a Gnome desktop accessible <a href="http://en.wikipedia.org/wiki/Vnc">via VNC</a>.</p>
<p>I&#8217;m deciding whether to use <a href="http://en.wikipedia.org/wiki/Centos">Centos</a> or <a href="http://en.wikipedia.org/wiki/Ubuntu">Ubuntu</a> as the basis for one or both templates.  I&#8217;m more familiar with Centos&#8217;s RedHat heritage but Ubuntu&#8217;s design goals of ease-of-use and ease-of-update appeal.   Since I was in the process of  re-evaluating  my EC2 builds I decided to also check out <a href="http://www.nomachine.com/">NX as an alternative to VNC</a>.  I had tried to install <a href="http://en.wikipedia.org/wiki/NX_technology">NX Server</a> on a Fedora 4 instance a few years back, but had abandoned the effort having spent the best part of a day on it, reverting back to my VNC comfort zone.</p>
<p>This time I was able to use <a href="http://groups.google.com/group/ec2ubuntu">one of  Eric Hammond&#8217;s Ubuntu AMIs</a> with <a href="http://ec2hardy-desktop.notlong.com/">NX pre-installed</a>.   Wow, what a difference!  It&#8217;s much  more responsive, even over my <a href="http://blog.gobansaor.com/2008/02/23/a-tale-of-two-services/">tempermental fixed wireless broadband</a> connection.  I also tried it using my backup ISDN line, again a huge improvement compared to using VNC. If you&#8217;re still using VNC to remotely access EC2 or any other remote server, you&#8217;ve got to check out NX.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/376/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/376/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/376/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/376/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/376/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/376/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/376/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/376/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/376/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/376/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/376/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/376/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=376&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/06/11/nx-rather-than-vnc-for-ec2-desktop/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>New Banner, New Logo, New Disk and a new S3 Firefox extension.</title>
		<link>http://blog.gobansaor.com/2008/05/31/new-banner-new-logo-new-disk-and-a-new-s3-firefox-extension/</link>
		<comments>http://blog.gobansaor.com/2008/05/31/new-banner-new-logo-new-disk-and-a-new-s3-firefox-extension/#comments</comments>
		<pubDate>Sat, 31 May 2008 17:35:45 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[AmazonAWS]]></category>

		<category><![CDATA[S3]]></category>

		<category><![CDATA[Ballinafagh]]></category>

		<category><![CDATA[Blessington]]></category>

		<category><![CDATA[Firefox V3]]></category>

		<category><![CDATA[logo]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=374</guid>
		<description><![CDATA[I&#8217;ve just uploaded a new banner image based on a photo of Ballinafagh Lake at dusk with my new logo layered over it using Paint .NET.
The previous banner was based on this photo of willow &#8216;down&#8217; covering a lake-side tree at Russeltown on  Blessignton Lake.
Both lakes are in fact man-made.  Blessington is a [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve just uploaded a new banner image based on a photo of Ballinafagh Lake at dusk with my new logo layered over it using <a href="http://www.getpaint.net/index.html">Paint .NET.</a><a href="http://gobansaor.files.wordpress.com/2008/05/snowinjune.jpg"><img class="alignright alignnone size-full wp-image-373" style="float:right;" src="http://gobansaor.files.wordpress.com/2008/05/snowinjune.jpg?w=144&h=108" alt="" width="144" height="108" /></a></p>
<p>The previous banner was based on this photo of willow &#8216;down&#8217; covering a lake-side tree at Russeltown on  Blessignton Lake.</p>
<p>Both lakes are in fact man-made.  <a href="http://www.southdublintourism.ie/index.php?option=com_content&amp;task=view&amp;id=167&amp;Itemid=210">Blessington</a> is a reservoir for the Pollahuca hydroelectric plant and is now the major water source for most of North Kildare and large parts of Dublin City.  <a href="http://www.wikisandbox.com/page/Ballinafagh+Lake?t=anon">Ballinafagh</a> is an abandoned reservoir for the <a href="http://www.sip.ie/sip070/A%20History%20of%20the%20Grand%20Canal.html">Grand Canal system</a> and is a magical spot, particularly when visited on a summer&#8217;s evening or at dusk in winter.</p>
<p>I&#8217;ve also discovered <a href="http://overstimulate.com/projects/s3">s3://</a>, a new Firefox extension for accessing Amazon S3.  Really simple to use, rather than an FTP type approach, it uses the URLbar.  By going to s3:// you can add your Amazon S3 credentials and then manage your buckets, upload new files, or delete existing files.</p>
<p>Big plus, it works with Firefox 3; <a href="http://overstimulate.com/projects/s3">S3Fox</a> has not yet made the leap (even using <a href="http://www.oxymoronical.com/web/firefox/nightly">Nighly Tester Tools</a> extension to force compatibility wouldn&#8217;t work - same applies to the EC2 management extension, <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=609">Elasticfox</a> ).  In fact a whole heap of extensions are not Firefox 3 RC compatible, so much so, when I rebuilt my machine this week after a disk failure, I reverted back to V2.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/374/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/374/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/374/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/374/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/374/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/374/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/374/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/374/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/374/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/374/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/374/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/374/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=374&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/05/31/new-banner-new-logo-new-disk-and-a-new-s3-firefox-extension/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://gobansaor.files.wordpress.com/2008/05/snowinjune.jpg" medium="image" />
	</item>
		<item>
		<title>Palo OLAP and sparse dimensions.</title>
		<link>http://blog.gobansaor.com/2008/05/26/palo-olap-and-sparse-dimensions/</link>
		<comments>http://blog.gobansaor.com/2008/05/26/palo-olap-and-sparse-dimensions/#comments</comments>
		<pubDate>Mon, 26 May 2008 15:24:09 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[BI]]></category>

		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Palo]]></category>

		<category><![CDATA[excel]]></category>

		<category><![CDATA[Add new tag]]></category>

		<category><![CDATA[drill-through]]></category>

		<category><![CDATA[drill-thru]]></category>

		<category><![CDATA[essbase]]></category>

		<category><![CDATA[ETL-Server]]></category>

		<category><![CDATA[Palo 2.5]]></category>

		<category><![CDATA[pivot]]></category>

		<category><![CDATA[sparse dimension]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=367</guid>
		<description><![CDATA[Last week I tried out both the latest Palo 2.5 release and its sister product, ETL-Server.  Although I&#8217;ve not done any proper benchmarks, 2.5 does appear to be faster than the previous release and the Excel add-in also behaves better when co-habiting with other add-ins and macros (the previous release&#8217;s use of, and response to, [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Last week I tried out both the latest <a href="http://www.jedox.com/en/enterprise-spreadsheet-server/excel-olap-server/palo-server.html">Palo 2.5</a> release and its sister product, <a href="http://www.jedox.com/en/enterprise-spreadsheet-server/etl-server/introduction.html">ETL-Server</a>.  Although I&#8217;ve not done any proper benchmarks, 2.5 does appear to be faster than the previous release and the Excel add-in also behaves better when co-habiting with other add-ins and macros (the previous release&#8217;s use of, and response to, application level events meant it often caused the entire Excel session to grind to a halt when other macros were processing data).</p>
<p>There is however, a major memory leak problem when using cubes with <a href="http://www.dba-oracle.com/oracle_news/2005_6_27_What_Is_Sparsity.htm">sparse dimensions</a> (such as the Biker database Invoice dimension), at one stage my Excel session had a working set size of 750M!</p>
<p>The failure of  MOLAP cubes to effectively handle sparse datasets has always been something that ROLAP star schema advocates (myself included) have pointed to as a major short-coming.  Over time, products such as <a href="http://en.wikipedia.org/wiki/Essbase">Essbase</a> have managed to get around this limitation (but often only with careful up front cube design by skilled staff). Palo hasn&#8217;t quite made it to that level yet.  But then, maybe it shouldn&#8217;t, perhaps simplicity of setup and use should be Palo&#8217;s goal, not the ability to handle a telco&#8217;s &#8216;grain-at the-level-of-call&#8217; fact table.</p>
<p>Two of the new features in Palo 2.5 that I&#8217;d been looking forward to, Zero Suppression and Drill-through are both relevant to the handling of cube sparsity.</p>
<ul>
<li><a href="http://businessintelligence.ittoolbox.com/documents/popular-q-and-a/differences-between-drill-through-drill-down-and-slice-dice-in-cognos-2289">Drill-through</a>; to reduce the sparseness of a dimension, by moving the &#8216;grain&#8217; up a level in the consolidation hierarchy, e.g. have a base level of Invoice Type instead of Invoice Number or indeed, dropping the dimension entirely.</li>
<li>Zero Suppress; to filter out excessive elements, e.g. only show Invoice Numbers that were raised this month rather than showing every invoice number ever raised depending on a non-zero value to indicate that it belongs to this month.</li>
</ul>
<p>It was my testing of Zero Suppression using the Biker database that appears to have caused Excel to go on a memory binge.  It doesn&#8217;t worry me that much as the right design can reduce the need for zero suppression e.g. in a customer dimension, &#8216;hide&#8217;  customers who trade with you infrequently under a &#8220;Others&#8221; consolidation.  Also, the use of drill-through would eliminate the need for many sparse dimensions, but, alas, the drill-through functionality is severely nobbled in Palo 2.5.</p>
<p>Firstly, the functionality is only available in Excel if you purchase the €8,000 <a href="http://www.jedox.de/en/enterprise-spreadsheet-server/excel-supervision-server/introduction.html">Palo Supervision Server</a> which is all fine and dandy if you have a need for all the good things that Supervision Server offers.  Having said that, I did managed to bypass this requirement by calling the ETL-Server&#8217;s SOAP API directly from Excel with the help of Simon Fell&#8217;s superb <a href="http://www.pocketsoap.com/">PocketSoap</a> library, so all is not lost on that front. But &#8230;</p>
<p>&#8230; the other reason I&#8217;m not overly impressed with the feature, is its implementation; particularly for something Jedox is asking you to pay €8000 for!   Here&#8217;s what you get.</p>
<p>Within the ETL-Server&#8217;s cube &#8216;export&#8217; you can specify &#8220;&lt;info&gt;&#8221; fields such as invoice number, that will not form part of the final cube coordinate list.  These coordinates and the info fields along with &#8216;value&#8217; field are then written to a Derby (aka JavaDB) database table, where the schema is the ETL project name and the table name matches that of the cube.  So you end up with, in essence, a &#8216;fact table&#8217; at a finer &#8216;grain&#8217; than the cube and the role of <a href="http://en.wikipedia.org/wiki/Degenerate_dimension">degenerate dimensions</a> being provided by the &#8220;info&#8221; fields.  The resulting table is not indexed, so large datasets will be a problem and for it to work, a cube can only be &#8216;owned&#8217; by a single ETL project.</p>
<p>The other thing to note is that drill-thru only works for &#8216;N&#8217; coordinates i.e. no consolidation elements.  This is unlike Excel&#8217;s Pivot table which allows you to double-click at any cell, to reveal the underlying dataset.</p>
<p>On the plus side, the new ETL-Server is actually very good and well put together.  Better off not thinking of it as a generic ETL tool but as a specialist Palo loader, like <a href="http://www.orafaq.com/wiki/SQL*Loader_FAQ">Oracle&#8217;s SQL*Loader</a>.  And like SQL*Loader&#8217;s role on many Oracle projects, for a large percentage of Palo projects ETL-Server will  be the only ETL tool needed.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/367/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/367/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/367/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/367/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/367/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=367&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/05/26/palo-olap-and-sparse-dimensions/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>
	</item>
		<item>
		<title>Oracle in the cloud &#8230;</title>
		<link>http://blog.gobansaor.com/2008/05/06/oracle-in-the-cloud/</link>
		<comments>http://blog.gobansaor.com/2008/05/06/oracle-in-the-cloud/#comments</comments>
		<pubDate>Tue, 06 May 2008 21:09:44 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[AmazonAWS]]></category>

		<category><![CDATA[EC2]]></category>

		<category><![CDATA[cloud]]></category>

		<category><![CDATA[Oracle]]></category>

		<category><![CDATA[Oracle 10g Express]]></category>

		<category><![CDATA[Oracle Application Express]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=366</guid>
		<description><![CDATA[Image via Wikipedia

&#8230; not yet, but Bill Hodak from Oracle has just opened a thread over on the Amazon AWS developer forums, looking for feedback on the use of Oracle in AWS projects.  First there was Red Hat, then this week&#8217;s announcement from Sun and now Oracle; has Amazon managed to turn itself into [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><div class="zemanta-img" style="float:right;margin:1em;"><a href="http://en.wikipedia.org/wiki/Image:Oracle_logo.svg" target="_blank"><img style="border:medium none;display:block;" src="http://upload.wikimedia.org/wikipedia/en/thumb/5/50/Oracle_logo.svg/202px-Oracle_logo.svg.png" alt="Oracle Corporation" /></a>Image via <a href="http://en.wikipedia.org/wiki/Image:Oracle_logo.svg" target="_blank">Wikipedia</a></p>
</div>
<p>&#8230; not yet, but Bill Hodak from Oracle has just <a href="http://developer.amazonwebservices.com/connect/thread.jspa?messageID=88483&amp;tstart=0#88483">opened a thread over on the Amazon AWS developer forums</a>, looking for feedback on the use of Oracle in AWS projects.  First there was Red Hat, then this week&#8217;s <a href="http://www.sun.com/aboutsun/pr/2008-05/sunflash.20080505.3.xml">announcement from Sun</a> and now Oracle; has Amazon managed to turn itself into the cloud provisioner not just for the hungry masses of start-ups and independent developers but for the technology elites?</p>
<p>As for using Oracle on EC2, yes please.  Most of my datasmithing career has been spent behind the wheel of an Oracle database, the front-ends might have been Excel or some BI package, the end results might have been SAP master data take-ons or an Essbase cube, but the blood and guts were always Oracle.  And this was before Oracle Apex - think what wonders could have been achieved if I had access to such a product in the past.</p>
<p>When EC2 first appeared I enthusiastically  installed  Oracle 10g Express, using a Hamachi VPN to tunnel the Apex front-end back to my PC (don&#8217;t ever expose an Oracle 10g server to the public internet, its architects assumed it would be used solely within the corporate firewall).  I even used the power of Oracle&#8217;s redo logs to partially protect against the ephemeral nature of EC2&#8217;s disk storage.</p>
<p>It looked to me back then that EC2 could be an ideal hosting environment for <a href="http://www.oracle.com/technology/products/database/application_express/index.html">Oracle Application Express</a> (aka Apex, aka HTML DB), but for a few wee problems:</p>
<ul>
<li>It&#8217;s not absolutely clear whether the Oracle 10G Express database licence covers its use in a virtual environment (sometimes the restriction of one database per server is stated as one per machine), a few attempts to look for a definitive  yeah or neigh on the product&#8217;s support forums elicited no response.  I&#8217;m guessing its fair-usage, but confirmation would be nice.</li>
<li>Oracle doesn&#8217;t appear to know what to do with Apex, you get the impression they&#8217;re afraid it&#8217;ll cannibalise its lucrative J2EE business.</li>
<li>10g Express is severely hobbled as a database, not just the 4GB per server (or is that machine), it&#8217;s lacking any sort of updating service, serious security flaws remain unpatched and username/passwords are sent in plain text; making it suitable (and then only barely) for use within a firewall or VPN.</li>
<li>Once you outgrow Express,  you&#8217;re into big money and even worse you might have to talk to a sales rep!</li>
</ul>
<p>So what would I like to see Oracle offering on EC2? <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=867">A paid AMI</a>, preloaded with a variation of Express, minus the 4GB limit, with a &#8220;hardened&#8221; public internet facade, along with regular patches automatically applied.  Optional add-ons&#8230;</p>
<ul>
<li>Various levels of support, fixed monthly charge perhaps.</li>
<li>Ability to upgrade to the full Enterprise Editions, but again paid for via a combination of  AMI hourly charges and optional month-to-month support charges.</li>
<li>Ability to purchase once-off consultancy, both from Oracle and third-party suppliers.</li>
</ul>
<p>I&#8217;m not holding my breath though&#8230;</p>
<p>Oh, if you&#8217;re confused over the various &#8220;Express&#8221; terms used in the above, don&#8217;t blame me, blame Oracle, I thing the poor branding profile (constant name changes, copy cat names) is an indication of Oracle&#8217;s lack of commitment to both products.</p>
<div id="zemanta-pixie" style="width:100%;margin:5px 0;"><a id="zemanta-pixie-a" title="Zemified by Zemanta" href="http://www.zemanta.com/"><img style="border:medium none;float:right;" src="http://img.zemanta.com/pixie.png?x-id=5d817014-c45b-4be4-8386-0914bcd221a4" alt="" /></a></div>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/366/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/366/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/366/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=366&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/05/06/oracle-in-the-cloud/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/en/thumb/5/50/Oracle_logo.svg/202px-Oracle_logo.svg.png" medium="image">
			<media:title type="html">Oracle Corporation</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixie.png?x-id=5d817014-c45b-4be4-8386-0914bcd221a4" medium="image" />
	</item>
		<item>
		<title>Python to replace VB6 &#8230;</title>
		<link>http://blog.gobansaor.com/2008/05/05/python-to-replace-vb6/</link>
		<comments>http://blog.gobansaor.com/2008/05/05/python-to-replace-vb6/#comments</comments>
		<pubDate>Mon, 05 May 2008 11:59:05 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[VBA]]></category>

		<category><![CDATA[excel]]></category>

		<category><![CDATA[Add new tag]]></category>

		<category><![CDATA[Microsoft Visual Studio]]></category>

		<category><![CDATA[VB6]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=365</guid>
		<description><![CDATA[&#8230; well at least for me.  As I discussed previously I&#8217;ve been seriously investigating using Python as my primary datasmithing scripting language, in effect a new VBA.  I also currently use VBA&#8217;s compiled cousin, VB6, for certain tasks such as building Excel RTD servers.  The problem with VB6 is it depends on [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>&#8230; well at least for me.  As I <a href="http://blog.gobansaor.com/2008/04/11/python-the-new-vba/">discussed previously</a> I&#8217;ve been seriously investigating using Python as my primary datasmithing scripting language, in effect a new <a class="zem_slink" title="Visual Basic for Applications" rel="wikipedia" href="http://en.wikipedia.org/wiki/Visual_Basic_for_Applications" target="_blank">VBA</a>.  I also currently use VBA&#8217;s compiled cousin, <a class="zem_slink" title="Visual Basic" rel="wikipedia" href="http://en.wikipedia.org/wiki/Visual_Basic" target="_blank">VB6</a>, for certain tasks such as building <a href="http://msdn.microsoft.com/en-us/library/aa140059(office.10).aspx">Excel RTD servers</a>.  The problem with VB6 is it depends on Visual Studio 6, which is <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&amp;articleId=9076558&amp;source=rss_topic63">no longer supported by MS</a> and is increasingly next-to-impossible to purchase.   I have a copy which I picked up at a charity auction (for €50, which also included, Windows 2000, <a class="zem_slink" title="Microsoft Visio" rel="homepage" href="http://office.microsoft.com/en-us/FX010857981033.aspx" target="_blank">Visio</a> and  Office 2000 professional!) but I am aware any code I develop in VB6 is in effect  tending towards&#8221;closed source&#8221; as far as many others are concerned as VS6 continues its journey into history. (I also use Visual C/C++ but such code is future-proof as the latest versions of VS continue to support C/C++ unaltered).</p>
<p>So what are my alternatives:</p>
<ul>
<li>Do nothing, continue to use VS6 (I&#8217;ve already made an &#8220;off-site&#8221; copy in case the house burns down!).</li>
<li>Use .NET, <a href="http://blog.gobansaor.com/2007/10/04/javascript-as-an-excel-scripting-language-via-exceldna/">ExcelDNA</a> is an easy way to integrate Excel with .NET, but it doesn&#8217;t handle the creation of COM servers.  In general, accessing COM from .NET is a total PITA.</li>
<li>Use Python.</li>
</ul>
<p>I&#8217;m going down the Python route, especially now that I&#8217;ve figured out how to create in-process (DLL) COM servers (via <a href="http://pyinstaller.python-hosting.com/">pyInstaller</a>) and how to manage sub-processes (via <a href="http://www.oreillynet.com/onlamp/blog/2007/08/pymotw_subprocess_1.html">import subprocess</a>).  Not just because it&#8217;s future-proof, but because:</p>
<ul>
<li>It&#8217;s easy to build against and to manipulate <a href="http://en.wikipedia.org/wiki/Component_Object_Model">COM interfaces</a>, nearly as easy as VB6, much, much easier than the .NET alternatives.</li>
<li>I can then use the same platform to handle general datasmithing (on Windows, MacOS and Linux), and do Excel integration &#8220;stuff&#8221; and also web front-ends (via <a href="http://googleblog.blogspot.com/2008/04/developers-start-your-engines.html">Google Apps Engine</a>).</li>
<li>I really like Python and I find I&#8217;m much more productive with it than with any other language I have ever used (with the possible exception of <a href="http://en.wikipedia.org/wiki/MUMPS">MUMPS</a>, but otherwise &#8220;Where were you Python, when I were a lad, down coding pit, slaving over hot Cobol &#8230;&#8221;).</li>
</ul>
<p>Enough work, the sun is shining; the <a href="http://www.rspb.org.uk/wildlife/birdguide/name/c/chiffchaff/index.asp">chiffchaff</a> is, well, chiff-chaffing; spring has sprung, I&#8217;m off out to the garden&#8230;</p>
<p><strong>Update</strong>:</p>
<p>As Prashant pointed out in the comments below, <a href="http://vb2py.sourceforge.net/">http://vb2py.sourceforge.net/</a> looks like a very useful tool &#8230;</p>
<div id="zemanta-pixie" style="width:100%;margin:5px 0;"><a id="zemanta-pixie-a" title="Zemified by Zemanta" href="http://www.zemanta.com/"><img style="border:medium none;float:right;" src="http://img.zemanta.com/pixie.png?x-id=86eb7846-ec1c-4b5f-bc4d-5bd1bae0fbec" alt="" /></a></div>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/365/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/365/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/365/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=365&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/05/05/python-to-replace-vb6/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixie.png?x-id=86eb7846-ec1c-4b5f-bc4d-5bd1bae0fbec" medium="image" />
	</item>
		<item>
		<title>Palo ETL Server - Not for me &#8230;</title>
		<link>http://blog.gobansaor.com/2008/05/01/palo-etl-server-not-for-me/</link>
		<comments>http://blog.gobansaor.com/2008/05/01/palo-etl-server-not-for-me/#comments</comments>
		<pubDate>Thu, 01 May 2008 12:13:58 +0000</pubDate>
		<dc:creator>gobansaor</dc:creator>
		
		<category><![CDATA[BI]]></category>

		<category><![CDATA[ETL]]></category>

		<category><![CDATA[Palo]]></category>

		<category><![CDATA[SQLite]]></category>

		<category><![CDATA[excel]]></category>

		<category><![CDATA[MOLAP]]></category>

		<category><![CDATA[Pivot Table]]></category>

		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=364</guid>
		<description><![CDATA[Jedox have just released V1.0 of their Palo-centric ETL Server.  I had been looking forward to this, not so much for its ETL ability (which is somewhat limited when compared to the likes of Pentaho PDI or Talend) but for the drill-through capability it would add to Palo.  Alas, there&#8217;s a catch, you [...]]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Jedox have just released V1.0 of their <a href="http://www.jedox.com/en/enterprise-spreadsheet-server/etl-server/introduction.html">Palo-centric ETL Server</a>.  I had been looking forward to this, not so much for its ETL ability (which is somewhat limited when compared to the likes of <a class="zem_slink" title="Pentaho" rel="homepage" href="http://www.pentaho.com/" target="_blank">Pentaho</a> PDI or <a href="http://www.talend.com">Talend</a>) but for the drill-through capability it would add to Palo.  Alas, there&#8217;s a catch, you must purchase  <a href="http://www.jedox.de/en/enterprise-spreadsheet-server/excel-supervision-server/introduction.html">Palo Supervision Server</a> (€8,000) to enable the Excel add-in to avail of this feature!</p>
<p>The thing that attracted me to Palo in the first place was its simplicity of approach and the primacy of Excel as the end-user view of the product, a modern day <a href="http://en.wikipedia.org/wiki/Essbase">ESSBase</a>.  The fact that the Excel Add-in is closed source always worried me, as I felt that it would inhibit the thing that really sets open source apart (no not the cost) ,the formation of an active and innovative developer community.  The sort of developers who have a need for, and an interest in, <a class="zem_slink" title="MOLAP" rel="wikipedia" href="http://en.wikipedia.org/wiki/MOLAP" target="_blank">MOLAP</a> tools tend to be more familiar with VBA, .NET, SQL, SAP Config etc. than with C/C++ or even Java development.   The one area where such a community could add value, the Excel front-end, is closed to them.  And I know there&#8217;s some non-Jedox involvement in the form of <a href="http://www.jpalo.com/en/">JPalo</a> and the <a href="http://sourceforge.net/projects/palooca/">OO-Calc Add-in</a>, but Excel is the key to Palo&#8217;s wide-spread adoption.</p>
<p>Also, the choice of .NET as the main client-side development platform was a mistake in my opinion, a VBA accessible object model would have been much more useful, it would also have removed the need for the current painful installation process.</p>
<p>This partial &#8220;open-source&#8221; model and the increasing complexity of the platform makes Palo, at least for me, a less attractive Micro-BI option.  And remember, Excel already has a very powerful in-memory OLAP tool, the  humble <a class="zem_slink" title="Pivot table" rel="wikipedia" href="http://en.wikipedia.org/wiki/Pivot_table" target="_blank">PivotTable</a>, which is in most cases &#8220;good enough&#8221; for most analytical needs.  So why use Palo rather than a PivotTable?</p>
<p>Palo&#8217;s advantages:</p>
<ul>
<li>Can handle very large data sets (limited by the free memory available to the server).</li>
<li>Allows write-back and splash-down, both very useful for planning/budgeting applications.</li>
<li>Allows for ragged-hierarchies,</li>
<li>Server-side MOLAP rules e.g. [Budget],[2008] = [Actual],[2007]*0.035</li>
</ul>
<p>Excel&#8217;s PivotTable advantages:</p>
<ul>
<li>Pure Excel. with the object model available for VBA scripting.</li>
<li>Drill-through as standard.</li>
<li>Excel 2007 can now handle a 1,000,000 rows, for earlier versions use an Access/SQLite local database or an enterprise database to first &#8220;group&#8221; and summarise the data to be pivoted.</li>
<li>Can be used against SQL Server Analytical Services (SSAS) Cubes (and those of other providers such as <a href="http://www.pentaho.com/products/analysis/spreadsheet_services/">Pentaho&#8217;s Mondrian</a>).</li>
<li>Much easier to set-up and use.</li>
</ul>
<p><strong>Update:</strong> May 1st 13:30</p>
<p>Beta Version of Palo 2.5 (which you&#8217;ll need to use the Palo ETL Server&#8217;s drill-through functionality) <a href="http://www.jedox.com/download/palo/beta/palo_addin_server_win32_2_5_0_0.zip">has just been released</a> ..</p>
<div id="zemanta-pixie" style="width:100%;margin:5px 0;"><a id="zemanta-pixie-a" title="Zemified by Zemanta" href="http://www.zemanta.com/"><img style="border:medium none;float:right;" src="http://img.zemanta.com/pixie.png?x-id=42364e71-9912-43e9-a469-91db0e3a1474" alt="" /></a></div>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/gobansaor.wordpress.com/364/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/gobansaor.wordpress.com/364/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/gobansaor.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/gobansaor.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/gobansaor.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/gobansaor.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/gobansaor.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/gobansaor.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/gobansaor.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/gobansaor.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/gobansaor.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/gobansaor.wordpress.com/364/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.gobansaor.com&blog=110633&post=364&subd=gobansaor&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://blog.gobansaor.com/2008/05/01/palo-etl-server-not-for-me/feed/</wfw:commentRss>
	
		<media:content url="http://a.wordpress.com/avatar/gobansaor-128.jpg" medium="image">
			<media:title type="html">gobansaor</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixie.png?x-id=42364e71-9912-43e9-a469-91db0e3a1474" medium="image" />
	</item>
	</channel>
</rss>