Category Archives: cloud

Why Larry hates the cloud, and my data trinity.

Last week Oracle certified Amazon EC2 as a supported platform, that same week Larry Elison attacked the concept of cloud computing as pure hype. Obviously, Larry is not happy with this whole cloud thing, and I think it’s not just the threat it poses to the software industry’s traditional licensing model that worries him, rather, as Robert X. Cringely points out in his “Cloud computing will change the way we look at databases” post, it’s the likelihood that it sounds the death-knell for large-scale traditional databases.

This new database paradigm is memory rather than disk centric, with the disk-based element acting as an archive/backup/restore mechanism which can easily be stored on commodity SAN devices ( e.g. Amazon’s ESB). Using MapReduce technology Google effectively holds the whole Internet in memory, not in one big super computer but in lots of cheap commodity servers.

But it’s not just in the realm of mega datasets that RAM based databases threaten traditional models. Excel is a memory-based database engine, so too in-memory OLAP tools such as Palo. Such products’ ability to handle large volumes of data has increased over the years, with the decrease in RAM costs and the appearance of cheap 64 bit machines (which are no longer limited to 2G/3G process working sets).

That doesn’t mean that we’ll throw away SQL databases in their entirety, SQL and the relational model will continue to be useful. But perhaps of greater use in local datastores/caches that as the building blocks for large scale datastores. For such local caches, less will be more; fewer features, easier to configure, more flexibility. That’s why I like SQLite; long after the dinosaurs of the database world have disappeared, I imagine SQLite databases will continue to survive, embedded in mobile phones, browsers, wherever a local datastore is required. And more than likely operating in memory rather than off disk.

By combining Excel with an in-memory SQLite database, linked to a Palo OLAP in-memory server, it’s possible to take advantage of three powerful data-processing technologies (spreadsheets, SQL, multi-dimensional cubes) all within your PC’s RAM. You could do serious datasmithing with such a combination on a pretty mediocre laptop, with most modern machines providing an excess of CPU power, no need for super fast disks, just as much memory as you can muster. And, with Windows on EC2, these three amigos will soon be capable of being used as a cloud bursting platform.

Excel, SQLite and Palo, my data trinity.

About these ads

Cloudy skies, cloudy apps…

Just back from a break in Clifden, Connemara, summer is nearly over, the kids return to school today, back to work.

Aasleagh Falls, Co. Mayo

Aasleagh Falls, Co. Mayo

Counties Galway and Mayo were like the rest of the country last week, a tad wet, but unlike the developed east of the island, flooding was not a problem; a problematic drainage area is called a lake in the west.

This August has been the wettest and dullest I’ve ever experienced but at least I saw some sunshine earlier in the month thanks to Kristian Raue CEO of Jedox who kindly invited me to visit the company’s offices in Freiburg, Germany.  Freiburg is very green in both senses of the word, surrounded as it is by the Black Forest and its well deserved “eco-city” status.  Its also know as the warmest city in Germany, a reputation it thankfully lived up for this visitor from a rain-soaked Atlantic isle.

August morning, Frieburg Im Breisgau

August morning, Freiburg im Breisgau

If Freiburg left a positive impression on my mind, so too did Jedox.  The overall impression is of a company which intends to use a combination of quality, vision and the judicious use of open-source to build the Jedox brand into one associated with best-of-breed products and consultancy.  This vision can be seen in the evolution of Palo, from its “good enough” beginnings to its current near-best-of-breed 2.5 version, and from talking to some of those working on the product, best-of-breed status is not that far off.

Likewise, ETL-Server which is currently a Palo only “loader”, is to be further  developed into a true ETL tool, while continuing to offer MOLAP-centric specialisms.

I also got a glimpse of the next version of Worksheet Server. “Wow!”, is all I can say.

Existing web based spreadsheet products are fine for simple data analysis or basic data capture purposes but cannot compete with their client-based elder cousins when serious datasmithing is required.  Well, from the demo I saw of Worksheet Server in action, that’s about to change.  The look and, more importantly, the feel is similar to that of traditional spreadsheets, its interface with Palo is identical to that of the existing Excel add-in, and here’s the big one, its open source!  Game-changing or what?

But …

That might enable me to move a lot of my spreadsheet applications to the cloud, but what about those applications that are more suited to an MS Access type solution?

Then try out WaveMaker. It’s open source and built on industry standards, Hibernate,Spring and the Javascript Dojo framework but has the ease of GUI database development more usually associated with MS tools. The resulting applications are packaged as a WAR file which can be hosted by any standards based Java server (e.g. Tomcat or Jetty).  The latest version makes developing Ajax-fronted database applications even easier with the addition of layout templates.  Its existing ability to automatically bind interfaces to SOAP web services has been extended to REST web services by means of a new WSDL auto-discover tool.  And Chris Keene CEO of WaveMaker also informs me that …

We are also releasing a cloud-based IDE in October with Amazon – stay tuned…

We launched in February and will be announcing our first 7 figure deal this month. We run on Mac, Linux and Windows and are currently the #1 developer download on Apple.com (http://www.apple.com/downloads/macosx/development_tools/)

Our goal is to make it easy to build rich internet applications without complex coding – kind of a MS Access for the Web.

Jedox and Wavemaker the new breed of open-source businesses

NX rather than VNC for EC2 Desktop

The various Amazon EC2 AMIs that I’ve built over the last few years are getting a bit long in the tooth. Most are based on Fedora 4 and nearly all are over-burdened with software I no longer use nor require. Time for some rationalisation.

I figure I need two ‘template’ AMIs, one containing the bare minimum of software, EC2 tools, Python, Perl and Java; the second loaded with the likes of Kettle, Talend, Hamachi VPN, OracleXE , Palo MOLAP Server and Palo ETL Server and a Gnome desktop accessible via VNC.

I’m deciding whether to use Centos or Ubuntu as the basis for one or both templates. I’m more familiar with Centos’s RedHat heritage but Ubuntu’s design goals of ease-of-use and ease-of-update appeal.  Since I was in the process of re-evaluating my EC2 builds I decided to also check out NX as an alternative to VNC. I had tried to install NX Server on a Fedora 4 instance a few years back, but had abandoned the effort having spent the best part of a day on it, reverting back to my VNC comfort zone.

This time I was able to use one of Eric Hammond’s Ubuntu AMIs with NX pre-installed.  Wow, what a difference! It’s much more responsive, even over my tempermental fixed wireless broadband connection. I also tried it using my backup ISDN line, again a huge improvement compared to using VNC. If you’re still using VNC to remotely access EC2 or any other remote server, you’ve got to check out NX.

Oracle in the cloud …

Oracle CorporationImage via Wikipedia 

… not yet, but Bill Hodak from Oracle has just opened a thread over on the Amazon AWS developer forums, looking for feedback on the use of Oracle in AWS projects. First there was Red Hat, then this week’s announcement from Sun and now Oracle; has Amazon managed to turn itself into the cloud provisioner not just for the hungry masses of start-ups and independent developers but for the technology elites?

As for using Oracle on EC2, yes please. Most of my datasmithing career has been spent behind the wheel of an Oracle database, the front-ends might have been Excel or some BI package, the end results might have been SAP master data take-ons or an Essbase cube, but the blood and guts were always Oracle. And this was before Oracle Apex – think what wonders could have been achieved if I had access to such a product in the past.

When EC2 first appeared I enthusiastically installed Oracle 10g Express, using a Hamachi VPN to tunnel the Apex front-end back to my PC (don’t ever expose an Oracle 10g server to the public internet, its architects assumed it would be used solely within the corporate firewall). I even used the power of Oracle’s redo logs to partially protect against the ephemeral nature of EC2′s disk storage.

It looked to me back then that EC2 could be an ideal hosting environment for Oracle Application Express (aka Apex, aka HTML DB), but for a few wee problems:

  • It’s not absolutely clear whether the Oracle 10G Express database licence covers its use in a virtual environment (sometimes the restriction of one database per server is stated as one per machine), a few attempts to look for a definitive yeah or neigh on the product’s support forums elicited no response. I’m guessing its fair-usage, but confirmation would be nice.
  • Oracle doesn’t appear to know what to do with Apex, you get the impression they’re afraid it’ll cannibalise its lucrative J2EE business.
  • 10g Express is severely hobbled as a database, not just the 4GB per server (or is that machine), it’s lacking any sort of updating service, serious security flaws remain unpatched and username/passwords are sent in plain text; making it suitable (and then only barely) for use within a firewall or VPN.
  • Once you outgrow Express, you’re into big money and even worse you might have to talk to a sales rep!

So what would I like to see Oracle offering on EC2? A paid AMI, preloaded with a variation of Express, minus the 4GB limit, with a “hardened” public internet facade, along with regular patches automatically applied. Optional add-ons…

  • Various levels of support, fixed monthly charge perhaps.
  • Ability to upgrade to the full Enterprise Editions, but again paid for via a combination of AMI hourly charges and optional month-to-month support charges.
  • Ability to purchase once-off consultancy, both from Oracle and third-party suppliers.

I’m not holding my breath though…

Oh, if you’re confused over the various “Express” terms used in the above, don’t blame me, blame Oracle, I thing the poor branding profile (constant name changes, copy cat names) is an indication of Oracle’s lack of commitment to both products.

UPDATE Sept. 22nd 2208

Looks like the Oracle Cloud has arrived..

Java – at the eye of a perfect storm

The “perfect storm” of ubiquitous broadband, powerful and cheap laptops, virtual machines, cloud-based services/infrastructure and open-source software is changing the nature of IT in a way that’s reminiscent of the revolution started by the IBM PC. Although a lot of emphasis has been put on the influence of consumer-focused services on the enterprise, the Web2.0 effect; there’s also traffic in the other direction.

Tools that were once the preserve of large multi-national or governmental organisations are now becoming available to a much larger audience at a fraction of the cost (either free via open source or pay-as-you-go via on-line services such as Amazon Ec2 or salesforce.com).

As a result of this leakage from the enterprise, I’m more and more using a skill that I though I’d left behind in the hallowed halls of “big business”, my knowledge of enterprise Java. Two of the tools that are at the centre of my datasmithing arsenal are Java to the core, Talend ETL and WaveMaker and a third Palo’s new ETL Server is built on a Java stack.

Talend

My first impressions of the “Java Project” version of Talend were, as they say in Texas, “all hat and no cattle”. I’ve stuck with it though, and have had the opportunity over the last few weeks to re-visit the product. Initially, my attitude to Talend was coloured by my experience with Kettle (aka Pentaho PDI), which under the direction of Matt Casters and the patronage of Pentaho has grown from strength to strength, but once I attuned myself to the idea that Talend is, in essence, a code generator, generating code in a language I know well, I became more comfortable with it.

What I like about Talend, is the ability to convert an ETL process into either a POJO or a WAR file representation of the solution, both stand-alone and fixed-in-time. Talend as a company or as a product could disappear in the morning, as could I, but the solution, cast in Java, will continue on regardless, a solution expressed in a standards based language, widely used and understood by a large number of IT professionals.

This is really important when you consider the ETL/BI products that have been swallowed by the rent-book collectors of the IT business, e.g. Essbase/Hyperion by Oracle, Business Objects by SAP (I might be a bit unfair to SAP, who continue to show real commitment to software R&D). The new owners of these once ground-breaking products will continue to milk the licence holders for “rent” long into the future , a situation that would be acceptable if they continue to offer value above a “protection” service more associated with that provided by your local hoodlum; but alas, most were bought to strengthen the purchaser’s control of their market, so I wouldn’t hold my breath.

I’ve always been a lazy programmer, so I’ve over the years developed numerous productivity aids to help automate development (i.e. reduce the boredom factor for me and the cost for my customers/employers) and to reduce errors (a cost to both sides). Code generation has been at the heart of many of these efforts, so I find myself at home with both Talend and Wavemaker’s approach.

WaveMaker

I’d not heard of Chris Keene’s WaveMaker until a few weeks ago. WaveMaker (previously know as ActiveGrid) belongs, along side Talend, Pentaho and Jedox, to a new breed of open-source businesses, and the product brings new life to Java web development. I was lucky to escape the worst of the J2EE nonsense and could never understand why an easy to use GUI builder like this never existed in the Java world, no wonder .NET continues to outgun J2EE in the market place.

Those of you with a background in VB6/VBA or VisualStudio will feel right at home here, but instead of desktop GUIs you’ll be building AJAX web applications. The resulting application is packaged as a WAR file which can be hosted by any standards based Java server (e.g. Tomcat or Jetty). It’s open source and built on industry standards, Hibernate,Spring and the Javascript Dojo framework.

Not only can WaveMaker act as a front-end to traditional databases, it’s designed to be equally at home with data served up by web services or POJOs. And, as Talend WAR projects and the Palo ETL Server (Jetty based) both expose Axis based web services, these three products are a match made in Java heaven.

Oracle Application Express

So, If you only came across WaveMaker recently, what did you intend to use as a Web/GUI front-end before this?

Well, for many tasks, Excel, backed by web-service aware VBA will continue to be an option, my xlAWS library and xLite code base will continue to be useful. Obviously, Excel is a natural front-end to Palo itself and VB6 is always there for quick and dirty Window’s GUI apps. I may also use Proto if circumstances warrant it.

For web front-ends, in the past I toyed with using Oracle Apex, then Rails, then JotSpot, then Zimki.

  • Rails taught me a lot about “good web app design” and introduced me to Ruby (and SQLite) but it didn’t offer me the speed of development and ease of deployment I’ve become accustomed to.
  • JotSpot got swallowed up by Google and spat out again as Google Sites minus the innovative “wiki database” capability.
  • Zimki, alas, is no more.
  • That left me with Oracle’s impressive Application Express (aka APEX aka HTML DB).

Application Express is excellent if your background is in Oracle databases and/or Oracle Forms and if you work in a Oracle shop and have not checked it out, then do so, you’ll be impressed. It’s standard with all versions since 10G and can be installed on V9 databases. It’s also the front-end for Oracle XE – the free edition of the database server.

So why jump ship to WaveMaker?

APEX is Oracle specific, closed source, costs a fortune once you outgrow Oracle XE, is a bit “odd” to configure, and, I’m not sure Oracle know what to do with the product (afraid it might cannabalise its lucrative J2EE business!).

WaveMaker embraces the world, is open source and is really easy to use, while still allowing access to the underlying code (both Java and Javascript) and CSS styling. And, it can be easily deployed.

No contest, I’m afraid.