Category Archives: cloud

LightSwitch & Hobo – the return of the 4GL?

Those of us of a certain age have fond memories of the golden era of 4GLs. These simple, but at the time revolutionary, tools enabled business-aware programmers (usually termed analyst-programmers) to quickly build & deploy line-of-business apps. They (both tools and devs) were primarily data-driven, data begat screens, screens begat more data and so on. The resulting apps where server delivered, using either green-screen terminals or client-side delivery apps (a bit like current-day client-side RIAs). The resulting UIs could best be described as “plain but functional”.

These halcyon days were replaced by a combination of  2-tier Windows apps and by the 3-tier enterprise platforms. The increasing sophistication & complexity of the new platforms forced programmers to become highly specialised, often losing their once close links with the business, and even losing sight of the value of business data as a resource in itself (still amazes me when I come across business application programmers with poor or non-existent SQL/RDBMS skills). The analyst-programmer (AP) was no more.

Many of those APs, like myself, managed to stay close to the business by either becoming business/data analysts or datawarehousing/BI specialists. ERP platforms such as SAP, with their complex configuration requirements, also created a welcoming home for business-focused IT refugees.

The need for quick’n'dirty line-of-business apps has not disappeared, but this service is now often being provided by tools such as MS Access and above all, by Excel. This is both good and bad; the good is the expansion of development skills outside of IT; the bad is the effective dis-arming of a large proportion of professional IT folks. To paraphrase, Marvin the Paranoid Android: “Brain the size of a planet, and all they give me is Excel!”

Most of us managed to make the best of the situation, learning to respect, if not love, Excel; becoming SQL wizards; MDX magicians and dimensional modellers par excellence. But the call of the 4GLs remained and we veterans continue to keep a watchful eye for something that will match and hopefully surpass them.

Oracle’s Application Express (aka HTML DB) offered those with Oracle skills hope (very similar in concept to SQL*Forms V3) and the open source tool WaveMaker is also excellent, styling itself, with good reason, the Powerbuilder for Web Enterprise.

In the last month I’ve come across two modern-day descendants of  these 4GL data-driven tools. Hobo and LightSwitch.

Hobo, is an open source extension to  the Ruby on Rails platform. It uses the same MVC architecture but takes the concept further in not just allowing you to build a relationship model for your data, but also enabling the  easy specification of  a lifecycle model and, here’s the biggy, automatically building (and re-building on change) a very respectable UI to present to the outside world. It also offers a starter authentication framework and lots of other useful helpers. All without writing a line of Ruby code, or having any idea of how RoR works!  The end result is a tool that can not only quickly, and iteratively,  build CRUD type applications, but can also handle simple workflow apps out of the box.

LightSwitch, is similar but yet very different. The same, in that it follows the traditional 4GL data to screen approach; presenting the user with a graphical tool to build data tables, to link to existing data sources and to create relationships between entities. Screens can then be generated that, like Hobo, are professional looking and easy to use. Again like Hobo, new ‘skins’ can be applied to change the look and feel.

If a more sophisticated solution is required, code can be added (VB.NET or C#) at predefined events and indeed the resulting project, being fully VS 2010 compliant, can be opened in VS Professional and built out from there. (A similar ability to get under the bonnet exists for Hobo as it is essentially Rails).

Where LightSwitch differs is the deployment methods used. The end result is a Sliverlight app which can either run client-side (with full access to the client’s environment e.g. interact with Office etc.) or as a sandboxed IIS browser app or via the Azure cloud. Same code, same project can easily migrate back & forth between all three options. (Hobo, being a Rails app could also be sort-of-localised using RubyScript2Exe, but it could very easily be cloud deployed using the EC2-based, dead-simple to use, http://heroku.com/)

The LightSwitch data modeller also allows for relationships between local databases, network databases, SQL Azure cloud databases and web service datasets to be built and maintained within the application. The need for mashups between local & central/remote data is a constant requirement for LOB developers and LightSwitch appears to made it very easy to implement.

This data mash-up ability and the option to interact with the client will be a major attraction, at least for corporate devs working largely with MS tools. I say will, as alas the current beta is  “molasses-in-January” slow. I thought initially it was just my 5yr laptop hitting the wall, but others with more modern & powerful hardware also found it so.

So do tools like Hobo and LightSwitch herald the return of the IT analyst/programmer? Probably not, different times; outsourcing, SaaS and packaged software have and will continue to reduce the number of business-facing IT staff. But their places are been taken by IT-aware business folks, citizen programmers, creators of time-assets and it is they that will likely be the beneficiaries of such tools.

LiteBI, Heavy ETL

Although my major BI interest is in micro-BI (or is that  workgroup-BI?)  i.e. data, perhaps cleansed and packaged elsewhere, available locally on a datasmith’s PC,with most likely an in-memory OLAP as the analysis tool; the possibilities of the “cloud” as a BI platform have not escaped me.

From a micro-BI perspective, the ability to act as a backup/mirroring tool or as ETL/marshaling tool (anybody for Hadoop and SQLite?) attracts. I’ve yet to make up my mind on BI delivered as a cloud PaaS but obviously many others believe it has a future.

My main worry with PaaS is not lock-in (which exists equally for in-house proprietary solutions) but the dangers of a Coghead-like lock-out.  My other doubts are more technical; believing, as I do, that in-memory offers significant advantages over traditional ROLAP (simplicity been the main one) and multi-tenant in-memory architectures are not yet a runner.  But last week I had a demo of new Spanish BI PaaS service, LiteBI, which might just change my mind.

Javier Giménez Aznar and his team previously worked on delivering Pentaho based datawarehouses to large Spanish corporations and government agencies, so they have a deep understanding of Mondrian ROLAP and are using that knowledge to build the LiteBI service, but this time with SMBs as the target customers rather than corporates. Pricing starts at €145 per month and is based on number of concurrent users, number of analytical spaces and the data volumes, so it’s not for very small firms more for the Medium in SMB.

Impressions? The cube designer, dashboard builders and the general UI are all very good and I would think would appeal to end-user datasmiths and, as such, will be a major up-front aid to selling this product.  But it was LiteBIs approach to the thorny issue of ETL and data loading that impressed me and also helped ease some of my Coghead-induced-fears.

BI technology stacks consist of three elements:

  • The “fancy” front-end; graphs,animated dashboads and so on.
  • The pivot engine; ROLAP or MOLAP or both.
  • The ETL process.
  • (Many would say there’s an important 4th, the data-warehouse, but not every BI effort requires one, but that’s another issue)

LiteBI is continuing to build yet more functionality into their UI and this “fancy” front-end is essential as it’s their “shop window”.

Mondrian provides their pivot engine, and again they continue to work on optimisations such as column-based datastores to increase speed and automate responsiveness tuning (end-users are very unforgiving of slow pivots).

But it’s in the 3rd area, that of the ETL process, that you realise the LiteBI team has real-world BI experience.  Data is loaded into LiteBI via an API, but with the ETL process itself happening on the customer side.

“Well,so what?” you may ask. The extraction of data has to obviously happen customer-side (even though not in the case of data being sourced from the likes of SalesForce.com). Yes, but it’s the transformations and data cleansing that adds true value to the ETL process and subsequently determines the quality and usefulness (as opposed to the speed or the “prettiness” of delivery) of the solution.

Part of the process of adopting LiteBI, is an ETL consultancy stage where a LiteBI partner company will provide on-site services to build this ETL layer, handling not just transformations but initial load and automating the subsequent delta uploads.

So the cost mounts up, but in reality you can’t do BI without this investment; there’s no ETL magic bullet.  Even still, Javier says the typical go-live time for a LiteBI project would be in the order of 3-4 weeks rather than the 3-4 months of similar on-site Pentaho projects.

The end-user ‘owning’ the ETL process makes the prospect of a service lock-out slightly less worrying as, at least, one would still have a good starting point for moving to another provider or back in-house. What I would really like to see would be the option to self-host LiteBI, which I guess would involve open sourcing large parts of the service (the automated optimisation strategies could, for example, be excluded from this open source version).

The load API comes packaged as a plugin to Kettle (aka PDI) and the intention is to offer a similar add-on for Talend in the near future. LiteBI also offers a white-label offering whereby 3rd party OLTP solution providers can use the service as their product’s BI suite.

Like the Skibbereen Eagle keeping its eye on the Czar of Russia, I too will be keeping a watchful eye on LiteBI and the march of on-demand BI in general.

Why not join me on Twitter at gobansaor?

Why Larry hates the cloud, and my data trinity.

Last week Oracle certified Amazon EC2 as a supported platform, that same week Larry Elison attacked the concept of cloud computing as pure hype. Obviously, Larry is not happy with this whole cloud thing, and I think it’s not just the threat it poses to the software industry’s traditional licensing model that worries him, rather, as Robert X. Cringely points out in his “Cloud computing will change the way we look at databases” post, it’s the likelihood that it sounds the death-knell for large-scale traditional databases.

This new database paradigm is memory rather than disk centric, with the disk-based element acting as an archive/backup/restore mechanism which can easily be stored on commodity SAN devices ( e.g. Amazon’s ESB). Using MapReduce technology Google effectively holds the whole Internet in memory, not in one big super computer but in lots of cheap commodity servers.

But it’s not just in the realm of mega datasets that RAM based databases threaten traditional models. Excel is a memory-based database engine, so too in-memory OLAP tools such as Palo. Such products’ ability to handle large volumes of data has increased over the years, with the decrease in RAM costs and the appearance of cheap 64 bit machines (which are no longer limited to 2G/3G process working sets).

That doesn’t mean that we’ll throw away SQL databases in their entirety, SQL and the relational model will continue to be useful. But perhaps of greater use in local datastores/caches that as the building blocks for large scale datastores. For such local caches, less will be more; fewer features, easier to configure, more flexibility. That’s why I like SQLite; long after the dinosaurs of the database world have disappeared, I imagine SQLite databases will continue to survive, embedded in mobile phones, browsers, wherever a local datastore is required. And more than likely operating in memory rather than off disk.

By combining Excel with an in-memory SQLite database, linked to a Palo OLAP in-memory server, it’s possible to take advantage of three powerful data-processing technologies (spreadsheets, SQL, multi-dimensional cubes) all within your PC’s RAM. You could do serious datasmithing with such a combination on a pretty mediocre laptop, with most modern machines providing an excess of CPU power, no need for super fast disks, just as much memory as you can muster. And, with Windows on EC2, these three amigos will soon be capable of being used as a cloud bursting platform.

Excel, SQLite and Palo, my data trinity.

Cloudy skies, cloudy apps…

Just back from a break in Clifden, Connemara, summer is nearly over, the kids return to school today, back to work.

Aasleagh Falls, Co. Mayo

Aasleagh Falls, Co. Mayo

Counties Galway and Mayo were like the rest of the country last week, a tad wet, but unlike the developed east of the island, flooding was not a problem; a problematic drainage area is called a lake in the west.

This August has been the wettest and dullest I’ve ever experienced but at least I saw some sunshine earlier in the month thanks to Kristian Raue CEO of Jedox who kindly invited me to visit the company’s offices in Freiburg, Germany.  Freiburg is very green in both senses of the word, surrounded as it is by the Black Forest and its well deserved “eco-city” status.  Its also know as the warmest city in Germany, a reputation it thankfully lived up for this visitor from a rain-soaked Atlantic isle.

August morning, Frieburg Im Breisgau

August morning, Freiburg im Breisgau

If Freburg left a positive impression on my mind, so too did Jedox.  The overall impression is of a company which intends to use a combination of quality, vision and the judicious use of open-source to build the Jedox brand into one associated with best-of-breed products and consultancy.  This vision can be seen in the evolution of Palo, from its “good enough” beginnings to its current near-best-of-breed 2.5 version, and from talking to some of those working on the product, best-of-breed status is not that far off.

Likewise, ETL-Server which is currently a Palo only “loader”, is to be further  developed into a true ETL tool, while continuing to offer MOLAP-centric specialisms.

I also got a glimpse of the next version of Worksheet Server. “Wow!”, is all I can say.

Existing web based spreadsheet products are fine for simple data analysis or basic data capture purposes but cannot compete with their client-based elder cousins when serious datasmithing is required.  Well, from the demo I saw of Worksheet Server in action, that’s about to change.  The look and, more importantly, the feel is similar to that of traditional spreadsheets, its interface with Palo is identical to that of the existing Excel add-in, and here’s the big one, its open source!  Game-changing or what?

But …

That might enable me to move a lot of my spreadsheet applications to the cloud, but what about those applications that are more suited to an MS Access type solution?

Then try out WaveMaker. It’s open source and built on industry standards, Hibernate,Spring and the Javascript Dojo framework but has the ease of GUI database development more usually associated with MS tools. The resulting applications are packaged as a WAR file which can be hosted by any standards based Java server (e.g. Tomcat or Jetty).  The latest version makes developing Ajax-fronted database applications even easier with the addition of layout templates.  Its existing ability to automatically bind interfaces to SOAP web services has been extended to REST web services by means of a new WSDL auto-discover tool.  And Chris Keene CEO of WaveMaker also informs me that …

We are also releasing a cloud-based IDE in October with Amazon – stay tuned…

We launched in February and will be announcing our first 7 figure deal this month. We run on Mac, Linux and Windows and are currently the #1 developer download on Apple.com (http://www.apple.com/downloads/macosx/development_tools/)

Our goal is to make it easy to build rich internet applications without complex coding – kind of a MS Access for the Web.

Jedox and Wavemaker the new breed of open-source businesses

NX rather than VNC for EC2 Desktop

The various Amazon EC2 AMIs that I’ve built over the last few years are getting a bit long in the tooth. Most are based on Fedora 4 and nearly all are over-burdened with software I no longer use nor require. Time for some rationalisation.

I figure I need two ‘template’ AMIs, one containing the bare minimum of software, EC2 tools, Python, Perl and Java; the second loaded with the likes of Kettle, Talend, Hamachi VPN, OracleXE , Palo MOLAP Server and Palo ETL Server and a Gnome desktop accessible via VNC.

I’m deciding whether to use Centos or Ubuntu as the basis for one or both templates. I’m more familiar with Centos’s RedHat heritage but Ubuntu’s design goals of ease-of-use and ease-of-update appeal.  Since I was in the process of re-evaluating my EC2 builds I decided to also check out NX as an alternative to VNC. I had tried to install NX Server on a Fedora 4 instance a few years back, but had abandoned the effort having spent the best part of a day on it, reverting back to my VNC comfort zone.

This time I was able to use one of Eric Hammond’s Ubuntu AMIs with NX pre-installed.  Wow, what a difference! It’s much more responsive, even over my tempermental fixed wireless broadband connection. I also tried it using my backup ISDN line, again a huge improvement compared to using VNC. If you’re still using VNC to remotely access EC2 or any other remote server, you’ve got to check out NX.

Oracle in the cloud …

Oracle CorporationImage via Wikipedia 

… not yet, but Bill Hodak from Oracle has just opened a thread over on the Amazon AWS developer forums, looking for feedback on the use of Oracle in AWS projects. First there was Red Hat, then this week’s announcement from Sun and now Oracle; has Amazon managed to turn itself into the cloud provisioner not just for the hungry masses of start-ups and independent developers but for the technology elites?

As for using Oracle on EC2, yes please. Most of my datasmithing career has been spent behind the wheel of an Oracle database, the front-ends might have been Excel or some BI package, the end results might have been SAP master data take-ons or an Essbase cube, but the blood and guts were always Oracle. And this was before Oracle Apex – think what wonders could have been achieved if I had access to such a product in the past.

When EC2 first appeared I enthusiastically installed Oracle 10g Express, using a Hamachi VPN to tunnel the Apex front-end back to my PC (don’t ever expose an Oracle 10g server to the public internet, its architects assumed it would be used solely within the corporate firewall). I even used the power of Oracle’s redo logs to partially protect against the ephemeral nature of EC2′s disk storage.

It looked to me back then that EC2 could be an ideal hosting environment for Oracle Application Express (aka Apex, aka HTML DB), but for a few wee problems:

  • It’s not absolutely clear whether the Oracle 10G Express database licence covers its use in a virtual environment (sometimes the restriction of one database per server is stated as one per machine), a few attempts to look for a definitive yeah or neigh on the product’s support forums elicited no response. I’m guessing its fair-usage, but confirmation would be nice.
  • Oracle doesn’t appear to know what to do with Apex, you get the impression they’re afraid it’ll cannibalise its lucrative J2EE business.
  • 10g Express is severely hobbled as a database, not just the 4GB per server (or is that machine), it’s lacking any sort of updating service, serious security flaws remain unpatched and username/passwords are sent in plain text; making it suitable (and then only barely) for use within a firewall or VPN.
  • Once you outgrow Express, you’re into big money and even worse you might have to talk to a sales rep!

So what would I like to see Oracle offering on EC2? A paid AMI, preloaded with a variation of Express, minus the 4GB limit, with a “hardened” public internet facade, along with regular patches automatically applied. Optional add-ons…

  • Various levels of support, fixed monthly charge perhaps.
  • Ability to upgrade to the full Enterprise Editions, but again paid for via a combination of AMI hourly charges and optional month-to-month support charges.
  • Ability to purchase once-off consultancy, both from Oracle and third-party suppliers.

I’m not holding my breath though…

Oh, if you’re confused over the various “Express” terms used in the above, don’t blame me, blame Oracle, I thing the poor branding profile (constant name changes, copy cat names) is an indication of Oracle’s lack of commitment to both products.

UPDATE Sept. 22nd 2208

Looks like the Oracle Cloud has arrived..

Java – at the eye of a perfect storm

The “perfect storm” of ubiquitous broadband, powerful and cheap laptops, virtual machines, cloud-based services/infrastructure and open-source software is changing the nature of IT in a way that’s reminiscent of the revolution started by the IBM PC. Although a lot of emphasis has been put on the influence of consumer-focused services on the enterprise, the Web2.0 effect; there’s also traffic in the other direction.

Tools that were once the preserve of large multi-national or governmental organisations are now becoming available to a much larger audience at a fraction of the cost (either free via open source or pay-as-you-go via on-line services such as Amazon Ec2 or salesforce.com).

As a result of this leakage from the enterprise, I’m more and more using a skill that I though I’d left behind in the hallowed halls of “big business”, my knowledge of enterprise Java. Two of the tools that are at the centre of my datasmithing arsenal are Java to the core, Talend ETL and WaveMaker and a third Palo’s new ETL Server is built on a Java stack.

Talend

My first impressions of the “Java Project” version of Talend were, as they say in Texas, “all hat and no cattle”. I’ve stuck with it though, and have had the opportunity over the last few weeks to re-visit the product. Initially, my attitude to Talend was coloured by my experience with Kettle (aka Pentaho PDI), which under the direction of Matt Casters and the patronage of Pentaho has grown from strength to strength, but once I attuned myself to the idea that Talend is, in essence, a code generator, generating code in a language I know well, I became more comfortable with it.

What I like about Talend, is the ability to convert an ETL process into either a POJO or a WAR file representation of the solution, both stand-alone and fixed-in-time. Talend as a company or as a product could disappear in the morning, as could I, but the solution, cast in Java, will continue on regardless, a solution expressed in a standards based language, widely used and understood by a large number of IT professionals.

This is really important when you consider the ETL/BI products that have been swallowed by the rent-book collectors of the IT business, e.g. Essbase/Hyperion by Oracle, Business Objects by SAP (I might be a bit unfair to SAP, who continue to show real commitment to software R&D). The new owners of these once ground-breaking products will continue to milk the licence holders for “rent” long into the future , a situation that would be acceptable if they continue to offer value above a “protection” service more associated with that provided by your local hoodlum; but alas, most were bought to strengthen the purchaser’s control of their market, so I wouldn’t hold my breath.

I’ve always been a lazy programmer, so I’ve over the years developed numerous productivity aids to help automate development (i.e. reduce the boredom factor for me and the cost for my customers/employers) and to reduce errors (a cost to both sides). Code generation has been at the heart of many of these efforts, so I find myself at home with both Talend and Wavemaker’s approach.

WaveMaker

I’d not heard of Chris Keene’s WaveMaker until a few weeks ago. WaveMaker (previously know as ActiveGrid) belongs, along side Talend, Pentaho and Jedox, to a new breed of open-source businesses, and the product brings new life to Java web development. I was lucky to escape the worst of the J2EE nonsense and could never understand why an easy to use GUI builder like this never existed in the Java world, no wonder .NET continues to outgun J2EE in the market place.

Those of you with a background in VB6/VBA or VisualStudio will feel right at home here, but instead of desktop GUIs you’ll be building AJAX web applications. The resulting application is packaged as a WAR file which can be hosted by any standards based Java server (e.g. Tomcat or Jetty). It’s open source and built on industry standards, Hibernate,Spring and the Javascript Dojo framework.

Not only can WaveMaker act as a front-end to traditional databases, it’s designed to be equally at home with data served up by web services or POJOs. And, as Talend WAR projects and the Palo ETL Server (Jetty based) both expose Axis based web services, these three products are a match made in Java heaven.

Oracle Application Express

So, If you only came across WaveMaker recently, what did you intend to use as a Web/GUI front-end before this?

Well, for many tasks, Excel, backed by web-service aware VBA will continue to be an option, my xlAWS library and xLite code base will continue to be useful. Obviously, Excel is a natural front-end to Palo itself and VB6 is always there for quick and dirty Window’s GUI apps. I may also use Proto if circumstances warrant it.

For web front-ends, in the past I toyed with using Oracle Apex, then Rails, then JotSpot, then Zimki.

  • Rails taught me a lot about “good web app design” and introduced me to Ruby (and SQLite) but it didn’t offer me the speed of development and ease of deployment I’ve become accustomed to.
  • JotSpot got swallowed up by Google and spat out again as Google Sites minus the innovative “wiki database” capability.
  • Zimki, alas, is no more.
  • That left me with Oracle’s impressive Application Express (aka APEX aka HTML DB).

Application Express is excellent if your background is in Oracle databases and/or Oracle Forms and if you work in a Oracle shop and have not checked it out, then do so, you’ll be impressed. It’s standard with all versions since 10G and can be installed on V9 databases. It’s also the front-end for Oracle XE – the free edition of the database server.

So why jump ship to WaveMaker?

APEX is Oracle specific, closed source, costs a fortune once you outgrow Oracle XE, is a bit “odd” to configure, and, I’m not sure Oracle know what to do with the product (afraid it might cannabalise its lucrative J2EE business!).

WaveMaker embraces the world, is open source and is really easy to use, while still allowing access to the underlying code (both Java and Javascript) and CSS styling. And, it can be easily deployed.

No contest, I’m afraid.

CouchDB = IBM’s SimpleDB and S3 ?

What if you’re a major player in the IT world and suddenly the internet’s equivalent of your local bookshop releases a mould-breaking cloud-based database service, SimpleDB. This is on top of Amazon’s highly acclaimed document data store service, S3!

Well, if you’re IBM you hire Damien Katz the person behind CouchDB. I think 2008 could be the year that cloud-based database services really take off