<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Pentaho Data Integration (Kettle) V Talend Benchmark</title>
	<atom:link href="http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/</link>
	<description>A country datasmith.</description>
	<lastBuildDate>Tue, 02 Mar 2010 17:49:09 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: David Pavlis</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comment-4895</link>
		<dc:creator>David Pavlis</dc:creator>
		<pubDate>Tue, 03 Mar 2009 22:12:09 +0000</pubDate>
		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587#comment-4895</guid>
		<description>First, I have to confess I am responsible for CloverETL (www.cloveretl.org) tool - competitor of both Talend &amp; Pentaho.
We have just recently conducted quite comprehensive performance test of our tool - Clover plus Talend and Pentaho (can be download from: http://www.cloveretl.org/_upload/clover-etl/Comparison%20CloverETL%20vs%20Talend%20and%20Pentaho.pdf)

The reason for conducting this test was simple one - we wanted to know where do we stand. We took the TPC-H Q1&amp;Q3 tests and implemented it using the aforementioned tools. We used the TPC dbgen utility to generate 1GB and 10GB of data.
The reasons we obtained are quite interesting and I would like anyone experienced with Talend or Kettle to comment on them. All tools were able to cope with the 1GB data set. When it came to 10GB, Talend failed completely and Kettle just took too long - approx 3x times slower then Clover.

We had a chance to do the same tests with Informatica and DataStage. They both scored quite well, actually were faster than our tool, but not so much. 

David.</description>
		<content:encoded><![CDATA[<p>First, I have to confess I am responsible for CloverETL (www.cloveretl.org) tool &#8211; competitor of both Talend &amp; Pentaho.<br />
We have just recently conducted quite comprehensive performance test of our tool &#8211; Clover plus Talend and Pentaho (can be download from: <a href="http://www.cloveretl.org/_upload/clover-etl/Comparison%20CloverETL%20vs%20Talend%20and%20Pentaho.pdf)" rel="nofollow">http://www.cloveretl.org/_upload/clover-etl/Comparison%20CloverETL%20vs%20Talend%20and%20Pentaho.pdf)</a></p>
<p>The reason for conducting this test was simple one &#8211; we wanted to know where do we stand. We took the TPC-H Q1&amp;Q3 tests and implemented it using the aforementioned tools. We used the TPC dbgen utility to generate 1GB and 10GB of data.<br />
The reasons we obtained are quite interesting and I would like anyone experienced with Talend or Kettle to comment on them. All tools were able to cope with the 1GB data set. When it came to 10GB, Talend failed completely and Kettle just took too long &#8211; approx 3x times slower then Clover.</p>
<p>We had a chance to do the same tests with Informatica and DataStage. They both scored quite well, actually were faster than our tool, but not so much. </p>
<p>David.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex的个人Blog &#187; 再谈Kettle 性能问题</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comment-4820</link>
		<dc:creator>Alex的个人Blog &#187; 再谈Kettle 性能问题</dc:creator>
		<pubDate>Sat, 13 Dec 2008 08:50:45 +0000</pubDate>
		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587#comment-4820</guid>
		<description>[...] 1 .http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/ [...]</description>
		<content:encoded><![CDATA[<p>[...] 1 .http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/ [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Gleeson</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comment-4813</link>
		<dc:creator>Tom Gleeson</dc:creator>
		<pubDate>Thu, 04 Dec 2008 19:17:04 +0000</pubDate>
		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587#comment-4813</guid>
		<description>Matt, 

I, like Nick, but from the other end of the scale (micro-ETL), care little of marginal differences between tools.  And even when a product under-performs for me, throwing more hardware (or more time) at the problem usually solves it. 

As for people making blanket statements about things they know little about; end users and purchasers of IT products do it all the time, that is why they use benchmarks (biased or otherwise) and reports (Gartner comes to mind) to help them pick and choose between offerings.

Benchmarking is not a scientific discipline, it&#039;s a marketing one.

Tom</description>
		<content:encoded><![CDATA[<p>Matt, </p>
<p>I, like Nick, but from the other end of the scale (micro-ETL), care little of marginal differences between tools.  And even when a product under-performs for me, throwing more hardware (or more time) at the problem usually solves it. </p>
<p>As for people making blanket statements about things they know little about; end users and purchasers of IT products do it all the time, that is why they use benchmarks (biased or otherwise) and reports (Gartner comes to mind) to help them pick and choose between offerings.</p>
<p>Benchmarking is not a scientific discipline, it&#8217;s a marketing one.</p>
<p>Tom</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Casters</title>
		<link>http://blog.gobansaor.com/2008/12/04/pentaho-data-integration-kettle-v-talend-benchmark/#comment-4812</link>
		<dc:creator>Matt Casters</dc:creator>
		<pubDate>Thu, 04 Dec 2008 18:36:35 +0000</pubDate>
		<guid isPermaLink="false">http://gobansaor.wordpress.com/?p=587#comment-4812</guid>
		<description>FYI, I&#039;m not apposed at all to put the source files and transformations in an open source benchmarking project.

Remember that the enemy here is not Talend nor Pentaho.  We are still fighting the proprietary ETL tools out there.  If benchmarking can improve the situation for all open source tools involved, then that&#039;s good for all, customers included.  If we can expose the used transformations, jobs, mappings, source files, etc, then that strengthens us all.

Any other approach is IMHO counter productive and that includes making blanket statements concerning things you know very little about...  if you know what I mean.

By the way, Nick is kinda biased.  He has seen Kettle process massive amounts of data on clustered SAN systems.  As such I think his requirements for &quot;time and resources&quot; are a bit different from the average user. :-)

Matt</description>
		<content:encoded><![CDATA[<p>FYI, I&#8217;m not apposed at all to put the source files and transformations in an open source benchmarking project.</p>
<p>Remember that the enemy here is not Talend nor Pentaho.  We are still fighting the proprietary ETL tools out there.  If benchmarking can improve the situation for all open source tools involved, then that&#8217;s good for all, customers included.  If we can expose the used transformations, jobs, mappings, source files, etc, then that strengthens us all.</p>
<p>Any other approach is IMHO counter productive and that includes making blanket statements concerning things you know very little about&#8230;  if you know what I mean.</p>
<p>By the way, Nick is kinda biased.  He has seen Kettle process massive amounts of data on clustered SAN systems.  As such I think his requirements for &#8220;time and resources&#8221; are a bit different from the average user. <img src='http://s.wordpress.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Matt</p>
]]></content:encoded>
	</item>
</channel>
</rss>
