IBM recently put out a video with a few quotes from customers who’ve had a chance to try DB2 9.7 in production. At times the editing is a bit cheesy, but what our customers are saying is factual. DB2 9.7 is quite a breakthrough and it can save millions of dollars for large companies. If your needs are more modest, like a demanding Web application or a point of sale server, DB2 Express-C can still give you most of the same benefits at zero the cost. Wanna take it for a spin? You can download it here or sign up at RightScale.com to try it on the Cloud.

This is a great day for those of us who love DB2, as DB2 Express-C 9.7 has just been released. As mentioned before, this is the best DB2 ever, and an extremely important release.

To learn more about what’s new in this release, please check out the recording of our latest webinar:


If you run Linux, Unix or Windows, download it while it’s hot.

DB2 9.7 on the Cloud

Another great aspect of this release is that for the first time ever, DB2 has been released both as a product and as a deployment on the Cloud. If you pop over to RightScale, you can get a trial account for free and should see DB2 Express-C 9.7 on both CentOS and Ubuntu within the partner catalog. RightScale has been an amazing partner and they really do wonders to simplify Cloud Computing. In ten minutes time you can be up and running on the Cloud, thanks to the templates provided.

DB2 on the Cloud

DB2 support for Django

But the good times don’t stop there, we are also announcing the first official release of the Django adapter for DB2. It sounded crazy when I first proposed the idea within IBM back in 2006, but now it’s a reality.

You can download the .tar.gz archive from the Google Code homepage for the project, or simply by clicking here. This version fully supports the Django 1.0.2 API. For instructions on how to install it, please read the Getting started with the IBM DB Django adapter guide. The current version supports DB2 for Linux, Unix, Windows and MAC OS X, version 8.2 or higher (9.5 FP2 or higher for MAC OS X). In the future, IBM Cloudscape, Apache Derby, Informix (IDS) and both System i & z/OS will be supported.

ibm_db gem updated to 1.1

I’ll conclude this DB2-centric post with a smaller, but still interesting announcement. The ibm_db gem has been updated to version 1.1. This release includes support for ActiveRecord’s QueryCache mechanism, enhanced support for BigInt (and BigSerial), support for rename_column (requires DB2 9.7), parametrization of the timestamp datatype (requires DB2 9.7), and a few fixes and performance enhancements as well. It is recommended that you upgrade to this version.

Counting rows is an ubiquitous operation on the web, so much so that it’s often overused. Regardless of misuse, there is no denying that the performance of counting operations has an impact on most applications. In this post I’ll discuss my findings about the performance of DB2 9.5 and MySQL 5.1 regarding counting records.

For those of you who are not into science fiction, let me clarify that the odd title of this post is a tongue-in-cheek reference to the great novel, Do Androids Dream of Electric Sheep?.

I connected to the database, created the table, imported the data and benchmarked counting operations using ActiveRecord in a standalone script. Here is the code I used:

#!/usr/bin/env ruby
require "rubygems"
require "active_record"
require 'benchmark'

ActiveRecord::Base.establish_connection(
  :adapter  => :mysql,
  :username => "myuser",
  :password => "mypass",
  :database => "mydb")

ActiveRecord::Schema.define do
  create_table :people, :force => true do |t|
    t.string :name, :null => false
    t.string :fbid, :null => false
    t.string :gender
    t.string :profession
  end
end

class Person < ActiveRecord::Base
end

# This can be sped up by performing an import instead
Person.transaction do
  File.open("person.tsv").each_line do |line|
    line = line.split(/\t/)
    p = Person.new
    p.name = line[0]
    p.fbid = line[1]
    p.gender = line[6]
    p.profession = line[17]
    p.save!
  end
end

n = 100
Benchmark.bm(26) do |x|
  x.report("Count all:") { n.times { Person.count } }
  x.report("Count profession:") { n.times { Person.count(:profession) } }

  x.report("Count females:") do
    n.times { Person.count(:conditions => "gender = 'Female'") }
  end

  x.report("Count males w/ profession:") do
    n.times { Person.count(:profession, :conditions => "gender = 'Male'") }
  end
end

Please note that importing records in a huge transaction containing hundreds of thousands of INSERT operations is far from the most efficient way to import. Massive imports of data using the load/import facilities provided by each database is the way to go (also see the ar-extensions plugin). The lengthy import wasn’t benchmarked here though, so it isn’t determinant for this article.

people.tsv is a 92.7 MB tab separated values file that contains 875,857 records from the Freebase project (in my file I removed the header line, leaving only records).

For those who are not familiar with ActiveRecord, the queries executed behind the scenes are (in order):

SELECT count(*) AS count_all FROM people

SELECT count(people.profession) AS count_profession FROM people

SELECT count(*) AS count_all FROM people WHERE (gender = 'Female')

SELECT count(people.profession) AS count_profession FROM people WHERE (gender = 'Male')

While the table definition (for MySQL) is:

CREATE TABLE `people` (
	`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
	`name` varchar(255) NOT NULL,
	`fbid` varchar(255) NOT NULL,
	`gender` varchar(255),
	`profession` varchar(255)
) ENGINE=InnoDB

As easily verified by enabling logging with:

ActiveRecord::Base.logger = Logger.new(STDOUT)

Without much further ado, here are the times I obtained on my last generation MacBook Pro 2.66 GHz with 4 GB DDR3 RAM, and 320 GB @ 7200 rpm hard disk, running Mac OS X Leopard:

MySQL:

  Count all:                  42.467522
  Count profession:           52.130935
  Count females:              54.575469
  Count males w/ profession:  64.046631

DB2:

  Count all:                  5.818485
  Count profession:           7.714391
  Count females:              8.556377
  Count males w/ profession:  9.656739

Or in graph form:

COUNT performance graph

That’s an impressive difference. To be exact, in this example DB2 was between 6 and 7 times faster than MySQL. In the case of COUNT(*), DB2 counted almost a million records in 58 milliseconds, or in about the blink of an eye according to Wolfram Alpha.

For those who are skeptical, please note that DB2 was not manually fine-tuned in any way. The client codepage was set to 1252 to allow Greek letters, and the log size was increased to permit such a huge transaction during the import. That’s it, no optimizations were attempted. This is DB2 Express-C out of the box. It looks like smart androids count electric sheep with DB2 after all. :-P

The advantages of DB2 over MySQL when dealing with a massive volume of traffic are well known (and not limited to performance either), but DB2 can dramatically improve performance even for your average web application. And DB2 9.7, which will be released this month, increases the performance and the ability to self-tune itself to the available resources and required workload even further. If you’d like to try DB2 Express-C for yourself, you can download it here. It doesn’t cost you a dime to obtain and can be used for development, testing and production absolutely free of charge.

Next Page →