Essential guide to the Ruby driver for DB2

This article is obsolete. Please refer to the following articles for up do date instructions: Ruby/Rails and DB2 | Python/Django and DB2. Thank you!

This guide is a basic tutorial that’ll quickly get you started with the ibm_db Ruby driver. You’ll be querying DB2 from Ruby in no time, but please bear in mind that this document is not meant to be thorough (use the API reference for this purpose). Also, this is a guide for the IBM provided driver and not the ruby-db2 community provided one, and it introduces you to the Ruby driver only not the Rails adapter.

I strongly encourage you to use the IBM one since, while still open source, it is constantly developed and maintained by IBM and it has an active community of people using it in production. The ruby-db2 driver provided by the community hasn’t been updated for over 2 years and I’m not sure whether it’s abandoned or not.

The latest release of ibm_db is only 2 months old (at the time of writing); it’s very stable and fast, and utilizes the common API adopted by IBM for dynamic language drivers (such as PHP, Perl and Python). This means that if you learn one, you’ll easily be able to adapt your skills to another language. My advice is to stick with IBM’s driver.

Please follow these instructions for installing DB2 and IBM’s Ruby driver. Should you encounter any issues with the installation process, please refer to the DB2 Express-C forum and the Rubyforge forum.

When facing difficulties with the installation of DB2, use the former, while if you have problems with the ibm_db driver, use the latter which is monitored by, besides a group of helpful members of the community, myself, Alex Pitigoi (who is the current maintainer of the Rails adapter) and some other members of the team who developed the driver. We’ll carefully listen to your questions and usually provide help or a resolution in a timely fashion.

Before we get started let me clarify one point about the name of the driver. It was formerly known as IBM_DB2, but IBM decided to drop the 2 at the end, in order to reflect the intention to provide a driver that not only works with DB2, but also with other selected IBM Data Servers, like Informix Dynamic Server version 11 (Cheetha), IBM Cloudscape or Apache Derby.

Loading the IBM_DB driver

In order to connect to DB2 LUW from Ruby, we first need to “require” the driver. Linux and Unix users can run:

require 'rubygems'
require 'ibm_db'

On Windows, use:

require 'rubygems'
require 'mswin32/ibm_db'

Please note that you won’t need the first line if you haven’t installed the driver through RubyGems or if you already have the RUBYOPT environment variable set for rubygems. The execution of the second line will return true if the driver has been successfully “required”, false if it was already loaded, and it will raise an exception otherwise. From now on I’ll use ‘ibm_db’, so if you are using Windows, change it accordingly to ‘mswin32/ibm_db’.

Connecting to a DB2 database

The IBM_DB module contains a series of methods and classes that can be used to interact with DB2 from Ruby. The first method that we’ll use is IBM_DB.connect. This allows us to connect to both local and remote databases. Those who have DB2 Connect installed will even be able to connect to DB2 on iSeries and z/OS.

More technically, connecting to a local database means establishing a cataloged connection by specifying a database alias that exists in the DB2 client catalog. For the first few examples we’ll use the SAMPLE local database/alias. In case you don’t have it already, you can create it by running db2sampl.

The following snippet connects to the SAMPLE local database:

require 'rubygems'
require 'ibm_db'

conn = IBM_DB.connect("sample", "db2inst1", "mypassword")

if conn
  puts "We're connected!"
  IBM_DB.close(conn)
else
  puts "There was an error in the connection: #{IBM_DB.conn_errormsg}"
end

The following is the equivalent code for connecting through TCP/IP to a remote database (in reality it obviously connects to local databases too, but locally it’s usually better to use an alias):

require 'rubygems'
require 'ibm_db'

conn = IBM_DB.connect("DRIVER={IBM DB2 ODBC DRIVER};DATABASE=sample;\
                       HOSTNAME=localhost;PORT=50000;PROTOCOL=TCPIP;\
                       UID=db2inst1;PWD=mypassword;", "", "")
if conn
  puts "We're connected!"
  IBM_DB.close(conn)
else
  puts "There was an error in the connection: #{IBM_DB.conn_errormsg}"
end

For an “uncataloged” connection to a database through TCP/IP, the complete connection string is passed to the connect method. Beside the usual arguments (database, username and password), in this case we also need to specify the hostname or IP address of the dataserver, and the port number (defaulted to 50000 on Linux/Unix).

In both cases (local or remote), the connect method returns a connection handle resource when the connection is successfully established, while it returns false for any failed attempts. This implies a couple of important things. The connection outcome should be verified with a conditional if statement, because it won’t raise any exceptions for you if there are problems connecting. Luckily you can use IBM_DB.conn_errormsg to obtain a string containing the error message returned by DB2. For example, in our snippet, if we were to use the wrong password, we would receive an error message similar to the following:

There was an error in the connection: [IBM][CLI Driver] SQL30082N Security processing failed with reason "24" ("USERNAME AND/OR PASSWORD INVALID"). SQLSTATE=08001 SQLCODE=-30082"

Error messages are usually descriptive enough to let you understand what the problem is. Should you find yourself stuck though, you can first try searching through the Information Center, then Google it and if you still can’t solve the issue, ask directly in the DB2 forum.

The following is a common error with a message that is rather puzzling:

"[IBM][CLI Driver] CLI0133E Option type out of range. SQLSTATE=HY092 SQLCODE=-99999"

If you get this error, it usually means that you are using a version of DB2 that is too old. Install the latest FixPack or the latest version of DB2 (currently 9.5) to resolve the problem.

Please note that while it’s a good idea to use the method IBM_DB.close to close connections that are no longer needed, the driver would have closed the connection automatically for you upon finishing the execution of the script.

The IBM_DB.connect method also accepts an optional parameter that affects the behavior of the connection. For example it’s possible to turn on or off the autocommit, specify the case for the column names in the resultset, or even define a scrollable cursor. As usual, the API reference has all the details, but here is an example that turns off autocommit and specifies the option for retrieving lowercase columns:

require 'rubygems'
require 'ibm_db'

conn = IBM_DB.connect("sample", "db2inst1", "mypassword",
                      {IBM_DB.SQL_ATTR_AUTOCOMMIT => IBM_DB.SQL_AUTOCOMMIT_OFF, 
                       IBM_DB.ATTR_CASE => IBM_DB.CASE_LOWER})
if conn
  puts "We're connected!"
  IBM_DB.close(conn)
else
  puts "There was an error in the connection: #{IBM_DB.conn_errormsg}"
end

Finally, before moving on, note that there is a similar method called IBM_DB.pconnect that establishes persistent connections.

Executing queries

The following snippet executes a query and prints the retrieved records:

require 'rubygems'
require 'ibm_db'

if conn = IBM_DB.connect("sample", "db2inst1", "mypassword")
  sql = "SELECT * FROM EMPLOYEE"
  begin
    if stmt = IBM_DB.exec(conn, sql)
      while row = IBM_DB.fetch_assoc(stmt)
        puts "#{row['FIRSTNME']} #{row['LASTNAME']}: #{row['EMPNO']}"
      end
      IBM_DB.free_result(stmt)
    else
      puts "Statement execution failed: #{IBM_DB.stmt_errormsg}"
    end
  ensure
    IBM_DB.close(conn)
  end
else
  puts "Connection failed: #{IBM_DB.conn_errormsg}"
end

It may seem complicated because we added some extra “padding” to the code, but executing queries with the driver is very straightforward. The reason why we have these extra conditions and error checks, is because both the connect and the exec methods don’t raise errors when they fail. Let’s analyze in detail the code above.

We first attempt to establish a connection, placing the result in the conn variable. The returned value can either be a valid connection handle or false. If it’s a false value, it means that the connection attempt failed, therefore we print the error message retrieved with IBM_DB.conn_errormsg.

If the connection was successful we then proceed to execute a query through the method IBM_DB.exec, which accepts two arguments: the connection handle stored in conn, and a string representing the SQL query. This, just like the method connect, will return false if DB2 returns an error and therefore the execution of the query has failed. In that case we print the error message with the IBM_DB:stmt_errormsg method. If the SQL statement was issued successfully the exec method returns a statement resource that we use to fetch the records (of course, for SELECT queries).

In order to fetch the results, we loop using the IBM_DB.fetch_assoc method that retrieves one record at a time. The loop will be interrupted when there’s no more rows left in the resultset. The IBM_DB.fetch_assoc method returns a hash whose keys correspond to the columns. By default, SQL columns are case insensitive, so unless you specified otherwise in a connection attribute, the hash will have uppercase keys.

The table EMPLOYEE has several columns, amongst others FIRSTNME (not a typo on my part), LASTNAME and EMPNO. In this snippet we print all the employees’ first and last names, along with their employee numbers. Finally, we clean up the resources associated with the resultset by executing IBM_DB.free_result. This operation, just like closing the connection, would have happened automatically if not specified, when the script ends.

A series of alternative (and similar) methods exist: fetch_array, fetch_both, fetch_object and fetch_row. Again, the API reference has all the details. They are very easy to use if you understand the manner in which we used fetch_assoc in the snippet above.

Parameterized queries

In the previous section we showed how to run an SQL query to interrogate a DB2 database from Ruby. While the approach adopted so far is perfectly acceptable, in some instances there is a better way. Chances are that sometimes you’ll find yourself repeating the same query over and over.

In some other scenarios, you may also run a series of queries which follow the same pattern except that the values provided vary from query to query. Consider for a moment the following queries:

SELECT * FROM USERS WHERE AGE > 20;
SELECT * FROM USERS WHERE AGE > 30;
SELECT * FROM USERS WHERE AGE > 40;
SELECT * FROM USERS WHERE AGE > 50;
SELECT * FROM USERS WHERE AGE > 60;
SELECT * FROM USERS WHERE AGE > 70;

These are identical except for the cut off value for the column AGE. In such cases, the best approach is to use parameterized queries. For those who aren’t too familiar with the expression, parameterized queries are queries where the SQL statement contains parameters whose values will be passed to the database without having to dynamically create the complete SQL query within your program.

This approach is not only less error prone and much more secure (e.g. in avoiding SQL Injections), but it can also drastically improve performance.

The following snippet shows an example of their usage:

require 'rubygems'
require 'ibm_db'

if conn = IBM_DB.connect("sample", "db2inst1", "mypassword")

  sql = " SELECT FIRSTNME, LASTNAME FROM EMPLOYEE WHERE SALARY BETWEEN ? AND ?"
  
  stmt = IBM_DB.prepare(conn, sql)
  values = [15000, 50000]
  
  begin
    if IBM_DB.execute(stmt, values)
      while row = IBM_DB.fetch_array(stmt)
        puts "#{row[0]} #{row[1]}"
      end
    else
      puts "Execution failed: #{IBM_DB.stmt_errormsg(stmt)}"
    end
  ensure
    IBM_DB.close(conn)
  end
else
  puts "Connection failed: #{IBM_DB.conn_errormsg}"
end

Let’s take a look at what’s going on here. We connect as usual but this time around we place question marks in the SQL query rather than the actual values. Then we prepare the statement through the method IBM_DB.prepare which accepts the connection handle and the parameterized SQL query as required arguments. The API reference for the driver has details for an optional third parameter that changes the way the cursor operates.

Behind the scenes, DB2 creates an optimized access plan for retrieving data with that statement. Subsequently issuing the prepared statement with IBM_DB.execute (not IBM_DB.exec) enables the statements to reuse that access plan and avoids the overhead of dynamically creating a new access plan for every statement we issue. Please note that we could bind the parameters explicitly by using the method IBM_DB.bind_param, which allows us to specify the parameter type (input, output, input-output or file), data type, precision, scale and so on. However this is usually only necessary for stored procedures where we intend to obtain information through output parameters.

If all the parameters are input ones, we can simply pass an array of values (aptly named ‘values’ in our snippet) to the IBM_DB.execute method. Notice that in the code above, we arbitrarily decided to use the IBM_DB.fetch_array method rather than IBM_DB.fetch_assoc as used in the previous one. Unlike the latter, with the former we use the numeric index 0 and 1 to access the values contained in the first two columns of the current record, since the method returns an array and not an associative array (a hash).

Stored Procedures

The ibm_db ruby driver fully supports stored procedures. Let’s start with an extremely simple example that is technically no different than running any other query.

require 'rubygems'
require 'ibm_db'

if conn = IBM_DB.connect("books", "db2inst1", "mypassword")
  
  drop_column_sql = "ALTER TABLE USERS DROP SAMPLE_COLUMN" 
  reorg_sql = "CALL ADMIN_CMD('REORG TABLE USERS')"
  
  if IBM_DB.exec(conn, drop_column_sql)
    puts "Column dropped"
  else
    puts IBM_DB.stmt_errormsg
  end
 
  if IBM_DB.exec(conn, reorg_sql)
    puts "Reorg successful"
  else
    puts IBM_DB.stmt_errormsg
  end
  
else
  puts "Connection failed: #{IBM_DB.conn_errormsg}"
end

The snippet above drops a column from a table. After dropping a column, DB2 requires the issuing of a REORG command. Unfortunately, IBM_DB.exec can only execute valid SQL statements and not DB2 command line processor (CLP) commands. Luckily though, the stored procedure ADMIN_CMD allows us to execute certain administrative commands, including REORG.

So, as you can see, in simple cases like this we can just use pass an SQL statement in which we call a stored procedure or user function to IBM_DB.exec. In practice, things are not always this straightforward. If we are using stored procedures, chances are that we need to handle output parameters too.

Let’s assume that we have a stored procedure called SALES_BY_TITLE which requires three parameters. The first parameter is the name of the author, the second one is the title of a book. Imagine that the second parameter is an input-output one, because we can provide a partial title for the book, and the procedure will provide us with the complete title if it exists.

The third (output) parameter will tell us the number of copies sold to date. From Ruby, we can handle this scenario quite easily:

require 'rubygems'
require 'ibm_db'

conn = IBM_DB.connect("books", "db2inst1", "mypassword")

if conn
  sql = "CALL SALES_BY_TITLE(?, ?, ?)"
  stmt = IBM_DB.prepare(conn, sql)
  
  author = "Allen Ginsberg"
  title = "Cosmo%"
  copies = 0
  
  IBM_DB.bind_param(stmt, 1, "author", IBM_DB.PARAM_IN)
  IBM_DB.bind_param(stmt, 2, "title", IBM_DB.PARAM_INOUT)
  IBM_DB.bind_param(stmt, 3, "copies", IBM_DB.PARAM_OUT)
  
  if IBM_DB.execute(stmt)
    puts "Title: #{title}"
    puts "Copies: #{copies}"
  else
    puts IBM_DB.stmt_errormsg(stmt)
  end
else
  puts "Connection failed: #{IBM_DB.conn_errormsg}"
end

As shown before, we CALL the stored procedure and use question marks as placeholders for the parameters in the SQL statement. Immediately after having prepared the statement, we need to initialize the three variables author, title and copies, and bind them to the prepared statement through the IBM_DB.bind_param method. The second argument of this method is the position of the parameter (starting from 1) and the third one is the type of parameter. Again, the API reference has more details about its usage.

After executing the statement (with the IBM_DB.execute method) the two variables, title and copies, which where respectively bound as INOUT and OUT parameters, will contain the full title of the book and the number of copies which we can print for the user to see.

While the IBM_DB Ruby driver provides us with many more methods, understanding this tutorial will give you enough material to get started with using the driver. Above all, I hope it will make the process of looking up specific methods in the online documentation much easier.

Get more stuff like this

Subscribe to my mailing list to receive similar updates about programming.

Thank you for subscribing. Please check your email to confirm your subscription.

Something went wrong.

13 Comments

  1. Ermar February 8, 2008
  2. Ronald February 8, 2008
  3. Gerardo Santana February 8, 2008
  4. Antonio Cangiano February 8, 2008
  5. Nathan February 8, 2008
  6. Gerardo Santana February 8, 2008
  7. Antonio Cangiano February 8, 2008
  8. Gerardo Santana February 8, 2008
  9. Michele March 6, 2008
  10. Frank June 11, 2008
  11. Pingback: express reorg July 25, 2008
  12. Jeff July 31, 2009
  13. Antonio Cangiano July 31, 2009
  14. Aman March 26, 2020

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.