Oct
21
What is DB2 pureScale?
Filed Under DB2 | 3 Comments
There are two main types of scalability: vertical and horizontal. Vertical scalability consists of potentiating the hardware specs of a given server. This is typically done by increasing the number (and to a minor extent, speed) of the processors, adding more RAM, and so on. Commodity hardware tends to impose a strong limit on the resources that can be augmented. There is only so much that can be stuffed into your typical x64_86 Dell server, even replacing the entire machine with the top of the line model (having the same architecture). For this very reason, many customers opt to purchase more expensive hardware that has the advantage of facilitating vertical scalability, by allowing much more headroom for additional resources. This is why Unix-based solutions like IBM System p + AIX and Sun’s Solaris + SPARC are common choices in demanding environments.
For all their merits, even premium hardware solutions can only do so much in terms of vertical scalability. They may postpone the problem, by allowing a large number of processors and a huge amount of RAM to be installed, but they are still not limitless. And that’s when it becomes cheaper and much more feasible to attempt horizontal scalability, or the process of adding extra servers that share data and the overall system load.
Commercial databases, particularly DB2, excel at vertical scalability, which incidentally is one of the weakest aspects of open source RDBMS like MySQL. When it comes to horizontal scalability though, even commercial databases tend do struggle a little. Scalability is achieved, but a lot of work is required to configure complex setups involving masters, slaves, replication, failover servers, and so on. Until now.

IBM has just introduced a new technology for DB2 called pureScale, which accomplishes three incredible feats:
- Horizontal scalability: Virtually unlimited capacity by simply adding new nodes to the cluster.
- Application transparency: No code changes needed.
- Continuos data availability: regardless of which server fails, the system will continue to operate albeit with a proportionally reduced throughput.
DB2 pureScale applies to DB2 running on Unix (IBM System p running AIX only, for the moment), the same principles and rock solid technology employed by DB2 for z/OS Data Sharing and SYSPLEX. This technology happens to be employed by some of the largest companies in the world and it’s considered the golden standard when it comes to easily create OLTP database clusters, providing virtually unlimited scalability capabilities.
You can read more about it here and here. If you are going to IOD this year, don’t forget to attend this overview by one the architects of DB2 pureScale. And even if you can’t head over to Las Vegas next week, you can still attend online our Chat with the Lab today at 11:30am EST. Database scalability is a hot topic and this is going to be one of the most popular chats with the lab ever, so register for free now.
Sep
10
DB2 Express-C 9.7 can easily be installed on Ubuntu 9.04 Desktop edition by simply issuing sudo ./db2setup. This will open up a launchpad and you’ll be able to install the product through a wizard. But what about setting up DB2 Express-C 9.7 on Ubuntu 9.04 Server edition? When you are ssh-ing into your VPS or dedicated sever, there are no GUIs to help you out.
IBM provides all the information on how to proceed through a series of detailed documents, which explain why each command is required and what your options are. But if you are looking for a single document that is slightly more concise in nature, you can read this excellent step by step guide that was published today on the blog bits.of.info.
Sep
9
Improve the speed and security of your SQL queries
Filed Under DB2, Python, Ruby | 5 Comments
An easy way to improve the performance and security of SQL queries is to replace literals with parameters. By replacing literal values with parameters, advanced relational databases will be able to compile your queries and have their execution plans cached. This saves time and precious resources when the same query (minus the actual values) is executed over and over.
Consider the following series of queries:
SELECT * FROM users WHERE karma BETWEEN 100 AND 499;
SELECT * FROM users WHERE karma BETWEEN 500 AND 999;
SELECT * FROM users WHERE karma BETWEEN 1000 AND 1999;
SELECT * FROM users WHERE karma BETWEEN 2000 AND 4999;
SELECT * FROM users WHERE karma BETWEEN 5000 AND 9999;
SELECT * FROM users WHERE karma BETWEEN 10000 AND 50000;
These each represent the same query and can be transformed into a single parameterized query:
SELECT * FROM users WHERE karma BETWEEN ? AND ?;
Trying to use clever tricks with quotes in order to inject arbitrary SQL code becomes futile. Parameters are considered values, and have no effect on the structure of the query itself.
Parameterized queries are therefore efficient and go a long way towards preventing SQL injection attacks in your applications. They have virtually no downside.
Newbie developers often ignore the existence of this feature and end up irritating seasoned DBAs who have to deal with the consequences of their incompetence. Leon Katsnelson argues that this is such an important matter, that every DBA should forward this Computerworld article to their developers. I tend to agree with how important of an issue that is.
That article provides the following example in Java:
String lastName = req.getParameter("lastName");
String query = "select * from customers where last_name = ?"
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1, lastName);
try { ResultSet results = pstmt.execute(); }
Here I’ll show you an example of how to work with parameterized queries from Ruby and Python. I’ll use the Ruby and Python drivers for DB2.
Ruby first:
require 'ibm_db'
conn = IBM_DB.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = IBM_DB.prepare(conn, query)
values = [500, 999]
IBM_DB.execute(pstmt, values)
while row = IBM_DB.fetch_array(pstmt)
puts "#{row[0]}:#{row[1]}"
end
We load the driver (use mswin32/ibm_db on Windows, and ibm_db.bundle on Mac), create a prepared statement, and then bind the two parameter values to it through the execute method. We then fetch the resultset one row at a time and print the value of the first two fields for each record. For fine-tuned control we could have used the IBM_DB::bind_param method.
The Python version is very similar:
import ibm_db
conn = ibm_db.connect("mydb", "db2inst1", "mypassword")
query = "SELECT * FROM users WHERE karma BETWEEN ? AND ?"
pstmt = ibm_db.prepare(conn, query)
values = (500, 999)
ibm_db.execute(pstmt, values)
tuple = ibm_db.fetch_tuple(pstmt)
while tuple:
print tuple[0] + ":" + tuple[1]
tuple = ibm_db.fetch_tuple(pstmt)
As you can see, working with parameterized queries is not any harder than dynamically generating SQL queries. Yet the benefits of doing so are huge.
Unfortunately, despite being a very sound choice to base an Object-Relational Mapper (ORM) on, ActiveRecord does not use parameterized queries. Even when it looks like you are passing parameters to a given method, these are actually used to dynamically form an SQL query. Of course you are still free to use parameterized queries in your Rails applications by employing the driver directly. But I really think this is something ActiveRecord should be built upon.
Luckily for Django developers, Django’s ORM uses parameterized queries, thus improving both performance and security with a single design choice. In the Python world you couldn’t get away with ignoring parameterized queries.
For those of you using Rails, all is not lost. DB2 Express-C 9.7 has a killer feature known as the Statement Concentrator, which caches similar queries allowing them to use a shared access plan. It’s not as efficient as using prepared statements in your code, but it’s the best you can do when, as in the case of ActiveRecord, you can’t use parameterized queries directly. Leon’s article explains in greater detail how this feature actually works.





















