Archive for February, 2007

Ruby Implementations Shootout: Ruby vs Yarv vs JRuby vs Gardens Point Ruby .NET vs Rubinius vs Cardinal

Antonio Cangiano February 19th, 2007

Many brilliant developers are working on improving the current implementation of Ruby and on creating alternatives. I was curious about their current respective speeds, so I installed and ran some benchmarks for the most popular implementations. In this article, I’m sharing the results for the community to see.

Disclaimer

  • Don’t read too much into this and don’t draw any final conclusions. Each of these exciting projects have their own reason for being, as well as different pros and cons, which are not considered in this post. They each have a different level of stability and completeness. Furthermore, some of them haven’t been optimized for speed yet. Take this post for what it is: an interesting experiment;
  • The results may entirely change in the next 3, 6, 12 months… I’ll be back!
  • The scope of the benchmarks is limited because they can’t stress every single feature of each implementation. It’s just a sensible set of benchmarks that give us a general idea of where we are in terms of speed;
  • These tests were run on my machine, your mileage may vary;

Benchmark Environment

My tests were conducted on an AMD Athlon™ 64 3500+ processor, with 1 GB of RAM.

The tested Ruby implementations were:

The operating system that was used for all – but Ruby.NET – is Ubuntu 6.10 (for x86). Ruby.NET currently runs on Microsoft Windows only, therefore I’ve used Vista with the .NET Framework 2.0 and have also run Ruby 1.8.5-p12 on Windows as a means of having a more direct comparison with Ruby.NET.

Ruby 1.9, JRuby, Rubinius and Cardinal were all installed using their respective latest development versions from trunk.

Tests used

The 41 tests used to benchmark the various Ruby implementations can be found within the benchmark folder in the repository of Ruby 1.9. The following is a list of the tests with a direct link to the source code for each of them:

Results

The following table shows the execution time expressed in seconds for Ruby 1.8.5 on Linux, Ruby 1.8.5 on Windows, Ruby 1.9 (Yarv/Rite) on Linux, JRuby on Linux, Gardens Point Ruby.NET on Windows, Rubinius on Linux and finally Cardinal on Linux.



LEGEND:

  • A blue bold font indicates that the given Ruby implementation was faster than the current stable, mainstream one was (Ruby 1.8.5 on Linux);
  • The baby blue background indicates that the given Ruby implementation was the fastest of the lot for the given test;
  • ‘Error’ indicates an abnormal interruption of the program. ‘Too long’ instead, is an indication that the execution took longer than 15 minutes and was manually interrupted;
  • Average and Median values take in consideration only working tests (they exclude ‘Too long’ programs as well).

Below is a chart which shows the average and median values, visually:



You may also be interested in visualizing a direct comparison of how many times a given implementation was faster or slower than Ruby 1.8.5 on Linux:



Of course, the bold green values indicate a positive performance, so for example Cardinal was 4 times faster than Ruby 1.8.5 on Linux for the test vm1_swap, but it was also 18 times slower for so_matrix (therefore in red).

I won’t provide too many personal considerations but rather let you enjoy the numbers. Generally speaking though, Ruby on Windows was about 1.5 times slower than on Linux. Yarv (merged in the development version of Ruby) is clearly the fastest by a long shot. This is good news (there are hopes for a fast Ruby 2.0), and it is not an unexpected result.

Ruby.NET and JRuby had similar performances and were able to execute most of the tests. It is clear though that they will need to focus on improving their individual speeds in the coming future, in order to be ready for prime time.

Cardinal wasn’t able to complete most tests, and was extremely slow in some others. However on a few occasions, it also showed decent results (beating Ruby 1.8.5 in 3 tests). Rubinius was extremely slow too but correctly handled a larger amount of tests than Cardinal was able to (and it was significantly faster in executing so_sieve.rb).

I’d like to conclude by saying that all the people involved with these projects are doing an amazing job. And while some implementations show that they are in an early stage of development, it is in no way detrimental of the great effort and work done by their developers, nor attempts to predict their future success or failure. So once again, great job guys, all of this is nothing short of exciting!

UPDATE 02/21/07: Wow, it looks like this article received a lot of attention and naturally I’m glad it did. Slashdot linked to this and traffic sky rocketed, giving major exposure to all these projects.

Most importantly, I initially thought I’d run another batch of tests in 3 months time, but given the amount of feedback that I’ve received, I’ll carry out another test run fairly soon to incorporate many of the insightful suggestions and requests that were received.

By the way, Ruby 1.8.6 is out in preview, and some of you sent me emails asking to test it out. Running the test shows that it’s usually slightly faster than 1.8.5 and it seems to notably speeds up recursion based tests. The next test run will have details for Ruby 1.8.6 as well.

Top 10 Ruby on Rails performance tips

Antonio Cangiano February 10th, 2007

The performance of Ruby on Rails is influenced by many factors, particularly the configuration of your deployment server(s). However the application code can make a big difference and determine whether your site is slow or highly responsive. This short article is about some of the tips and best coding practices to improve performances in Rails only, and won’t attempt to cover the server configuration improvements for the various deployments options.

  1. Optimize your Ruby code: this may seem obvious, but a Rails application is essentially ruby code that will have to be run. Make sure your code is efficient from a Ruby standpoint. Take a look at your code and ask yourself if some refactoring is in order, keeping in mind performance considerations and algorithmic efficiency. Profiling tools are, of course, very helpful in identifying slow code, but the following are some general considerations (some of them may appear admittedly obvious to you):
    • When available use the built-in classes and methods, rather than rolling your own;
    • Use Regular Expressions rather than costly loops, when you need to parse and process all but the smallest text;
    • Use Libxml rather than the slower REXML if you are processing XML documents;
    • Sometimes you may want to trade off just a bit of elegance and abstraction for speed (e.g. define_method and yield can be costly);
    • The best way to resolve slow loops, is to remove them if possible. Not always, but in a few cases you can avoid loops by restructuring your code;
    • Simplify and reduce nested if/unless as much as you can and remember that the operator ||= is your friend;
    • Hashes are expensive data structures. Consider storing the value for a given key in a local variable if you need to recall the value a few times. More in general, it’s a good idea to store in a variable (local, instance or class variable) any frequently accessed data structure.
  2. Caching is good: caching can significantly speed up your application. In particular:
  3. Use your database to the full extent of the law :) : don’t be afraid of using the cool features provided by your database, even if they are not directly supported by Rails and doing so means bypassing ActiveRecord. For example define stored procedures and functions, knowing that you can use them by communicating directly with the database through driver calls, rather than ActiveRecord high level methods. This can hugely improve the performance of a data bound Rails application.
  4. Finders are great but be careful: finders are very pleasant to use, enable you to write readable code and they don’t require in-depth SQL knowledge. But the nice high level abstraction come with a computational cost. Follow these rules of thumb:
    • Retrieve only the information that you need. A lot of execution time can be wasted by running selects for data that is not really needed. When using the various finders make sure to provide the right options to select only the fields required (:select), and if you only need a numbered subset of records from the resultset, opportunely specify a limit (with the :limit and :offset options).
    • Don’t kill your database with too many queries, use eager loading of associations through the include option:
      # This will generates only one query,
      # rather than Post.count + 1 queries
      for post in Post.find(:all,
                            :include => [ :author, :comments ])
        # Do something with post
      end
    • Avoid dynamic finders like MyModel.find_by_*. While using something like User.find_by_username is very readable and easy, it also can cost you a lot. In fact, ActiveRecord dynamically generates these methods within method_missing and this can be quite slow. In fact, once the method is defined and invoked, the mapping with the model attribute (username in our example) is ultimately achieved through a select query which is built before being sent to the database. Using MyModel.find_by_sql directly, or even MyModel.find, is much more efficient;
    • Be sure to use MyModel.find_by_sql whenever you need to run an optimized SQL query. Needless to say, even if the final SQL statement ends up being the same, find_by_sql is more efficient than the equivalent find (no need to build the actual SQL string from the various option passed to the method). If you are building a plugin that needs to be cross-platform though, verify that the SQL queries will run on all Rails supported databases, or just use find instead. In general, using find is more readable and leads to better maintainable code, so before starting to fill your application with find_by_sql, do some profiling and individuate slow queries which may need to be customized and optimized manually.
  5. Group operations in a transaction: ActiveRecord wraps the creation or update of a record in a single transaction. Multiple inserts will then generate many transactions (one for each insert). Grouping multiple inserts in one single transaction will speed things up.

    Insead of:

     my_collection.each do |q|
       Quote.create({:phrase => q})
     end

    Use:

    Quote.transaction do
     my_collection.each do |q|
       Quote.create({:phrase => q})
     end
    end

    or for rolling back the whole transaction if any insert fails, use:

    Quote.transaction do
     my_collection.each do |q|
       quote = Quote.new({:phrase => q})
       quote.save!
     end
    end
  6. Control your controllers: filters are expensive, don’t abuse them. Also, don’t overuse too many instance variables that are not actually required by your views (they are not light).
  7. Use HTML for your views: in your view templates don’t overuse helpers. Every time you use form helpers you are introducing an extra step. Do you really need a helper to write the HTML for a link, a textbox or a form for you? (You may even make your designer, who doesn’t know Ruby, happy!)
  8. Logging: configure your applications so that they log only the information that is absolutely vital to you. Logging is an expensive operation and an inappropriate level (e.g. Logger::DEBUG) can cripple your production application.
  9. Patch the GC: OK, not really a coding issue, but patching Ruby’s Garbage Collection is strongly advised and will improve the speed of your Ruby and Rails applications significantly.
  10. A final note:I don’t advocate premature optimization, but if you can, work on your code with these principles in mind (but don’t overdo it either). Last minute changes and tweaks are possible but less desirable than a “performance aware” style of coding. Profile your applications, benchmark them
    and have fun experimenting.

Acts As Suggest plugin

Antonio Cangiano February 8th, 2007

When searching for the word “honnolullu”, Google will promptly suggest “Did you mean: honolulu”. This feature is quite useful because it drastically improves usability. The lack of this exact function makes Wikipedia a pain in the neck as far as searching goes. So much so that I, and I suspect most people, just google for “wikipedia searched_term”.

Hence, I decided to create a small plugin. The acts_as_suggest Rails plugin allows developers to easily add this functionality to any ActiveRecord class, basing the suggestions on the existing values in the table. For example, if you have a table Articles with a column Title, you will be able to retrieve a list of possible corrections for a misspelled article title. The suggestions are not coming from a dictionary, making it very flexible.

Installation



Make sure you have the required Text rubygem, if not please install it: gem install Text

Download acts_as_suggest.zip, extract it and copy the acts_as_suggest folder into the vendor/plugins directory of your application. You’ll also need to restart your web server.

Usage



Assuming you have a model for a table that contains a list of articles, you would start by including acts_as_suggest in your model:

class Article < ActiveRecord::Base
  acts_as_suggest
end

This will provide the model with the suggest method, which is the core of the plugin. At this point you can retrieve suggestions as follows:

Article.suggest(:title, honnolullu‘)
Article.suggest([:title,:author], David Copperffelds‘)

These will return an array of records if the searched string is not misspelled and matches an existing value in the database. If no matches are found, the method will attempt to find similar existing values and return an array of strings (your suggestions). If none of the strings in the given column(s) are close enough, an empty suggestion array is returned.

The return values reflect a fairly common scenario. With a single method call I can retrieve all the records that match a user search, or I can retrieve suggestions to correct the spelling of his/her search if no records are found.

Please check the documentation for further details.

Documentation



You can read the short documentation online here or access the local copy in the /doc folder.

Bugs and Feedback



Here comes a little disclaimer. I wrote this plugin in half an hour, including the basic documentation and a few tests. Therefore, I’ve not yet setup an SVN repository for it; there are just a few test cases (I’m not even publishing them at this time) and it is very likely that there will be bugs, issues or possible improvements required. I’m releasing this .zip file as early as possible, while the code is in its infancy. In this way I can survey user interest and eventually further develop it. So please, send in any feedback and bug reports by email.

UPDATE (02/14/07): I’ve fixed a vulnerability regarding the unsafe handling of tainted data in a finder method (spotted by Alex Wayne). Please download version 0.1.1.

Given the enthusiasm and interest shown by many people, I’m now considering the possibility of starting an actual small project on Rubyforge.