Ruby on Rails: An experience report

© 2011 David Röthlisberger david@rothlis.net
2 March 2011.

These are my thoughts on the Ruby on Rails web framework after developing, deploying, and supporting an internal web app for a small business with several hundred users spread across the world. 23k lines of code (excluding tests & 3rd party plugins); 10 man-months spread over 3 years (80% in the first year).

Relative sizes of components of my application's source code

LoC1 Changes2
31% 33% Models
16% 17% Controllers
8% 7% Config & infrastructure code3
10% 7% Database migrations
4% 4% View helpers
31% 32% View templates

1. Lines of Code.

2. Total lines added or modified across the entire history of the repository.

3. Excludes Rails itself and 3rd party plugins.

After significant initial frustration and pain I now feel quite productive in Rails. It allows you to develop a lot of functionality in a small amount of time, but, as with all frameworks, when you encounter a problem you’ll need to acquire a thorough understanding of the framework in order to proceed. This may negate the development speed advantage, but only for your first (significant) project.

I would use Rails again, for certain classes of projects, if I was in a hurry and I wasn’t overly concerned with —or had plenty of resources to dedicate to— maintainability.

What I mean by that: If you build your company’s primary product on Rails, and have full-time developers dedicated to it over its lifetime, then the maintainability issues won’t be significant. But if you want to write something and leave it running without having to worry about it, even small security updates can become huge time sinks. The Rails team is concerned with innovation far more than with backwards compatibility. At least you will get security updates — Rails is a very active project.

If you want a CMS, there are much more efficient technologies for the task. For very small projects (an email-gathering page or a simple webservice) I’d go with a micro-framework like Sinatra (or its equivalent in your favourite language). For complex workflows or rich client applications, Rails wasn’t designed for your needs — you’ll really want a client-side framework. For all other CRUD, Rails could be a good fit for you.

What follows is a high-level view of the framework’s features I have used, plus anecdotes from the most significant pain points I have encountered. If I seem to focus too much on the latter, keep in mind that other frameworks have similar problems, or different ones of their own. My criticisms are only “arrows of longing for a better shore”.

Documentation

Rails documentation is a lot better than it used to be. Three years ago the free documentation was fragmented across the blogosphere, much of it was out of date, and you pretty much had to buy a book. Today, the Rails Guides are fairly comprehensive and probably enough to get you started.

Rails has a very active community and there are thousands of tutorials, screencasts and blogposts on common tasks. Do make sure that the tutorial’s major version of Rails matches yours.

For more sysadminny stuff, e.g. how best to manage 3rd party gems (ruby packages), you’re more dependent on Google and hoping that the blog post you found is still current. Anything older than about a year is quite likely obsolete.

The Rails API reference (also at apidock.com with user annotations) is fairly complete, though you need to know where to look: Many of the methods you’ll use most frequently are actually generated dynamically, so you won’t find them directly in the API reference.

For example, you might have a Student model, in which you declare belongs_to :school. Rails then defines several methods on the Student class for you: school, school=, build_school, and create_school. You’ll find these documented under belongs_to. Other methods, such as name, name=, name?, and find_by_name, are generated automatically based on the schema of the students table; their existence is documented under ActiveRecord::Base. Similarly, name_changed?, name_was, and name_change are under ActiveModel::Dirty. At the beginning this is confusing but you learn to find your way around.

The same limitations apply to navigation in your IDE/editor: It won’t find the definition of name_changed? etc. (It gets worse with Rails’s monkey-patching and method-chaining; see the section on the Rails source code for details.)

For Ruby the language: The first edition of Programming Ruby, which documents Ruby 1.8, is available free online (missing some figures & tables). Rails 3 uses Ruby 1.9, for which you’ll want to buy the book’s latest edition. ruby-doc.org has the Ruby API reference.

Models

ActiveRecord

Models, the “M” in “MVC”, are where your business logic goes. ActiveRecord is Rails’s Object-Relational Mapping library (Rails 3 allegedly allows you to use other ORMs instead).

ActiveRecord doesn’t require you to specify a model’s attributes: It takes them directly from the database schema (a “Student” model expects a “students” table). This is painless for greenfield projects, and adapting ActiveRecord to an existing schema that doesn’t follow the required conventions is also possible (with explicit annotations in your model).

Not having the table definition in your model source code might seem annoying, but in practice it’s not a big deal: The full schema is in the generated schema.rb file and I tend to have a MySQL session open in a terminal anyway. There is a gem to annotate your model source files with the corresponding schema information in a ruby comment, but relying on this sounds like a bad idea to me as the information in the comment is not canonical.

ActiveRecord ships with support for MySQL, Postgres, and SQLite. There is an Oracle adapter.

Your model source code will contain:

In my experience ActiveRecord is pretty good at letting you drop down to raw SQL when you need it. You can write
Student.find(:conditions => {:name => n})
or
Student.find(:conditions => 'arbitrary where clause'),
and similarly for the :include and :join parameters to find, as long as you understand how ActiveRecord generates its SQL. Even arbitrary subqueries, as long as you use table aliases that don’t conflict with the aliases generated by ActiveRecord.

If you say School.find(:include => :students), ActiveRecord will try to generate two separate queries: One for the schools, and one for all the matching students. This is to avoid a cartesian product where all the data for each school is returned multiple times, once for each student. This is all handled transparently: To access the data you simply call students on any of the returned school objects. Very rarely, with complicated nested includes, ActiveRecord just refuses to load associated data. If this happens your options are to write a separate query for the associated data, and manage the two (or more) datasets manually in your ruby code; or dig around the ActiveRecord source code, find the protected (but documented) method preload_associations, call it explicitly, find that it works, and hope it doesn’t break in the next version of Rails.

The logic around creating/saving associated records can get complicated, so you have to be careful; but to be fair it’s not a trivial problem.

ActiveRecord supports optimistic and pessimistic locking out-of-the-box, though with optimistic locking you’ll have to explicitly handle conflicts in your controller code.

There are 3rd party plugins for versioning database records, although they either don’t support associations or haven’t been updated since Rails version 1.

Database migrations

Rails “migrations” are a way of specifying database schema changes in ruby code like change_table :students {|t| t.string :new_attribute}. Using raw sql in a migration is easy. Deploying and rolling back migrations is easy with Capistrano, but if an error occurs during a migration you will need to manually roll back any changes it did manage to make.

Rails uses a table to keep track of which migrations have been applied, though there’s no explicit support for ordering migrations other than a timestamp in the migration’s filename. I have no experience with Rails migrations in the context of multiple-developer teams and the ordering of migrations from merged development streams, so I won’t comment further.

Views

ERB

Rails’s default templating language is ERB, where you write literal HTML interspersed with tags like <%= ... %>, whose content is interpreted as ruby code and the result inserted into the template. Predictably, this results in messy tag soup. Fortunately it’s easy to use 3rd party templating engines instead.

Rails provides many helper functions for generating the HTML for forms, links, and validation messages, and for formatting numbers and dates, pluralising words, etc. I can’t comment on internationalization, but there’s a Rails Guide on it.

Rails’s XSS countermeasures are discussed in the section on Security.

Haml

Haml is a 3rd party templating engine, popular among Rails developers. It is beautiful but flawed.

I like Haml’s block-structure-by-indentation (Python-style significant whitespace). In general Haml results in much less text on screen, making the block structure very obvious and highlighting css selectors.

In my project, converting view templates from ERB to Haml resulted in 27% less lines and 10% less characters. (But my sample size is small: 21 templates totalling 1200 lines of ERB. Many templates I never converted to Haml, and others I created in Haml to start with.)

Now Haml’s flaws:

Haml’s newline-escaping is plain annoying: It’s like normal newline-escaping, except that you also have to escape the last newline. So how do you know that the next line doesn’t belong to the previous line? Because it doesn’t end in an escape character. But what if you also want to wrap the next line, without it merging into the previous one? You have to do this:

Logical line 1 wrapped | across 2 physical lines | -# Must insert this comment to separate lines Logical line 2 wrapped | across 2 physical lines |

The official line on this is “Using multiline declarations in Haml is intentionally awkward. This is designed to discourage people from putting lots and lots of Ruby code in their Haml templates.” Make of that what you will.

Haml inserts unexpected whitespace inside textareas unless the developer remembers to use different syntax for textareas compared to everything else. Supposedly this has been fixed, but still strikes when a Haml template inserts another partial template (maybe only a non-haml partial template?) with a textarea in it.

In general, Haml isn’t good at not producing whitespace. There are < and > specifiers for whitespace removal, but they only work with tags, not with filters or other Haml elements.

I found the source code for Haml’s parser difficult to modify: It’s a hand-rolled agglomeration of global variables. (But maybe that’s the nature of parsers.)

View templates represent 1/3 of all changes over my application’s development history, so it’s important to have good tools! If you don’t mind the PHP-style <%= %> tags (and your editor handles the mix of html and ruby sufficiently well) then you’re better off sticking with ERB. There are Haml look-alikes, which I haven’t used, but I don’t know whether they solve Haml’s shortcomings.

I have always been intrigued by the idea of generating html programatically, like Smalltalk’s Seaside does. Markaby looks like the closest Ruby equivalent, but I have no first-hand experience with it.

Controllers

Rails best practice says “fat models, skinny controllers”: Put as much logic as possible into the models, and keep the controllers to the minimum amount of glue code.

Typically your controller will contain declarative statements like

before_filter :login_required

or (with the rolerequirement plugin)

require_role :admin, :for => [:new,:create,:edit,:update,:delete]

This is followed by code specific to individual actions. Typically this is as simple as:

def update @student = Student.find(params[:id]) @student.attributes = params[:student] if @student.save redirect_to(@student) else render :action => 'edit' end end

though in practice your controllers will often contain far more complex code, in spite of the “skinny controllers” mantra.

The mapping of URLs to a controller and action is configured centrally in a routes.rb file. It is straightforward and flexible.

Much has been said about Rails’s “magic”, but state is handled the old-fashioned way: Query string paramaters and hidden form inputs. Rails tries to make this painless: In the code example above, params[:student] contains form parameters in a format, generated by Rails’s view helper functions, that is directly assignable to the student object.

But for more complex interactions, spanning more than a single request/response cycle, you will need to manage session data yourself. Again, Rails provides easy-to-use mechanisms for storing data in browser cookies (with the associated size limitations), or on a memcached or database server. Still, it’s a manual process spanning multiple controller actions and view templates, tied together by hidden form fields, and my feeling is that complex interactions are best handled by a continuation-based framework, or perhaps on the client (browser) side.

The new RailsAdmin plugin offers an automatic CRUD interface similar to Django’s admin interface, for Rails 3 only. There are several other attempts of varying quality.

For an advanced framework aimed at CRUD applications, I am slightly disappointed by Rails in some regards: Repetition of boiler-plate in the controllers; until as recently as 2009 Rails didn’t directly support forms for nested models (e.g. updating a Student object and a list of associated Telephones from the same form; you had to buy an “advanced” book to learn how to do that); lack of discrete components/widgets. Due to Ruby’s extremely dynamic nature, disparities between a model and its views and controller will only be found by explicit tests or live users (more on this later).

Some of the above points may be unfair to Rails, better put down to a lack of developer discipline (i.e. my own) in developing application-specific abstractions.

Javascript

Rails 2.3 has built-in view helpers for generating inline Prototype and Scriptaculous javascript code, but you’ll be better off ignoring them and writing unobtrusive javascript by hand, perhaps using jQuery.

Rails 3 allegedly embraces unobtrusive javascript although trying to find out about it reminds me of last decade’s dark times of Rails documentation. There’s no mention of it in the Rails Guides.

Rails controllers can recognize an XHR request and deliver the appropriate content, be it HTML snippets or JSON data.

Debugging

The framework provides an API, concise and available from any model, view or controller, for writing to the logfile at various levels (info, debug, etc). It is possible to swap out the default logger with, for example, a syslog logger, without changing your application code.

Step-by-step debugging is easy to set up and just works, at least for basic inspection of variables, stepping through instructions, and navigating up and down stack frames. There is likely to be IDE support for the ruby debugger, but I find the command-line interface perfectly sufficient.

I should point out here that in development mode any changes to your model/view/controller code (including debugger breakpoints) take effect immediately — just reload the page.

Because of the dynamic nature of the Rails framework, don’t expect the debugger to step into framework functions (at least not and have it make any sense).

There is also script/console which loads an IRB (interactive ruby) shell with all your framework & application classes loaded. Very very handy for quick testing / prototyping / figuring-things-out.

Maintainability

My application is only 23k lines of code, not at all large as projects go (though there are unsubstantiated claims that equivalent applications in less dynamic frameworks might be 10 to 100 times larger). As a one man team I can’t comment on the maintainability of Ruby/Rails code across large teams, but I do have the benefit of maintaining the application for over two years, and adding small pieces of functionality years after (my own) original development.

Application source code

Rails enforces the MVC pattern which results in fairly cleanly separated concerns — it’s pretty difficult for the developer to fight against it, when doing the right thing is just so easy (assuming competent data modeling).

Comprehensive test coverage is extremely important, given Ruby’s dynamic nature. The Rails Community’s emphasis on unit testing and test-driven development is no surprise. Rails folk love testing so much they write tests in plain English, then write regular expressions to convert the English into ruby tests.

As with most things, test coverage is a compromise. Non-critical or slow-changing applications may get away with only testing at the controller level (functional and integration tests). I certainly do, for the most part; but when I need to refactor I regret not having greater test coverage. (Actually, what I regret is not having statically-typed compile-time checking, but now we’re talking religion.) Upgrading the framework without perfect test coverage is downright scary.

Try to add features or behaviours in a declarative style. Ruby makes this very easy. For example, it’s roughly 100 lines of code, and a day to write even if you’re completely new to Ruby metaprogramming as I was, for an ActiveRecord extension whereby you can declare in one of your models:

searchable_by :name, :school => :name, :address => :city

which creates a search method on that model that will return records with a matching name, or with an associated school with a matching name, or with an associated address with a matching city.

The advantage here is twofold: In the model itself, the search logic is entirely described by a single line of code, making for very compact, maintainable models; and you only really need to test the “searchable_by” framework, not the individual search functionality for each model that uses it.

As another data point, a declarative mechanism for specifying access to individual attributes of a model, depending on the logged-in user’s role, took me 2-3 days of initial work, based on a skeleton by Bruce Perens. This included extensions to the Rails FormBuilder helpers (text_field, check_box, etc.) to automatically show the attribute in a non-editable form for users lacking write permissions, and not to show it at all if lacking read permissions. And of course the same security rules enforced server-side, to protect against craftily-crafted forms.

With this level of meta-programming, documentation becomes very important (more on meta-programming in the section on the Rails source code).

Environment

Rails depends on various external packages such as the Ruby i18n library and rack (an interface for Ruby web frameworks). Problems in these packages, while rare, can be difficult to track down and fix. At least everything’s open source; but understanding the rack internals (or an i18n library when your app isn’t even internationalised) is going to take some time.

For system administration tasks, things are pretty easy and there are good, reasonably well-documented tools available:

RVM (Ruby Version Manager) makes it easy to switch between different ruby interpreters and gemsets in your development environment — useful if you work on more than one app at a time.

Bundler packages your application’s dependencies (as long as they’re ruby gems) so you really don’t have to worry about installing gems on production servers or synchronising developers' environments.

Pretty much all plugins and gems are on github, so once you figure out the process it’s easy to fork on github, clone your fork into your app’s vendor/plugins directory, and make your own changes, pulling from upstream as necessary (and of course contributing your changes back if at all appropriate).

The stability of the Rails stable branch

As of this writing, 2.3 and 3.0 are the official stable branches of Rails (where 3.0, released August 2010, has major infrastructural changes and the upgrade is not trivial).

The latest commits on the 2-3-stable branch indicate a certain cavalierness:

Ideally a stable branch would contain only security fixes and well-tested bugfixes, but in my experience you can’t safely upgrade along the 2.3 stable branch.

For example, at version 2.3.2 Rails had a bug handling unicode characters in multipart mime emails. Upgrading to 2.3.5 fixed the problem, but introduced a new one: Validation error messages went from “address must be specified” to (literally) “{{attribute}} {{message}}”.

This was fixed in 2.3.9, but 2.3.8 introduced another problem where form parameters are seemingly-randomly truncated (so sometimes, half the text a user types into a form input is lost). At version 2.3.11, 8 months after the cause and solution are known, this still hasn’t been fixed.

Rails source code

In order to work its magic, Rails makes heavy use of Ruby’s monkey patching, method aliasing, and meta-programming.

Monkey patching means changing existing classes. In Ruby you can add methods to any existing class, including built-in classes like String and Array. You can even overwrite existing methods in a class. This adds a ton of flexibility, but makes it impossible to determine the behaviour of a class by inspecting the source code in a single file.

Method aliasing is a particular monkey-patching technique where you take an existing method, such as ActiveRecord’s update, and you rename it to update_without_callbacks; then you define your own version of update, which at some point may call update_without_callbacks.

Some other code then comes along and renames update (i.e. the new one defined by the callbacks implementation) to update_without_lock, and instates its own update (which calls update_without_lock, which calls update_without_callbacks).

The advantages of this are extreme flexibility and power for the framework; the downside is it becomes rather tricky to trace the full path of execution for a simple call to update. In this particular example, there are at least two more levels of indirection in the core Rails framework: update_without_dirty and update_without_timestamps. Plugins can, and do, chain on extra aliasing. So can your application.

Meta-programming means code that generates other classes or methods. For example, ActionController::Request has code like this:

[ 'SERVER_NAME', 'SERVER_PROTOCOL', 'HTTP_ACCEPT', 'HTTP_USER_AGENT' ]. each do |env| define_method(env.sub(/^HTTP_/n, '').downcase) do @env[env] end end

This defines methods server_name (which returns @env['SERVER_NAME']) and accept (returning @env['HTTP_ACCEPT']), among others. The advantages are less code duplication, and, again, flexibility and power: This type of meta-programming is what makes it possible for ActiveRecord to automatically generate setter & getter methods based on your database schema. One disadvantage is that you have no hope of finding the definition of accept should you need to (unless you already know the answer).

All of this may make it quite difficult to understand the codebase well enough to contribute back; but contributing to the source-level documentation couldn’t be easier. Clone the docrails repository on github, ask for commit permission, and commit away. docrails is periodically reviewed and merged back into the master rails repository (you will even be credited for your contribution).

Security

SQL injection is protected against in the usual manner. In fact ActiveRecord provides many ways of constructing queries, all of them safe from injection.

Before Rails 3, XSS was a different matter: Developers were expected to escape unsafe data with the h() view helper. Rails 3 automatically keeps track of a string’s “html safe” status, and this functionality has been back-ported to Rails 2.3 in the rails_xss plugin (but you’ll need to test existing apps carefully to avoid double-escaping your HTML output).

In the controllers section above I showed a code sample assigning form parameters to a model object. Sensitive attributes (such as “role” on a User model) can be marked as “protected”, so they can’t be changed with a mass-assignment statement. You can also switch the defaults around, so you have to specify which attributes aren’t protected (more sensible).

CSRF protection: Rails’s RESTful routes strongly guide you towards the correct use of GET and POST methods, and Rails automatically includes a security token in all forms which it then checks for any non-GET requests.

See the Ruby on Rails security guide for full information on these and other security measures.

The Ruby and Rails teams have been criticized in the past for their handling of security vulnerabilities. Rails’s current security policy seems to address these concerns.

Performance

Ruby is slow, but it mostly doesn’t matter because Rails is inherently scalable. What limited context is stored across requests, is in hidden form inputs or in a browser cookie (or on a memcached or database server if you really want to). So subsequent requests from the same user can perfectly well be handled by different servers sitting behind a load balancer. Setting up separate servers for the application (ruby) code, database, and static content delivery, is straightforward (though I haven’t needed to do so; my performance requirements are extremely modest).

However, ruby is really slow. I generate a page of reports showing various pieces of information for 200+ separate entities, in total almost 400KB of HTML. The various database queries total 400ms (<10ms with the cache hot) but the ruby code to generate the HTML (nothing particularly complex) takes 4.5 seconds. This time is linear in the amount of entities displayed, and profiling doesn’t identify any obvious hot spots.

That was with MRI (the reference Ruby implementation) version 1.8.7. My informal tests with MRI 1.9.2 indicated a 30% speedup, but parts of my app stopped working (rails 2.3 is now compatible with ruby 1.9 but you need to check your application code and 3rd party plugins). Rubinius (1.2.2) was twice as slow as MRI 1.8.7 (but crashed so often that Rubinius’s JIT compiler didn’t really get a chance to optimize things).

I don’t think ActiveRecord specifically has anything to do with the poor performance (except in that it’s written in Ruby). If you’ve heard of the Rails “N+1 queries problem”, then before you bring it up: Despite all the noise on the blogosphere, it has never been a real problem. It’s pretty easy to diagnose and fix, so it’s really a “programmer problem” not a “Rails problem”.

Rails offers easy-to-use mechanisms for caching entire pages and fragments of pages. Of course, this doesn’t help when the content you deliver is customised for the logged-in user.

Deployment

Deployment used to be painful but now it couldn’t be easier, thanks to Phusion Passenger (an Apache module, aka mod_rails).

Passenger isn’t terribly flexible. For instance, you can specify the maximum number of Rails processes, and if you’re running multiple sites on the same server, the maximum number of processes per app, but not the maximum processes for a specific app. But Passenger is extremely easy to install and configure, is well documented, and has proved stable. Passenger is the official preferred deployment setup for Rails.

Some people prefer Nginx to Apache. It’s simpler to configure, is supposed to be faster, and Passenger is also available for Nginx. It’s less flexible: A couple of years ago I found it impossible to configure Nginx to both report on file upload progress and serve static files directly (this was before Passenger), so I switched to Apache. YMMV.

Automated deployment of changes to your application code, database schema and static assets is done with Capistrano, which can target multiple environments (staging, production) and server types (application, database, static assets). The initial setup is not too difficult and after that it works very smoothly.

Automatically packaging any 3rd party gems (ruby packages) with your app is handled smoothly by Bundler. Capistrano ties in nicely with Bundler.

Community & ecosystem

Rails is very popular, and it is rare to come across a problem that hasn’t already been found and fixed by somebody else. Form inputs truncated? There’s a patch for that. Rails processes spinning indefinitely at 100% CPU after a failed file upload? Yep, there’s a patch for that too.

There are zillions of Ruby packages (“gems”) and Rails plugins for common tasks such as user authentication and authorization, file uploads (including an Apache module for upload progress reporting), pagination, you name it.

Traditionally Rails plugins work by generating database migrations, controller code, and view templates, which you copy into your application and customise as necessary; this makes upgrading the plugins non-trivial. Some plugins are more self-contained.

The quality and documentation of plugins vary widely, but many are very highly polished. In general though, the promise of self-contained, pluggable widgets or packages (à la Django’s “apps”, of which I admittedly have no experience) has not been realised. Hopefully with the integration of “Rails Engines” into the Rails core, and the more modular design of Rails 3, we will start seeing more ambitious plugins. I have a feeling that Rails changes so fast that the more complex plugins would require a lot of maintenance.

One plugin I had a favourable experience with was Comatose, a micro-CMS I use to manage the help pages within my webapp. With less than a day’s work I had a fully functioning CMS tied into my existing authentication and authorization mechanisms, complete with Textile processing, ajaxy previews, and page versioning/history.

One downside to the community is that it is very trend-driven, and the plugin that was the de-facto standard for a given task might be all but abandoned a year later. It can be quite hard to find the right information; it’s essential to stay in the loop with RSS feeds and newsgroups. Comparisons between various plugins are especially lacking. Back when github first came on the scene, projects migrated their source code to github but rarely updated their original project pages to indicate that fact.