Writing Fast Ruby: Learning from Merb & Rails 3 – Carl Lerche

April 18, 2009 Chad Woolley

Intro

Works for Engineyard. Getting to Rails 2.3, will be at RailsConf. Engineyard is hiring.

GoGaRuCo '09 - Carl Lerche

Does Ruby Scale?

Yes. So does LOLCODE.

Scaling != Speed

Is Ruby Fast?

Rub/Jruby around 23-30 in the Great Language Shootout.

In reality, Ruby is fast enough for the vast majority of use cases. Odds are slow code is your fault.

GoGaRuCo '09 - Carl Lerche

How do you write fast code?

Step 1. Write Slow Code

Don’t worry about performance the first time around. Odds are you don’t know what will be slow. Just write the codebase

Step 2. Use Science

Don’t stab in the dark. Use the scientific method. It is the most important tool science gives us, we should use it.

Scientific Method

Step 1. Define the Question (It needs to be specific)

Step 2. Gather Information

Step 3. Form a Hypothesis

Step 4. Perform experiment and collect data

Step 5. Analyze interpret

Step 6. Publish results and retest

Scientific Method reworded for code

  1. Why is my code so slow?
  2. Where is the time/memory being spent?
  3. Why is the chunk of code slow / a memory hog?
  4. Change code. Collect before/after metrics
  5. Compare metrics
  6. Deploy

Defining the Question

“My app feels slow” is not specific enough.

“Why is action X taking more than 100ms on average?” is a better question.

“Why is 60% of the merb dispatch cycle in content negotiation?” is a good one too.

“Why are my Mongrel processes growing to 300MB of memory?” (Gets laughs)

Our Scenario and Question

QUESTION: “Is route generation as fast in rack-router as it is in Merb and Rails?”

Gather Information

Tools: Rbench, ruby-prof/kcachegrind, EXPLAIN ANALYZE, log files, New Relic/Fiveruns, memory_usage_logger, bleak_house

He shows some data from the benchmarks, comparing rack-router and merb routing, and rails routing. This answers the question that Merb route generation is NOT as fast.

So, rephrase the question. How do you make it fast? Benchmarks aren’t good for this. He used Rubyprof which provides a lot of ways to set up test and output data. He uses the “call graph” output, which he can open in kcachegrind. It is available via macports.

He then shows the graphical output of kcachegring. Top left shows the methods which take longest, lower left is call stack, showing aggregate (’incl’) and individual time (’self’) spent in methods.

Sort by “self”, and it turns out Array::map is the one taking most individual time. Most of the calls occur in Rack::Router::Condition::generate_from_segments. This is a good place to look and spend time trying to speed up.

Hypothesis

Most of this logic can be removed and moved somewhere else. You can check Git logs to see how he did it.

Perform Experiment, collect data, analyze, interpret

Rewrote the code, it was faster.

Publish and retest

Twitter and let everyone know about it.

More Examples

Shows more kcachegrind examples. Shows how it can show source code annotated with performance data.

Remember – you don’t have to go through all the steps or feel bad because you didn’t in the past, just keep them in mind to figure out quicker where things are happening.

The Garbage Collector

Conservative mark and sweep. Every time it runs, none of your ruby gets executed. The goal is to get the garbage collector to run as little as possible.

The way object allocation works is ruby boots and gives you 8 meg of memory. When your code runs, it allocates memory. If it can’t, it will run the garbage collector.

Avoid creating unnecessary objects

Don’t need to do this:

records.dup.values # records is a Hash

Use DataMapper’s identity map. It will not create a new object if it doesn’t need to. This will drastically reduce the number of objects created.

Beware of modifying Large Strings

Don’t do parse time operations

For example, slash at end of line to split up code

Beware of closures

Be lazy

“No code is faster than no code” – merb motto:

  • Cookie handling
  • memoize in the reader
  • Procs as method arguments (instead of just arguments)

Lambdas

Sexy, but slow. Sometimes you need them, but keep in mind they are more expensive than method dispatch.

def my_method
  yield
end

vs.

def my_method(&block)
  yield
end

“Compiling” your code

  • Iterating is slow
  • Ruby’s AST is fast

class_eval For The Win.

About the Author

Biography

Previous
GoGaRuCo '09 – Magic scaling sprinkles – Nick Kalen
GoGaRuCo '09 – Magic scaling sprinkles – Nick Kalen

Magic Scaling Sprinkles Nick works at Twitter and is part of the team that makes it scale. With respect...

Next
Palm Pre and webOS in Network World Article
Palm Pre and webOS in Network World Article

Ian McFarland and I did an interview with John Cox of Network World for his recent article on Palm's webOS....