Intro
Works for Engineyard. Getting to Rails 2.3, will be at RailsConf. Engineyard is hiring.
Does Ruby Scale?
Yes. So does LOLCODE.
Scaling != Speed
Is Ruby Fast?
Rub/Jruby around 23-30 in the Great Language Shootout.
In reality, Ruby is fast enough for the vast majority of use cases. Odds are slow code is your fault.
How do you write fast code?
Step 1. Write Slow Code
Don’t worry about performance the first time around. Odds are you don’t know what will be slow. Just write the codebase
Step 2. Use Science
Don’t stab in the dark. Use the scientific method. It is the most important tool science gives us, we should use it.
Scientific Method
Step 1. Define the Question (It needs to be specific)
Step 2. Gather Information
Step 3. Form a Hypothesis
Step 4. Perform experiment and collect data
Step 5. Analyze interpret
Step 6. Publish results and retest
Scientific Method reworded for code
- Why is my code so slow?
- Where is the time/memory being spent?
- Why is the chunk of code slow / a memory hog?
- Change code. Collect before/after metrics
- Compare metrics
- Deploy
Defining the Question
“My app feels slow” is not specific enough.
“Why is action X taking more than 100ms on average?” is a better question.
“Why is 60% of the merb dispatch cycle in content negotiation?” is a good one too.
“Why are my Mongrel processes growing to 300MB of memory?” (Gets laughs)
Our Scenario and Question
QUESTION: “Is route generation as fast in rack-router as it is in Merb and Rails?”
Gather Information
Tools: Rbench, ruby-prof/kcachegrind, EXPLAIN ANALYZE, log files, New Relic/Fiveruns, memory_usage_logger, bleak_house
He shows some data from the benchmarks, comparing rack-router and merb routing, and rails routing. This answers the question that Merb route generation is NOT as fast.
So, rephrase the question. How do you make it fast? Benchmarks aren’t good for this. He used Rubyprof which provides a lot of ways to set up test and output data. He uses the “call graph” output, which he can open in kcachegrind. It is available via macports.
He then shows the graphical output of kcachegring. Top left shows the methods which take longest, lower left is call stack, showing aggregate (’incl’) and individual time (’self’) spent in methods.
Sort by “self”, and it turns out Array::map is the one taking most individual time. Most of the calls occur in Rack::Router::Condition::generate_from_segments. This is a good place to look and spend time trying to speed up.
Hypothesis
Most of this logic can be removed and moved somewhere else. You can check Git logs to see how he did it.
Perform Experiment, collect data, analyze, interpret
Rewrote the code, it was faster.
Publish and retest
Twitter and let everyone know about it.
More Examples
Shows more kcachegrind examples. Shows how it can show source code annotated with performance data.
Remember – you don’t have to go through all the steps or feel bad because you didn’t in the past, just keep them in mind to figure out quicker where things are happening.
The Garbage Collector
Conservative mark and sweep. Every time it runs, none of your ruby gets executed. The goal is to get the garbage collector to run as little as possible.
The way object allocation works is ruby boots and gives you 8 meg of memory. When your code runs, it allocates memory. If it can’t, it will run the garbage collector.
Avoid creating unnecessary objects
Don’t need to do this:
records.dup.values # records is a Hash
Use DataMapper’s identity map. It will not create a new object if it doesn’t need to. This will drastically reduce the number of objects created.
Beware of modifying Large Strings
Don’t do parse time operations
For example, slash at end of line to split up code
Beware of closures
Be lazy
“No code is faster than no code” – merb motto:
- Cookie handling
- memoize in the reader
- Procs as method arguments (instead of just arguments)
Lambdas
Sexy, but slow. Sometimes you need them, but keep in mind they are more expensive than method dispatch.
def my_method yield end
vs.
def my_method(&block) yield end
“Compiling” your code
- Iterating is slow
- Ruby’s AST is fast
class_eval For The Win.
About the Author