We made a code change and deployed to demo, and all the sudden some of our ruby processes were eating a ton of CPU against our full dataset.
In the java world you can send a SIGQUIT to any running java process and get a thread dump. Go ahead, run a java process and kill -3 it.
You can get this in the ruby world by using xray:
sudo gem install xray
Drop this into your code:
require "xray/thread_dump_signal_handler"
Now:
kill -3 <ruby pid>
Look in the log file where you’re sending stdout. You’ll see something like:
=============== XRay - Done ===============
/usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `call'
_ /usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `run_machine'
_ /usr/lib64/ruby/gems/1.8/gems/eventmachine-0.12.0/lib/eventmachine.rb:224:in `run'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/backends/base.rb:57:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/server.rb:150:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/controllers/controller.rb:80:in `start'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:173:in `send'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:173:in `run_command'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:139:in `run!'
_ /usr/lib64/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6
_ /usr/bin/thin:19:in `load'
_ /usr/bin/thin:19
(that’s thin patiently waiting to service the next request)
A one-line code drop-in results in a powerful new inspection tool. Pretty neat.
For bonus points:
ps ax | grep "thin server" | grep -v grep | awk '{print $1}' | xargs kill -3
For more bonus points, stick this in a capistrano task and grab the thread dumps from your logs, and you’ll have a cluster-wide snapshotting tool.
We kill -3’d our CPU-eating thins and discovered a directory scan problem introduced by a recent code change – totally obvious from the thread dump. Now we’re nailing it down with a failing perf unit test and fixing the problem.
About the Author