Marshal.dump vs YAML::dump

August 16, 2009 Colin Shield

We find ourselves with a project with a very large dataset, more than 2 million items. This dataset changes frequently. The changes need to be transported to their respective servers ready to be served out to clients.
We decided to use a queuing architecture to distribute data. Objects are serialized and pushed to a queue. The large size of the dataset requires us to optimize as much as possible. There are only so many hours in a day and there is a lot of data to transport.
A question was raised in standup as to what was the fastest serialization method: YAML::dump or Marshal.dump. It seemed appropriate to write a quick script and work out which would be appropriate for our particular situation.
The objects we are serializing are simple hashes. I thought I’d write something that was representative of our situation in order to present a nice clear decision.
Here’s some code:

require 'yaml'
obj = {:a => "hello", :b => "goodbye", :c => "new string", :d => {:da => 1, :db => 2}, :e => 1}
start = Time.now
(0..10000).each do
  ser_obj = YAML::dump(obj)
  new_obj = YAML::load(ser_obj)
end
puts "YAML::dump time"
puts Time.now - start
start = Time.now
(0..10000).each do
  ser_obj = Marshal.dump(obj)
  new_obj = Marshal.load(ser_obj)
end
puts "Marshal.dump time"
p Time.now - start

I think we all knew how the results would look. It was nice to see that for our particular case there was a clear winner.

YAML::dump time
5.397909
Marshal.dump time
0.280292

Seems fairly cut and dried to me.
I personally prefer YAML for test result comparison. Maybe we’ll put something in our spec_helper to use YAML for testing and Marshal for production.

About the Author

Biography

Previous
Mocking ScrewUnit with iSpy
Mocking ScrewUnit with iSpy

I was looking for a mocking framework to use with Screw.Unit when I found out that Rajan had ported the spy...

Next
Rails requests missing the HTTP body
Rails requests missing the HTTP body

This is a bug in Rails that quite likely affects you, but which you've even more likely never experienced. ...