Fighting test pollution with an RSpec custom ordering strategy

April 25, 2013 Andrew Bruce

Test pollution manifests itself as seemingly false negatives or false positives in a test suite. It occurs when some shared state is unintentionally modified, or unintentionally read and used in a test.

When test pollution builds up, it can mean that a project’s build fails unpredictably, which can stop a whole team from shipping code regularly. This is an expensive way to not build software.

Here’s an example of test pollution. You can save and run it with Ruby if you like. You shouldn’t need anything but a recent version of Ruby. If you run it several times, it will sometimes fail and sometimes pass:

[gist id=5449597 file=test_pollution.rb]

Why is it so unreliable? MiniTest orders tests randomly by default. When the first test runs before the second test, the test case fails, because the first test has the effect of setting the @logged_in instance variable to true, and the second test is effectively expecting the value of @logged_in to be false. The code under test has global state: the class instance variable, @logged_in.

The problem with the first test is that it’s a bad citizen: it sets global state and doesn’t clean itself up. The problem with the second test is that it’s presumptuous: it relies on global state being something in particular. As an aside: the code is terrible, and the flakiness of these tests should prompt you to change it, but I used a contrived example for the purpose of demonstration.

Fighting pollution with existing tools

I mentioned that MiniTest orders tests randomly by default. This is a Good Thing: it’s a deliberate ploy to flush out test pollution. If you ran the above code and it failed, it would give you a ‘seed’ number to pass in to the test, so that you could consistently run the test in the failing order. From there, you could hopefully work out why the particular order of tests failed. Both RSpec and MiniTest allow you to run tests in a random order, and both allow you to re-use the order from a previous failed run.

These aspects of MiniTest and RSpec are useful. They allow you to fix the order, so that you can find the dirty polluters of your test suite. This is often a quest to find two items: a polluter and a polluted test. Often you’ll find more than two, or a strange combination of tests that, when run in a particular order, cause a failed build.

On larger projects, these tools aren’t enough. Large codebases tend to have correspondingly large, slow test suites. Finding a source of test pollution in such beasts can involve looking at a lot of code, and can take a very long time.

Why can’t we automate this?

We’ve seen above that tests are reorderable chunks of code. So, you’d think that digging through reams of code just to reduce a pass/fail situation down to two examples would be an automatable process. You’d be right, but it’s not as straightforward as it ought to be.

When my pair and I first set out to write a tool to reduce tests down to their polluting / polluted components we thought we could get RSpec to output a list of files. We could then just feed the list of files back into RSpec in the order in which they failed, and use a binary search to find the offending files.

Unfortunately, it’s not that easy. RSpec isn’t file-centric, but groups-and-examples-centric. For the uninitiated, an RSpec test is composed of groups and examples:

[gist id=5449597 file=rspec.rb]

Once test files are processed, they’re loaded into memory and randomized regardless of file (but keeping group hierarchy intact). RSpec’s randomization algorithm is pretty simplistic:

Programmer: Hey RSpec, run the suite with this seed: 123
RSpec: I found a set of sibling groups. What should I do?
Randomizer: ‘Randomize’ them according to the number of items in the set, with this seed: 123
RSpec: OK, I’ve got a set of sibling examples inside one of the groups. What should I do?
Randomizer: ‘Randomize’ them according to the number of items in the set, with this seed: 123

And so on. This approach is great so long as you don’t want to reduce the problem set. If you do reduce the set of examples that RSpec is running, the order is lost, because the number of items in the list changes.

Enter The Scrubber

I’d ideally like to tell a computer to go and find my test pollution. I’m some of the way there. So far, I’ve managed to create a semi-automatic solution: get RSpec to output the order of a test run to a human-readable file, so that it can be edited and fed back into RSpec to order the next run.

Scrubber is a project I started last week that allows you to persist RSpec run orders, edit them, and replay them. It relies on a relatively new feature of RSpec that allows you to define custom ordering strategies. These strategies are just blocks of code that take as an input the list of groups or examples in a particular section of your suite, and return the groups or examples you want, in the order you want them.

The main stumbling block when writing the utility has been deriving a unique ID from each example or example group. The ID needs to be human readable and also reproducible across runs. So far I have a simplistic solution that just dumps the group or example description, and file location. This isn’t very unique, especially considering that RSpec has no restriction on having groups or examples with duplicate descriptions.

In future, I hope to use a more robust ID, perhaps a checksum of the example/group’s metadata. Again, this isn’t straightforward as some of the metadata are Proc objects. Perhaps the RSpec team would be interested in persistable suite runs as a core feature?

Anyway, here’s an example of Scrubber in use. If you’re interested, I suggest you clone the repo and have a play with the example to see how editing a file might work. It’s very rough around the edges right now, but serves as a proof of concept. Hopefully I’ll get enough time, or contributors, to make this into something more automated and user-friendly.

About the Author

Biography

Pivotal Launch Recap: A Next-Generation PaaS, Consumer-Grade Enterprise, and the Industrial Internet

Pivotal officially launched Wednesday morning with “Pivotal: A New Platform for a New Era,” a livestreamed ...

Monday's Tracker Outage, New Status Page

This week started on the wrong foot with a 1 hour and 45 minute outage early Monday morning, that affected ...

Fighting test pollution with an RSpec custom ordering strategy

Fighting pollution with existing tools

Why can’t we automate this?

Enter The Scrubber

About the Author

Previous

Next

Related content in this Stream

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.

As we stand at the threshold of a new era in data management, Greenplum continues to lead the industry with its commitment to innovation.

Experience enhanced security with Tanzu Application Platform. Elevate your organization's defenses from code to build with SLSA Level 3, image scanning scheduling & automatic upgrades for new patches.

Explore Spring's exceptional NPS score of 75, surpassing industry benchmarks by 18%. Discover why it matters.