Ensure the contents of one array are in another array

January 17, 2013 Josh Bourgeois

Humans seem to be built so that there’s a point when, while learning a complex system, we begin to forget some of the more basic fundaments of the machines building that system. When I was taking my college calculus course, I could fly through things like finding the limit to a function, calculating sigma notation, or doing whatever it was that we did with those crazy fx/dx things, but if you asked me to add 55+137, I’d have to stop and think about the answer for (let’s see, two goes into the ones column, carry the minute…) a minute and two seconds.

As it is with mathematical systems, sometimes we also forget the simpler functions in our programming language of choice. Tracker does a lot of work with arrays, and in testing that one array has absorbed the contents of another, we tend to go through a loop that looks like this:

origin_array.each { |cargo| destination_array.should include?(cargo) }

It makes sense to iterate over the contents of an array to verify its contents. It’s how we first get taught to inspect an array. But the simpler way to do this in Ruby is to use Array’s intersection function, #&.

(origin_array & destination_array).should == origin_array.uniq

Whereas the iterative test’s fail case would only show you one object and that it was or wasn’t expected in the array, This code returns the array you expected next to the array you got. You might think that the iterative loop is more robust, as it would tell us if any duplicates from origin_array didn’t make it to destination_array. You’re wrong, sorry to say: Unless you’re deleting objects from destination_array as you find them, any duplicates in origin_array are just going match against the same object in destination_array as the object that got compared before it.

Of course, changing your testing paradigm isn’t worth a lick if it doesn’t make your failures more descriptive and/or more performant. It’s a matter of personal opinion whether you prefer RSpec’s array comparison or its more human-readable “expected x to include y, but didn’t”. Performance, though, is pretty cut-and-dry, so I decided I’d make a couple of arbitrarily large arrays and compare the time it took to run these different comparison methods.

I got my degree in English, so to get my arbitrarily large arrays, I made two arrays of all of the words used in The Tale of Two Cities and Great Expectations by Charles Dickens (arrays of 136,237 and 184,418 words, respectively). Then I ran them through this bit of code:

require 'benchmark' Benchmark.bm do |x| x.report("intersection") { two_cities & expectations } x.report("subtraction") { two_cities - expectations } x.report(" iteration") { two_cities.each{|word| expectations.include?(word)} } end

I compared against array subtraction as well because I figured that while it returns the inverse of array intersection, it does it in a completely different way, and as such it deserved to be benchmarked too. To give some disclosure, this was performed on a late-2006 iMac with a 2GHz Intel Core 2 Duo processor and 2.5 GB of RAM.

user system total real intersection 0.090000 0.010000 0.100000 ( 0.141857) subtraction 0.080000 0.000000 0.080000 ( 0.079905) #it was the best of times iteration 242.670000 0.570000 243.240000 (259.153025) #it was the worst of times

Anyway yeah, array intersection is a pretty cool guy. Also, it turns out putting things in code blocks means nothing on this WordPress deploy.

About the Author

Biography

It's business time!

Helps Anyone know how to make capybara-webkit be quiet? When we run our request specs we see this every tim...

Ethics and the Data-Driven Enterprise

Despite the prodigious business value, civic innovation, and predictive insight yielded from the cascading ...

Ensure the contents of one array are in another array

About the Author

Previous

Next

Ensure the contents of one array are in another array

About the Author

Previous

Next

Related content in this Stream

How VMware Tanzu CloudHealth helps customers uncover spiraling AWS Extended Support charges.

VMware Tanzu enhances Spring development with simplified operations, accelerated innovation, seamless microservices transition, increased security, and effortless scaling.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

This 7-part blog series provides a roadmap for architecting a data science platform using VMware Tanzu. We'll delve into the building blocks of a successful platform that drives data-driven insights.

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.