Combatting adversarial AI — which side are you on?

January 29, 2019 Justin Smith

Forget about AI taking our jobs, let’s worry about the attackers aiming to weaponize it.

Most IT security specialists readily admit that the future almost certainly contains the union of artificial intelligence and automated cyber attack systems. While that conjures up potential scenarios of Terminator-style “Skynet” disasters, the reality is both a lot more fantastical and mundane.

The notion that there are numerous defensive security solutions already in the marketplace that use some effective form of AI is false. Many vendors use the term very loosely to mean we can automate or orchestrate our detection or rules or scans and thus think they have accomplished some sort of near version of AI. But this isn’t true, and is simplistic pattern matching masquerading and marketed as AI. True AI uses data to make decisions and branches out from those decisions to something that wasn’t immediately obvious or predictable.

So we have this situation where many vendors are talking about AI but aren’t actually using the technology, or not really demonstrating its true potential. Furthermore, there have been few efforts where vendors have collaborated on defending against potential AI-based adversaries: the best examples of this are the 24 Information Sharing and Analysis Centers that have been created around specific vertical markets.

With the ethos of open source in mind, I wanted to share a very serendipitous discovery my team at Pivotal made not too long ago that illustrates how AI can be used in a novel way to solve a very real problem. Our hypothesis borrowed from an education theory called transfer of learning: where knowledge of one context is applied to a completely different scenario.

Rohit Khera on my team hypothesized that we could teach Google’s TensorFlow to detect when data is confidential or private, like a password, and when it wasn’t. The foundational idea was to interpret lines of code as an image of grayscale pixels. He wrote a blog post on the topic in 2016. There’s some mysterious contextual link between AI’s ability to recognize a picture of a flower and a picture of a password in text.

Google’s AlphaZero found a similar link between Go and Chess. For context, Google’s AlphaZero used AI to teach a machine how to play Go. A year after Alpha won a game against the top Go player, it won a match against the top-rated computer chess program. What’s remarkable about the latter feat is that the AI program taught itself in just a few hours to learn how to play at this grandmaster level. No one programmed the machine the typical opening moves of chess, or famous games of the past. It took one set of rules — the game of Go — and extended them into another — chess. Khera presumed that a similar notion could apply to the very real problem of leaked credentials in log files and source repositories.

Artificial intelligence can scour code to find accidentally public passwords

To understand his approach — which I’ll describe in a moment — let’s first consider a typical IT security engineer’s day. Alerts are coming in constantly about all sorts of anomalies: many of them are false positives, while others could be signs of a system breach. Screening these alerts is a complex and tedious human task, and that is what many defensive AI-type tools are trying to help. This can mean a person doesn’t have to evaluate so many of these alerts and can focus on the ones that really matter. Given that the average IT operation is running hundreds or thousands of different applications, this is a very difficult situation, and one of the reasons why so many breaches occur.

But many vulnerabilities happen because of human errors too, and these situations rarely trigger an alert. An engineer copies a sensitive file to a cloud storage bucket and sets access rights to “anyone.” Or a set of personal data is accidentally copied to a log file and stored as plain text on some external server. These situations aren’t always obvious but are very problematic, and could threaten the entire enterprise. If an attacker finds this information, they can harm our business.

We already use a variety of automated tools to facilitate patching of our systems, or managing common infrastructure situations such as spam handling, and system configuration, so why not take things to the next level here with more advanced AI tools? While AI isn’t yet the answer to everything, and does require a joining of machine and human learning to be effective, it can be a lot more useful with the right set of applications in the security space, and particularly to defend against adversarial AI-based attacks.

For our research, we started providing TensorFlow a series of matrices that represent textual data, such as a series of private encryption keys, or a list of passwords, or something similar. We then used the software to figure out ways to recognize when data was confidential and when it wasn’t. We connected this to a live series of logs that contained both private and public kinds of data and trained it accordingly. We found that the program could detect confidential data much better than what we were doing with regular expressions beforehand.

What’s compelling about the findings: just as Google AlphaZero was able to beat the best computer chess player (and chess programmer) in less than a day, our AI demonstrated the promise of being better than our regular expression authors.

Granted, transfer of learning as it relates to AI is mysterious. We are still building a simple vocabulary to even discuss the most basic of terms of what’s possible. But as we learn how to apply AI in the defensive context, you better believe attackers are also trying to figure out how to weaponize AI.

My hope is that our industry can carve out the necessary time to experiment with AI and learn. Engineers need the freedom to occasionally experiment and fail, attackers do it all the time. If we experiment and innovate, I’m hopeful that transfer of learning will help us to better understand data relationships and successfully defend our enterprises.

Combatting Adversarial AI — Which Side Are You On? was originally published in Built to Adapt.

Demystifying the public or private cloud choice

A recent study found that just 4 percent of organizations run their applications exclusively in the public ...

“Your IRS wait time is 3 hours.” Is lean possible in government?

The story of how the IRS embraced Lean Startup practices and built an app citizens craved.

Combatting adversarial AI — which side are you on?

Forget about AI taking our jobs, let’s worry about the attackers aiming to weaponize it.

Previous

Next

Combatting adversarial AI — which side are you on?

Forget about AI taking our jobs, let’s worry about the attackers aiming to weaponize it.

Previous

Next

Related content in this Stream

Bitnami-packaged open source software is loved by developers for its ease of use, which enables developers to directly pull a Bitnami package and seamlessly start using it with little effort.

VMware Tanzu announces the General Availability of AWS Commitment Discount Recommendations, which provides recommendations for all reservable services in AWS through VMware Tanzu CloudHealth.

Introducing VMWare Tanzu Data Hub, a self-managed Database as a Service (DBaaS) Platform, providing enterprises a way to host their internal DBaaS offering for internal business users.

In the cloud-native landscape, MCAs drive seamless compliance integration. Their expertise ensures proactive security measures align with regulatory standards for sustained innovation & collaboration.

Tanzu Application Platform brings innovation faster with more frequent feature updates. With 1.9, take advantage of enhanced DORA metrics visibility and improved compliance options for companies.

We’re excited to share some great news! Spring Academy Pro content is now free. It will be available to everyone who registers a work, vocational, or educational email address.

March 28, 2024, marks the official minor release date of Spring Cloud Gateway for K8s version 2.2, and it's set to optimize how developers protect access to their GraphQL services.

We are excited to announce that VMware Tanzu Application Service 6.0 is now generally available!

Get a clear picture of your OSS supply chain, and the risks you face from your open source software dependencies, using the all-new Tanzu OSS Health Assessment.

Trivy can now utilize CSAF VEX data to filter out false positives in CVE reports, maximizing the value of VEX documents in VMware Tanzu Application Catalog.

Bitnami-packaged open source software container images available in DockerHub are now signed by Notation, an implementation of the Notary Project specifications and a CNCF-incubating project.

There’s never been a better time to be a Java and Spring developer! Let me show you why with a sneak peak into JD Conference 2024.

If you're into FinOps, you've probably heard of FOCUS. Introducing our FOCUS FlexReports template for AWS, Azure, and GCP. Turn your cloud bills into FOCUS-compliant reports in minutes!

The latest Spring Boot simplifies infrastructure setup with Docker Compose. Now, supporting Bitnami images, it opens new possibilities for developers. Exciting times ahead!

Shape the future of Spring! Participate in the State of Spring Survey 2024. Share insights, collaborate with the community, and drive innovation.

Extend Apache Tomcat support with Tanzu Spring Runtime. Seamless transition, enhanced security, and uninterrupted workflow for Java applications.

Welcome to another edition of What’s new with Tanzu Application Catalog. This is a quarterly round up of all things related to Tanzu Application Catalog.