What is artificial intelligence?
Artificial intelligence (AI) is a broad term that means different things to different people, and in different contexts. It invokes images of robots and trite jokes about Skynet. Some very serious people are worried about artificial general intelligence (often referred to as AGI) that will overtake human intelligence and become an existential threat to humanity. Other people hear “AI” and, thanks to any number of marketing efforts guilty of putting form over substance, assume they’re being sold a bill of goods.
For the purposes of this guide, though, AI can be thought of as a broad category of computing techniques that can be trained to perform certain functions with relatively minimal human oversight. These include techniques and terms that have gained quite a bit of popularity over the past several years, such as machine learning, deep learning (and neural networks), and reinforcement learning to name just a few.
We’ll use the terms AI, machine learning, and deep learning somewhat interchangeably, because (as explained below) deep learning is the technique that really catapulted AI back into the public consciousness over the past several years. However, deep learning itself is just one method under the broader umbrella of machine learning (which also encompasses a wide variety of techniques for clustering, classification, and regression), which itself is just one technique under the broader umbrella of AI (which also includes non-machine-learning approaches such as knowledge graphs). For a more detailed take on each of these terms, and AI in general, Cousera’s “AI for Everyone” course is well worth checking out, as is this glossary from Google.
This guide doesn’t go into detail on how each method works or how they’re similar/different—those are decisions best left to developers and engineers tasked with building AI systems —but it’s safe to say that many of them are well understood, production-tested (although research continues to push the cutting edge), and have proven themselves valuable for any number of commercial tasks. Also, as opposed to the hypothetical AGI noted above, all of these techniques are considered narrow AI—meaning they’re focused on human- or even superhuman-level performance on specific tasks.
Fig. 1: AI is a broad term that encompasses many things.
Why is it so popular?
There are many reasons why AI is so popular right now, although the big two (arguably) are mentioned in the previous section: (1) AI is driving real revenue growth and/or cost-savings at organizations that have applied it to meaningful use cases; and (2) approaches such as deep learning have made it simpler to automate tasks (computer vision and natural language processing, for example) that were typically quite intensive in terms of time, effort, and expertise using previously available machine learning techniques. Basically, modern algorithms are making AI more accurate, efficient, scalable, and feasible for a broader population of organizations.
If you’re wondering why AI is popular now, as opposed to taking off during previous decades, that is largely a function of available resources. Access to relatively cheap computing power (especially GPUs) and mountains of data made it possible to start training deep learning models with high accuracy and within reasonable timeframes. It was the success of these models circa 2012 that helped reignite public excitement around AI. With each passing year, AI algorithms mature thanks to armies of researchers and production implementations; data becomes even more available; computing hardware gets even cheaper; and new processor architectures drive performance even higher.
It also didn’t hurt that, as explained in more detail below, the world had already been indoctrinated by big data in the years leading up the emergence of deep learning and the reintroduction of AI into public consciousness. In some ways, deep learning is like big data on steroids—if you feed them enough data, deep learning algorithms can achieve markedly better results for certain tasks (and do some entirely different things) with arguably less effort.
(However, it’s notable that there’s a growing movement among AI researchers to develop techniques that require less data and fewer computing resources than many of today’s popular techniques require. This would potentially make best-in-class AI techniques and models available to organizations without massive datasets and massive budgets for buying and running hardware.)
Finally, AI is popular because it inspires imagination. It’s hard to not think about the possibilities when you witness applications like Siri, Alexa, and Google Assistant go from zero to near ubiquity over the course of several years, or when we see AI models dominating ever-more-complex games and outperforming doctors on analyzing medical images. For all the concerns about dangerous AI and biased algorithms, there’s also the promise of tackling very difficult problems in science, civil engineering, business, and more.
What can it actually do today?
It’s probably best to start a discussion about what AI can do with a big caveat: Just because a shiny new technique can be used to solve a problem, that doesn’t mean it should be used to solve that problem. There are many cases where data and problems lend themselves to more traditional machine learning and/or statistical analysis techniques that will work fine—if not just as good or better—and require significantly less investment. There also are areas where humans should continue to carry out actions or decisions because either they’re better than machines, or because any incremental gains from automation aren’t worth the resources to make it happen.
In fact, it’s a spate of products (both enterprise IT and consumer-facing) needlessly and/or inaccurately marketed as “AI” that has made a running joke of the term in some circles. Perhaps you’ve heard of “smart” toothbrushes. Or have seen Bob Dylan talking to a laptop in a TV commercial. Or have been pitched a new product that will use AI to make sure you never lose another sale or miss another network-intrusion attempt.
Where recent advances in AI really do shine, however, is in the area of machine perception—computer vision, speech recognition, natural language processing, and the like. This is where some of the biggest breakthroughs have come over the past several years, and why we’re now inundated with “smart” capabilities in any device that can house a microphone or camera (and, of course, connect to the internet). It’s also why researchers and entrepreneurs alike are applying AI across all sorts of fields, trying to determine how accurately can diagnose diseases, predict crop health, regulate online content, and do any number of other things “simply” by analyzing images, sounds, and/or text.
AI models are also proving adept at analyzing other types of data, from structured data to time-series data streaming from sensors. A typical application in this type of environment might include finding latent patterns in large, complex datasets that human experts can then analyze to determine their significance (or not). For example, some companies apply AI to areas like patient data or financial fraud, where the models identify correlated features and experts examine the results to see if they’ve uncovered something new or significant.
Fig. 2: A model trained on landscapes is fed a new image and correctly predicts it's a mountain.
Most successful AI implementations involve supervised learning. This means that the model is trained on specific, labeled things—from pictures to spam messages to fraudulent transactions—which it is then able to recognize when it sees new, unlabeled examples. In the oversimplified image above, for example, the model might have been trained by a social media company to recognize any number of common landscapes. Now, when a user uploads a photo (even without posting a comment about it), the AI model is able to detect in which type of geography it was taken and the company can use that information to help fill out its profile of that user.
Although it seems simple at such a high level, under the covers the model is actually cuing in on common features it has recognized while analyzing all those labeled images during the training phase. This is why having quality data is important: Because too much noise might result in a model that focuses too heavily, for example, on a blue sky in the background rather than on the features that comprise a mountain, beach, or forest.
There’s also promise in the world of unsupervised learning. In fact, one of the first public demonstrations of deep learning was Google’s work using an unsupervised model to analyze YouTube videos; the model saw enough cats and human faces to recognize the common characteristics of each (it knew they were things), despite not having any labels to apply. Many of the reinforcement learning models that currently generate headlines (typically for mastering games of some sort) are also unsupervised, in a manner of speaking. Although there are pre-defined goals, the “agents” are free to experiment and learn from mistakes as they seek to achieve those goals.
You can imagine letting a model loose on a massive dataset to identify outliers in online transactions, or to cluster customers into granular cohorts based on hundreds or thousands of data points. Taken outside of video games, reinforcement learning has the potential to do things like accelerating the discovery of new drugs or materials (such as this newly discovered antibiotic)—the types of workloads traditionally performed on supercomputers—by letting models run simulations and devise novel ways to achieve specific outcomes. If there’s a major drawback to reinforcement learning at the moment, it’s that it can be computationally expensive.
If there’s a thread connecting all of these different use cases, it’s a combination of scale and speed that’s beyond human capability. Whereas studies suggest people are capable of putting names to thousands of faces, trained AI models can put names to millions or more. Machine listening models can suggest similar songs across a database the size of Spotify’s. And no human can fully analyze hundreds of thousands of records each containing dozens of data points, or run millions of simulations over the course of days in order to attack a goal from every possible angle.
That being said, human judgment and expertise still play a valuable role in evaluating AI models and their findings, and deciding how to act based on the predictions those models generate. The goal for organizations deploying AI either internally or as a product feature is to find the sweet spot where they can take advantage of the sheer scale, speed, and processing power of AI models, as well as the ability of humans to apply context and make wise decisions based on the data they’re shown. Today, AI is still best viewed as a tool to augment or inform human decision-making, or to automate certain tasks while still giving users the freedom to take over the reins if need be.
What are some good examples of useful AI applications?
With all that in mind, let’s quickly run through some examples (that are not related to voice assistants such as Amazon Alexa or Google Home) of AI being applied successfully—and sometimes novelly—by enterprises across various industries:
Automating important but mundane tasks
Often times, the immediate and tangible benefits of AI are around automation rather than revolution. Simply cutting the manual effort required to perform menial tasks can save organizations a lot of time and energy, and allow them to put human brains to work on more valuable—or at least more complex—endeavors. A down-to-earth example (that doesn’t technically involve AI) in many homes might be a Roomba robotic vacuum cleaner: While it’s doing the relatively menial job of vacuuming, you can perform more complex household chores (such as dusting), or whatever else would be a more valuable use of your time.
Some examples at industrial scale include:
Uber uses computer vision to scan millions of drivers’ licenses, restaurant menus, and other documents every year. It also uses other machine learning techniques for many tasks, including optimizing pickup times for riders.
Swedish paper manufacturer BillerudKorsnas AB uses AI to determine how long wood chips must cook before they become usable pulp. The company says that monitoring the relevant diagrams and charts all day is too boring for humans, who are needed for more valuable work.
Canadian trucking firm Polaris Transportation Group uses AI to sort, read, and analyze shipping documents faster and more accurately than can workers. The project proved so successful that the company spun out a consulting arm to help other trucking firms do the same thing.
Chinese food producers are using AI to taste recipes and production batches to ensure consistency and quality. The AI systems are faster, can work around the clock, and aren’t subject to variances in human judgment.
Some organizations are adopting software that automatically transcribes meetings, so attendees can focus more on substance and less on note-taking.
Many software vendors are incorporating AI into their developer tools in order to minimize the toil of writing quality code. These tools can do things such as auto-completing code as developers are typing (think about how your text message application predicts the next word you’ll type), scanning for vulnerabilities, and generally suggesting ways to improve the quality of code.
Finding patterns in “high-dimensional” spaces
AI models are also very useful for monitoring and analyzing large and/or complex datasets, or high-dimensional data, in order to identify patterns that humans cannot detect without extraordinary effort. Typically, these applications use AI not to automate something that humans previously did, but rather to help inform human decision-making and understanding of a problem. Applications of this type of analysis range from cybersecurity to data center optimization, and from medical records to fraud detection.
Specific examples include:
Google used machine learning to optimize data center efficiency and, as of 2016, reduce cooling costs by 40 percent. The models, trained on a large historical dataset of sensor outputs, could predict usage by recognizing patterns that experts could not identify.
We’re not there yet, but many experts believe AI will improve cybersecurity by detecting anomalous behavior. This is more than merely flagging attempts to access corporate networks; the vision is that models will be able to eliminate passwords by understanding what is normal behavior for users across many different aspects.
Researchers are using AI models to identify disease more accurately than doctors—and sometimes years in advance. MIT, for example, has demonstrated a model that can detect breast cancer years in advance by recognizing subtle patterns in breast tissue from medical images.
Mastercard uses AI to personalize fraud detection, resulting in fewer false positives. By analyzing myriad data points about individuals’ behavior—including things such as how the angle of the phone on mobile purchases—the company is able to increase accuracy beyond what legacy rule-based methods could do.
Virtually every email provider and mobile carrier utilizes AI to power spam filters and/or identify scam calls. These machine-learning-powered techniques are especially critical as spammers and scammers develop more advanced methods for defeating rule-based systems, and as the sheer volume of digital communications continues to skyrocket.
Defining new markets
This use case for AI is less mature than those around automating or optimizing existing tasks, but the promise is immense for organizations that can harness AI to open up entirely new markets or perhaps even create new products. We’ve already seen the success that companies like Amazon and Google have had in creating a new market with voice assistants—a market that is fast expanding into nothing less than total home automation. And although Pinterest was founded in 2009 (before the deep learning revolution of circa 2012-13), it has expertly used AI-powered computer vision to create a platform centered on recommending similar items and allowing users to buy products that resemble ones in which they’ve expressed interest.
The big challenge here is identifying those opportunities where AI can power a new thing, whatever that is, that’s actually useful. Simply tacking “artificial intelligence” onto an existing product or workflow doesn’t necessarily make it smarter or better (it can sometimes makes it worse), and it certainly does not help establish a new market where you have first-mover advantage.
How does it relate to other things I care about?
In many ways, AI is the epitome of “big data” as it was advertised during the heyday of Hadoop et al in the early-mid 2010s. A general idea fueling that movement was that running analyses over larger datasets is better than running analyses over smaller samples. There was also much talk about finding the proverbial needle in the haystack—those hidden business insights that only come from analyzing new data in new ways.
For the most part, this is how many modern AI algorithms work. Deep learning, in particular, is predicated on the notion that more, better data equals more accurate models. It’s why any reasonable advice on getting started with AI almost certainly includes something about the importance of having the right volume of the right data.
Hence, doing AI in-house requires a robust data infrastructure. You’ll probably need to store a lot of data and pre-process it—possibly using a system like Apache Hadoop or Apache Spark—in order to get it into shape for modeling. Many organizations also train models directly on Apache Spark, thanks to its support for TensorFlow and other popular deep learning frameworks. Essentially, the data infrastructure required for AI can look very similar, if not the same, as what’s required for doing business intelligence, data science, and other data analysis at scale.
However, there are at least a couple of major differences between analyzing data using modern AI algorithms and more traditional methods. One is that AI as defined today excels at machine perception tasks, as well as more traditional data analysis on structured or unstructured data. Another is that while machine learning was always an option for analyzing data, today’s advances allows organizations to do it on more data and with less of the cumbersome, manual and error-prone process of “feature-engineering.” Today’s AI systems can identify relevant variables and features automatically, saving engineers lots of time on the front end of the process, and leaving them "only" to tune the “weights” assigned to each variable in order to improve accuracy.
AI can be part of the data science toolkit, but you would be wise to not conflate the two. The major distinction—which Andrew Ng does a great job explaining in his “AI for Everyone” course—is that data scientists typically do their best work in terms of applying various analytic techniques to answer specific business problems, and might end up presenting their results via slide deck or some other sort of presentation. If a company is trying to improve engagement on its website, for example, a data scientist might devise the right tests, analyze the data, and present her findings and recommendations to a broader set of stakeholders.
AI techniques can be useful tools for data science work, but its utility starts and stops with analyzing the data put in front of it. That’s why many successful AI applications focus not on trying to solve or assess specific business problems and opportunities, but rather on carrying out a specific task that will end up as a continuously running system doing just that one thing. Things like automating data center cooling, handling speech inputs from smart speakers, or serving relevant ads to site visitors.
Analytics / business intelligence
The discussion about data science and AI very much applies to the discussion about business intelligence (BI) and AI, as well. BI is a tried and true practice with many tried and true products for helping organizations view their most-important data. For the purposes of pure BI, AI will probably be consumed as features inside the commercial applications already in use—whether that’s new ways of analyzing data or utilizing capabilities such as natural-language search or queries in order to find relevant data or ask questions easier.
However, for folks tasked with deeper analytics jobs, some databases now support AI models. Greenplum, for example, utilizes the Apache MADlib project to support deep learning and other machine learning operations inside the database, using SQL as the interface. While these integrations tend to be early in terms of maturation, they could provide some serious value without requiring organizations to build new systems dedicated to AI or to buy specialized AI products and transfer data to them from the database.
Internet of Things
AI and the Internet of Things (IoT) are often discussed in the same breath because the expectation for many IoT devices is that they’re “smart.” In the consumer world, users want to issue voice commands, recognize objects (in the case of security cameras), and even “learn” to control certain tasks (as happens with smart thermostats, for example). Device-level expectations might differ at in industrial IoT settings, where sensors and other connected devices feed data to systems that can identify, for example, potential machine failures or system outages.
Generally speaking, most IoT devices themselves are not too smart without an internet connection, because most computation still takes place on servers (cloud-based or local, depending on the setting). This places added importance on application architecture and even the underlying infrastructure in order to minimize latency and ensure reliability, especially as smart devices take on more responsibility inside homes, factories, and elsewhere. Advances at the hardware and algorithm levels—and privacy concerns, as well—will eventually push more AI computation (specifically, inference) onto devices, but save for some specific instances (high-end smartphones, for example) we’re not there yet.
For all the reasons mentioned in the above section on IoT, cloud computing is a critical component of the AI story today. However, the cloud is not just a place to rent dumb servers to run models that power smart devices. The major cloud providers—Google, Microsoft, Amazon Web Services, and even Alibaba—are also some of the major centers of AI research and, increasingly, AI services. These services run the gamut from tools for tuning homegrown deep learning models to fully managed, just-add-data products for tasks such as computer vision and natural speech recognition.
The wide spectrum of cloud services is important for a number of reasons. A big one is that organizations using cloud-based AI services might not need to hire in-house experts; whether or not they do depends on a number of factors, including whether turnkey services can deliver on performance and reasonable cost at enterprise scale. At any rate, though, these services can definitely be a good solution for testing new AI-powered products and building proofs of concept.
Advances in AI-specific hardware are another reason that the cloud could end up being very important for the development of mainstream AI. Training deep learning models, in particular, is historically a compute-heavy task carried out across large numbers of GPUs. Especially if you’re not regularly training new models—or even managing your own data centers anymore—renting those resources from a cloud provider could very well prove the most efficient, and possibly cost-effective way of procuring those resources. What’s more, some cloud providers are also developing and renting their own specialized AI hardware, which promises lower costs and better performance, but which also might increase lock-in at the processor and/or development levels.
Cloud computing—or, more specifically, cloud-native computing—also comes into play for organizations looking to host AI systems locally. Generally speaking, the most popular frameworks needed to train production AI models (including databases and data-processing engines) can all run on Kubernetes, which provides a modern platform for managing containerized software packages. Given that Kubernetes looks increasingly like a default substrate for hosting many applications, systems, and microservices, it probably makes sense to future-proof the management of AI systems by planning to run them on Kubernetes, as well.
However, training and running AI models are two very different things, and what makes sense for training (for example, utilizing a cloud service or running inside a datacenter) might not make a lot of sense in production. In large part, this is because latency and usability concerns are pushing more AI workloads to run locally, either on-device or at least on-site. Going forward, data privacy concerns and regulation will likely play a big role in influencing where AI workloads run, as will the advent of ubiquitous 5G networks. The connective tissue among all these factors is that they blow up the cloud-centric AI architecture that requires a constant, high-speed connection to achieve any sort of real-time experience.
What should I be concerned about?
Staffing and organization
As with so many new technologies, one of the major topics of conversation (and concern) around AI has to do with finding employees who possess the right skills to successfully implement it. On the one hand, it’s a fair concern: AI techniques and systems are still sufficiently novel so as to have relatively few skilled practitioners—database management or Java programming they are not. And during the peak of AI hype only a few years ago, it wasn’t uncommon to see large web companies hiring up seemingly all of the world’s AI experts and paying them enormous salaries.
On the other hand, the situation is probably much less dire today than it was those mere few years ago. After all, most organizations don’t need AI research divisions with master computer scientists to run them, and access to quality information and training on fundamental AI techniques is now easy to come by (via courses such as Deeplearning.ai and Fast.ai, but also any number of blog posts and instructional videos). All of the most popular machine learning frameworks are open source, as well. It’s faster and there are fewer barriers than many people might assume for engineers and developers interested in learning practical AI to pick up many of the skills they need to be useful.
Of course, other aspects of deploying AI in production—such as data architecture and engineering, application reliability, and decisions around whether to use cloud services or go the DIY route—will require different sets of skills than just being able to write a good AI model.
Another tactic that some enterprises use to advance their AI efforts is to partner with universities. As just one example, insurance giant Liberty Mutual is partnering with MIT on research into topics and applications that will be important to the insurance industry as AI adoption increases. While these types of partnerships probably aren’t akin to outsourcing product development for an individual company, they can help advance the industry as a whole and, more importantly, provide valuable insights into opportunities to jump on or costly risks to avoid.
Ethics and privacy
The chances are that anybody remotely interested in AI has already heard and read numerous concerns about AI as it relates to data privacy and ethics. This is no wonder, considering how easy it is to fall into traps any time you’re relying heavily on data to inform decisions. On the privacy front, there is always risk associated with storing personal data and analyzing it in ways that might be used to invade individuals’ privacy, or at least in ways that gives people the willies. The infamous Target-teen-pregnancy case is a prime example of this, years before AI came back into the limelight.
In addition to existing concerns, what makes AI even more shaky—especially with customer-facing products—are the types of additional data often being collected—photos, faces, voices, biometrics, and other unstructured, non-traditional data types. Many people are put off by the idea of strangers getting access to these intimate and sometimes deeply personal aspects of their lives (think medical images or surreptitiously recorded voice recordings, for example). Amazon, Google, Apple, Facebook, and others have already faced backlash over what they’re collecting, when they’re collecting it, and who has access to it, but it’s unclear what types of standards might emerge around handing this data.
The ethics issues around AI are dicier yet. Everyone has heard the saying “garbage in, garbage out” to explain how poorly prepared or poorly sourced data can lead to inaccurate or biased analysis, but AI adds some new wrinkles. A big one is the subconscious desire to treat AI systems as infallible, or at least as truly intelligent, and to overlook data quality in the process. This is how you end up with hiring algorithms that just reinforce the existing experiences and skill sets within the organization, or with criminal-sentencing algorithms that reinforce existing stereotypes. Or voice-powered systems that don’t recognize different accents, languages, or pitches.
Doing AI correctly requires training models on inclusive datasets that cover more ground than organizations might have internally, or can easily get online. Getting additional data ethically might require some creativity (there’s also research on algorithms to debias AI models), but training models on a broader set of examples can deliver more representative results, while also potentially attracting a larger user base and minimizing the chance of a PR or even legal situation.
Overreliance and explainability
Overreliance or putting too much trust in AI systems can also lead to problems—sometimes ethical and sometimes legal. The situations mentioned above are examples of what happens when organizations implementing AI put too much stock in the models and too little effort into data integrity and human oversight. But AI also presents risks for users who rely too heavily on what they’re being told or what they believe the system can do.
Fig. 3: A rough spectrum of risk associated with AI applications.
In some cases, a degree of responsibility definitely falls on the user. This applies strongly to applications where AI is acting as an assistant (giving users suggestions they need to assess) or where there’s no real likelihood of significant injury in the case that the application isn’t working correctly. Most people have the ability to manually control the stereo, laptop, or lightswitch if their smarthomes are acting up.
However, other scenarios have actual life-or-death consequences. In these cases, the question is around how much automation is acceptable, and to what degree algorithms should be allowed to act rather than to inform. The answer might depend on the specifics of any given situation (for example, where an autonomous car is driving and at what speed) or the field in which AI is being applied (health care and medical diagnosis, for example).
Cybersecurity probably falls somewhere in the middle. Companies are sold software promising to keep them safe from threats by utilizing AI, but if the software’s “brain” is essentially a “black box” of proprietary algorithms and AI models, how much faith should customers put in its ability to do what’s advertised? Here, a false positive or a missed threat might not cost anybody’s life, but it could cost an organization and/or its customers a lot of time and money.
A lot of this comes down to product design, specifically around and user experience (UX) and explainability. UX is about assessing the consequences of various mistakes, and making sure users have the level of control necessary to overrule a system and take over control if need be. A very simple example is an email service sending messages to a spam folder—where users can verify whether a message is truly spam—instead of automatically deleting them.
Explainability is the idea that an AI system should be able to explain to users why it made the decision it made, and is strongly connected to the "black box" problem mentioned above. It can be easier to trust an application that explains why it did something—and that level of transparency might even be necessary in heavily regulated industries like financial services, or in areas like health care and transportation where a court might have to assess who’s liable for damages. Of course, explainability, too, comes down to balance: knowing why your streaming music service suggested a particular song probably isn’t too important.
How do I get started?
This document is a lot to digest, and anyone who starts digging deeper will soon find a seemingly bottomless pit of information on how to do AI correctly. Thankfully, however, executive-level guidance can be boiled down to some relatively simple steps:
Identify potential applications of AI within your business or industry. If something can be easily automated or optimized via data analysis, and if there’s little risk associated with mistakes, it might be a viable candidate for an early AI project.
Determine what data you have and will need in order to power those AI models. Data defines how accurate and how useful an AI model can be, so there’s no benefit to skimping here. Budget spent on cleaning or acquiring data might not be sexy, but it’s probably budget well-spent.
Find the right team(s) internally to build out technical AI strategy. As long as it fits into broader IT and corporate mandates, you probably don’t need an opinion of how an AI system is built and trained. But do make sure that the teams building this out have adequate time and resources to make informed decisions.
Start experimenting. Going from zero to production AI model is rarely a good idea. Start small and prove efficacy incrementally, then grow organically as in-house knowledge, skills, and comfort increase.
Like deciding whether to undergo a major surgery, you’ll probably want some additional opinions before rushing into your first AI project or earmarking millions of dollars into building an AI team. Here’s a list of valuable resources from elsewhere to help you understand what’s possible and what to expect: