Aaron Levie, CEO of Box, talks with Michal Lev-Ram, Senior Writer at Fortune, about how advanced technologies like machine learning will revolutionize the enterprise.
“We looked at our R&D budget and compared it to numbers that were about 100 to 500 times larger and decided that it would be way better to partner for this one. So, that was a really easy math exercise.”
Michal Lev-Ram: Everyone familiar with Box? Do we need an introduction? No, okay.
Aaron Levie: I didn’t see a single hands, so uh, it’s hard to know actually
Michal Lev-Ram: They nodded. They very, quietly nodded. (laughing) Okay, I’ll assume you’re all familiar. Aaron doesn’t knows a lot about avocados, but …
Aaron Levie: But, shampoo, I’m working on so I have to get some Brandless shampoo now so I’m excited for that.
Michal Lev-Ram: It’s a gateway product to coke then…
Aaron Levie: …To other three dollar products.
Michal Lev-Ram: Yes. Alright, now we’re gonna talk about AI, not avocados with this conversation.
Aaron Levie: Okay.
Michal Lev-Ram: So, start us off…with just where you see AI in a lot of consumer applications, trickling out. Where do you see things on the enterprise side?
Aaron Levie: Sure, and do we have official confirmation that this is the first AI conversation in a chapel that’s ever happened? Are we on Wikipedia for that?
Michal Lev-Ram: There’s gotta be some chapel in Silicon Valley that’s dedicated to this.
Aaron Levie: I think there is one, right? The auto guy or something that has religion going on. Alright, so just very briefly just in case you are confusing us with Dropbox, so Box is this company that we started about twelve and a half years ago. It’s a really simple idea, we wanted to make it really easy to share and collaborate and access files from anywhere. We were in college at the time and eventually decided to focus 100% on the enterprise market so people inside of your organizations were starting to install the application and starting to use it to share files and collaborate. So, we pivoted to the enterprise, dropped out of college, moved to the Valley. I did that whole thing. All the required sort of ways of building a startup we tried to execute.
Michal Lev-Ram:You guys were actually working right next to a Fry’s for a long time …
Aaron Levie: Yes, yes I, which now …
Michal Lev-Ram: Which we just found out is owned by Kroger. I didn’t realize.
Aaron Levie: Yes, so Kroger will make it’s way throughout this story I’m sure. So, we will make sure that happens. So, we worked in a Fry’s warehouse next to Fry’s. Very good logistics by the way, so congrats Kroger for that. But, so we built up the company, we now have about 17, 18 hundred employees. 65% of the Fortune 500 use Box to securely manage, share, collaborate around content. So, the whole mission of the company is we wanna make it incredibly simple to manage any amount of files and documents and important information in enterprise that scale, and be able to let companies be able to collaborate and share securely in and outside their organization. And so, as you can imagine, we sit squarely at the center of this whole massive ecosystem of… Okay, how do we start to take all of the data in our business and be able to derive more information, more insights, and more intelligence from that content.
So, going back 5, 6, 7, 8, years we started trying figure how is machine learning going to apply to our service. That was very basic probabilistic things of how could continue to improve the service and what content we were showing people based on whether it’s search results or activity feeds. That was really the, sort of, level of vision we had at the time. And then, we saw this big boom occur in mostly deep learning and some narrow AI use cases, especially around computer vision, where you can take any amount of photos, any amount of video, any amount of audio information and begin to glean more and more insights and contacts from that content. So, for us, what we’ve been working on is a product called Box Skills, which takes the machine learning and artificial intelligence services from Microsoft, Google, and IBM, and we’re working on Amazon as well, and plugs those into Box so we can take the computer vision technology from Google Cloud or the audio intelligence technology from Microsoft Azure, or Watson’s sentiment analysis and be able to work against all of the files that we have in Box when a customer wants to be able to take more context and more insights from their information. So, that’s what we’ve been up to.
Michal Lev-Ram: And totally interoperable.
Aaron Levie: Totally interoperable and the idea is, for us, we had to make this decision about a year and a half ago now where we either were gonna go build effectively all of the things that all of the big companies were building. And we looked at our R&D budget and compared it to numbers that were about 100 to 500 times larger and decided that it would be way better to partner for this one. So, that was a really easy math exercise. And we decided that, okay, Google, Microsoft, IBM, Amazon and many others were gonna be building up pretty significant AI initiatives and that we’d be better off focusing more on architecture as opposed to proprietary AI services. So, we’ve been working on an architecture that let’s us, sort of, effectively plug and play. I mean, as everyone in this room knows, it’s way easier to say plug and play and interoperability than it is to actually do it. So…
Michal Lev-Ram: …Just press a button.
Aaron Levie: It’s just a button. It’s a checkbox and an admin console. That’s all it is. And it doesn’t ever work that way, but that’s what you hope. The idea is how to be build an abstraction layer from all of the content that we manage. We have about 35 billion files that we store. And an abstraction layer from the content and all these different third party AI services that we can basically do individual or mass transactions on so we can begin to process data. And so, if you’re a retailer and you have, let’s say 100 million photos of all these different products, and you wanna instantly be able to tag them with all the objects that you see inside of the photos, that’s what things like Google Computer Vision are really really good at now. Accuracy is still constantly improving, but we think in the next couple of years it’ll certainly exceed what a human can do from especially a time standpoint and cost standpoint. So, that’s our kind of forte so far in some of the AI space, and then we’re doing a bunch of work ourselves with more proprietary machine learning work, just to better connect the dots between how people are using information and content within Box and then build a bunch of functionality on top of that. So, we’re super pumped. We think that it’s gonna completely change enterprise software. I think it’s a requirement that I say that, obviously, but we deeply believe this. I mean, when we look at, I mean, I don’t even remember what the question is anymore but…
Michal Lev-Ram: …yeah, I’ll go back to it. Let’s reel you back here.
Aaron Levie: Yeah, please reel me back.
Michal Lev-Ram: So, take a step back and rejoin us…
Aaron Levie: Yeah where are we? We’re in a chapel. Okay.
“How many times are we using computers and software to do things where we are basically just working on behalf of the software?”
Aaron Levie: Yeah, we are ubiquitous in our collective messaging of AI, so we have reached complete saturation in our ability to talk about AI. So that’s, we’re there on the messaging side. I think in terms of where I think the industry right now is still working it’s way through; and it’d be cool to kind of get a survey maybe at some point. I’m not good at administering surveys so if you have a good way of doing that, that’d be great. But, I think we’re very early in deep, pragmatic, high leverage, high impact business use cases. I think we’re collectively as an industry very early in that.
Michal Lev-Ram: How many of you agree with that? Show of hands. That’s kind of how I do surveys.
Aaron Levie: Okay, that’s how you do it? Okay, cool. Okay, you don’t stratify it by like …. Okay, raise your hand if you’re less than 50% using AI. Okay.
Michal Lev-Ram: That’s too complicated. We need AI for that.
Aaron Levie: Yeah, we will. Yes. So, I think we have this feeling where we know that everything’s gotta change because if you just look at your daily tasks, I look at my daily tasks, everybody here looks at their daily tasks, and we’re just like, “how many times are we using computers and software to do things where we are basically just working on behalf of the software?” and we’re trying to coordinate things and we just all know that if semantic analysis was good enough, if natural language was good enough, if you could connect all the dots of all the data in my company and my organization and that was good enough, that the amount of times I sent an e-mail asking somebody for, “Hey, what was the revenue from this set of customers last quarter?”, and I can string that query together really really easily, then somebody else has to spend two hours doing all of the analysis to pull it together. What if you could string that query together and the AI would just return the answer? We have access to all of the underlying data, what we don’t know is how to join it, make sense of it, and make sure that our natural language is what retrieves it. So, I think what we’re excited about is the raw technology is there in many cases. Our data sets are necessarily all collectively there. We’re not all working with our information in consistent enough ways that horizontal technology’s been able to solve our problems. And this is why I think there’s, to some extent, even fatigue sometimes where we’re hearing so much about AI, but the level of actual market penetration of these technologies is still very early.
So I think what we’re focused on, and overall with Box we’ve just done this for now 12 years, I think what we’d like to do is take all of the complex technology that’s out there, all of the amazing innovation, and bring it to customers and bring it to large enterprises in highly, highly applied, specific ways. So, if you have a lot of photos and you wanna be able to search those photos or build work flows off those photos and not have to have humans manually enter all of the categories of those, we wanna do that at scale in an automated way. If you have a lot of video and you wanna be able to search within that video. If you have documents that you wanna understand the different categorization of those documents. Those are the use cases that we’d like to bring to bear first. And we’re not promising a complete revolution within the next 6 months, but I think over time, when we look at all the tasks that we’re doing, when we look at the work that goes on inside of our organizations, the amount of data that we’re working with, I think that we’re gonna start to—just, every one of these daily experiences will become more efficient. The daily tasks that we’re doing over time just become a lot simpler. So, we’re pretty excited. I think we’re very early, but I think in certainly 3, 5, 10 years from now we will not have predicted most of the change that will come and even though we’re talking about it so much right now.
Michal Lev-Ram: By the way, sorry to call you out over here. You’re furiously taking notes. But I saw you nodding. On the photo side, are you guys doing that in your organization?
Audience #1: Really resonates on the search.
Aaron Levie: Oh, thank God. Okay.
Audience #1: I’m part of AIG, and between all the number of claims, data, underwriting… We’re seeing a lot of searches and it’s all inconsistent.
Aaron Levie: And this is what we find, what most organizations have, especially with unstructured content. Your documents, your files, your claims. You are only working actively maybe on 1 or 2 percent, so I made up a number, but certainly with your scale. Yeah, it sounds great.
Michal Lev-Ram: Sounds good.
Aaron Levie: And so, you have 90 plus percent of your data is this cold data that is effectively archived, but what if there was actually value in that. That you could retrieve at a moments notice. What would that look like? We go to media customers and they have all of this intellectual property that they can’t use because they don’t know what’s in their archives. You go to insurance companies or banks and they have an unbelievable wealth of information, but there’s simply no way to retrieve it because everybody is relying on humans to say, “okay, what was the last thing that I worked on? Where could that be?” And then we can only remember a small percentage of the total amount of data that we have access to. And so that amount of redundant work that happens, the amount of miscalculations that occur, the amount of times I’m searching for the same thing over and over again. Computers should be actually helping us with that and we should be able to kind of move along to the next task in a more productive way. So, thank you for taking notes.
“If all you’re trying to do with your cloud is match what Amazon can already do or what a major public cloud can do, then is your time really going into the most valuable areas of your business? Because your customer doesn’t care where the compute is.”
Michal Lev-Ram: (laughs) Please jump in with questions by the way. Feel free at any point. But, what do you think … you talked about how you looked at your own R&D budget and decided let’s partner, not build … of your customers and people in the room … I mean, are you seeing … are most people building this themselves at large enterprises? Or, what approach are they taking?
Aaron Levie: I think we’re all probably collectively figuring out what is the proprietary advantage in the world of machine learning and AI. And I think that this is just such an early space that it’s pretty hard to know exactly where the line of demarcation is from my service and my data and algorithms that I could leverage from third parties. And we struggle with this internally. Two or three years ago, before we made the decision we said “Okay, if what we’re doing is relying on partners like Google, or IBM, or Microsoft for some of this advanced technology, do we feel we have enough value add to the customer?” And so, then we just invert and said, “Well, the reality is, us building it ourselves is not gonna be inherently valuable to the customer either, because at best we’re gonna match the technology from those vendors.” It’s the same argument for cloud computing. If all you’re trying to do with your cloud is match what Amazon can already do or what a major public cloud can do, then is your time really going into the most valuable areas of your business? Because your customer doesn’t care where the compute is. Maybe the regulator does, but we’ll work through that. So, if your time and energy is being spent matching what the market can sell to you at basically maybe a fraction more margin maybe, then your R&D budget and your time and your innovation and your IP’s not going to the right area.
And so we had to ask ourselves the same question, I think so many customers are doing the same thing. Which is … Okay, so if I can send some workloads to public clouds or private clouds or work with partners on this, then what probably matters more than anything is the data, and the experience around how we’re gonna use that data in our products. And, I think that’s what the twenty-first century is gonna be defined by. In the digital age, it’s not gonna be… it’s gonna be very hard to keep your customers captive. It’s very hard to control all distribution in the world. So, what you really have to care about it customer experience and engaging experiences and building the best products and that’s gonna come down, we believe, to how you’re leveraging data, how you’re facilitating those experiences. And so, I don’t think it’s gonna be someone’s gonna have a better algorithm necessarily, but someone’s gonna have a better data set that’s gonna train an algorithm for their data better, and that will hopefully compound more and more over time. So, whether you’re banking, or insurance, or healthcare, or life sciences we think it’s gonna come down to the data. And we think it’s gonna come down to how do you manage that data, how do you extract the most amount of insights and value from that, and the algorithms are gonna be not fungible, but certainly there’s gonna be so many and you’re gonna wanna be able to pull them together for the experiences that you’re trying to deliver.
Michal Lev-Ram: Any other suggestions for where people in the room should be investing their resources and time and energy when it comes to..
Aaron Levie: …Yeah, Box. I’d say that should be high on the list. Yeah.
Michal Lev-Ram: Where do you work again? Yes. When it comes to AI though. Broadly. Strategy.
Aaron Levie: Still Box. The answers still relevant, I think …. again this is… and we have to actually, honestly, we have to always remind ourselves of this, but we sort of … every new major innovation in the market, whether it’s mobile, whether it’s cloud, whether it’s AI, we just always have to step back and say “Okay, where is the value created in this new era?” What are the … and for us, we think about it as … Okay, for our customers, what are the challenges going to be in this new world order and what can we do to make their lives as simple as possible? And so, instead of having to have your data in all these different places and running all these different AI algorithms, how can we give you one platform where the AI comes to the content as opposed to the opposite of that.
I would mostly give the same advice to customers, which is try and figure out where is the value going to be in the future? What are the customer experiences that need to be solved? And importantly, the idea … and we are guilty of this on a daily basis, but trying to step back and remember that technology for technology’s sake is not gonna solve many problems. Really, ultimately, working back from what is the customer problem, what is the actual customer experience that this technology can solve? More often than not, when we try and just throw technology into the market and we actually haven’t worked back from what the customer problem is, we don’t get surprised by amazing results. We mostly have a technology that nobody is utilizing because it’s not fundamentally solving a problem. So, I would say whatever business you are in working back from, What is the customer experience fundamentally that is deficient and where does some form of automation, some form of intelligent experience begin to fundamentally change that. And in many cases, it will mean rewriting major parts of your underlying business processes.
It will not be possible that you can take today’s software system, add AI to it and all of a sudden you’re going to have some new, magical experience. Probably, the underlying business process itself, is the thing that needs to be retooled. We see this all the time with our large customers. More digital experiences, more technology is not going to make you more competitive with an Amazon or a Google. It’s going to be going back to sort of first principles. What ultimately was the customer trying to do? Why did they come into our store? Why do they shop with us in first place. Why do they want our service? Then let’s redesign the customer experience. AI will, naturally, have some component, where we can automate some process but, let’s not just throw AI at today’s architecture and expect that we are going to revolutionize that experience. That would be the rough advice but, that would work irrespective of whether we are talking about AI, cloud computing or mobile apps.
Michal Lev-Ram: Okay. So any question that I ask, that would pretty much be the answer?
Aaron Levie: Yeah, pretty much. That covers always three major topic areas.
Audience #2: What kind of interface do you have for human-in the-loop training, for example, to help make the models better? It seems to me that’s going to be an important part for everyone.
Aaron Levie: Right now, we are just relying on the existing services from Google, Microsoft, IBM. The customer is not doing training through our interface. There are APIs that you can use from all of those players that can do custom training datasets using their interfaces. For our use case, in particular, what we have designed but not launched or released, within our user interface, when you look at file whether it’s an image or a video or a document. You being able to effective tell us what that piece of content is and it gets better and better over time. To the point where that feedback loop is able to disappear. So, that’s effectively how it will show up over time.
Michal Lev-Ram: Other questions for Aaron? Back there?
Audience #3: What are you guys thinking about adversarial machine learning, particularly, and how that will appear in a couple years. And what effect do you think that’ll have on growth?
Aaron Levie: Wow. I was told we would only talk about positive subjects today. (laughs) When you say adversarial, like, cybersecurity attacks? Or …
Audience #3: Yeah, so, if I put three dots on the stop sign, I could trick your car into thinking it’s a yield sign.
Aaron Levie: Right. Yeah, great question. So, fortunately, and probably knock on wood, I’m sure Facebook said this like three years ago, so, because we are primarily working with, effectively, only corporate customers and they are the ones working with their data, most of the time we’re not having to worry as much about people manipulating the training sets, cause they are the ones … it’s their content and their training set. Certainly, if one of our external partners had some adversarial use cases or negatively trained data, that would be a problem and we’d have to make sure that we’re working through that. I mean, in general I’m scared shitless of AI. So…
Michal Lev-Ram: As a human.
Aaron Levie: As a human. Absolutely. Related to Box’s use cases I’m pretty optimistic. (laughing) So, the way that we are gonna use AI, I am genuinely … I think is gonna be fine. I think when you start to think about some of the more scary use cases in the world, fortunately, we’re hoping we won’t be a part of. This is … We are in for a couple decades of very interesting policy and regulations and sort of AI battling AI and … it’s interesting cause I’m actually, I’m extremely optimistic about the dimension of jobs, for instance, on AI. I actually think that we are far too cynical and skeptical on the jobs front. I actually think jobs will just continue to change like they always have over the past hundred some odd years. And what we’re really bad at is imagining how the jobs are gonna change, because we can only… like, we’re reflecting on today’s work and what if you remove this thing, well then I’d be on a beach for the other seven hours of the day, but we’ll probably just move on to the next set of tasks. So I’m very optimistic on jobs and reasonably suspicious of some of the more negative aspects of AI. Even with very very limited cyber security attacks you can see the destruction that’s already occurred in many cases. You know, layering on AI on that is pretty gnarly. So, I think that I’m on the side of “we gotta regulate it” and also on the “I don’t think this is gonna end the world from jobs and the economy”. So, I’m hopeful there.
Michal Lev-Ram: We have other things that are gonna end the world most likely.
Aaron Levie: Oh, totally. Yeah, it won’t be our first problem.
Michal Lev-Ram: Other questions for Aaron.
Audience #4: When are you planning to take your release to the market so that we can use it and try it out?
Aaron Levie: So, we just announced the technology about a month and a half ago at our conference, in mid October. It’ll be … it’s in extreme private beta for a few select customers that have use cases.
Michal Lev-Ram: There are two people in the whole world using it.
Aaron Levie: About four. So, it’s extreme beta and we’ll be rolling it out in probably the next quarter or so for more broader availability and then full general availability probably about two quarters out.
“We recognize that probably 1 percent of your employee base is gonna tell us anything about their files, so can we take even that 1 percent and learn enough from that 1 percent that we can connect the dots to a much more significant part of the population and how data is being used. Clearly the only way to do that is machine learning.”—Aaron Levie
Michal Lev-Ram: Okay, I think there was a question over there.
Audience #5: So, in the space on structured data. A little bit may be taking this and applying it to risk models as well, so these files are more risky?
Aaron Levie: Yeah, so this one’s really exciting. So, I’m gonna switch to a different technology. So Box Skills, as if everybody cares about really memorizing our architecture, but I’ll go for it. So, Box Skills is basically where we’re taking in the machine learning and AI services from all the different public clouds and putting it into the Box and making it so customers choose which ones they wanna turn on. That’s one technology. Box Graph is our own effective machine learning model that takes all the events and behaviors that people are doing on Box and then we build features on top of that. So, one thing that we’re thinking of trying to solve is sort of this age old problem that you’re at least, your CSO has always had, which is: I have a hundred thousand employees and I want to make sure that confidential data doesn’t like from the organization. It isn’t shared with the wrong entities. It’s not shared with accidentally a competitor or at least somebody whose not supposed to know. And so the theory has been, for twenty, thirty years, is… Okay, well this is fine, what’s gonna happen is every user is just gonna tag their content and they’re classify the data. And I think as we know, basically .001 percent of people do that and they’re mostly just in the legal team, or the security team, but certainly no knowledge worker has ever tagged a file as something.
So, we basically have to find other ways to interpret and come up with implicit signals about security issues around content. So, what we’re trying to do is figure out, okay, what are all the different patterns and behavior that … the easy thing is just user behavior analytics. So, let’s see what a normal set of things that a user does is and then when they do something anomalous let’s alert a security team and call it out. That one’s pretty straight forward. Many technologies do that. CASBs and others. It’s still really important and it can mean that we’re focusing on more signal versus noise. But, given that we’re a sharing platform also, and given that we manage content, we can learn other signals from home people behave and we can actually begin to look at content and see its similarity to other content. So, maybe we know because of either a generic data set or what your specific enterprise has trained us on, we know what … let’s say in Pharma we know what a life sciences regulatory application looks like. Right, we just have seen it and you’ve trained it. Now we can know that all other content that looks exactly like that or has any similarity or has other content that is within a regulatory application and it shows up in another document, we know that that also is pretty similar. And so, we can begin to weight the risk level of that file being shared with certain entities outside the organization higher than if it’s a marketing asset.
And so, what we wanna be able to do is … We recognize that probably 1 percent of your employee base is gonna tell us anything about their files, so can we take even that one percent and learn enough from that one percent that we can connect the dots to a much more significant part of the population and how data is being used. Clearly the only way to do that is machine learning. You would never be able to do it in any other way. And so we’re working a lot of technology around that.
So, we want to make sure that whether it’s compliance, or privacy, or security, that we can make sure that the used is doing as little to nothing as possible, the system is doing as much as possible, and we try and train it with as many high leverage activities as we can.
Michal Lev-Ram: Well, the great thing about Aaron is that you talk so fast that even in a 20 minute conversation you can cover what most normal people cover in like an hour.
Aaron Levie: Oh, we’re done?
Michal Lev-Ram: Yes. (laughing) Thank you so much.
Aaron Levie: Alright, thank you. Thank you.
This video was filmed at the Built to Adapt conference in Sausalito, California. The transcript was edited for clarity.