Preventing Catastrophes from Generative AI
A conversation with thought leader Randol Aikin
Generative AI has sparked our imaginations. It also has terrified us. Bing’s chatbot declaring its love for a NYTimes columnist, CNET’s saga with AI-generated content, and many other incidents have shown that AI can generate horrific outcomes. Moreover, the nature of the adverse outcome can be totally unpredictable.
Randol Aikin helped build safety systems for self-driving cars at Uber and elsewhere. He was particularly affected by a fatality caused by a self-driving car and began to envision how he could prevent other types of generative AI from generating catastrophes like this. He is now exploring how to create guardrails for generative AI more broadly.
In this conversation, we discussed:
How are safety systems for self-driving cars instructive for preventing adverse outcomes from generative AI?
What are some examples of adverse outcomes that would be hard to predict?
What are best practices for avoiding them?
What should be the role of boards in monitoring generative AI used by their companies?
You can listen to the podcast or read the lightly edited transcript below. Let’s dive in!
If you’d like to learn from other visionary founders, subscribe (for free!) here:
Who are the startup founders you admire most? I’d love to interview them on this podcast. Reply to this newsletter email with your recommendations.
Here are a few leadership roles and other key roles at companies that I’m excited about.
Airplane: Solutions Engineer
Arketa: Head of Demand Gen
Explo: Head of Marketing
Jasper: Lots of roles across engineering and product!
Meez: Head of Growth
Pocus: Founding AE and other roles in Customer Success, Biz Ops, Demand Gen here
Seso: Enterprise AE, Enterprise CSM, and other roles
A: Randol, I am so excited to have you on the podcast today. Thank you so much for joining.
R: Thank you, Allison. I really appreciate it.
A: You are working in a very interesting space, generative AI. Obviously it's gotten a lot of news attention recently. Everyone is talking about it. There's a lot of buzz, but there are also some challenges that I think you've anticipated with generative AI as a technology and the way it might be used in the future.
I don't think you're looking to fear-monger or talk about the singularity or anything like that. You're thinking about the practical considerations that would make generative AI useful in a business context and in a way that would help companies that are using it to best serve their customers.
So, I think you're anticipating some really interesting things that might happen in the world. I’m excited for you to share them with our audience. To start out, can you tell us what brought you to where you are today?
R: I've had a pretty non-linear career. I've worked in big bang cosmology and I've worked in humanitarian aid and disaster relief. I also had the opportunity to participate in the last big AI hype cycle, which was the world of autonomous vehicle development. And in that world, I spent a number of years focused on this problem of autonomous vehicle safety, which as you can imagine, is a really challenging problem. And also, in my opinion, it’s the first broad experiment in applied AI deployment — figuring out how to ship safe AI algorithms that can have real impacts in the world. I did this at Apple, then at Uber, and then at a company called Ike.
I had a number of formative experiences during my time in the field. One of those was when I was at Uber in 2018, when an Uber self-driving car was involved in a fatal accident. Elaine Herzberg was struck and killed by a vehicle that was driving itself.
For me this was a really important moment of recognizing that while we're sitting behind our computer screens developing technologies, these technologies have real consequences in the real world, and we need to be really thoughtful and deliberate about how we deploy these things.
A big lesson for me coming out of that experience is to refer to safety and risk as a systems problem. It's not a technology problem. It involves company leadership, engineering, the design of the technology itself, product management, and go-to-market. And that job is iterative and never complete.
As you develop a new technology, you need to put a lot of work into making that technology better. But you also need a parallel effort of building the systems and processes that are going to allow it to be successful and safe in the world. This has been true in the world of aviation and automotive and most recently in autonomy.
This world of generative AI has a lot in common with these examples. There’s a big hype cycle with a ton of investment and a ton of different players coming into the field and putting technology out in the world. It's important to recognize that it's incumbent on us as a group of technologists to develop this parallel of systems, safety, and guardrails to help make sure that that's done responsibly out in the world.
A: You obviously have a lot of experience with self-driving cars, and certainly the fatality as a result of a self-driving car is a very emotionally gripping example of AI gone wrong. I think you're of the opinion that there's actually a much more horizontal, generally applicable type of problem that we need to solve for when ensuring that AI produces safe outcomes for people. How do you go from preventing fatalities to preventing adverse outcomes from generative AI?
R: The first thing to note is that in the world of generative, this actually is a huge technology breakthrough. I think we've all had our “holy crap” moments with generative AI. For me there have been two. One was when I was working on a side project last year. I just fed GPT-3 this prompt of “good morning” and what came back was this perfectly crafted scam email. Many people have had examples like this filling up their Twitter feeds.
The second one that's perhaps more interesting and compelling is what the OpenAI team demoed in their Codex demonstration. This is the tool that developers can use to turn natural language into code. It demonstrates the ability of GPT-3 to actually learn APIs. This is a really big deal, because it means that in the very near future we're going to see generative AI not just being confined into a dialogue chat box, but out in the world making real decisions and taking real actions, making purchases or booking travel, et cetera.
There's a real technology breakthrough, and along with that is this adoption curve where there are going to be a lot of different business use cases that are attempted. And the bet is that there are going to be a bunch of bad outcomes that accompany them. It's going to take a while for us to get to the point of having a mature technology. We can learn from other industries about how we get there. Autonomy is one, aviation is another.
In a lot of respects, these system problems are not new problems. We have really good ideas about how to do safe deployments, how to build guardrails that we sometimes call runtime monitors, which are pieces of software that help ensure that a particular AI deployment or a particular technology is doing what we think it should do and not stepping outside of the lines.
A: Do you think that companies will anticipate that they need to put these guardrails in place in order to prevent bad outcomes? Or do you think we're going to have to wait to see whether there are a whole series of bad outcomes for people to realize how important it is that they take preventative steps?
R: Great question, and I really hope it's the former. I hope that we skate to the puck a little bit here where we're anticipating bad outcomes and developing guardrails and policies to prevent them.
In the short term we see plenty of examples. Within the last 24 hours, there have been a ton of examples of Bing chat saying all kinds of awful things. The week before, we saw this Bard moment from Google, with an embarrassing example of their Bard chatbot producing some factual inaccuracies that wiped over $100 billion of market cap off of Alphabet's value.
These are already consequences that are very real. But they tend to be more or less embarrassing and maybe not be catastrophic in the way that we think of in typical system safety engineering, which is bodily harm or loss of life or existential threat to a company.
But those moments are coming and my hope is that we can prevent some of those very worst outcomes with implementing appropriate guardrails. And in a lot of ways we're waiting for our three-mile island moment or our Hindenburg moment, which is this terrible outcome that shapes the regulatory environment around the technology for the foreseeable future.
So, to answer your question, there are a bunch of very practical, very reasonable steps that we can take in the short term for companies that are going to try to build this into their businesses to prevent both the embarrassing outcomes and the big ones.
A: We started out the conversation by talking about fatalities from self-driving cars, and you're obviously identifying many other types of problems that might result from AI. Is there a particular term that you're finding that people are using for this type of adverse outcome? Or are they just calling it “bad outcomes” or “embarrassing outcomes”? Is there a name for this yet?
R: In the AI community you find a few different examples. Folks will talk about “hallucinations,” which is an AI algorithm generating facts out of thin air. Other folks talk about “AI alignment,” which is how we make sure that AI is broadly aligned with our long-term interests and goals. In the world of autonomy and automotive, we talk about “safety of the intended functionality.”
There are a few different terms here, but there's a broader category of AI accidents if you will, and I'd be really curious to hear if other folks have terminology around this idea of making sure that AI is not participating in things that it shouldn’t. That it's not departing the lane, if you like.
The kinds of ways in which it can do harm in the world are really, really broad and not just confined to misinformation, moderation, privacy or copyright. It's going to be a huge number of potential bad outcomes. This is going to be a major, major effort both for businesses and for the research community in the coming years.
A: If any members of the audience have thoughts on whether there should be a generalizable term for these kinds of bad outcomes, or if you are using a particular term in your own business, I'm sure Randall would be curious to hear your ideas.
Thinking about category creation, which of course you're doing, is there a name for the category of software that you're building?
R: Leaning on my autonomy experience, I think of this as “lane departure warning for AI.”
Let’s say there are the OpenAI's of the world that are building foundation models that are broadly capable of accomplishing a huge number of tasks, from writing code to participating in dialogue. While those models get better and better, it's still important to have these additional safeguards on top of that.
In the world of autonomy, we would say that we need to make the driver better, we need to make the autonomy itself better. But we also need lane departure and emergency braking. We still need airbags even in a world in which we have very good self-driving vehicles.
This is because there are so many difficult-to-predict ways in which the core autonomy piece can misbehave that we really want to make sure that we have additional safety precautions and risk management solutions in place.
If we fast-forward five or 10 years, there is going to be a whole category of companies that are focused on risk management frameworks for AI, just as there are today for privacy and SOC 2 compliance. The question is when and not if, at least from my perspective.
A: It's interesting that you mentioned these other categories like SOC 2 compliance. It makes me wonder whether there might be federal regulations that inspire or require companies to adopt safety tools, or certain types of industry recommendations that spur people to adopt tools like this. Do you think that will be the case?
R: I think it will be the case in the long run. But I think that we're going to have to wait for that. And there's an opportunity for a lot of harm to happen between now and then.
I do think it's the case that standards will come before regulation, and there are plenty of examples of this, such as ISO standards. I think folks will work to implement voluntary participatory standards that are largely about process and not necessarily about what a particular AI is or is not allowed to do. For example, when you have an AI out in the world, you can collect data from your users to improve the customer experience in a way that actually introduces some really challenging privacy issues.
Likewise, I think there are already examples of businesses deploying generative tools for text and image that present some really thorny copyright issues.
While we wait for the regulation and for the case law to settle out, we likely will see standards sooner rather than later to help companies at least have a story about how they're making sure that these deployments are responsibly driven and responsibly managed through the entire lifecycle of an AI deployment.
A: What would you say to folks who might call you a Luddite or say that you're fear-mongering?
R: For one thing, the signal is pretty clear that in the research community, the folks that are closest to this technology are spending the most time on these safety and risk themes. There is excellent research that comes out of OpenAI, Anthropic, and Google and others that are really on this theme.
Where there may be a little bit of a blind spot is that for a lot of the work that is being done in the research community, it's focused on two themes. One is moderation of how to prevent these largely chat-based AI interactions from spewing hateful, toxic, violent speech. That's super important. Also, the ethics and bias that are attached to that kind of interaction are really important.
And then there's a lot of research on long-term, AI-aligned, existential human-level threat type of work.
There's a broad category in the middle of what happens when generative AI is involved in expense management, is participating in clinical care, is in medical, or is writing legal contracts. Now, it's not really a moderation problem. It's certainly not AI robots roaming the streets. It's this medium-term challenge of what happens when the generative AI tools that we've already started using begin to do web automation and have actions out in real world.
Those risks are very real. Companies that are thinking about deploying generative AI, especially now that there's so much attention and so much investment going into the field, they really need to think carefully about those risks.
A: What are some examples of companies that are taking this problem seriously and how are they approaching it?
R: They fall into three categories, broadly. The first are the model developers themselves. So, think of OpenAI or Anthropic. The way that they're tackling this problem is by making the model itself better and more robust. Anthropic has some really interesting work around, what they call constitutional AI, which is a means by which you can turn natural language into policy about how you want your AI to behave.
The second category are businesses that are today incorporating generative AI. TripActions is actually a recent example of this. They recently announced integrating GPT across their product set. Now, in the short term, that may be things like telling you the weather in the location where you're traveling or referring you to a good restaurant near your hotel. But you can imagine that over time that actually begins to allow you to make travel plans, change your travel plans, or ask for opinions in a way that could have real challenges if the model is not behaving as you expect.
The third category are folks and businesses that are hosting generative content. A recent example of this is Epic Games. One of their businesses is ArtStation, where they host creator content. There was a huge outcry from their creator community when generative AI images began starting to show up on their boards. And I think Epic is particularly aware of the challenges around these things, having recently been hit with a half a billion dollar FTC fine for COPPA violations.
I think that they're trying to come up with a broader policy about how to think about generative content both in game production, as well as the case of ArtStation or these other moderation cases, to make sure that they're getting a handle on generative content coming on to their platform.
A: Why wouldn't OpenAI make models that are good enough to solve for this problem?
R: As I mentioned earlier, it’s absolutely the case that the most important safety research in the field is coming from these big model developers. But taking OpenAI as an example, it's pretty clear that their north star is generalized intelligence, which means that I think they are going to continue to focus on making underlying models broadly capable of accomplishing a wide variety of tasks.
But let’s think about AI deployments that are going to happen in regulated environments, such as businesses that are trying to get models to serve some pretty narrow tasks. It could be the case, for example, that a healthcare company is thinking about incorporating a chat agent to help book appointments. Well, OpenAI's model will happily provide clinical advice. And unless you have the right guardrails in place, even though OpenAI has developed a “safe solution,” it may not be safe in a particular context or a particular business.
A: If I'm a company that's looking to incorporate generative AI into my product and I want to ensure that there are some kind of guardrails for safety purposes, is there a certain best practice framework that you would recommend?
R: Again, this is an area where we can learn a lot from adjacent industries. There are lots of good examples of various frameworks. NIST (National Institute of Standards and Technology) actually recently released an AI risk management framework guide that's hundreds of pages long. MIT has a really great framework called STPA that was really built around aviation accident analysis, but these tend to be pretty engineering-centric.
From an organization perspective, I think about these as “the four Ps.” The first P is establishing policies, which really means write it down. If you're going to deploy a model out into the world, state what can that model do and what must we be sure that the model does not do. And also state what requires human action or human intervention.
The second P is running what's called a pre-mortem, which is anticipating the catastrophic outcomes and by doing so identifying risks. And that really needs to span product, engineering, leadership, legal and compliance.
The third P is protections. A lot of what we've been talking about is implementing guardrails that keep models in line and adhere to policies that we establish in step one. Today, this is all about moderation. In the future this is going to be inclusive of misuse, compliance, copyright, and other risks as well.
The fourth P is all about process. Process here means that when you deploy a model, you're not done. In some ways the work is just beginning. There are very real risks that are introduced specifically with this technology of things. For example, in context learning and reinforcement learning with human feedback where a model evolves over time.
And just because out of the gate you ensured that it obeyed a bunch of different rules, it doesn't mean that it will continue to do so. And that also includes things like monitoring instant response.
Turning it back to you: what do you think are examples of other areas that we may look to for precedent here?
A: This has to be a board-level concern. Boards have to be aware of the risk of a company adopting generative AI, and they need to monitor the guardrails for ensuring safety from generative AI in the way that they would monitor processes related to cybersecurity or financial fraud.
Using cybersecurity as an example, we've seen many catastrophic breaches over the years that have affected millions of people. It's often hard to quantify the impact of a breach like that, because it could be company destroying. It could destroy your brand, your customer relationships, and potentially other things.
Because there have been enough of these cybersecurity breaches, boards now realize that they need to have a cybersecurity expert on the board. That’s somewhat of a standard requirement when boards are looking to fill open roles or think about refreshing their board members.
You could imagine a future in which boards are required to include a director who is experienced in generative AI, understands the risks from generative AI, and can help monitor the systems that ensure safety.
So, maybe we do have some precedent actually for issues like this. And then it’s a matter of people like you intelligently designing these software programs and systems that allow companies to really manage the risk.
R: Do you see potentially some risk here that this actually creates more space for startups to be first to the punch in generative AI deployments? Maybe this is because they have a different risk profile and may not have to worry about the reputational risk that a large incumbent does.
A: That's very interesting. Definitely in talking with a lot of startups, I know that many of them are thinking that using generative AI in their products is going to become table stakes. At the beginning, it might be a way to carve out a space, but over time it'll just become a requirement for selling your software, because your customers will expect it.
You're right that maybe they adopt it because compared to large companies, the risk of something going wrong is less of an issue. You can move fast and break things, as we've seen startups do before. On the other hand, you might have some large companies that see it as an opportunity for innovation and to build market share versus their competitors in adopting generative AI. And if it does become table stakes, then it will force large companies to do that as well.
R: Yeah, it's really interesting. And your example, moving fast and breaking things, I think that really gets to the underlying mission here, which is there are just too many examples of technology deployments happening really, really quickly, and then we come to understand the deleterious effects of those deployments only after they've reached scale.
Social media is a great example of this, and this technology adoption curve feels faster than certainly anything I've seen in my career. So, as a business and a technology community, it's really important to be deliberate about anticipating what we want that future to look like and trying to build the right pieces and technologies in place to ensure that that happens.
A: Randall, thank you so much for joining us today. This was an awesome conversation, and I'm really excited for all the buzz that this will create.
R: Thank you so much, Allison. I really appreciate the conversation as well.