Allison Pickens' Newsletter
Allison Pickens' Podcast
Mining The Hidden Gold In Your Visual Content
0:00
-22:19

Mining The Hidden Gold In Your Visual Content

A conversation with Coactive AI's CEO Cody Coleman

Every time your users post a product review with a photo or video, they’re contributing to the vast treasure trove of user-generated content. But companies have barely scratched the surface in leveraging that visual content for marketing purposes. Why aren’t we mining the data across those images and videos?

In my latest podcast episode, I discuss how to bring structure to this kind of unstructured data, in a conversation with Cody Coleman, CEO of Coactive AI. Coactive AI recently announced its combined Series A and Seed funding from a16z, Bessemer Venture Partners, and others.

  • What are the best practices for data teams to work with visual data?

  • What mistakes are data teams making?

  • Is this kind of unstructured data a “data problem” or a “marketing problem”?

You can listen to the podcast or read the lightly edited transcript below. Let’s dive in!

If you’d like to hear from other CEOs in the data and AI categories, subscribe (for free) here!

Leadership Roles

Here are a few leadership roles and other key roles at companies that I’m excited about.

Transcript

Allison: Cody, I'm so excited to have you on the podcast today to talk about unstructured data and visual data. Thanks so much for joining us today.

Cody: Thanks for having me. I'm super excited to chat.

A: Let’s start out with your personal experiences working with visual data. How were you inspired to tackle this problem? What challenges have you faced before?

C: I've spent the past decade working at the intersection of data systems and AI, before I even knew it. I’ve been working with all forms of data from tracking log data to image and video data to finance data to educational data. But oddly enough, there was a through-line of visual data in all of that.

I worked as a PM, an associate product manager at YouTube Analytics. I had done analysis on educational videos for edX and massive open online courses. Most recently, I was doing my PhD at Stanford as part of the DAWN Project. The mission of the DAWN Project was trying to democratize machine learning. We had the fortunate opportunity of working with Pinterest and Meta. We were able to see firsthand what the state-of-the-art looks like for systems and tools. And we worked with unstructured data, images, and videos at scale and actually made it useful.

A: You've had many years of experience working through these challenges. Why do you think now is the right time for most data teams to tap into their unstructured data?

C: You need AI and systems in order to be able to work with unstructured data like this.

Speaking first about tabular data: 10 million rows of tabular data is about 40 megabytes. For the purposes of an analogy, let’s say that's like the size of Lake Tahoe, which is about 400 square kilometers.

If you have 10 million documents, that's about 40 gigabytes. That's three orders of magnitude larger, and would be roughly equivalent to the Caspian Sea, which I believe is close to 400,000 square kilometers.

For visual data, 10 million images is about 20 terabytes, if you look at a dataset like Open Images. That is equivalent to the size of the Pacific Ocean—roughly 150 million square kilometers.

To be able to process that, you need to have the infrastructure and tooling to scale up to that amount of data. I was able to see this be developed firsthand with benchmarks that I created at DAWNBench and MLPerf. We saw the industry really start thinking about high performance deep learning. That systems component is now possible and has become a bit more standardized to be able to work at this scale with large amounts of unstructured data.

The other key component is that with AI, we have this massive movement around foundation models. Foundation models can fundamentally transform this raw analog pixel data that humans can understand into a format that's easy to understand for computers and for machines. We're seeing these great models that come out pre-trained, whether it be from OpenAI or Hugging Face or others. They operate as something of an analog to a digital converter for unstructured data. They make it into a more compressed, more easy-to-work-with format for computer systems and machines.

Bringing together the systems component, the advances in high-performance deep learning, and the advances in foundation models in AI creates a perfect trifecta to actually make this data useful.

Then, very broadly as a macro trend, unstructured data is taking up more and more of our everyday life. You and I are talking on a Zoom call right now. With eCommerce, we're making purchasing decisions based off of images and videos. Then you think about the way that we communicate, we've gone from text to things like Instagram and TikTok.

Right now 80% of internet traffic is video data. That's only going to grow as this whole wave of generative AI makes content really easy to generate, whether it's Stable Diffusion or things like ChatGPT.

A: Switching gears: Let's say I'm leading a data team and I'm really interested in leveraging Coactive, your product. Are there certain other products that I need to have purchased and started using well in order to take advantage of your product?

C: Part of what we wanted to do with Coactive is to make it super easy for existing data teams to be able to work with their data. We wanted to meet people where they are. A lot of organizations store their image and video data in the cloud on an object store like S3 or Google Cloud Storage.

As long as you have your data in the cloud, we make it super easy with Coactive. You can just point us to an S3 bucket, and we can ingest all of that data. We'll take care of the modeling piece to generate a representation of embeddings and vectors for you, so that you don't have to think about it.

Right out of the box, you can do multimodal search. With text, you can just search all of your image catalog. But you may want to get to something that's more domain-specific. A lot of companies have concepts that really matter to them. So, we make it really easy to define domain-specific concepts by using this procedure of active learning. We suggest examples of what is best for you to label.

We really try to make it an end-to-end process, from where the data is stored, all the way to being able to use it for search, analytics, to derive insights and to plug into. If you're already using SQL and a tool like Spark SQL (or really kind of any form of SQL) we can plug right into it.

A: So it sounds like there's not a particular tech stack that you are endorsing across a number of different companies. You're, as you said, just happy to meet data teams where they are.

C: Exactly. One thing that's unique about Coactive is that we really wanted to make it easy for data teams to leverage this data, rather than having to spin out a whole complicated ML team or extra infrastructure just to be able to work with that data.

A: Are there particular industries that you think would benefit most from leveraging unstructured data?

C: At the beginning of Coactive, we saw a wide variety of different industries, from autonomous vehicles to medical imaging. But we found that people really resonate with this in consumer retail, media, and media technologies.

You have the traditional media companies, such as Paramount or Comcast. But there are also user-generated content platforms or community platforms like Reddit, Discord, Fandom. All of these platforms have a tremendous amount of image and video data that's being uploaded. They have no clue what's in this data. Being able to unlock and understand if this good or bad content, or if it’s appropriate for this community, that’s a really hair-on-fire problem in order to ensure trust and safety.

On the other side of consumer retail, you have eCommerce platforms, which also have an increasing amount of visual content. Even more traditional brands like Nike or Steelcase (the chair company)  have a tremendous amount of image data from just taking photos of their products. Trying to find that right photo of that awesome Air Jordan or that Steelcase chair is really hard for those companies.

A: I know that today you announced a big milestone in the development of your company. I think also in your category, it was a big funding announcement. Can you talk a little bit about that and also maybe indicate why this is an important milestone, generally, for the market that you're operating in?

C: Today is a huge day for us, coming out of stealth and announcing our Series A and seed funding. First off, we're extremely fortunate and grateful for our partners with Bessemer Venture Partners' Elliott Robinson, and then Martin Casado at a16z. They're just phenomenal people that have been able to help us along the way.

We developed the product and actually built out the infrastructure, but now we're shifting gears to think about how we create a repeatable sales motion and a go-to-market motion.

We had some initial users and we developed a very high quality experience for them. Again, we made it super easy for existing data teams to leverage the data. So, we're trying to figure out how do we actually scale that up to more industries and make it a much more repeatable motion, while also maintaining that high level of quality and experience that our customers already know and love.

I think this will ultimately help establish the category as we figure out that language and how we resonate this new category to the problems that businesses have now.

We're really in a state of category creation. A lot of people haven't had the tools to even think about working with image and video data before. We are figuring out that messaging and how we take customers from where they are today to what's possible with Coactive, in terms of unlocking the power of unstructured data.

A: You're a very mission-driven company. You've written a lot about algorithmic bias. Certainly in dealing with visual data in particular, you have to be thinking about this. A lot of people are concerned about the ways in which AI is being informed by the data sets they're collecting with respect to images. How are you looking to tackle algorithmic bias as a company? And more tactically, what best practices would you recommend to companies that are trying to avoid algorithmic bias?

C: Combating algorithmic bias is super important to Coactive. More broadly than that, we want to make sustainable AI. We want to make AI that actually will benefit society and the environment for the long-term and minimize any kind of detrimental effects.

With all the advances and the amazing things like ChatGPT, there are also a whole bunch of different concerns that come up. They are very valid and honest concerns that more people need to be thinking about. Fundamentally what we're doing at Coactive is trying to serve as a role model for how to think about and bring that broader perspective of sustainable AI into everything that we do.

One of the first things that we did was we found this amazing project called Dollar Street, which was created by the nonprofit Gapminder, to mitigate human bias. It took photos of household items from families around the world and organized them on a street, based off of socioeconomic income so that you could see actually the diversity of everyday life for people around the world. It had a profound impact.

Because we're in the US, we might think we're middle class. In actuality we're at the upper 1% globally. Our daily view is not representative of the entire global population. This isn't a problem that just happens with humans, this is the same thing that is happening with machine learning models.

We've seen models be biased such as associating black people with the classic gorilla or things like that.

The opportunity to convert that dataset into one for the machine learning community that would allow them to identify, evaluate, and improve their models was so compelling. My co-founder and our team were spending nights and weekends working on this dataset, bringing it together and continuing to support it. We all want to try to reduce the barrier, reduce the cost, reduce the energy for being able to use state-of-the-art machine learning.

We need to be aware and take accountability that as ML people or people using artificial intelligence in practice, we have a responsibility to think about what the repercussions of scaling out AI are and how that can potentially go awry.

There's been amazing work in the academic area around this, as far as best practices and ethics. As organizations, we need to do the same thing because there's a tendency to just push it onto the side and claim we're not intentionally doing anything bad. But I think is the first steps are just understanding the problem, then mitigating the problem and being able to measure it.

Then we need to use datasets like Dollar Street, update models and follow the best practices. These things are improving quite rapidly as AI develops. We saw this with OpenAI's GPT-3 versus ChatGPT and GPT-4. As these models have improved, they've not just gotten better in accuracy, they’ve also reduced bias substantially. Being adaptive and keeping up with the best practices can help mitigate problems, vulnerabilities and mitigate bias overall.

A: That's really great advice. What other mistakes do you think data teams tend to be making with respect to their visual data?

C: Visual data and content in general is super powerful and can be useful for lifting sales. We've seen it in the percent of users who say that user-generated content influences their decisions to make purchases. But where data teams go wrong is that they don't do anything with the data. It just ends up stored on S3.

If they just have it on S3, that makes it really hard to work with. That’s because it’s on network file system, which is super slow for processing large collections. The first thing that you can do is coalesce all of these small files together into something that's easier to work with and process.

Another odd problem that I see with data teams is them trying to do everything themselves and reinvent the wheel. It just ends up taking a lot of time and resources. They under-appreciate how difficult it is to actually work with visual content.

Going back to the sheer scale of it, if you're used to working with tabular data, that's a data lake. But unconstructed data is a data ocean. It’s foolhardy to assume that the same vehicles that would get you across a lake would also work for an ocean.

Data teams sometimes don't appreciate the scale and how difficult it is to work with. It's partially because there aren’t tools out there. But at Coactive, our whole mission is to actually fix that problem and to make it easier, so that people aren't trying to reinvent the wheel. They don't have to get the same battle scars that we did from working with this data. They can just focus on deriving value for their business.

A: I'd love to talk about how you derive value from unstructured data. Certainly I think much of what you said would naturally appeal to data teams, but I imagine that marketing teams would also benefit from visual data being leveraged more. What do you think is the argument exactly for a marketing team in terms of why they should use your product or generally tap into their visual data more?

C: Maybe it’s indicative of companies working with visual data overall, but in terms of marketing teams, the process right now is fairly manual. If you're a marketing team for a sports team and you want to do a highlight reel of three pointers or something like that, you're spending a tremendous amount of time in human resources combing through all of this data to find those right clips for the campaign. By using a tool like Coactive, you can replace the human work of toiling through all this data and be able to get the information in an instant.

Maybe you're a grocery store, Easter's coming up and you want to get your marketing campaign ready. Easter happens every year. You've probably taken photos of Easter for previous campaigns. Rather than having to go out and do a completely new photo shoot, if you have the ability to quickly search through it, you could search and find those relevant photos.

Users and customers tell us that it's so hard to do that search through their archive of data that they end up spending thousands of dollars to redo a photo shoot. They redo this process and waste time. So, reducing the time that it takes to deliver these results is a massive win for marketing teams and larger enterprises. We see the same thing for content moderation and trust and safety use cases as well, which are very human-driven.

Especially in the trust and safety case, there's material out there that should never see the light of day. Subjecting human beings to that is just horrible. By automating that away, you end up saving content moderators a lot of mental stress and anguish.

A: It can be hard to quantify the impact of reducing risk, which I think is what you're speaking to. It also sounds like part of the ROI equation is reducing cost from, as you said, people dedicated to doing photo shoots or otherwise manually combing through data.

Do you think there's a revenue argument as well for optimizing the way you use visual data? And if so, I'd be curious to know if you've seen any case studies about this or developed any ROI analyses yourself?

C: In eCommerce, it's quite interesting right now. A lot of big eCommerce companies and big brands at the scale of Nike or West Elm will see a sales lift by having the right video, such as user content on the product page. It gives it a much more human feel to what the products are going to be like when you see another human being wear the product.

Studies have shown the lift is somewhere around 20% to 30% in terms of click-through rate and purchase by having the right user content next to a product. As I mentioned earlier, 80% of customers say that user-generated content impacts their decision to buy a product. So by having the right content for the right person at the right time, that can directly improve your revenue by higher sales conversions.

A: Cody, to wrap up, is there one tip that you would give to data teams for how they should better use their visual data?

C: I'm biased. I think that Coactive will be that one tip to save you time and effort. But invest in and look at AI.

AI and unstructured data go hand-in-hand for being able to unlock value. Data and AI together are the way forward, rather than being separate topics. We're seeing this across the board with more conferences, data systems and databases bringing AI and data together. On the AI side, we see data-centric AI.

Think about your data and your AI strategy together, holistically and comprehensively. That’s the one tip that I would give data teams.

A: Thank you so much for joining us today, Cody. This was awesome.

C: Thank you so much.

If you enjoyed this conversation, share it!

Share

Allison Pickens' Newsletter
Allison Pickens' Podcast
Patterns and prophets, in SaaS and Web3