What does it mean to be a "data person" these days? "Data scientist" conjures up images of PhDs hunkering down on advanced statistical analyses. There's an eliteness to the title. Folks who merely are spreadsheet wizzes, have learned some SQL, and are analytically minded would never consider themselves to be data scientists. But their work is tremendously valuable. And they need tools to help them.
I sat down with Barry McCardel, the founder of Hex (which I invested in), to learn how he's supporting the large population of the "analytically technical."
Why did you decide to start Hex?
Hex is the product that I wish I had at every point in my career -- in the research lab where I worked in college, in consulting, at Palantir, and then at another startup. I'd often find myself building insane Excel spreadsheets with dozens of tabs, macros, and buttons. They were essentially data products, and I even released them to my colleagues as though they were products.
Later I joined a startup, where I worked closely with our data team. We had a modern data stack that included Snowflake, dbt, and Looker, but our data team was still spending a lot of their time on one-off SQL queries and local Python notebooks. We were building what were essentially brittle Rube Goldberg machines in order to get data from point A, run the model, then move the results to location B and let someone look at it. A lot of insights were being communicated through slides, screenshots, and Slack. The data-sharing process was fragmented -- a total mess.
So I actually started this journey looking for a product that solved the problem, but I couldn't find it. I couldn't stop thinking about it and wound up starting Hex with my co-founders Glen and Caitlin, to build the product that we thought data teams deserved to have.
Why does Hex improve how data teams work?
Data teams want their work to be impactful. They're expensive teams to hire and retain, so they need to show ROI to the business. If you're delivering your work in a one-off CSV uploaded to a Google sheet once a week, and it's inflexible and not live, your stakeholders are more likely to be frustrated than delighted. Data teams want to show they're making an impact, and it's our role to help them do that.
As an analogue, Design teams used to have the same problem. They were all working in local Photoshop or Sketch files, collaborating through share drives and attachments. And then they would share their work with stakeholders like Product or Marketing through emailed PDFs. Those people would then put stickies on the PDFs and send the file back, and the cycle would start anew. The process was full of friction. But then Figma emerged as a platform for Design as a function, both to collaborate among themselves and also, crucially, to interface with and make an impact on the rest of the organization.
The analogy with data teams today is painfully clear. It’s super common to see data teams generate PDFs of their Python notebooks and email them around. But Hex enables data teams to publish their work in an app, with live multiplayer, comments, and real versioning. It’s a much more effective way to collaborate and communicate, and ultimately have a real impact.
How do you see Hex fitting into the modern data stack?
The rise of the cloud data warehouse enabled people to store data flexibly, at large scale, and make it super accessible to a variety of tools. Redshift, BigQuery, and Snowflake were absolute game changers. Then the reinvention of the ETL / ELT stack with Fivetran and others made it easy to get data from the source into your warehouse. And dbt made it easy to transform and refine that data and make it usable. From there, people would historically use legacy BI tools like Looker or Tableau to build traditional dashboards.
We're complementary to those tools, filling the gap for the deeper, more exploratory, advanced analysis work that data teams want to do. Data teams are saying, "Great. We've got all this data at our fingertips. How do we leverage it to get insights and drive impact?" We see Hex as the answer here: as a collaborative data workspace, it is built to make the data in your warehouse actually useful and impactful to the rest of the organization.
How are data teams evolving?
First, from an organizational perspective, data teams used to be organized as ivory tower R&D groups. They'd sit by themselves and do interesting work. But often it was hard to connect their work back to the business.
Today it's becoming more common for individual data scientists and analysts to be aligned with and deeply embedded within the functional groups that they support. The data folks are part of the planning cycle for those functions, and their OKRs are linked. The data team's day-to-day work is driven by the needs of the functional group, even as the data folks work together to design their infrastructure and best practices.
Second, we're seeing an evolution in how companies perceive the value of data teams. "Data science" can mean many different things. The term typically refers to two different disciplines: analytics, where the end deliverable is an insight, and machine learning, where it’s a trained model or API endpoint.
Both disciplines can be really valuable, although a lot of the fetishism around machine learning has cooled, and we've come to a deeper appreciation for thoughtful data analysis; 80% of the time, it gets you where you want to go. In high-functioning data teams, the most impactful work that data scientists or analysts produce has nothing to do with bleeding-edge deep-learning models. And if you talk to 10 people with the title "Data Scientist," I'd bet that six or seven of them are probably doing analytics work.
Finally, outside of data teams, there's a growing data literacy. We call this population the “analytically technical." These are people who don’t have CS or stats backgrounds, but they are highly data literate. They can write some SQL or Python, or often they’re writing “code” in a spreadsheet because that’s the best environment they have.
In general, we're finding that many descriptions of jobs outside of data teams require analytical skills. For example, a friend of mine just started in a finance role, and he had to do a SQL test to get the job. A few years ago, it would have been an Excel test instead. Some of our most ardent users are actually product managers or finance folks who use Hex to do really interesting data work, despite not having "data" in their job title.
As someone who did a stint in finance early in her career, I find this fascinating. How do you serve this kind of broad population?
We’re betting on this population continuing to grow. One of our goals with Hex is to build a product with a “low bar and high ceiling”: something that empowers and connects users across the technicality spectrum, and helps them learn and collaborate together.
This ties into what we announced recently with Hex 2.0. We rewrote the traditional code notebook execution engine to be both more approachable for newer users, but also very powerful for more advanced folks. It actually works more like a spreadsheet, where the data updates “reactively” – it’s more natural, and intuitive for everyone.
There’s a lot more on our minds here, and we'll continue to focus on building a product to empower this growing population.
***
If you’d like to learn more about Hex, check them out here.
If you'd like to hear from more founders like Barry about once a week, sign up for my newsletter here:
If you’d like to learn more about what I’m up to, see my website here.