June 12, 2023
Portfolio
Unusual

How Weights & Biases found product-market fit

Sandhya Hegde
No items found.
How Weights & Biases found product-market fitHow Weights & Biases found product-market fit
All posts
Editor's note: 

SFG 23: Lukas Biewald on dev-focused ML ops

In this episode of the Startup Field Guide podcast, Sandhya Hegde chats with Lukas Biewald, CEO and Co-founder of Weights & Biases about the company's path to product-market fit. Weights & Biases is a developer-focused MLOps platform last valued at over $1 billion. Their platform helps developers streamline their ML workflow from end to end. Weights & Biases currently has over 700 customers using their product to manage their ML models.

Be sure to check out more Startup Field Guide Podcast episodes on Spotify, Apple, and YouTube. Hosted by Unusual Ventures General Partner Sandhya Hegde (former EVP at Amplitude), the SFG podcast uncovers how the top unicorn founders of today really found product-market fit.

If you are interested in learning more about the topics we discuss in this episode, please check out our resources on defining an ideal customer profile, building a product-led growth strategy, and working with design partners.

TL;DR

  • The founding insight: Lukas and his co-founders wanted to build a product for ML practitioners because the available options at the time were largely top-down. Lukas and his founding team had to convince investors that there was a need for new machine-learning tools that could be delivered using a product-led growth strategy. 
  • Early design partners: The founding team partnered with OpenAI and Toyota who were willing to meet with them weekly to help them iterate on the early version of the product. 
  • Early use cases: The first pain point that resonated with early customers was experiment tracking. When people logged their visualizations to Tensor Board, they couldn’t share them easily because it was logged locally. The Weights & Biases product helped solve this pain point. 
  • Iterating to product-market fit: The founding team tested their product with a group of 40 people in a class to test out and iterate the onboarding process for the product. There was also early user feedback that the product’s UX was not good enough, so the team spent a week cleaning up the product’s UX. The team slowly saw a steady increase in users, and they saw the same users coming back over and over again.
  • The core ICP: Weights & Biases built their product for a persona titled “ML practitioner” whose main job is building and deploying ML models. The team stayed focused on this persona and made it their “true north” to guide their product development efforts. 
  • Growth tactics: The team increased awareness and growth by building a “reports” feature for the product, and by creating SEO-optimized, technical, long-tail content that has become popular among their core audience.

Episode transcript

Sandhya Hegde:

Welcome to the Startup Field Guide, where we learn from successful founders of unicorn startups, how did companies truly found product-market fit . I'm your host, Sandhya Hegde. And today, we'll be diving into the story of Weights & Biases. Started in early 2018. This is a developer-focused MLOps platform, last valued over $1 billion. Now, if you're trying to put an ML model into production as a part of your core product experience in a company, you need to be able to manage it effectively, right? You need to be able to experiment with it, track different versions of it, and collaborate over it across all these different people in the company who want to make sure that the model is doing what it's intended to do.

This is actually a really hard problem given the complex nature of machine learning of models for both data scientists and developers. So over half a million people today from 700 companies use Weights & Biases to manage their ML models. And our guest today is their CEO, Lukas Biewald. Lukas, welcome to the Field Guide.

Lukas Biewald:

Thanks. Thanks for that nice introduction. That was a great overview of Weights & Biases.

Why Lukas Biewald started Weights & Biases

Sandhya Hegde:

I'm glad! And I think the reason I'm so excited to have you on the Field Guide and talking about this is that you've been in this space for over 20 years now, which makes you one of the veterans. You started working in Stanford's AI Lab back in 2003. You started your first company, which also helped people get ML models into production in 2007. And so I'm curious if you could share a little bit more about your journey so far. What inspired you to start CrowdFlower and then Weight & Biases and how have you looked at the big picture of AI evolving in the past couple of decades?

Lukas Biewald:

Sure. Well, it's funny, when I was a little kid, I was thinking about AI and I was thinking what maybe we all think of just like, "Wow, what a cool thing that computers can learn to do stuff and you don't have to program them." But I thought in my head, I was imagining, "Okay, this is probably the last job that people will do." I'm actually not sure if that's true anymore, but that just seemed so interesting and exciting. And then I went to Stanford and I was studying AI at a really different time when it wasn't so exciting. Things weren't really working. And when I was in the AI lab, it's funny because so many people there went on to be really famous and successful, but it definitely didn't feel like that at the time. It just felt really frustrating. Nothing was working. I was trying to do these toy problems and I couldn't even get the toy problems to work very well.

And so I actually left to start a company really out of frustration that it felt like all the problems we were working on were just problems where there happened to be data available. And so what CrowdFlower did was it collected training data for people trying to build machine learning models. And it was really early. I've really started it with this idea that this would just be useful. Back then, Y Combinator had just started, but there wasn't the same sense of, lots of people should start companies. I remember in my first decks being advised to remove any mention of AI from the pitch deck, which is so funny compared to now.

Sandhya Hegde:

Right, because there were these waves of skepticism where people were like, "Oh, yeah, this is finally going to work and Lukas is perfect. He has a background in math. He has a background in CS. He'll figure it all out," but then lots of companies struggling. Like you said, to even make the toy demo be an effective demo.

Lukas Biewald:

Yeah, totally. So yeah, at the time I had no idea the difference between sales and marketing or how that might work. It took me over a year to raise the initial money for CrowdFlower. We raised 100k and I was like, "What are we going to do with all this?" We had a 100k at a $2 million valuation. So that was a really long journey and I learned a lot and changed a lot, got married, had a kid and I was thinking, "When CrowdFlower sold, what's the company that I really want to do?" And actually, one of the things that had happened to me was I'd gotten interested again in machine learning and I was feeling like I spent so much time running companies that I'm becoming a dinosaur. Some of my core assumptions were being challenged.

For a long time, we just thought training data was the most important thing and the different kinds of models don't really matter. And people were constantly claiming, "I have a better model that works way better," and they kept being wrong and so there's a lot of skepticism about neural networks and deep learning. But then I remember when AlphaGo beat the best Go player, it was a real wake up call to me where I was just like, "You know what? This is real and this is totally different and it's so exciting." And so I actually worked with the new co-founder and my old co-founder and we really wanted to make a product for the machine learning practitioners building models.

At the time, we were looking around and there was really good products like Domino Data Lab and DataRobot, but they were so top down, they weren't really things that we wanted to use ourselves. And it's so funny because all these things were actually contrarian in 2018 when we started it. I remember we had to convince investors that new tools are needed for machine learning. We had to convince investors that machine learning was an interesting market to actually work in. And we had to convince investors, can you believe it? We had to convince investors that a bottom up strategy or product-led growth was actually a good idea. All those things went from quickly from contrarian to incredibly mainstream, but I guess that's how those things go.

Sandhya Hegde:

Yeah, I have a few follow-up questions. So I think 2018, while in some ways the future was already here, right?

Lukas Biewald:

Yeah.

How Weights & Biases found early design partners for MLOps

Sandhya Hegde:

The first LLM, the transformer methodology, that's already here, only no one has paid true attention. So I'm curious, what was that aha moment for you? How did you dig in when you first read about other transformer's architecture and attention is all you need? How did you respond to diving back into that world? And then in terms of industry adoption, there was very little adoption of deep learning at the time. People weren't using ML in production, but very few companies were. So I'm curious, in your mind, have you thought about that dichotomy of, "Okay, there's this really advanced technique that a very small handful of people are working on and understand and then most people don't know how to leverage it or deploy it"? I'm curious who your early design partners were for your vision of Weights & Biases and what were they trying to do with ML models?

Lukas Biewald:

I think our starting place was not super strategic or something that a McKinsey consultant would come up with. I think that our real logic was we love this technology and we love what it looks like it can do and it's so exciting. We just want to be part of this. I had actually done an unofficial internship at OpenAI where I just said, "Hey, guys, I'll work for free and I'll do whatever you want," and I actually worked with, I think it was a 24-year-old grad student just being his minion, coding up random stuff for him, which was actually really a helpful experience to just get back in the flow of making stuff, but also quickly get up to speed on what was going on in deep learning.

I'm the kind of person I can only really learn by doing and so that was a really core design partner in a way. They used our stuff really early. And then also we did a lot of work with Toyota in the early days. We obviously begged everyone to take a look at what we were doing and no one was interested. Well, actually, what we said to Toyota was like, "Look, we'll do whatever you want," and we were even asking them, "Hey, we'll just come sit in your office and literally do whatever you want." And they actually wouldn't let us badge in and do random work for them, but they were willing to meet with us weekly. And I think we proved to them that we'd really listened to them and iterate with them.

And so it was actually the combination, Toyota and OpenAI were two really early design partners. We wanted more, but just people weren't willing to play with our nascent libraries. And again, it took a long time and it was probably a year of just mostly working with them, not as a strategy, but just as literally we couldn't get anyone else interested in what we were doing.

Sandhya Hegde:

And one thing that I've observed is often when you are building a dev tool, you want to go bottom up. It's all the startups that adopt you first and try you out for prototyping. Maybe they don't have a lot of customers. They're less risk-averse. They'll try the new tool really quickly. However, if I think about 2018, training data was still a blocker for people to actually deploy models. So small startups can't really do it. Most small startups can't do anything. It is the big companies. Which were the big companies that you saw actually using ML in production well? Who was your list of, "This is what best in class looks like, outside of OpenAI and Toyota"? Were there other companies that you would call on the bleeding edge?

Lukas Biewald:

Yeah, so my last company, CrowdFlower, we also sold into ML companies. So I did know the space. "Who was doing it well" is an interesting question. I guess, for me, doing it well means you have a thing deployed that's doing something useful for someone. And it's industry by industry. I think, at that time, autonomous vehicle was a really big emphasis for every car company and then there were a lot of startups doing it. And I remember Cruise seemed like they were really doing a good job and Waymo seemed really exciting. Back then, there was a lot of emphasis from eCommerce and search and so you'd see Amazon and eBay and the big commerce companies really care about this and had good methods, but I think this was before the explosion of applications and it was pretty vertical-specific.

That's one of the problems we had at CrowdFlower. When we started the company, we got Google and eBay as customers and then there wasn't that much more of the market to go to. It took a long time right before other folks caught up. And I think, at the end of the day, machine learning is pretty enterprise-heavy, right? Enterprises tend to have the most data, but I think a lot of startups, they're more scared to sell to enterprises than they should be. And I think VCs often are nervous about advising companies to sell to enterprises. I think, if you're going to be in machine learning, you have to sell to enterprises. There's not really much of a startup market for ML.

The earliest use cases for the Weights & Biases platform

Sandhya Hegde:

Yeah, outside of prototyping, which you'll find hundreds of, but I think that's a fabulous point. I'm curious, what were some of the early use cases that you could see were just resonating a lot more with people as opposed to just the whole platform approach that you eventually have every successful customer adopt? What was the first thing people just desperately wanted to do with Weights & Biases?

Lukas Biewald:

Yeah, I think our approach has always been to ... I think, in contrast to a lot of our competitors, a lot of our competitors, they want to build the full platform first and we've been much more narrow in scope. I don't think we even now have a complete set of offerings. We try to nail the specific pain points. And the first pain point that really resonated, which still resonates today, is logging what's happening during training. So we call that experiment tracking. And it's so funny because investors thought that was really niche, that they hated that use case. And I actually think there's a lesson there. I look for products that ... Every investor has these workflow diagrams and they're really well researched, they're reasonable, and they're reasonably accurate, but I look for products that don't live in those diagrams, yet customers want them.

Because of course, like Amazon and Microsoft, they'll build the stuff that's in those workflow diagrams, but it's the stuff that doesn't show up there where they're real pain points that I think everybody misses. So, experiment tracking, now it's so funny, people inject it into the workflows, but at the time, what we knew was ... It's actually as simple as this. People were logging to a thing called TensorBoard at the time and then they were having trouble sharing it because TensorBoard was hosted locally. And so people would get some result and then they would actually literally screenshot the TensorBoard and email it to their colleagues. And then when the colleagues wanted to pull it up, they couldn't because it ran locally.

That was actually the pain point we saw really, really specific, feels niche, but I don't know, I think we just really wanted to solve ... We're desperate to just get people to use our tools by any means necessary. So it's like, "Okay, we know one person would use this, right? So we got our friend at Toyota to use it and he was happy and then we got our friend to OpenAI to use it and they were happy." It wasn't a really big top-down scheme. It was just like we were desperate to get people to use anything.

How Weights & Biases iterated to product-market fit

Sandhya Hegde:

And so walk me through maybe the first 12 months of Weights & Biases. At what point did you have enough people using that first feature or two that you felt comfortable saying, "Okay, let's open this up. Let's see who else is interested"? What was your approach to doing GA launch in your first year?

Lukas Biewald:

Well, we might need to go to the first two years to actually get to someone actually using our product, but what we would do, so me and my co-founders, we would go and we would rent a cheap Airbnb in the woods or something and we would just spend a week just nonstop working together. And then we would take it and we'd show it to whoever. Literally, we would beg people to look at it and try our library and stuff. And it's funny, Shawn, one of my co-founders called it demo-driven development because we were really just trying to make demos that were convincing enough that someone would even take a longer meeting with us. And it was really hard. We would kept showing people …

I remember we spent a whole week building this thing that actually turned out to be the experiment tracking product. And we built it for a Toyota because this guy, Adrian, hopefully he's listening to this, I remember Adrian was like, "I need this product," and then we built it. We built a whole product in a week. It's three good engineers just cranking together. And then Friday afternoon, we drove up from Santa Cruz where we were staying to their office and showed it to them. And he is like, "I don't know. I'm not that interested in this." Of course, we were pretending we had that product the whole time. And we showed it to him. He's like, "I don't know. It's not that cool. I want to do this other thing called hyperparameter optimization."

And it was so deflating, but you pretend like, "Oh, yeah. Oh, we also have parameter optimization. We can show that to you next week."

Sandhya Hegde:

"It's on the roadmap, coming up."

Lukas Biewald:

Yeah, exactly. And then, the traction was really slow. I remember having conversations with my co-founders like, "Does anyone actually want this?" And I don't know, we had a few people using it. It's not even like they loved it. And then we did this thing, I'm telling you the true story, maybe you should edit this for brevity.

Sandhya Hegde:

Perfect.

Lukas Biewald:

But the other thing we were doing is I was doing classes in machine learning at the time and so I was just desperate to use my stuff. So I was like, "Maybe we could modify this to work for our classes that we're doing." And there, we would get 40 people to use it all at once because they were just in the class. And that was interesting because you would really feel the onboarding. So it would try to get four people to use it. It would break for 10 of them, 10 other people would be confused. It's actually really stressful because I'd be standing there like, "Oh no." And then my co-founder Chris would be just sitting with people's laptops quickly trying to get them to use it.

So that actually got our onboarding, I think, way better than it would have been otherwise. And then we got a little bit of traction. It was pretty flat. I'm talking about 20 weekly active users. And it was like this really this moment, this guy Hamel. I hope people wouldn't mind me naming them. I say this all with love. Hamel Husain, he was working at GitHub at the time and we were really trying to get him to use it and I remember it was Thanksgiving, he called me and he was basically like, "Lukas, I got to tell you, I think your product is good, but it is so ugly. The UX is so unbelievably bad that I'm going to use your competitor that has worse scalability and worse features just because their UX isn't so awful." And so I felt so bad and then I started a panic with my two co-founders. We added what we called a bug bash, where we went in a room for a week, and instead of building new functionality, which is what we'd always done before, we just cleaned up the UI everywhere.

And it was funny, I think that was the moment where the product started growing and it was slow growth, but the thing that really felt promising to me was it was so steady. We had 20 people one week and then we had 23 people the next week and 26 people the next week. And I remember showing our small angel investors at the time. I was like, "I think this got legs. It's taking off." And they were like, "Man, Lukas, these numbers are so small." I remember they're like, "Call me when you have thousands of weekly active users. This is ridiculous. You have 25 users and you're proud of it." But it was the same ones coming back and each week it was a little more. And I think I'd just been doing startups for so long that I'd never really seen that before. Each week, we have more users than the week before.

And that just felt so good. We just felt so much promise because it really just felt like we were building in this way that we'd never experienced. So we got really excited about finding other ways to get more users in the product and using the product and that actually ... Ever since then, it's felt like we had some product-market fit. So I do actually think it was actually this Hamel complaint that really ... I think that's the moment where it started to take off, but I don't know. I was actually going back and looking at this because I was thinking this would be really useful to know as we launched more products, "What was the thing that made experiment tracking take off?" Because it's definitely flat for 18 months, I think and then started to grow.

 It wasn't like there was major functionality change. I really think it was that UX cleanup inspired by one person saying they're going to go the competitor product.

Sandhya Hegde:

It speaks to the value of having design partners who give you brutal feedback and just say what they're thinking and feeling, right?

Lukas Biewald:

Yeah.

How Weights & Biases defined their ICP

Sandhya Hegde:

It's incredibly valuable. So this is 2020 now, you started with a nice cohort of 20 week weekly active users who were happy. They're coming back They're telling other people about your product. What was the persona from the early days? So are you just always focused on developers? Especially given it's still 2020, was it a mix of actually data scientists, statisticians, developers, or people who are coding versus people who are just focused on the math behind the model at the time?

Lukas Biewald:

Well, I think one thing that's really worked for us really well has been ... We started the company with a user profile in mind and that was basically we call an ML practitioner. And we picked a word for that person that's not a title because I actually think the titles are confusing. What makes somebody in our ICP is that their job is building and deploying ML models. I feel like I've been fighting this fight with investors from the very first day where people-

Sandhya Hegde:

I apologize on behalf of my entire profession.

Lukas Biewald:

No, no, I get it. I actually totally get it. And maybe it's stupid, but it's just so clarifying to have a single ICP. I really think that's one of the things that's really at the core of why our product is so loved because I think if you have two profiles and you could argue that we should. There's MLOps persona, there's executive persona. These personas do matter, but I think when you try to please a lot of people, obviously you don't end up really pleasing like anybody and I just assumed like, "Look, the product is made for this particular persona. I could tell you all about her." And that's just been our true north since the company started. I've just been obsessed with keeping the focus on that and tons of objections where people were like, "Okay, this is too narrow. It's too specific. The market's bigger. You can make more money selling into executives." Maybe all true, but at least I think we have a cohesive strategy and a cohesive product.

Sandhya Hegde:

So I have a follow-up question there, which is when you talk about really building only for this ML practitioner, the person is going to deploy it, manage it, what are examples of adjacent needs or use cases that so far you have said no to which someday as the company grows, you might think about, but what are the obvious adjacencies you've had requests about that so far you have said no to maintain your focus?

Lukas Biewald:

Oh, there's so many because the space is so exciting and growing so fast. There's a lot of demand for different things. So the obvious place where we could expand into and I reserve the right to do this, but-

Sandhya Hegde:

I'll implement it. I'll implement it.

Lukas Biewald:

Oh, thank you, thank you. So there's this MLOps persona that's really important inside companies and they're actually not ML practitioners typically, right? They're usually actually people with more of a DevOps background that is trying to make a reliable internal platform to deploy the models and I have a lot of empathy for them, right? That's a hard job. It's like they're doing what we do, but internally trying to get ML practitioners to follow a cohesive process and make their stuff reliable. And so one thing that they really want is infrastructure. And we've really decided not to do infrastructure, even though I think it's a huge pain point. The place where you run your models and making that reliable is, of course, a huge problem, but it just felt like a big distraction and we want to integrate with other people's infrastructure. We don't want to compete with all the infrastructure providers.

And so that's been one of these things where it's like, "Wow, there's so much money to be made in that, but we want to maintain our focus." And the ML practitioner, I think, cares less about this than the MLOps persona. So if you were serving them, you'd definitely orient towards infrastructure. And then there's the executives that care about all kinds of different stuff than the ML practitioner. They often feel like they just want a checklist of features. So the practitioner wants tools that solve their problems that they face daily, right? And they actually want flexibility to know that, "Hey, if there's a different tool that's better, I can let go and use that."

For some reason, executives really seem to want to buy one product that will solve all their problems. And I actually think it's a terrible idea, but it is very pervasive in the market. So I'm at the point where it's like: Look, if you pull out a feature checklist, we just lose and we love every lose. We lose to every competitor. We have less features than all our competitors and we just do. And if you're just looking for checking every box in an MLOps platform, we're going to lose. We do have good integrations with other tools that will check the boxes where we're missing, but we're just not going to make crappy software.

Sandhya Hegde:

I think that is very strategically aligned to also pursuing a bottom-up motion, right?

Lukas Biewald:

Totally, yeah.

How Weights & Biases created awareness of their product

Sandhya Hegde:

Where, yes, the exec cares about, "Oh, I need to buy software that solves the 17 problems. If you could do all 17, it makes my work easier, so could you ..." But your end user's like, "No, I only work on two of those 17 and I want the best solution for those two." So if you're focusing on a customer in terms of who's actually using you, it makes more sense to be narrow, not build the others until you have the right scale. I'm curious, so going back to 2020 when you first started taking off, your UX is now better, thanks to Hamel and the word is spreading. So I have two questions. One, what were the tactics you were using to grow awareness about Weights & Biases? What worked or didn't work in terms of just growing that early user base? And two, what was surprises for you in terms of what people were saying about it, how people were using the product? Do you remember any nuggets from the early days, so you're like, "Oh, I did not see that coming"?

Lukas Biewald:

I think both of these ideas were due to our now head of growth. So we had this woman, Lavanya who was an engineer, but she really had a knack for growth. And so she really ended up driving a lot of our growth. Of course, it was a team effort, but I feel like she had these just clever ideas actually that were surprising. So one thing that we did was we made this feature called reports. And actually it's funny, there's this engineer, John, who built reports and he told me, he was like, "This is so stupid, Lukas. People don't want to share their stuff inside of the application. They want to put it in notion or some nice place to paste it."

And I was like, "I think they do want to share it inside of Weights & Biases," but I actually had some self-doubt and I remember really doubting. I was thinking John is running a project that he thinks is stupid like, "This seems like it's invested for failure," and then actually, he made such a beautiful reports feature, he really wanted to use it. He did actually just a killer job with it. I couldn't believe it because usually that's bad management, right? People should believe in their task. And then what Lavanya did was she got our users to publish reports publicly and then she worked with this other guy, Axel, to actually SEO it really well, which I was also really skeptical of. I was like, "I don't know, does SEO really work?" For us, it really works.

So people were publishing lots of content, very longtail content. I'm talking super technical, maybe there's a hundred or a thousand people interested in it, but those people are very interested and so they stay searching for it. And we're the only place where it exists of just one graded send method versus another. And so that became our best way of getting user on our platform because the cool thing is people were using our platform or using our platform to publish their work and then people are actually seeing how the product works through SEO first.

And so a lot of people, they know us at this place where they learn about machine learning for a long time before they ever even go in and use it. So that's actually our main way that we find users and I really like it because it's very organic and it also just has steadily growths. So at first it was small, really small. For a long time, it was like, "Oh, we got five new users last week from this strategy," but Lavanya really believed in it and I think she knew that or she saw that you could grow it over time versus, yeah, if you get press once, you might get a huge influx of users, but it's just that one time where it doesn't sustain. This is much more steady.

Sandhya Hegde:

Makes sense. And there was no confidential data in the things they were publicly publishing, so users felt comfortable doing that?

Lukas Biewald:

Well, it's funny, right? Because reports is used for confidential data internally and then also there's a ... In our market, machine learning, there's a lot of academics and a lot of academics use our product. So we've made it free for them and they like to publish things especially, but we love it actually. Ever so often, we can convince a customer to publish something publicly because they want to share, they want to recruit or something like that and those tend to work really well also.

Sandhya Hegde:

Yeah, it's funny, one of the things that really helped Amplitude spread within the companies once we were adopted was a very similar feature where you could share a public URL link for some specific dashboard or chart with everyone in the company and people could click and interact with it even if they weren't signed up to use Amplitude at all or didn't have access to Amplitude. Because in big enterprises, no one really has true data democracy in terms of who has access to what data for good and bad reasons. And you couldn't really do that with any existing analytics tool. Either you had access to the whole tool or it was screenshots. You couldn't really have anything interactive.

So if there are any founders working in data listening to this, here's one guaranteed tip that you better try as well. No, that's a really awesome story. And were you getting any surprises in the customer feedback once people started using Weights & Biases, discovering it in the wild? What were some good and bad surprises for you?

Lukas Biewald:

Well, I'll say one really positive surprise has been how passionate people are about their experiment dashboards. I still don't feel like I quite understand the level of enthusiasm that people have. And I've been doing this a long time, so my last company, we collected training that it was very useful, but with Weights & Biases, people will sometimes just post videos of how much they love the product or they'll write in and say, "Hey, I found this bug and it's a bad bug," but they'll be so nice about it. They'll be like, "But I love your product and don't worry about it. It's awesome." This is a podcast for founders. I don't want to make anyone feel bad, but that really has been an interesting surprises how passionate people have gotten about the product.

I think the other surprises maybe has been the requests that people have have been really interesting and they've led us in interesting directions. I think we have this new product called Launch, which was really pulled from tons of user feedback of wanting to basically launch jobs from the Weights & Biases interface. And that's a funny one because I think I'm a little bit older than our audience and I'm just really comfortable inside a terminal. I just shout out to a machine and run stuff. And I think that our audience seems very interested in web UIs to do more. That's been a real learning. In general, notebooks are really pervasive among our audience that we don't know.

I think me and my co-founders often joke, I'm the one in our founding team that understands infrastructure the least and I think they sometimes talk about they build for me because I'm a little hazy on how Docker works and Git works. And so I think a lot of our product is abstracting away some of these concepts for machine learning people who are me. Just I don't know, I can google stuff, but I'm a little bit confused about ... Yeah.

Sandhya Hegde:

Right, so, "You have to pressure test it by seeing if Lukas can make it work."

Lukas Biewald:

Yeah, yeah. I'm like, "What Lukas thinks about it." Also it's funny, there's another joke that I can't read. I'm illiterate. And so what they do is they make the quick starts really short, but I actually think it's really important. I don't know why everyone makes their quick starts too long. And it's funny when I'm looking at a new product to consider using it, I feel like when the quick start is too many steps, I really don't want to use it.

Sandhya Hegde:

Yeah, no, this is the big challenge for all product-led companies, companies that aspire to have great self-serve experiences like, "How long is your customer's attention span?" right? And, "How do you keep their attention every five seconds?" I think that is a really hard thing to get right no matter how easy or complex the underlying product is. And maybe those student classes you did were a really, really formative experience to help you push that bar really high on--.

Lukas Biewald:

Yeah, those are amazing because what that really showed me is, if people get confused at any step, then they'll just go away. It's just really interest ... People always get confused and we thought it was so obvious what to do, but that's worse than a bug. It's showstopping. If someone can't figure out how to just paste the authentication token into their right code, you'll never see them to complain about it. So I felt lucky that we had those classes to torture students with them in the early days.

Sandhya Hegde:

Maybe the question I really want to ask you is if you were starting Weights & Biases now versus you started in 2018, in hindsight, would you do anything differently? Is there a different set of use cases or a different way that you would build Weight & Biases, the product itself, not the long-term vision, which I think is super well aligned to the future that we are all probably going to live in?

Lukas Biewald:

Yeah, I think the biggest thing that I would do now and that I'm trying to do now is I think that the software developers now really can do a lot of ML and that's different than a few years ago. So I think that's the big change. And I guess for me, I view it as an expansion of our user persona, but they have different needs and coming from a little bit of a different place. I think when you're a startup, you can adapt so much faster than a big company to what's going on. And so I would be all in, I think, on LLMs and, "How do we support this?" And of course, you don't see a lot of production use cases yet, but I think it's obvious that this stuff works super well. It's unclear what tools need to be made to make those work. But we're also trying to do that, so please don't compete with me.

How Weights & Biases built its early team

Sandhya Hegde:

Right, we won't. So switching gears a little bit to how you built your team, I think you grew a lot during the pandemic. I think you went from about 20 people to the team you are now. What's been your approach to leadership or what are lessons from the CrowdFlower days that you brought to the new company with you and how are you trying to create a strong, connected culture?

Lukas Biewald:

Well, I think one thing that I really, really am obsessed with is goals and orienting around goals. And that's even I think something I'm obsessed with when we're four people, but it's just more and more important as we grow. You have alignment naturally with four people, but even then, it's like these questions like, "Should we be oriented towards optimizing revenue or optimizing active users or optimizing retention?" And I think it's that clarity on what we're trying to do I think is something I'm really obsessed with because I've just watched people get confused and get misaligned.. I think especially when we're a remote team, we try to have real clarity.

And then also, I think, as I get older, I'm more obsessed with writing things down because it just feels like a much more scalable way to communicate than talking face to face. So we create a lot of docs, which I actually just think is a better way to run a business. I think, when you talk, it can be a little bit sloppy and people forget. I think it's just written documentation and probably people that work for me are laughing like, "We should do more of that," but definitely, I do way, way, way more of that than I used to. And actually, as we've gotten bigger, CrowdFlower got to about a hundred people and stalled out. I think we were 130 or 140 when we sold it, but it was a much slower run.

And so I thought a lot of best practices that executives coming from a Facebook or something would bring were just unbelievably stupid. I just thought, when people go to big companies, something happens to their brain where they just get all this dumb orientation. But now I realized why it happens. When you're scaling fast, you really do have this different set of challenges. But I think it's important not to, when you're small, obviously copy what a really high-growth company is doing. It's so bad because you're just doing exactly the wrong thing in a lot of cases.

Lukas Biewald's advice for early-stage founders building ML products

Sandhya Hegde:

I'm curious, what would be your advice for early-stage founders and developers that are trying to build their first product, build prototypes that are AI native today? What would be your advice for how to think about focusing on the end customer or the product, the use cases, versus figuring out all of this infrastructure and what are the right choices to make around infrastructure and tools? And what would be your advice? Are you writing the definitive guide to building ML products sponsored by Weights & Biases and then what would the intro say?

Lukas Biewald:

Well, I don't know if this is ML-specific, but I just feel like everyone gets this wrong, so it's worth saying, which is like, I would just make something for one person or two people or three people and go from there. Everybody's worried that they're going to end up doing consulting or something, but I just don't think that really happens that much. When we launch a new product, for example, we don't, from the beginning, try to grow it as fast as possible. We try to build it for specific people that we know and we talk about and try to make them really happy.

And I think it's even okay to prioritize the weird things that they want out of respect for their time where you're just quickly iterating on, "Okay, you want that weird thing? Fine, we'll do it," just, because actually even now, we have a huge amount of reach, I think we have a good brand, we have all these customers. Yeah, when we want something new, nobody wants to use it. They still don't want to use it and you just have to ... I think one thing that people don't realize, and I think even YC doesn't tell you well enough is how much people are not interested in what you're doing and how much you have to bank people to use your software. I still. I have all these tricks, essentially SDR tricks of sending tons of reminders to email people over.

I expect to email people a million times before they'll meet with me, and then when you get feedback ... I think once you actually show somebody that you really will implement the stuff they're asking for, I think then you have a friend for life too. So that's my big advice, is make a really small number of people really happy. It's harder to do than you think.

Sandhya Hegde:

That's really great advice. And I actually have a hot take from a previous podcast episode where the co-founder of Amplitude, Curtis, and I were talking about exactly this, which is like, "What is the approach you take to idea validation, message testing, prototyping for different types of products?" And where the two of us landed on was, and if the promise of the product, the idea you're trying to validate is something that plenty of people have promised for a long time, your customers are not paying attention anymore, "I have been pitched that snake oil again and again. So until you have something that is truly working, I don't have to imagine it's real, don't bother me, right? It's just not."

As opposed to if you're doing something where the problem is new and no one's really made promises about it before, it's way easier to do idea validation just with message testing that, "Okay, this is something people are looking for. They'll pay attention," as opposed to for problems that are more evergreen, even if the problem is big and deep, people are paying less attention because they're jaded, right? They've already seen it, heard it too often, so they're tuning it out now. So that was-

Lukas Biewald:

Interesting. I think I have to say, I don't know if I agree with that. I think people always tune you out. I've never had that experience. I feel like I'm usually making a new product in a new space and I just feel like no one pays attention to me in any scenario. You might just be thinking, look at a different market maybe. I think it's just the universal experience. And I'll say my co-founder, Chris, is a genius at this. I think the more you can make your demo look like it's real in all the little details ...people show me demos, it's like, "Don't put demo in the URL." You know what I mean? And our engineers, with the demos, they want to put all these warnings on it and it's like, "You can't do that, right? You got to show somebody something and really be like, 'This is the thing.' You can't just be hedging, 'Oh, this is a weird demo.' You got to really show them something real to get any feedback," because if somebody thinks they're looking at a demo or worse like a PowerPoint, they just don't actually really engage with it.

Sandhya Hegde:

Yeah, yeah, "Show me the thing working."

Lukas Biewald:

Yeah, yeah, yeah, totally.

Sandhya Hegde:

Yeah. Well this is awesome. So listeners, if you love this, there is more of it. Weights & Biases has an awesome podcast hosted by Lukas called Gradient Descent. It's one of my favorite ML podcasts. Please go check it out. And, Lukas, thank you so much for joining us today. I enjoyed this immensely.

All posts

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

All posts
June 12, 2023
Portfolio
Unusual

How Weights & Biases found product-market fit

Sandhya Hegde
No items found.
How Weights & Biases found product-market fitHow Weights & Biases found product-market fit
Editor's note: 

SFG 23: Lukas Biewald on dev-focused ML ops

In this episode of the Startup Field Guide podcast, Sandhya Hegde chats with Lukas Biewald, CEO and Co-founder of Weights & Biases about the company's path to product-market fit. Weights & Biases is a developer-focused MLOps platform last valued at over $1 billion. Their platform helps developers streamline their ML workflow from end to end. Weights & Biases currently has over 700 customers using their product to manage their ML models.

Be sure to check out more Startup Field Guide Podcast episodes on Spotify, Apple, and YouTube. Hosted by Unusual Ventures General Partner Sandhya Hegde (former EVP at Amplitude), the SFG podcast uncovers how the top unicorn founders of today really found product-market fit.

If you are interested in learning more about the topics we discuss in this episode, please check out our resources on defining an ideal customer profile, building a product-led growth strategy, and working with design partners.

TL;DR

  • The founding insight: Lukas and his co-founders wanted to build a product for ML practitioners because the available options at the time were largely top-down. Lukas and his founding team had to convince investors that there was a need for new machine-learning tools that could be delivered using a product-led growth strategy. 
  • Early design partners: The founding team partnered with OpenAI and Toyota who were willing to meet with them weekly to help them iterate on the early version of the product. 
  • Early use cases: The first pain point that resonated with early customers was experiment tracking. When people logged their visualizations to Tensor Board, they couldn’t share them easily because it was logged locally. The Weights & Biases product helped solve this pain point. 
  • Iterating to product-market fit: The founding team tested their product with a group of 40 people in a class to test out and iterate the onboarding process for the product. There was also early user feedback that the product’s UX was not good enough, so the team spent a week cleaning up the product’s UX. The team slowly saw a steady increase in users, and they saw the same users coming back over and over again.
  • The core ICP: Weights & Biases built their product for a persona titled “ML practitioner” whose main job is building and deploying ML models. The team stayed focused on this persona and made it their “true north” to guide their product development efforts. 
  • Growth tactics: The team increased awareness and growth by building a “reports” feature for the product, and by creating SEO-optimized, technical, long-tail content that has become popular among their core audience.

Episode transcript

Sandhya Hegde:

Welcome to the Startup Field Guide, where we learn from successful founders of unicorn startups, how did companies truly found product-market fit . I'm your host, Sandhya Hegde. And today, we'll be diving into the story of Weights & Biases. Started in early 2018. This is a developer-focused MLOps platform, last valued over $1 billion. Now, if you're trying to put an ML model into production as a part of your core product experience in a company, you need to be able to manage it effectively, right? You need to be able to experiment with it, track different versions of it, and collaborate over it across all these different people in the company who want to make sure that the model is doing what it's intended to do.

This is actually a really hard problem given the complex nature of machine learning of models for both data scientists and developers. So over half a million people today from 700 companies use Weights & Biases to manage their ML models. And our guest today is their CEO, Lukas Biewald. Lukas, welcome to the Field Guide.

Lukas Biewald:

Thanks. Thanks for that nice introduction. That was a great overview of Weights & Biases.

Why Lukas Biewald started Weights & Biases

Sandhya Hegde:

I'm glad! And I think the reason I'm so excited to have you on the Field Guide and talking about this is that you've been in this space for over 20 years now, which makes you one of the veterans. You started working in Stanford's AI Lab back in 2003. You started your first company, which also helped people get ML models into production in 2007. And so I'm curious if you could share a little bit more about your journey so far. What inspired you to start CrowdFlower and then Weight & Biases and how have you looked at the big picture of AI evolving in the past couple of decades?

Lukas Biewald:

Sure. Well, it's funny, when I was a little kid, I was thinking about AI and I was thinking what maybe we all think of just like, "Wow, what a cool thing that computers can learn to do stuff and you don't have to program them." But I thought in my head, I was imagining, "Okay, this is probably the last job that people will do." I'm actually not sure if that's true anymore, but that just seemed so interesting and exciting. And then I went to Stanford and I was studying AI at a really different time when it wasn't so exciting. Things weren't really working. And when I was in the AI lab, it's funny because so many people there went on to be really famous and successful, but it definitely didn't feel like that at the time. It just felt really frustrating. Nothing was working. I was trying to do these toy problems and I couldn't even get the toy problems to work very well.

And so I actually left to start a company really out of frustration that it felt like all the problems we were working on were just problems where there happened to be data available. And so what CrowdFlower did was it collected training data for people trying to build machine learning models. And it was really early. I've really started it with this idea that this would just be useful. Back then, Y Combinator had just started, but there wasn't the same sense of, lots of people should start companies. I remember in my first decks being advised to remove any mention of AI from the pitch deck, which is so funny compared to now.

Sandhya Hegde:

Right, because there were these waves of skepticism where people were like, "Oh, yeah, this is finally going to work and Lukas is perfect. He has a background in math. He has a background in CS. He'll figure it all out," but then lots of companies struggling. Like you said, to even make the toy demo be an effective demo.

Lukas Biewald:

Yeah, totally. So yeah, at the time I had no idea the difference between sales and marketing or how that might work. It took me over a year to raise the initial money for CrowdFlower. We raised 100k and I was like, "What are we going to do with all this?" We had a 100k at a $2 million valuation. So that was a really long journey and I learned a lot and changed a lot, got married, had a kid and I was thinking, "When CrowdFlower sold, what's the company that I really want to do?" And actually, one of the things that had happened to me was I'd gotten interested again in machine learning and I was feeling like I spent so much time running companies that I'm becoming a dinosaur. Some of my core assumptions were being challenged.

For a long time, we just thought training data was the most important thing and the different kinds of models don't really matter. And people were constantly claiming, "I have a better model that works way better," and they kept being wrong and so there's a lot of skepticism about neural networks and deep learning. But then I remember when AlphaGo beat the best Go player, it was a real wake up call to me where I was just like, "You know what? This is real and this is totally different and it's so exciting." And so I actually worked with the new co-founder and my old co-founder and we really wanted to make a product for the machine learning practitioners building models.

At the time, we were looking around and there was really good products like Domino Data Lab and DataRobot, but they were so top down, they weren't really things that we wanted to use ourselves. And it's so funny because all these things were actually contrarian in 2018 when we started it. I remember we had to convince investors that new tools are needed for machine learning. We had to convince investors that machine learning was an interesting market to actually work in. And we had to convince investors, can you believe it? We had to convince investors that a bottom up strategy or product-led growth was actually a good idea. All those things went from quickly from contrarian to incredibly mainstream, but I guess that's how those things go.

Sandhya Hegde:

Yeah, I have a few follow-up questions. So I think 2018, while in some ways the future was already here, right?

Lukas Biewald:

Yeah.

How Weights & Biases found early design partners for MLOps

Sandhya Hegde:

The first LLM, the transformer methodology, that's already here, only no one has paid true attention. So I'm curious, what was that aha moment for you? How did you dig in when you first read about other transformer's architecture and attention is all you need? How did you respond to diving back into that world? And then in terms of industry adoption, there was very little adoption of deep learning at the time. People weren't using ML in production, but very few companies were. So I'm curious, in your mind, have you thought about that dichotomy of, "Okay, there's this really advanced technique that a very small handful of people are working on and understand and then most people don't know how to leverage it or deploy it"? I'm curious who your early design partners were for your vision of Weights & Biases and what were they trying to do with ML models?

Lukas Biewald:

I think our starting place was not super strategic or something that a McKinsey consultant would come up with. I think that our real logic was we love this technology and we love what it looks like it can do and it's so exciting. We just want to be part of this. I had actually done an unofficial internship at OpenAI where I just said, "Hey, guys, I'll work for free and I'll do whatever you want," and I actually worked with, I think it was a 24-year-old grad student just being his minion, coding up random stuff for him, which was actually really a helpful experience to just get back in the flow of making stuff, but also quickly get up to speed on what was going on in deep learning.

I'm the kind of person I can only really learn by doing and so that was a really core design partner in a way. They used our stuff really early. And then also we did a lot of work with Toyota in the early days. We obviously begged everyone to take a look at what we were doing and no one was interested. Well, actually, what we said to Toyota was like, "Look, we'll do whatever you want," and we were even asking them, "Hey, we'll just come sit in your office and literally do whatever you want." And they actually wouldn't let us badge in and do random work for them, but they were willing to meet with us weekly. And I think we proved to them that we'd really listened to them and iterate with them.

And so it was actually the combination, Toyota and OpenAI were two really early design partners. We wanted more, but just people weren't willing to play with our nascent libraries. And again, it took a long time and it was probably a year of just mostly working with them, not as a strategy, but just as literally we couldn't get anyone else interested in what we were doing.

Sandhya Hegde:

And one thing that I've observed is often when you are building a dev tool, you want to go bottom up. It's all the startups that adopt you first and try you out for prototyping. Maybe they don't have a lot of customers. They're less risk-averse. They'll try the new tool really quickly. However, if I think about 2018, training data was still a blocker for people to actually deploy models. So small startups can't really do it. Most small startups can't do anything. It is the big companies. Which were the big companies that you saw actually using ML in production well? Who was your list of, "This is what best in class looks like, outside of OpenAI and Toyota"? Were there other companies that you would call on the bleeding edge?

Lukas Biewald:

Yeah, so my last company, CrowdFlower, we also sold into ML companies. So I did know the space. "Who was doing it well" is an interesting question. I guess, for me, doing it well means you have a thing deployed that's doing something useful for someone. And it's industry by industry. I think, at that time, autonomous vehicle was a really big emphasis for every car company and then there were a lot of startups doing it. And I remember Cruise seemed like they were really doing a good job and Waymo seemed really exciting. Back then, there was a lot of emphasis from eCommerce and search and so you'd see Amazon and eBay and the big commerce companies really care about this and had good methods, but I think this was before the explosion of applications and it was pretty vertical-specific.

That's one of the problems we had at CrowdFlower. When we started the company, we got Google and eBay as customers and then there wasn't that much more of the market to go to. It took a long time right before other folks caught up. And I think, at the end of the day, machine learning is pretty enterprise-heavy, right? Enterprises tend to have the most data, but I think a lot of startups, they're more scared to sell to enterprises than they should be. And I think VCs often are nervous about advising companies to sell to enterprises. I think, if you're going to be in machine learning, you have to sell to enterprises. There's not really much of a startup market for ML.

The earliest use cases for the Weights & Biases platform

Sandhya Hegde:

Yeah, outside of prototyping, which you'll find hundreds of, but I think that's a fabulous point. I'm curious, what were some of the early use cases that you could see were just resonating a lot more with people as opposed to just the whole platform approach that you eventually have every successful customer adopt? What was the first thing people just desperately wanted to do with Weights & Biases?

Lukas Biewald:

Yeah, I think our approach has always been to ... I think, in contrast to a lot of our competitors, a lot of our competitors, they want to build the full platform first and we've been much more narrow in scope. I don't think we even now have a complete set of offerings. We try to nail the specific pain points. And the first pain point that really resonated, which still resonates today, is logging what's happening during training. So we call that experiment tracking. And it's so funny because investors thought that was really niche, that they hated that use case. And I actually think there's a lesson there. I look for products that ... Every investor has these workflow diagrams and they're really well researched, they're reasonable, and they're reasonably accurate, but I look for products that don't live in those diagrams, yet customers want them.

Because of course, like Amazon and Microsoft, they'll build the stuff that's in those workflow diagrams, but it's the stuff that doesn't show up there where they're real pain points that I think everybody misses. So, experiment tracking, now it's so funny, people inject it into the workflows, but at the time, what we knew was ... It's actually as simple as this. People were logging to a thing called TensorBoard at the time and then they were having trouble sharing it because TensorBoard was hosted locally. And so people would get some result and then they would actually literally screenshot the TensorBoard and email it to their colleagues. And then when the colleagues wanted to pull it up, they couldn't because it ran locally.

That was actually the pain point we saw really, really specific, feels niche, but I don't know, I think we just really wanted to solve ... We're desperate to just get people to use our tools by any means necessary. So it's like, "Okay, we know one person would use this, right? So we got our friend at Toyota to use it and he was happy and then we got our friend to OpenAI to use it and they were happy." It wasn't a really big top-down scheme. It was just like we were desperate to get people to use anything.

How Weights & Biases iterated to product-market fit

Sandhya Hegde:

And so walk me through maybe the first 12 months of Weights & Biases. At what point did you have enough people using that first feature or two that you felt comfortable saying, "Okay, let's open this up. Let's see who else is interested"? What was your approach to doing GA launch in your first year?

Lukas Biewald:

Well, we might need to go to the first two years to actually get to someone actually using our product, but what we would do, so me and my co-founders, we would go and we would rent a cheap Airbnb in the woods or something and we would just spend a week just nonstop working together. And then we would take it and we'd show it to whoever. Literally, we would beg people to look at it and try our library and stuff. And it's funny, Shawn, one of my co-founders called it demo-driven development because we were really just trying to make demos that were convincing enough that someone would even take a longer meeting with us. And it was really hard. We would kept showing people …

I remember we spent a whole week building this thing that actually turned out to be the experiment tracking product. And we built it for a Toyota because this guy, Adrian, hopefully he's listening to this, I remember Adrian was like, "I need this product," and then we built it. We built a whole product in a week. It's three good engineers just cranking together. And then Friday afternoon, we drove up from Santa Cruz where we were staying to their office and showed it to them. And he is like, "I don't know. I'm not that interested in this." Of course, we were pretending we had that product the whole time. And we showed it to him. He's like, "I don't know. It's not that cool. I want to do this other thing called hyperparameter optimization."

And it was so deflating, but you pretend like, "Oh, yeah. Oh, we also have parameter optimization. We can show that to you next week."

Sandhya Hegde:

"It's on the roadmap, coming up."

Lukas Biewald:

Yeah, exactly. And then, the traction was really slow. I remember having conversations with my co-founders like, "Does anyone actually want this?" And I don't know, we had a few people using it. It's not even like they loved it. And then we did this thing, I'm telling you the true story, maybe you should edit this for brevity.

Sandhya Hegde:

Perfect.

Lukas Biewald:

But the other thing we were doing is I was doing classes in machine learning at the time and so I was just desperate to use my stuff. So I was like, "Maybe we could modify this to work for our classes that we're doing." And there, we would get 40 people to use it all at once because they were just in the class. And that was interesting because you would really feel the onboarding. So it would try to get four people to use it. It would break for 10 of them, 10 other people would be confused. It's actually really stressful because I'd be standing there like, "Oh no." And then my co-founder Chris would be just sitting with people's laptops quickly trying to get them to use it.

So that actually got our onboarding, I think, way better than it would have been otherwise. And then we got a little bit of traction. It was pretty flat. I'm talking about 20 weekly active users. And it was like this really this moment, this guy Hamel. I hope people wouldn't mind me naming them. I say this all with love. Hamel Husain, he was working at GitHub at the time and we were really trying to get him to use it and I remember it was Thanksgiving, he called me and he was basically like, "Lukas, I got to tell you, I think your product is good, but it is so ugly. The UX is so unbelievably bad that I'm going to use your competitor that has worse scalability and worse features just because their UX isn't so awful." And so I felt so bad and then I started a panic with my two co-founders. We added what we called a bug bash, where we went in a room for a week, and instead of building new functionality, which is what we'd always done before, we just cleaned up the UI everywhere.

And it was funny, I think that was the moment where the product started growing and it was slow growth, but the thing that really felt promising to me was it was so steady. We had 20 people one week and then we had 23 people the next week and 26 people the next week. And I remember showing our small angel investors at the time. I was like, "I think this got legs. It's taking off." And they were like, "Man, Lukas, these numbers are so small." I remember they're like, "Call me when you have thousands of weekly active users. This is ridiculous. You have 25 users and you're proud of it." But it was the same ones coming back and each week it was a little more. And I think I'd just been doing startups for so long that I'd never really seen that before. Each week, we have more users than the week before.

And that just felt so good. We just felt so much promise because it really just felt like we were building in this way that we'd never experienced. So we got really excited about finding other ways to get more users in the product and using the product and that actually ... Ever since then, it's felt like we had some product-market fit. So I do actually think it was actually this Hamel complaint that really ... I think that's the moment where it started to take off, but I don't know. I was actually going back and looking at this because I was thinking this would be really useful to know as we launched more products, "What was the thing that made experiment tracking take off?" Because it's definitely flat for 18 months, I think and then started to grow.

 It wasn't like there was major functionality change. I really think it was that UX cleanup inspired by one person saying they're going to go the competitor product.

Sandhya Hegde:

It speaks to the value of having design partners who give you brutal feedback and just say what they're thinking and feeling, right?

Lukas Biewald:

Yeah.

How Weights & Biases defined their ICP

Sandhya Hegde:

It's incredibly valuable. So this is 2020 now, you started with a nice cohort of 20 week weekly active users who were happy. They're coming back They're telling other people about your product. What was the persona from the early days? So are you just always focused on developers? Especially given it's still 2020, was it a mix of actually data scientists, statisticians, developers, or people who are coding versus people who are just focused on the math behind the model at the time?

Lukas Biewald:

Well, I think one thing that's really worked for us really well has been ... We started the company with a user profile in mind and that was basically we call an ML practitioner. And we picked a word for that person that's not a title because I actually think the titles are confusing. What makes somebody in our ICP is that their job is building and deploying ML models. I feel like I've been fighting this fight with investors from the very first day where people-

Sandhya Hegde:

I apologize on behalf of my entire profession.

Lukas Biewald:

No, no, I get it. I actually totally get it. And maybe it's stupid, but it's just so clarifying to have a single ICP. I really think that's one of the things that's really at the core of why our product is so loved because I think if you have two profiles and you could argue that we should. There's MLOps persona, there's executive persona. These personas do matter, but I think when you try to please a lot of people, obviously you don't end up really pleasing like anybody and I just assumed like, "Look, the product is made for this particular persona. I could tell you all about her." And that's just been our true north since the company started. I've just been obsessed with keeping the focus on that and tons of objections where people were like, "Okay, this is too narrow. It's too specific. The market's bigger. You can make more money selling into executives." Maybe all true, but at least I think we have a cohesive strategy and a cohesive product.

Sandhya Hegde:

So I have a follow-up question there, which is when you talk about really building only for this ML practitioner, the person is going to deploy it, manage it, what are examples of adjacent needs or use cases that so far you have said no to which someday as the company grows, you might think about, but what are the obvious adjacencies you've had requests about that so far you have said no to maintain your focus?

Lukas Biewald:

Oh, there's so many because the space is so exciting and growing so fast. There's a lot of demand for different things. So the obvious place where we could expand into and I reserve the right to do this, but-

Sandhya Hegde:

I'll implement it. I'll implement it.

Lukas Biewald:

Oh, thank you, thank you. So there's this MLOps persona that's really important inside companies and they're actually not ML practitioners typically, right? They're usually actually people with more of a DevOps background that is trying to make a reliable internal platform to deploy the models and I have a lot of empathy for them, right? That's a hard job. It's like they're doing what we do, but internally trying to get ML practitioners to follow a cohesive process and make their stuff reliable. And so one thing that they really want is infrastructure. And we've really decided not to do infrastructure, even though I think it's a huge pain point. The place where you run your models and making that reliable is, of course, a huge problem, but it just felt like a big distraction and we want to integrate with other people's infrastructure. We don't want to compete with all the infrastructure providers.

And so that's been one of these things where it's like, "Wow, there's so much money to be made in that, but we want to maintain our focus." And the ML practitioner, I think, cares less about this than the MLOps persona. So if you were serving them, you'd definitely orient towards infrastructure. And then there's the executives that care about all kinds of different stuff than the ML practitioner. They often feel like they just want a checklist of features. So the practitioner wants tools that solve their problems that they face daily, right? And they actually want flexibility to know that, "Hey, if there's a different tool that's better, I can let go and use that."

For some reason, executives really seem to want to buy one product that will solve all their problems. And I actually think it's a terrible idea, but it is very pervasive in the market. So I'm at the point where it's like: Look, if you pull out a feature checklist, we just lose and we love every lose. We lose to every competitor. We have less features than all our competitors and we just do. And if you're just looking for checking every box in an MLOps platform, we're going to lose. We do have good integrations with other tools that will check the boxes where we're missing, but we're just not going to make crappy software.

Sandhya Hegde:

I think that is very strategically aligned to also pursuing a bottom-up motion, right?

Lukas Biewald:

Totally, yeah.

How Weights & Biases created awareness of their product

Sandhya Hegde:

Where, yes, the exec cares about, "Oh, I need to buy software that solves the 17 problems. If you could do all 17, it makes my work easier, so could you ..." But your end user's like, "No, I only work on two of those 17 and I want the best solution for those two." So if you're focusing on a customer in terms of who's actually using you, it makes more sense to be narrow, not build the others until you have the right scale. I'm curious, so going back to 2020 when you first started taking off, your UX is now better, thanks to Hamel and the word is spreading. So I have two questions. One, what were the tactics you were using to grow awareness about Weights & Biases? What worked or didn't work in terms of just growing that early user base? And two, what was surprises for you in terms of what people were saying about it, how people were using the product? Do you remember any nuggets from the early days, so you're like, "Oh, I did not see that coming"?

Lukas Biewald:

I think both of these ideas were due to our now head of growth. So we had this woman, Lavanya who was an engineer, but she really had a knack for growth. And so she really ended up driving a lot of our growth. Of course, it was a team effort, but I feel like she had these just clever ideas actually that were surprising. So one thing that we did was we made this feature called reports. And actually it's funny, there's this engineer, John, who built reports and he told me, he was like, "This is so stupid, Lukas. People don't want to share their stuff inside of the application. They want to put it in notion or some nice place to paste it."

And I was like, "I think they do want to share it inside of Weights & Biases," but I actually had some self-doubt and I remember really doubting. I was thinking John is running a project that he thinks is stupid like, "This seems like it's invested for failure," and then actually, he made such a beautiful reports feature, he really wanted to use it. He did actually just a killer job with it. I couldn't believe it because usually that's bad management, right? People should believe in their task. And then what Lavanya did was she got our users to publish reports publicly and then she worked with this other guy, Axel, to actually SEO it really well, which I was also really skeptical of. I was like, "I don't know, does SEO really work?" For us, it really works.

So people were publishing lots of content, very longtail content. I'm talking super technical, maybe there's a hundred or a thousand people interested in it, but those people are very interested and so they stay searching for it. And we're the only place where it exists of just one graded send method versus another. And so that became our best way of getting user on our platform because the cool thing is people were using our platform or using our platform to publish their work and then people are actually seeing how the product works through SEO first.

And so a lot of people, they know us at this place where they learn about machine learning for a long time before they ever even go in and use it. So that's actually our main way that we find users and I really like it because it's very organic and it also just has steadily growths. So at first it was small, really small. For a long time, it was like, "Oh, we got five new users last week from this strategy," but Lavanya really believed in it and I think she knew that or she saw that you could grow it over time versus, yeah, if you get press once, you might get a huge influx of users, but it's just that one time where it doesn't sustain. This is much more steady.

Sandhya Hegde:

Makes sense. And there was no confidential data in the things they were publicly publishing, so users felt comfortable doing that?

Lukas Biewald:

Well, it's funny, right? Because reports is used for confidential data internally and then also there's a ... In our market, machine learning, there's a lot of academics and a lot of academics use our product. So we've made it free for them and they like to publish things especially, but we love it actually. Ever so often, we can convince a customer to publish something publicly because they want to share, they want to recruit or something like that and those tend to work really well also.

Sandhya Hegde:

Yeah, it's funny, one of the things that really helped Amplitude spread within the companies once we were adopted was a very similar feature where you could share a public URL link for some specific dashboard or chart with everyone in the company and people could click and interact with it even if they weren't signed up to use Amplitude at all or didn't have access to Amplitude. Because in big enterprises, no one really has true data democracy in terms of who has access to what data for good and bad reasons. And you couldn't really do that with any existing analytics tool. Either you had access to the whole tool or it was screenshots. You couldn't really have anything interactive.

So if there are any founders working in data listening to this, here's one guaranteed tip that you better try as well. No, that's a really awesome story. And were you getting any surprises in the customer feedback once people started using Weights & Biases, discovering it in the wild? What were some good and bad surprises for you?

Lukas Biewald:

Well, I'll say one really positive surprise has been how passionate people are about their experiment dashboards. I still don't feel like I quite understand the level of enthusiasm that people have. And I've been doing this a long time, so my last company, we collected training that it was very useful, but with Weights & Biases, people will sometimes just post videos of how much they love the product or they'll write in and say, "Hey, I found this bug and it's a bad bug," but they'll be so nice about it. They'll be like, "But I love your product and don't worry about it. It's awesome." This is a podcast for founders. I don't want to make anyone feel bad, but that really has been an interesting surprises how passionate people have gotten about the product.

I think the other surprises maybe has been the requests that people have have been really interesting and they've led us in interesting directions. I think we have this new product called Launch, which was really pulled from tons of user feedback of wanting to basically launch jobs from the Weights & Biases interface. And that's a funny one because I think I'm a little bit older than our audience and I'm just really comfortable inside a terminal. I just shout out to a machine and run stuff. And I think that our audience seems very interested in web UIs to do more. That's been a real learning. In general, notebooks are really pervasive among our audience that we don't know.

I think me and my co-founders often joke, I'm the one in our founding team that understands infrastructure the least and I think they sometimes talk about they build for me because I'm a little hazy on how Docker works and Git works. And so I think a lot of our product is abstracting away some of these concepts for machine learning people who are me. Just I don't know, I can google stuff, but I'm a little bit confused about ... Yeah.

Sandhya Hegde:

Right, so, "You have to pressure test it by seeing if Lukas can make it work."

Lukas Biewald:

Yeah, yeah. I'm like, "What Lukas thinks about it." Also it's funny, there's another joke that I can't read. I'm illiterate. And so what they do is they make the quick starts really short, but I actually think it's really important. I don't know why everyone makes their quick starts too long. And it's funny when I'm looking at a new product to consider using it, I feel like when the quick start is too many steps, I really don't want to use it.

Sandhya Hegde:

Yeah, no, this is the big challenge for all product-led companies, companies that aspire to have great self-serve experiences like, "How long is your customer's attention span?" right? And, "How do you keep their attention every five seconds?" I think that is a really hard thing to get right no matter how easy or complex the underlying product is. And maybe those student classes you did were a really, really formative experience to help you push that bar really high on--.

Lukas Biewald:

Yeah, those are amazing because what that really showed me is, if people get confused at any step, then they'll just go away. It's just really interest ... People always get confused and we thought it was so obvious what to do, but that's worse than a bug. It's showstopping. If someone can't figure out how to just paste the authentication token into their right code, you'll never see them to complain about it. So I felt lucky that we had those classes to torture students with them in the early days.

Sandhya Hegde:

Maybe the question I really want to ask you is if you were starting Weights & Biases now versus you started in 2018, in hindsight, would you do anything differently? Is there a different set of use cases or a different way that you would build Weight & Biases, the product itself, not the long-term vision, which I think is super well aligned to the future that we are all probably going to live in?

Lukas Biewald:

Yeah, I think the biggest thing that I would do now and that I'm trying to do now is I think that the software developers now really can do a lot of ML and that's different than a few years ago. So I think that's the big change. And I guess for me, I view it as an expansion of our user persona, but they have different needs and coming from a little bit of a different place. I think when you're a startup, you can adapt so much faster than a big company to what's going on. And so I would be all in, I think, on LLMs and, "How do we support this?" And of course, you don't see a lot of production use cases yet, but I think it's obvious that this stuff works super well. It's unclear what tools need to be made to make those work. But we're also trying to do that, so please don't compete with me.

How Weights & Biases built its early team

Sandhya Hegde:

Right, we won't. So switching gears a little bit to how you built your team, I think you grew a lot during the pandemic. I think you went from about 20 people to the team you are now. What's been your approach to leadership or what are lessons from the CrowdFlower days that you brought to the new company with you and how are you trying to create a strong, connected culture?

Lukas Biewald:

Well, I think one thing that I really, really am obsessed with is goals and orienting around goals. And that's even I think something I'm obsessed with when we're four people, but it's just more and more important as we grow. You have alignment naturally with four people, but even then, it's like these questions like, "Should we be oriented towards optimizing revenue or optimizing active users or optimizing retention?" And I think it's that clarity on what we're trying to do I think is something I'm really obsessed with because I've just watched people get confused and get misaligned.. I think especially when we're a remote team, we try to have real clarity.

And then also, I think, as I get older, I'm more obsessed with writing things down because it just feels like a much more scalable way to communicate than talking face to face. So we create a lot of docs, which I actually just think is a better way to run a business. I think, when you talk, it can be a little bit sloppy and people forget. I think it's just written documentation and probably people that work for me are laughing like, "We should do more of that," but definitely, I do way, way, way more of that than I used to. And actually, as we've gotten bigger, CrowdFlower got to about a hundred people and stalled out. I think we were 130 or 140 when we sold it, but it was a much slower run.

And so I thought a lot of best practices that executives coming from a Facebook or something would bring were just unbelievably stupid. I just thought, when people go to big companies, something happens to their brain where they just get all this dumb orientation. But now I realized why it happens. When you're scaling fast, you really do have this different set of challenges. But I think it's important not to, when you're small, obviously copy what a really high-growth company is doing. It's so bad because you're just doing exactly the wrong thing in a lot of cases.

Lukas Biewald's advice for early-stage founders building ML products

Sandhya Hegde:

I'm curious, what would be your advice for early-stage founders and developers that are trying to build their first product, build prototypes that are AI native today? What would be your advice for how to think about focusing on the end customer or the product, the use cases, versus figuring out all of this infrastructure and what are the right choices to make around infrastructure and tools? And what would be your advice? Are you writing the definitive guide to building ML products sponsored by Weights & Biases and then what would the intro say?

Lukas Biewald:

Well, I don't know if this is ML-specific, but I just feel like everyone gets this wrong, so it's worth saying, which is like, I would just make something for one person or two people or three people and go from there. Everybody's worried that they're going to end up doing consulting or something, but I just don't think that really happens that much. When we launch a new product, for example, we don't, from the beginning, try to grow it as fast as possible. We try to build it for specific people that we know and we talk about and try to make them really happy.

And I think it's even okay to prioritize the weird things that they want out of respect for their time where you're just quickly iterating on, "Okay, you want that weird thing? Fine, we'll do it," just, because actually even now, we have a huge amount of reach, I think we have a good brand, we have all these customers. Yeah, when we want something new, nobody wants to use it. They still don't want to use it and you just have to ... I think one thing that people don't realize, and I think even YC doesn't tell you well enough is how much people are not interested in what you're doing and how much you have to bank people to use your software. I still. I have all these tricks, essentially SDR tricks of sending tons of reminders to email people over.

I expect to email people a million times before they'll meet with me, and then when you get feedback ... I think once you actually show somebody that you really will implement the stuff they're asking for, I think then you have a friend for life too. So that's my big advice, is make a really small number of people really happy. It's harder to do than you think.

Sandhya Hegde:

That's really great advice. And I actually have a hot take from a previous podcast episode where the co-founder of Amplitude, Curtis, and I were talking about exactly this, which is like, "What is the approach you take to idea validation, message testing, prototyping for different types of products?" And where the two of us landed on was, and if the promise of the product, the idea you're trying to validate is something that plenty of people have promised for a long time, your customers are not paying attention anymore, "I have been pitched that snake oil again and again. So until you have something that is truly working, I don't have to imagine it's real, don't bother me, right? It's just not."

As opposed to if you're doing something where the problem is new and no one's really made promises about it before, it's way easier to do idea validation just with message testing that, "Okay, this is something people are looking for. They'll pay attention," as opposed to for problems that are more evergreen, even if the problem is big and deep, people are paying less attention because they're jaded, right? They've already seen it, heard it too often, so they're tuning it out now. So that was-

Lukas Biewald:

Interesting. I think I have to say, I don't know if I agree with that. I think people always tune you out. I've never had that experience. I feel like I'm usually making a new product in a new space and I just feel like no one pays attention to me in any scenario. You might just be thinking, look at a different market maybe. I think it's just the universal experience. And I'll say my co-founder, Chris, is a genius at this. I think the more you can make your demo look like it's real in all the little details ...people show me demos, it's like, "Don't put demo in the URL." You know what I mean? And our engineers, with the demos, they want to put all these warnings on it and it's like, "You can't do that, right? You got to show somebody something and really be like, 'This is the thing.' You can't just be hedging, 'Oh, this is a weird demo.' You got to really show them something real to get any feedback," because if somebody thinks they're looking at a demo or worse like a PowerPoint, they just don't actually really engage with it.

Sandhya Hegde:

Yeah, yeah, "Show me the thing working."

Lukas Biewald:

Yeah, yeah, yeah, totally.

Sandhya Hegde:

Yeah. Well this is awesome. So listeners, if you love this, there is more of it. Weights & Biases has an awesome podcast hosted by Lukas called Gradient Descent. It's one of my favorite ML podcasts. Please go check it out. And, Lukas, thank you so much for joining us today. I enjoyed this immensely.

All posts

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.