Dec. 24, 2025

#558 AI Is Easy to Build, Hard to Deploy: Data, Evaluation, and ROI with Bryan Wood

#558 AI Is Easy to Build, Hard to Deploy: Data, Evaluation, and ROI with Bryan Wood
Apple Podcasts podcast player badge
Spotify podcast player badge
Amazon Music podcast player badge
Castro podcast player badge
Overcast podcast player badge
YouTube podcast player badge
Anghami podcast player badge
PocketCasts podcast player badge
RadioPublic podcast player badge
RSS Feed podcast player badge
Youtube Music podcast player badge
Audacy podcast player badge
Goodpods podcast player badge
PlayerFM podcast player badge
Apple Podcasts podcast player iconSpotify podcast player iconAmazon Music podcast player iconCastro podcast player iconOvercast podcast player iconYouTube podcast player iconAnghami podcast player iconPocketCasts podcast player iconRadioPublic podcast player iconRSS Feed podcast player iconYoutube Music podcast player iconAudacy podcast player iconGoodpods podcast player iconPlayerFM podcast player icon

AI models are becoming commoditized, but deploying AI systems that deliver real ROI remains hard. In this episode, Mehmet sits down with Bryan Wood, Principal Architect at Snorkel AI, to unpack why data-centric AI, evaluation, and domain expertise are now the true differentiators.

 

Bryan shares lessons from working with frontier AI labs and highly regulated enterprises, explains why most AI projects stall before production, and breaks down what it actually takes to deploy AI safely and at scale.

 

 

👤 About the Guest

 

Bryan Wood is a Principal Architect at Snorkel AI, where he works closely with frontier AI labs and enterprises to design high-quality, AI-ready datasets and evaluation frameworks.

He brings over 20 years of experience in financial services, with a unique background spanning banking, engineering, and fine art. Bryan specializes in data-centric AI, programmatic labeling, AI evaluation, and deploying AI systems in high-compliance environments.

 

https://www.linkedin.com/in/bryanmwood/

 

 

🧠 Key Takeaways

• Why AI success is less about models and more about data and evaluation

• How enterprises misunderstand ROI and why most projects stall before production

• The difference between benchmark performance and real-world trust

• Why evaluation must be bespoke, not off-the-shelf

• How frontier labs approach data as true R&D

• Why partnering beats building AI entirely in-house today

• What’s realistic (and unrealistic) about autonomous agents in the near term

 

 

🎯 What You’ll Learn

• How to move from AI experimentation to production deployment

• How to design data that reflects real enterprise workflows

• How to identify where AI systems actually fail, and why

• Why regulated industries are proving grounds, not laggards

• How startups can overcome data and talent constraints

• Where AI is heading beyond today’s LLM plateau

 

 

⏱️ Episode Highlights & Timestamps

 

00:00 – Introduction & Bryan’s background

02:30 – Why data is now the real AI bottleneck

05:00 – Models are commoditized. So what actually matters?

07:45 – Why AI evaluation is harder than building AI

11:30 – Enterprise misconceptions about AI readiness

15:10 – Hallucinations, RAG failures, and finding the real problem

18:40 – Why most AI projects fail to show ROI

22:30 – Partnering vs building AI in-house

26:00 – AI in regulated industries: myth vs reality

30:10 – Startups, cold start problems, and data moats

33:40 – Scaling data operations with small teams

36:00 – What’s next: agents, data complexity, and AI timelines

39:00 – Final thoughts and where AI is really heading

 

 

📌 Resources Mentioned

Snorkel AI – Data-centric AI and programmatic labeling: https://snorkel.ai/

• Enterprise AI evaluation frameworks

• Frontier AI lab research practices

• MIT studies on AI ROI and enterprise adoption

 

[00:00:00] 

Mehmet: Hello and welcome back to a episode, episode of the CTO Show with Mehmet today. I'm very pleased joining me, Bryan Wood, who's a principal architect at Snorkel ai. Bryan, the way I love to do it with all my guests is I keep it to them to introduce [00:01:00] themselves, so. A little bit about you, your background, your journey, and what you're currently doing at Snorkel.

And then, you know, we gonna discuss a couple of topics of course related to AI as, as the audience can guess, probably from the company's name. Uh, we're gonna talk about data labeling, like data labs, uh, you know, and compliance. Some, some use cases also as well that you've been doing. I'll not speak a lot.

I'll keep it to you. So the floor is used, Bryan. 

Bryan: Cool. Um, tha thanks for having me. Um, Mehmet, uh, really excited to be here. Uh, like you said, my name's Bryan Wood. I'm a forward deployed principal, uh, engineer at Snorkel ai. Um, prior to this, I, I spent about, um, 20 years in the banking industry and my background before that was, was in fine art.

So have a bit of a meandering career arc there. Um, the main focus I have currently is working with Frontier AI Labs. To design, um, basically, uh, data sets for experiments that they're trying to run. So, you know, kind of figuring out [00:02:00] like how to partner with their research teams, bring in our research team and, you know, help them kind of build the state of the art models.

Mehmet: Great. And thank you again, uh, Bryan, for being here with me today. Um, let's start with, you know, the basics of, of what you currently do, Bryan, and the need of, of, uh, having, you know, the data. So what kind of, you know, value you bring, uh, when we talk about program programmatic data labeling and data centric ai.

Mm-hmm. So. What enterprises or even like maybe a startup are, are missing today and where you know, what you're doing currently at Snorkel makes sense to them. 

Bryan: Yeah, sure. Um, I would say that, you know, the, the main focus we have is on designing these data sets because, you know, as AI has become more capable and more complex, um, the type of data needed to both benchmark it [00:03:00] and tune it.

Um, has gotten, you know, significantly more complex. And, and I think, you know, maybe the, the best way to frame this up is that, you know, we really think about data as like r and d. So we're working with r and d, uh, partners in, in labs that are building frontier models and, you know, we're doing r and d and we're doing experiments that, that let that really focus on the, the shape of the data.

And you know, how that data is gonna model a, an AI agent or LLMs behavior. Um, and, and so it's really more of like a, a, a research sort of angle on data, which, which is really interesting. Um, I, I think the main difference, you know, between what I'm doing with labs and how enterprises are managing data is, um, you know, enterprises.

Tend to treat data as like a byproduct of a process. You know, it's something that kind of moves between systems, you know, maybe lands in a data lake or something. And um, you know, it'll go through all these data quality processes. [00:04:00] Um, and you know, but data, like that's kind of table stakes for data management, but it's not really.

Putting the data in a form that's gonna guide or evaluate ai. Um, and there's, there's quite a bit of, you know, research and art that goes into, um, how to actually leverage that data. And that's something these labs are like really, really good at. And, um, so yeah, my main focus is basically collaborating on those labs of like, what do these data sets need to look like, um, you know, to push the frontier.

Mehmet: Got you. So Bryan, is there a misconception here? 'cause you mentioned the, the model. So is there a misconception you see still, uh, with the teams that you work with about, um, you know why it's not mainly about the model all the time, it's also about, you know, the, the quality of this data. Is this something that maybe even some leaders are misunderstanding or they have like maybe misconception about it?

Uh, when, when it comes [00:05:00] to, to data quality, when starting any AI project. 

Bryan: Um, y yeah, I think so in the enterprise space, um, y you know, in the research lab space, I think there's a huge focus on data, very clear understanding of, of the importance of it and its connection with r and d. Um, I, I think in the enterprise space though, you do still see some focus on the models.

Um, you know, I think that a lot of these models are kind of, um, you know, sort of harmonizing to, to. Sort of have a lot of the same capabilities. I mean, I, I use all of the models a lot of times interchangeably, I'll find, you know, certain strengths and weaknesses. Um, you, you know, over the past few years though, I had heard several enterprise leaders, you know, that had the position of, you know, the models are gonna keep getting better and better and, you know, we're gonna hold off on, you know, tackling these hard problems because, you know, we think in six to 12 months, um, these problems might go away.

Uh, and, and so I think I don't see that [00:06:00] as much now. I, I think more of the misconception recently is on evaluating the models and what it actually takes to get these into production. There had been a lot of focus on just the performance of the models, how does it do on benchmarks. Um, there had been a lot of focus on like AgTech infrastructure, which is like, how do you put all this stuff together?

I think what we're, you know, people are starting to realize is the, the models are kind of commoditized. I mean, you should mm-hmm. Pick, pick the ones you want for, for whatever reasons you want. Um, probably a good idea to not be locked into a specific family. Um, you know, maybe have ones you can run locally, you know, so d different enterprises just have different needs there.

Um. And so I think what you're seeing in the enterprises that are actually getting this stuff into production is they really are starting to focus on evaluation, which is, you know, we have to do this safely, right? Like building the thing is, is not necessarily that hard. The models are available, but how do you build trust in it?

Um, which, [00:07:00] which kind of comes down to evaluation. And there's, there's, I've seen misconceptions in evaluation where, you know, people just kind of buy a, a package of evaluations or software and, you know, that might, that evaluator might say, okay, your, your attribution is at 85%. And it's like, okay, well, well what does that even mean?

You know, does it mean anything for the process that we're running? Do we, do we agree with it? And I, I think, you know, one of the things that people are maybe starting to realize is that. Evaluation is probably for an enterprise gonna almost always need to be bespoke. You're going to have to define what is actually important to your process and you know, not just have humans annotate whether it's good or bad, but actually have humans annotate the whole workflow.

In a way that guides AI so that you have like real data for that eval. And you know, all that's a lot of hard work, which is why I think you sometimes see this sort of lag [00:08:00] in AI deployment. You know, you see a lot of headlines about lack of, you know, mm-hmm. AI projects leading to ROI and things like that.

And I think it's because, you know, and I realize it's getting to be long answer, but like, I think it's because you see these projects slowed down at like stage three or four, right? It's like we got access to the models, we got the agent set up, we got some evaluators, and it all seemed really easy up until that point.

Then you sort of hit this wall where, okay, are we evaluating the right things? How do we even set up the data to do that evaluation? And that's where a lot of expertise and, you know, art is required to do it correctly. And I think that's where you're seeing things stall out right now. 

Mehmet: Right Now you mentioned, uh, agents, you mentioned, you know, the, the models, um.

And, you know, the applications today are, you know, like literally limitless, right? So, so we can apply, um, to a lot [00:09:00] of domains, but for example, in areas Bryan, like let's say, which they have high compliance, high risk, uh. Probably maybe financial services that you, I know you have a, a background from there.

So how, how do you ensure also the domain expertise is properly codified when using, you know, the labeling, for example, and are there any limits to their approach? Because again, you know what I have heard from. Uh, experts that one of the main things, like people are still, um, kind of, um, I would say, you know, they, they think a about it a lot or maybe they are hesitant.

You know, it's, it's about this piece, like, okay, like how, how good would it be in something that is, you know, maybe simulating transactions or. I don't know, like if, if you're trying to, to create data sets for, uh, I'm, I'm simplifying it, of course, but you know, like for, for detecting frauds, right. So, mm-hmm.

[00:10:00] Uh, tell me more about, about your approach at snorkel and, you know, of course, based also on, on your experience as well. 

Bryan: Um, yeah, I think the, the main thing is gonna be like, you have to actually look at, at what sort of workflow you're trying to automate with these agents, um, and, and really build your data sets around that.

So, you know, for example, if you're trying to, like, my background is, is largely commercial banking. So, you know, like lending to large businesses, um, you know, if you were looking to automate. A workflow with that, like what an underwriter or analyst might do in a commercial bank. Um, a lot of times, you know, financial services, places, they have tons of data and it's just kind of like this data that's a, a byproduct.

It's, it's in like a transactional database or something like that. And where they struggle is translating that into data that's going to guide. The behavior of an ai, um, you know, 'cause you can't just say like, well, we have everything in a data warehouse. It's, [00:11:00] it's past the quality checks. You know, we've got metadata, our data's in good shape and, and expect it To really drive an AI workflow, um, you really need to set up data sets that look like what that person is doing in that job and, and that's gonna provide the patterns that the AI learns.

And so I think that's one of the places a, again, that's a very hard problem to solve. And so I think that's one of the places where, um, you see some projects, um, you know, get hung up because it's like, what does that data need to look like and, and how do you evaluate it? Um, and a lot of times the gap there is, is simply expertise on, you know.

How to manipulate that data into something that's useful. Um, for the most part, you know, it's something you're gonna have to create. Um, you could use synthetic methods to, um, you know, automate some of that. Um, but you know that it's not gonna be just simply [00:12:00] backing up, you know, uh, uh, tuning process into a data warehouse.

That's what I'm trying to get at. I think. 

Mehmet: Right now you, you mentioned also something, uh, interesting. Um, and, and here it's about, you know, the, um, the amount of time probably, you know, I think teams put some time, uh, to, to, to get things work. So, so there are a lot of things that goes, might, goes wrong there.

And, you know, we've heard also about like, you know, how data sets, if, if they are not like, um. Like properly, I would say. Cleaned up, I dunno what's the right word? Yeah. Expert in this, you know, so, so they might look biased, the data might look biased. So sometimes, you know, um, we've heard about also the, the, the hallucination problem, and this is where the rag things comes into the play, so.

Do, do you help somehow also in isolating you know, where the problem is exactly from? Is it [00:13:00] like from data labeling logic? Is it like the model architecture? Like how does the workflow, if, if that makes sense work when, when you engage with the, with your customers. 

Bryan: Yeah. Yeah, a hundred percent. 'cause you know, the, the whole purpose of, of a lot of these evals and the data sets that drive them is to find those error modes, um, like, like you're discussing because, um, you know, a lot of times, like the, the capabilities of AI are, you know, described as like a jagged edge, because there's gonna be certain things that it's, you know, captured this capability it's very good at.

And then there's gonna be things that sort of look very similar to that. But the AI hasn't quite solved yet. Um, so, you know, the, and, and this is, you know, in some ways kind of a traditional software development loop, right? Where you are, you're deploying things, you're finding bugs, you're finding error modes, and you're fixing those.

And that's a, that's a very iterative process. And on the enterprise side, it's, it's one of the main things we do with our customers is [00:14:00] we help them set up benchmarks, evaluations. And then basically act on those because each time, you know, we run some experiment, update the models, it's, and, and then rebenchmark things, it's going to reveal something about that model or, or systems behavior, um, you know, that needs to be fixed or, or maybe it's additional slices of data that you need to look at.

Um. Uh, for, for creating your benchmark. And, and so it, it does in some ways, you know, there's, there's the art to setting up the benchmarks. Um, but once you have good benchmarks, it does become a, a very iterative process of simply fixing the bugs. And, and that could be done through, you know, some of the things you mentioned, like are there issues with the data labels, right?

Like, is the data just wrong or, or poor quality? Um. Or, you know, is it a hallucination issue? Is, is there something wrong with like a rag pipeline? Things like that. And, and obviously like these aren't, you know, we're not just talking about like a model where you need to [00:15:00] retrain it. These are usually like full systems with a lot of different integrations.

API calls rag setups, and so a lot of the work. On the evaluation is really pinpointing where it went wrong. Um, 'cause these processes tend to be like stepwise. So, you know, if it gets step 1, 2, 3, and then it fails, you know, the kind of, the whole thing fails. And if you're just looking at that end result and you're seeing failure, it, it's informative, but it's not telling you where your problem is.

You need to know, okay, it failed at step four and it's failed for this very specific reason. And it's finding those specific reasons that can be very difficult if you're trying to use, you know, what I would say is like off the shelf benchmarks or off the shelf evaluations because they're not likely to tell you very much about your, your actual data or, um, you know, the behavior of your system.

These things are gonna tend to be. Very generic. Um, not saying they're not useful, but you know, you'll have to really [00:16:00] set up a, a bespoke evaluation for your own system, I think, to get a good signal. 

Mehmet: Makes a lot of sense. Now, one thing you mentioned, Bryan, also minutes ago, is about the ROI and figuring the ROI.

So, um, maybe without mentioning name of customer, I can understand, you know, sensitivity here. But you know, if you can give us an example on, on, on the ROI, because you know, we are seeing sometimes also these reports coming out in, in, in, in the media and saying like, Hey, like based on these studies, like. X percent of companies, they tried to deploy ai.

They didn't see the ROI or like, for example, they're struggling. So when, when you speak to, to, to, uh, customers after a successful, uh, engagement with them, like what, what, what would be like the major R OIS that they can immediately see, and maybe some of them I understand they might be on the short run.

Fuel would be on the long run. So what are like these ROI use [00:17:00] cases that you have seen working? 

Bryan: Yeah. Most of what we still see is, is what I would call like traditional opex or operational excellence. It's, it's largely, you know, cost cutting, like find manual, repetitive processes. Um, which, you know, companies have always automated processes with software, whether it's, you know, just system integrations, robotics, whatever.

AI is giving you a new capability to move, you know, up the value chain on that automation. And I think that's, that's largely the theme. And, you know, ROI can be in the. You know, tens to hundreds of millions for, for enterprises. So these are super valuable use cases. Um, when you think about the ROI for the Frontier Labs, you know, that could potentially approach the billions of dollars because, you know, they're competing for, um, you know, a much larger kind of like long-term ambition.

Um, there. And, and, and so we, we do see, you know, healthy ROIs, [00:18:00] um, with the use cases. We're engaged. I think, you know, one thing I'd wanna call out, 'cause you know, you did mention, um, some of the, the studies and, and articles around companies struggling with ROIs. So I, I'll get the numbers wrong 'cause I don't remember 'em off the top of my head, but you know that, that MIT study that came out earlier this year is probably the most well known.

It said something like. 85, 90, whatever percentage of companies, you know, fail to realize ROI. And to me, you know, the most interesting thing in that article was the breakdown between companies that tried to do this themselves versus companies that partnered. And, you know, basically the, the punchline was that the companies that are getting ROI are partnering, they're bringing in, you know, whether it's.

Consultants, you know, vendors, startups, um, you know, you know, companies that deploy Ford engineers like myself. Those are the companies that are getting value out of ai. Um, whereas the companies that are trying to do this all homegrown, you know, hire their own [00:19:00] talent, build their own systems, are, are largely not realizing ROI.

So I think what that really points to is. One of the bottlenecks on this is, is expertise. And there the market is super competitive for people, um, that can build and deploy AI systems that have ROI and. It's largely, I would say, like more practical to try and partner with a company that has a track record of doing this, um, versus trying to build it yourself.

I think that'll change over time. Right now, this is new and there's not that, you know, there's just relatively few people with the muscle memory of how to do this. And again, those, those people are gonna be highly sought after. So, you know, those are the people, the companies that you need to probably try and reach out to and partner with.

If you try and do this yourself, you're gonna be in a very competitive place trying to get the, the people that know how to do this. Um. Again, I think that's because AI is so new right now. It's with any [00:20:00] technology, there's gonna be a small group of people pushing the frontier. As AI becomes just part of the stack and everyone knows how to use it, the this, this problem will go away.

Um, but I think what it points to in the short term is that. You know, companies are better off partnering, um, if, if they wanna achieve ROI th this is something I did when I was in financial services. So, you know, before I joined Snorkel, I was a customer of snorkel. Hmm. I was a customer of snorkel because they were stacked with, you know, top tier research PhDs, people that I wasn't gonna be able to hire or attract for the, the group that I ran.

Um, but that was the expertise that I needed to, you know, come in, think about my problems, come up with solutions, and, and deploy those solutions. Um, we didn't have that talent and we, we weren't gonna be able to hire it because it's scarce. So. I, I think that's the right pattern and that's kind of like what that MIT you know, study really [00:21:00] revealed to me was that, you know, my instincts around looking for experts and partnering versus just trying to build your own pool of experts is, is probably a lot more effective.

Mehmet: I think this makes a lot of sense because you know when, when people feel the urgency to jump on a new technology without proper preparation, and of course what you mentioned, the expertise is the number one here. So I think this might have reflected on the study of MIT. Uh, but as you said, like, uh, it's very logical to me.

Like the more we start to understand more how we can use this, we build, you know, kind of this practitioner mindset of how to utilize the technology. I think we, we'll be able to solve, um, you know, these issues that, you know, the, the study unveiled now because you came from, from the financial services.

Are you seeing, and I know now you work with, with like other verticals as well, but there are like usual [00:22:00] suspects as we call them, like financial services. Mm-hmm. Healthcare, you know, the, again, the high regulated uh, uh, domains. Are we still seeing this dilemma of innovation versus compliance specifically when we try to implement.

AI in such environments, or the myth is not there, is busted, is not there anymore. Like what, what are you seeing in this domain? Bryan? 

Bryan: I I think you're, you're definitely still seeing that. Um, I would maybe want to like reframe it though. Like I don't. I don't really think of it as a dilemma or, or a problem like, 'cause you know, the, the narrative or the, the punchline you hear all the time is that like, you know, places like financial services, um, healthcare, you know, they move slow and they, you know, are, are kind of like late adopters of this type of stuff.

Um. I think that's true. Um, but it's not due to, you know, [00:23:00] lack of, of effort or vision or, or talent. It's generally because, you know, that's the pace they need to move at. They're optimized for, you know, minimizing downside, right? Like minimizing risk and. So, so, and, and which is, you know, one of the things I really enjoyed learning at places like that was it requires a lot of rigor to deploy things at, at a place like a large financial services company.

Um, you, you know, at a startup, the pace is, is. Completely different. You know, we can move really fast. The labs that I work with right now move really fast. They can just blaze the trail. Um, and, and financial services just are, are set up differently. You know, they, they deal with people's money, their life savings, and, you know, all the regulation, you know, for the most part is there for a good reason.

Um, so I, I don't necessarily think it's a dilemma. I think it's more of just the, the nature of it. But, you know, the way that manifests is [00:24:00] that. The companies are sometimes perceived as not being innovative. And, you know, the way I would maybe try and reframe that is, is they, they do innovate, they do the experiments.

You don't necessarily see the results because it, it simply takes more effort to deploy things there. So I tend to think of a place like financial services as healthcare as. Not where you're gonna see the innovation. It's, it's not gonna appear to be that innovative. It's, it's not gonna be a place that this stuff is invented.

Um, so I think of it as more of like a proving ground. So, you know, once AI is deployed. At scale and financial services. I think that's, that's the hardest environment to do it in. Um, and it's not hard 'cause of the technology, it's, it's hard because it's a complicated place with complicated regulations.

And so I, I look at that as like the innovation happens elsewhere. That's just the nature of it. Things get hardened in places like financial services because. They have to be [00:25:00] perfect to be deployed there. Um, and, and so I don't think it's necessarily that the banks are, are slow or lagging innovation. Um, you know, I think they're, they're, they're doing what, what they're supposed to do with the pace they're supposed to do it.

Um. And, you know, if the banks invested more in innovation, maybe hired different type of talent to do research in places like this, I just don't know that it would bear any, any real value because at, at the end of the day, you know, it's the hardening that is really the trick at the banks rather than the actual new technology.

Mehmet: Right. And I think we're seeing more like, uh, both financial institutions, banks, insurance, and even like healthcare. They're trying to create this kind of sandbox environments or labs where, where they can like try new ideas before taking it as production. Make sure you know, and then. As you mentioned, like you put the safeguards there, like, uh, uh, on the guardrails that they call them and then try to, to, to, to take [00:26:00] it to, to the mass production, which is Yeah.

I, I, I think we, we are seeing a, a paradigm shift here, uh, because again. The AI is pushing everyone to try to, uh, to change their behavior. And, and yeah, I think this, this, this is good for the consumer, I would say like for the end user. Um, because, you know, it's, it's, it's pushing everyone to try to, to find like how we can leverage this technology to serve our customers in, in a better way.

Now, moving from enterprise a bit more to the startup sites, Bryan, so, uh, I follow this domain a lot and. You know, of course as we discussed, like now the model is commod commoditized. Like it's not about the model. Uh, we know like also the infrastructure kind of, it's, it's like there's abundance. Let's, let's call it this way now, the mode.

And a lot of my guests, they said like, the more today is the data, right? And, and the amount of the, and the quality of the data. Now, when it comes to startups, startup, they always have this problem [00:27:00] called the cold start a problem, right? Which can be applied actually to the data, because at the beginning they might not have, you know, enough data to build, you know, a, let's call it a, a, a minimum viable AI model, right?

So. Anything, you know, at snorkel where you work with startups and help them in, in this challenge. 

Bryan: Y Yeah, that's actually a huge focus of my job, um, because there's a lot of startups, you know, in the AI space trying to build competitive models now, and, and a lot of that work is catching up to the state of the art.

Um, you know, it's just kind of like by definition, startups are. Competing with bigger players that have more invested and, and have more money. And so that's where, you know, having like novel research, both within snorkel and within the labs that we partner with is, is really critical because you, you, you know, you just, you have to have novel ideas if you're gonna be competitive in this space.

Um, you know, we've [00:28:00] definitely heard a lot about data being the moat. You know, catching up to the state of the art though has actually, uh, I'm not gonna say it's like simple, but it's, i i, if we had talked a few years ago, you know, I think the narrative was AI is gonna be won by the big players because it takes hundreds of millions to billions of dollars to create a state-of-the-art model.

So. It's gonna be won by the players that, that have that, that money to invest. And you know, one of the things we've been seeing a lot of innovation in, you know, especially like out of China, is like how to create a state-of-the-art model. You know, that that catches up very quickly, you know, without those large investments and.

So a, a lab coming out today could probably use some of those models to create synthetic data and train their model and, you know, do some pretty meaningful hill climbing on a benchmark until they hit a threshold of where they can get, you know, with synthetic data. [00:29:00] And, and then it becomes about, you know, getting those more kind of like expert curated.

R and d design data sets, um, that start to push the edge into the frontier and beyond. So I, I, I do think that like data is, is a moat. Um, and that's why we see such a focus on it from the labs in building these, these more advanced data sets. Um. But I think that, you know, as things like open source models, you know, continue to keep pace and catch up with the state of the art, the ability to synthetically create that data, um, continues to advance.

Um, I will caveat that by saying like, I don't. I, there's not really any such thing as like purely synthetic data, right? So it, it's not like you hit a switch and you have data. You still have to have the people that know how to use those models to create the data that's designed to train the model. Um, but I do think it.

It's making it easier for [00:30:00] newer, lesser funded players to catch up. Mm-hmm. And is, is contributing to that idea that, um, you know, models are, you know, kind of being commoditized. Um, I think the other place that like data as a moat is becoming interesting, and I've seen some kind of shifts in this, is, um, again, if we had talked four or five years ago, I would've said.

Hey, these financial services companies, they have all this data. Like I remember saying it in my old job, like, we have the best commercial banking data set in the world. Mm-hmm. So we can build the best commercial banking ai. Um, and, and I would've thought that data is a moat. But, but where we're seeing this go now is that these frontier labs are, you know, basically figuring like they're creating that data now.

So, you know, being a bank and having a bunch of banking data is not. Uh, at this point, I don't think really providing you a moat in building, you know, AI that does banking workflows. Um, and I [00:31:00] don't think that's necessarily a bad thing for the bank. I think it means that the labs are gonna take the lead in developing AI that will be deployed at banks, and the banks are gonna, you know, probably be customers rather than builders in that scenario.

Mehmet: Right now, Bryan, and another thing which, um, startups usually they, they would, they would have a challenge in is maybe small engineering team. Like, is there anything that usually you can help them with like to, to be like extended, I would say data team for them. Like is there such, uh, you know, offering or services from, from your side?

Bryan: Yeah. On the, um, you know, on the enterprise side, you know, we, we have teams of four deployed engineers like myself that, you know, kind of go in, provide expertise, build, build these systems. Um, and, and it's very similar, you know, working with, with the labs that are building. Um, you know, models and AI systems is, you know, we integrate with them as both [00:32:00] research and data partners and y you know, so we're not only helping them define that data and define the experiments and how to, you know, measure the output of their models.

We're also helping them scale their data operations. Um. You know, because getting, getting, you know, a network of experts to manually curate or, you know, augment data that's synthetic is, um, you know, it's, it's a pretty big operation and something that, you know, I wouldn't expect all labs or, you know, startups in this space to want to tackle themselves.

Um. So a big part of the service we offer is, is, you know, in addition to that research partnership, it's like, how do we scale up those data operations so that, you know, customers are getting data that's, that's the highest quality and at the volume that they need. 

Mehmet: Right now, I know like it's, it's hard to predict future, Bryan.

Mm-hmm. Especially in the age of ai. But, uh, maybe some patterns that you're seeing that [00:33:00] might affect the way we are gonna continue doing this, um, you know. Regarding, uh, what you currently do at Snorkel. So where are we heading in, in your opinion? Like, what, what would, what would be the next frontier? Uh, if that makes sense?

When, when it comes to, to the use cases. Like, I know we spoke about, for example, Egen ai. We spoke about like, uh, you know, the. But like something you, you, you see it gonna be big for that specific use case. And when I go to to the Snorkel AI website, I see bunch of use cases, but the ones that attracted my attention, of course, it's like the agent ai, the other one is coding.

So. What, what is next? Let, let me simplify it this way. 

Bryan: Yeah, sure. I, I think, you know, just, just, I'll focus on a couple different spaces. First in the data space. Um, I, I think we're not gonna see any slowdown in the demand for data. Um, you know, because as the capabilities get more [00:34:00] advanced and the systems get more complex.

The data that's gonna drive the frontier will, will get more complex and require, you know, more expert input and more complicated workflows to get right. Um, so, you know, if we want to actually deploy agents that do end-to-end enterprise workflows, um, which I think we're still a ways off from there, there's, there's not gonna be any slowdown in the demand for, you know, what I'll call like AI ready data.

Um. So, so I think that's gonna continue to be a big focus. Um, I, I think another thing, like, just looking ahead, you know, one of my perspectives on a lot of these things is I always like to focus on like, what's practical, what can I actually get value out of? And, and, you know, six to 18 months. Um, and, and so one of the things I think is maybe a little like overhyped right now is the timeline on some of this stuff.

Um, you know, I think. You know, we al we always have this prediction that there's gonna be massive disruption over the next one to two [00:35:00] years. And, you know, we've been living with that for a couple of years now and you know, to be sure there has been disruption, but I don't think things are, um. You know, I, I don't think we're gonna, you know, when we see some of these calls for like, massive displacement of jobs, you know, due to agents and things like that, I, I don't think it's gonna happen at the pace that, that other people are calling for.

I think it's more of you, you know, just to be clear, like all the ideas we're seeing, like end-to-end workflows, it'll happen. I just think it's gonna be more in the five to 10 year range. Um, when we actually see autonomous agents doing the work of. A professional in financial services. I don't think it's gonna be in, in, in two years.

Um, but that said, I am very excited about agents. I think the, the foundation is being laid. Um, the expertise to deploy this stuff at scale is being built up. And I think we'll get a, a ton of value out of that. Um, I think the other place we, you know, hopefully we'll see some innovation is just in. Model architecture and capabilities.[00:36:00] 

Um, you know, there's been a lot of studies on there being a bit of a plateau in capabilities. You know, mark models continue to kind of hill climb benchmarks, but it's, it's more money being invested, more compute being invested for fewer incremental gains. And so I think, you know, we're gonna continue to see progress in LLMs, but.

It might slow down and plateau a bit. So another place that I find really interesting, you know, over the next couple of years is in alternatives to LLMs. So, you know, and, and actually seeing some of those deployed with practical applications. So, you know, I'm talking about like models that have like world views, long-term memory, you know, ability to do continuous learning.

Um, you know, solving some of those problems I think are gonna lay the foundation for. The end-to-end agents that we want to get all this value out of. But I don't necessarily think, um, we're gonna do it with just the, the, the tools we have today. 

Mehmet: Right. And, um, [00:37:00] I, I agree with you and the more, I delve more into the, you know, this theory about replacing jobs and all this, so, so the more I understand, like, actually humans would be needed more than any time before because you know, like you need the court, you need the coordinator, you need the supervisor role, I would say, because at the end of the day, you still need to check like, are the agents working?

Properly as we are, uh, you know, suspecting them to, to be, um, you know, probably enhancing the way they work and keep optimizing the way these agents work. And it's a mix with, you know, funny enough. It's like, I'm not sure if, if I might say this, I know quite some people they would not like it. Of course, LLMs, ai, machine learning, all this.

But there's a lot of things which are automation, which, you know, like I think it's a big chunk of automation that it's helping us using the AI of course, to, to, to streamline these processes, to, to streamline, you know, how we do things and this is where us as human will be needed. I agree with you, Bryan, on this a hundred percent.

Yeah. [00:38:00] Um. As of course, like Tam Will, will, will, will reveal to us, you know, what's, what's next. But, uh, it's interesting also you mentioned about, you know, how the plateau of LLMs will would happen and I'm expecting, you know, some other breakthrough in maybe in something parallel to LLMs, which we might not be, uh, much aware of today.

There are like other ways of, of also like doing things. And by the way, the whole LM comes from like one branch of the whole machine learning. So let's see what's happening in the other. And you know, I'm following papers. Academic by any, by any mean. But I mean, there's a lot of things happening in, in, in parallel war of parallel universe, if I might say, which also are exciting.

And I'm, I'm waiting to see like how this will become like practical, similar to how the LLMs are practical in the enterprises and for startups today. Uh, Bryan, really, I enjoyed the conversation. Before I let you go, um, final words from you and where people can get in touch. 

Bryan: Um, yeah, so I, yeah, I totally agree with everything you're saying there.

I think, um, [00:39:00] you know, I'm really excited about the potential breakthroughs coming up. Um, definitely bullish on AI overall, and, and thank you for having me. But, um, yeah, if people want to get in touch, um, just, you know, uh, look me up on LinkedIn, happy to, you know, connect. Um, you know, of course, uh, snorkels website is, is a good way to find us too.

Um, and you know, so if there's, whether it's Enterprise Frontier Labs, um, you know, we, we partner with kind of all the, you know, leading companies in AI space and are, are always happy to have conversations about it. 

Mehmet: Great, and thank you again, Bryan. So for the folks who are listening or watching us, so you'll find the links in the show notes or on the description.

And again, thank you Bryan, for your time. And as usual, this is how I end my episode. This is for the audience. If you just discovered us, thank you for, you know. Listening or watching. I hope you enjoyed If you did, so give me a favor, subscribe and share it with your friends and colleagues. And if you are one of the people who keeps coming and you are like loyal fans to the show, thank you very much for all what you did during [00:40:00] this year, 2025, and you know.

We, we, we just, you know, about to, to finish the year and, you know, we're finishing on high bars like we are trending as usual in multiple countries on the top 200, uh, apple Podcast charts. And this is something that cannot happen without you. And as I say, always stay tuned for a new episode very soon.

Thank you. Bye-bye.