May 21, 2026

#600 AI Reliability Is a Business Risk. Not Just an Engineering Problem | Helen Gu

Show Notes
Transcript

In this episode of The CTO Show with Mehmet, Mehmet sits down with Helen Gu, Founder and CEO of InsightFinder AI. Helen brings decades of research in distributed system reliability, anomaly detection, and AI-driven operations. The conversation focuses on why AI reliability is becoming a business risk, not just an engineering issue.

The conversation reframes AI observability as a production control layer for enterprises deploying AI agents. Helen explains why traditional DevOps and SRE practices are not enough when systems are probabilistic, model behavior changes, data shifts, prompts evolve, and agents begin taking actions across workflows.

If you are building, investing in, operating, or leading AI systems inside enterprise environments, this conversation gives you a practical frame for reliability, drift, runtime monitoring, and accountability.

About the Guest

Helen Gu is the Founder and CEO of InsightFinder AI, and a professor at North Carolina State University. InsightFinder AI was founded from her research in distributed system reliability using AI technology.

Helen has worked on anomaly detection, prediction, diagnosis, and system reliability since the late 1990s. She also spent a sabbatical year at Google evaluating anomaly detection algorithms, which later helped shape the foundation for InsightFinder AI.

LinkedIn: https://www.linkedin.com/in/helen-gu-b1aa42b6/

Website: https://insightfinder.com/

Key Takeaways

AI systems can fail silently while still returning confident answers.

AI reliability is becoming a business risk, not only an engineering concern.

Multi-agent systems can spread upstream mistakes across business workflows quickly.

Traditional SRE practices do not fully cover model behavior, prompts, and data drift.

Runtime monitoring matters more once AI moves from sandbox testing to production.

Observability alone is not enough without diagnosis, recommendations, and remediation.

Model drift can change business outcomes even when infrastructure appears healthy.

Human review shifts from doing work to supervising AI decisions and guardrails.

What You Will Learn

Why probabilistic AI systems require different reliability practices than software systems.

How model drift and data drift change production behavior over time.

What silent AI failure looks like inside enterprise workflows.

The reason sandbox testing misses real production AI failure cases.

How runtime monitoring helps detect hallucinations, bias, leakage, and accuracy issues.

Why AI observability must connect infrastructure, data, prompts, models, and business outcomes.

What leadership teams need to consider before AI agents begin taking actions.

Episode Highlights

00:00 — Helen Gu frames AI reliability from research

02:30 — AI systems answer confidently even when wrong

04:30 — SRE lessons do not fully transfer to AI

07:00 — AI reliability needs fine-grained runtime metrics

08:30 — Silent failure creates hidden business damage

10:00 — Multi-agent mistakes propagate faster than humans

12:00 — Model drift changes outcomes without warning

15:00 — Sandboxes miss production AI behavior

18:00 — Observability must become actionable control

21:30 — AI reliability becomes a leadership responsibility

24:30 — AI Labs test prompts, models, and datasets

28:30 — AI agents become part of enterprise workflows

31:30 — Responsible AI starts with accepting failure risk

Listen Now

Available on all major podcast platforms and YouTube

Connect with the Show

Follow The CTO Show with Mehmet for more conversations at the intersection of technology, startups, and venture capital.

[00:00:00]

Mehmet: Hello, and welcome back to a new episode of The CTO Show With Mehmet. Today I'm very pleased, joining me from the US, Helen Gu. She's the founder and CEO of InsightFinder AI. The way I love to do it all the time, my audience know by now, is I keep it to my guests to introduce themselves. So Helen, thank you again for joining me today.

Tell us a little bit more about you, your background, your journey, and then we're gonna deep dive into an, I think an important topic, I didn't discuss it much, like we, we touched base on it, which is all about, you know, reliability of the AI e- for the AI era. So but before this, welcome again and the floor is yours.

Helen: Thank you so much for inviting me. Hi, everyone, my name's Helen Gu. I'm the founder and CEO for InsightFinder. And, uh, also I'm a professor at the North Carolina State University. So InsightFinder is founded based on my research, uh, in the area of distributed [00:01:00] system reliability using AI technology. Uh, so I started my, uh, research, uh, back into 1999, so it's almost like 30 years ago.

And so we start to looking to how to enhance distributed system reliability using automatic technologies. Uh, so back then people try the control theory, and then basically I started to use, uh, neural network technol- technologies to actually perform anomaly detection, prediction and diagnosis. And since then have been focusing on this area for many years.

And so, uh, first, uh, publish research papers in the space and then basically, uh, launched the InsightFinder after actually I spent my sabbatical year at Google, uh, evaluating our anomaly detection algorithms. Uh, so fast-forward today, and so InsightFinder AI has been basically providing, [00:02:00] uh, reliable, uh, reliability services to Fortune 500 companies.

Mehmet: Great. And thank you again, Helen, for being here with me today. Let's dive in immediately talking about, you know, um, this important topic, which is reliability. So we're talking a lot about AI, and we know like AI systems, I started this when I was, you know, maybe more than 25 years ago when I was still in college.

So AI systems are probabilistic by nature. Yes. So what breaks first when companies try to operate them with traditional software engineering and DevOps practices?

Helen: Yeah, I think that you, you, you, you basically touch about very important, uh, aspect of, uh, AI, right? So most of the AI is under the hood is basically statistical machine learning.

So because it's statistic, so inevitably it will make mistakes, right? So, uh, the other thing is that very interesting, AI doesn't have [00:03:00] conscious. And so most of the time they don't actually tell you they don't know. They always give you answer, right? Just as a simple example, if you go to ChatGPT or go to Anthropic, you ask, uh, okay, what is Windows error code 0x14F?

And they will always try to give you answer. And you was... You know, if you are not a domain expert, and you probably know, uh, you probably don't know the answer they give you are all wrong. And so the, the reason is that the, you know, those, uh, large language models are trained based on, um, domain knowledge, like, uh, trained on public knowledge, not domain knowledge.

And so, uh, when they actually trying to do the inference, the AI technology will always actually pick the, you know, the, the inference path has highest probability, but that doesn't mean it's actually logical or it's actually, uh, real, right? So that's the fundamental reason why the AI, uh, models will always have, m- you know, all kinds of mistakes.

Um, [00:04:00] and the, uh, although the mistakes can be, uh, very rare, um, but the key thing here is that if you don't catch that, and so it will have a big, um, business impact if you don't know that, right?

Mehmet: Right. And of course, like, there is no system which can be 100% accurate, and mistakes can happen. Mm. And this brings me to something maybe my audience would be familiar with, and I know, like, you have done this comparison about, you know, how the rise of site reliability engineering, known as SRE, so a lot of, uh, people know it maybe as SRE.

During the cloud era, like, you know, like, everyone start to look, yeah, how we can make sure that we are 99.999%- Mm ... you know, available. Right. What is the equivalent wake-up call moment for AI reliability in your opinion, Helen?

Helen: Yeah, so, uh, so this is actually, uh, go back to my, uh, research area, right? So w- early days we look at distributed systems, and so everybody was saying, "Oh, you know, can we achieve [00:05:00] 100% reliability?"

And then basically I always teach this in my class, and so we, we have this famous proof, um, so, so 100% reliability is, is impossible in, in kind of internet-based, uh, distributed system. And so, uh, so that's why we have this 99.999, right? And so, uh, but still it's very useful, right? It's, you know, distributed system, cloud computing is extremely useful and it basically becomes a backbone for all the computations.

So, um, so it doesn't mean, like, you cannot achieve 100% reliability, you don't use it, but you know, you have the right, you know, um, kind of, uh, uh, prevention, detection, diagnose and detection te- techniques in place to basically handle that, right? So I think a similar thing here in, uh, AI side- Um, so we probably need, like, you know, uh, kind of operation people, uh, to actually, you know, [00:06:00] manage and watch those AI agents, whether they are doing, uh, like, the, the correct thing.

Uh, now the challenge of course is different, right? So, like, you know, uh, you know, cloud is about infrastructure, but, you know, AI agents is about model data and, you know, also there's more implication on the, uh, user, uh, interactions and also i- on the, on the sense of like, you know, security, uh, sensitivity, uh, sensitive information leakage, things like that, right?

So there's actual, uh, kind of challenges. Yeah.

Mehmet: Right. Now, if I want to think about it, Helen, and if I'm an executive listening or maybe watching us today, from a metrics perspective, you know, how do that looks like? Like, how you... we can describe the, um, operation reliability i- in, in the AI for the enterprises?

Helen: Yeah. So, uh, so now [00:07:00] basically the monitoring has to be become really, really fine grain. Uh, so this basically in, uh, at InstaFinder AI, we provide this real time, uh, evaluation technologies, right? So for any, every conversation, every prompt, every response, you need to actually evaluate in real time whether it's accurate, whether it's actually have like sensitive information leakage, whether it has bias, whether it has basically, uh, factually inaccurate information, right?

So those basically you need to do all those evaluations. And so, uh, you know, at InstaFinder, we have very com- comprehensive suite of evaluations. We also allow the customer to customize those evaluations for their specific use cases. Then for all those evaluations, you, you, you basically you can pass evaluation or you fail the evaluation, right?

The metrics will be actually how many evaluations you passed, right? And so, uh, so as you can imagine, this is a [00:08:00] much finer granularity of measurements, accurate measurements than traditional way of saying like availability, performance, those kind of things. And so, uh, so we also partition this evaluation into different aspects, right?

So like, you know, in terms of hallucinations, accuracy or, um, you know, fairness, bias, or basically the security trust, uh, scores, right? So this is basically, um, you, you, you, you are looking at different metrics measuring different aspects. Yeah.

Mehmet: Right. Now, always when we try to explain any, any solution, any technology solution to anyone, we try to, you know, mention...

Because I worked as a solution consultant before, so there's something we call the cost of doing nothing, right? Which is basically the risk of not solving this problem. So how dangerous is silent failure in AI systems where the model technically works because it's, it's spitting out, you know, something [00:09:00] to us, but, you know, business outcomes quietly degrade over time.

So how dangerous is this? And sorry to make it, like, a little bit long question because I think this is related. And- Mm. ... is that related to the model drift and data drift also?

Helen: Yeah, absolutely. So, um, so first of all, I think, uh, the, the impact, you asked for the impact, right? The impact can be really, really huge because, um, so now it's, uh, directly, a- AI agents, uh, uh, is expected to not only provide basically, uh, pas- passive information, they are expected very soon to take actions and, uh, um, change things and then basically, uh, provide basically immediate impact to your business, right?

So any, uh, errors during this, like, uh, uh, process is basically could be detrimental. And the other thing is that now basically we are looking at [00:10:00] multi-agent systems, right? So, uh, not just a single agent. And so, uh, a lot of, you know, agents, AI agents, they will collaborate to perform some complex tasks. And so if you have some, you know, upstream AI agents make mistakes, and they will actually propagate very quickly throughout all the, um, business units, right?

And so this is basically, at a high level, it could be re- extremely fast propagation, uh, compared to traditional, more like human to human, um, uh, kind of interaction. Uh, so first of all, they are doing things much faster. And so, uh, the other thing is that if you don't have this real-time evaluation, uh, control in place, and so, you know, unlike human, right?

So we are basically, uh, we have conscious, and so we have responsibility, we have ownership, right? And so, uh, you can hold people accountable, but how can [00:11:00] you hold a AI agent accountable, right? So it, it deletes, basically spend all your money in your bank and buy something useless for you, and how do you blame, right?

So there's a very important thing, who do you blame? You, you blame Anthropic, you blame ChatGPT or you blame your people, but you don't have people actually in place to, to supervise them, right? So I think this is very important that you need to have control in place, you need to have observability in place, and you need to have human To actually review those things in place as well.

So, um, so it's a, it's a very, um... You know, uh, for the AI, uh, system is evolved so quickly and people can really, uh, become, uh, you know, much more productive than before, and you can see, like, the productivity dramatically increases, but at the same time, the risk also increases dramatically as well. So if you don't have the, like, the- the- the right basically, uh, observability or control in [00:12:00] place, and so this will be actually, um, disastrous, right?

And so, um, so o- obviously the, uh, you- the second question you have is model drift, right? So, uh, because if you actually make the, the AI system based on the existing foundational models, like say you, you based on Anthropic or ChatGPT and the Gemini, and they, they will update the models, right? From time to time, and their model- Right

behavior will change, right? So let's say yesterday you asked, "Okay, should I buy this stock?" And maybe tomorrow they will tell you, you know, "No," and then the day after tomorrow they will change mind to say yes, right? So the model behavior will change based on the training data they have, but it doesn't mean that actually is accurate for your use case, for your scenario.

So, um, so if you don't have the model drift detection in place, you don't know your behavior changes, right? So let's say ask the same [00:13:00] prompt and you get a different answers, right? Than from yesterday, and you should be aware, right? You should be aware, okay, is this model problem or is my data has problem or is basically, uh, my business has changes, right?

So now it becomes much more complicated kind of, uh, you know, at the core of the InsightFinder is to do anomaly detection. And so, you know, we, we used to focus on, like, a cloud infrastructure application, but now we also have the anomaly detection for model, for data features, right? And so all those things needs to be actually correlate together and to see, okay, when you have the changes in your business decision, is this coming from your model behavior change, coming from your data change or coming from your v- infrastructure change?

So this is basically needs to be, uh, examined together. Yeah.

Mehmet: Right. Helen, we hear sometime also about, um, [00:14:00] organizations and especially if they are in the- into regulated environments, so where they start, you know, their projects in sandboxes, right? So it's like, you know, isolated environment, they would be, like, dealing with less amount of data and, you know, everything works perfect.

But once they go and deploy on a large, um, scale, things goes south. So why in your opinion this happen?

Helen: Yeah, that's a very canonical problem, right? So testing, uh, like y- you have so much coverage, right? So like especially in a traditional distributor system, you're testing all kinds of like corner cases in the system level.

And so maybe we understand how system behaves, right? They have limited dimensions you can actually monitor. And, uh, so that's, that's an easier problem, right? But now we are looking at AI, what more important AI is actually interacting with human. And so there's a lot of [00:15:00] unexpected things, um, could happen.

And so, um, so most of the time, like, you know, offline analysis, offline training always have limitations, right? So when you go to the production deployment, you are encounter, you know, unexpected data input, unexpected model changes, unexpected prompt, uh, kind of interactions between different agents. So, uh, all those things, basically you need to have basically production, uh, time, runtime monitoring and control, right?

So that's why at InsightFinder we laser focus on the production runtime, uh, monitoring and the, and the management. Um, so that's, that's where basically the, the real problem you can catch. And more importantly, like we started this conversation, is that no AI models is perfect because it's statistic.

Fundamentally it will make mistakes. And more importantly, what we saw is that all those foundational models are [00:16:00] trained based on general knowledge, not your specific business. So inevitably when they use this, uh, kind of like generic models in your specific business, they will make mistakes, right? So they will tell you something that is doesn't apply to your business.

So it's very important to catch those, uh, specific failure case and then basically transform that into a training data that you can iteratively fine-tune the model and to fit your environments, right? That's what we basically do at InsightFinder, is that- Mm-hmm ... we provide this end-to-end basic monitoring, uh, data training, data co-collection and correction, and then basically fine-tuning.

This kind of like very, uh, automatic closed feedback loop to continuously enhance the models that work for your cases, right? So, um, so I think the failure is inevitable and so very important thing is to catch failure first, and second is to actually [00:17:00] fine-tune your models and to adjust your system to actually correct that, right?

And so if you have both measures in place and then, um, you know, it, it... you, you, you have basically minimized the, the risk, right? So you, you know how to control, uh, the risk. Yeah.

Mehmet: Right. So Helen, I think w- w- you know, you repeated and, of course, I agree with you, couple of themes, you know, during this conversation.

So first is the risk, and we discussed why it's a risk, and the second thing you kept mentioning is observability. Um, and this is reminds me a lot of, you know, cybersecurity because in cybersecurity, of course, we talk about business risks, um, and we talk about, of course, observability. We, we always used to tell people you cannot protect what you cannot see.

Mm-hmm. Now, the reason I'm asking you this, um, are you considering yourself with what you're doing at InsightFinder [00:18:00] AI as a, you know, category s- uh, you know, a new category setter where, you know, you aim to have this AI observability part of a mandatory infrastructure?

Helen: Yeah, so, uh, so we wa- we, we basically tell our customer is that we are not just observability tool, right?

So, uh, so for us, like, you know, uh, whatever you do, like monitoring, observability, it's a tool, right? But I think what we observe from our customer is that they, they don't need a tool, they need a, a solution. And so a lot of, I think the pitfall happened in the past is that, like, people use observability tools like, you know, open telemetry monitoring like Grafana or basically using, uh, Datadog or using any other monitoring tools, and they can get the data, they can get visibility, but they don't know how to react to it and how to extract important, uh, actionable insights, right?

So that's what we do at [00:19:00] InsightFinder is to say, uh, we don't dictate what kind of tool you use and we are tool agnostic. You can use whatever tools you have, and we help you using AI technology. That's another thing, like InsightFinder from day one is AI company because we develop AI, AI models that are suitable for all kinds of observability data, right?

So whether it's metric data, log data, trace data, and so we help our customer to extract those important, uh, anomalies, right? Those are basically risk, uh, induced points you need to capture in real time, right? And so we provide this real time actionable insights to our customers so they can actually capture those important moments.

And then more importantly, we're not just observe, we actually help a customer, uh, we give recommendations what they need to do. We also have the automation in place. They can trigger those automation to fix the problem, and more importantly, we also [00:20:00] have prediction ca- capability to predict not, you know, not just detect, but predict- When something bad will happen, right?

And then basically we can help customer to prevent the problem. So I think this is basically the flow you needed for any kind of, uh, mission critical systems. Um, so, um, you need to be able to actually not just actually monitor, but also, uh, react and control, uh, the damage and then to basically re-re-recover system.

Yeah.

Mehmet: All right. Um, you know, a question that also I'm, I'm wondering a lot about, um, because when, you know, back in the days we knew where to go and who is the proper leadership responsible, you know, for, for protecting things, right? Like whether it was in the cloud era, site reliability, whether, you know, from cybersecurity.

Now, what are you seeing like in organizations, uh, with [00:21:00] this AI, you know, revolution that is up to us? Um, what's, what's being, you know, who, who's being in charge for this? Like, or are we seeing like new kind of roles created within the maybe engineering, um, organization? Uh, are there like any new leadership that is responsible for this kind of, uh, reliability for in the AI era?

Helen: Yeah. So I think it's, it's very interesting question because we see basically this becomes a very, uh, important area. A lot of, uh, companies, especially leadership teams start to actually, uh, form like new organizational new roles on that. And so we have been working with like, for example, uh, like leaders in AI platform or AI as a service, uh, divisions.

And so, um, so a lot of companies start to actually have like chief AI officer. Uh, so, um, [00:22:00] now basically I think AI becomes like pretty much incremental part for every, uh, business units. And so they, they become kind of like instead of the just isolated area doing like data analysis, we are seeing like things kind of, uh, uh, interact with all business units, right?

So then we have basically people, um, have traditional role of like CDO or basic CIO. Uh, they are more kind of like, uh, uh, tap into the, the, the chief AI- uh, kind of, uh, kind of AI domain or they hire another chief AI o- uh, AI officer to help them. Um- So the key thing here is that I think now the, uh, the AI reliability, uh, is basically become, uh, we will say like interdisciplinary because it's not just infrastructure, it's not just the data, it's not just the AI models, right?

So, [00:23:00] uh, I think very soon people will realize this is actually become a kind of, uh, uh, foundation, um, piece, you know, uh, for, for all the kind of across units functions. Yeah.

Mehmet: Right. It, it's interesting, you know, how, uh, this also changed, um, the way we look at adopting new technologies and, you know, back in the days, as you said, like there will be one department respond for this, but AI touches IT and beyond IT and, you know, like it- Mm-hmm

it goes even to, you know, even, uh, uh, compliance, risk and, and, and, and so on. And the reason I'm bringing up this is, you know, there's always this, um, I would say race between the speed of AI and deploying safely, uh, especially in, you know, regulated environments. Um, anything you can share or maybe something or a use case where you can help leadership teams who are [00:24:00] under pressure to move fast, uh, and they want to stay, you know, in compliance and to make sure that they don't break things, um, where, you know, you provide them this because of what you are building and what you have built.

Um, this, this, I would say balance between deploying fast, at the same time having it safe. And where are you seeing like the most, uh, uh, industries where they are the m- more exposed to, to AI reliability failures maybe rather than other industries?

Helen: Yeah, I think, uh, that's a very interesting point. Like, that's why when we develop our product, right, so like I said, our focus is on production runtime and so, you know, once you deploy into production, we perform real time monitoring, evaluation, and basically remediation, uh, for you.

So that's basically at the first place have the safety net. But we, we also have basically another, uh, set of [00:25:00] product called the AI Labs or AI Labs, and so we basically allow the, uh, the AI leaders to extensively, uh, play with all kinds of, uh, AI models they have, and also fine-tune those models, right?

Using real, uh, data. So essentially the labs is connected with their real data set, is connected with real, real workflow pipeline, and they can extensively experiment with these different AI models and different prompts, right? So, so we support like prompt versioning, prompt comparisons, and so we also support like, you know, three-dimensional kind of analysis, including data set- Prompt version and the models.

And so this way allow people to see like comprehensively under, you know, what scenario, right? So we do that at a very fine granularity for different use cases. So like they can have a clear idea, okay, for this use case, this model, this prompt, [00:26:00] this dataset actually works the best. So, um, so we basically give them comprehensive, basically, uh, the evaluation, um, suites.

They can actually understand how different, uh, like tuning knobs works and their impact. So once we have that, we basically can, um, can, uh, store them in a centralized place. They can actually have a very detailed view and history on, you know, kind of, uh, just like GitHub, right? So you have all the code, and so now we have basically this kind of, uh, GitHub like of kind of versioning control for all the, all their prompts and mo- and models and data.

So, uh, so, uh, so that's basically give them the kind of first level kind of, uh, uh, confidence, right? So you, you know and which part is working, and then basically you, you, you send this to basically the, the production run, and then you, you, you, you continue to monitor, [00:27:00] and then we basically collect the back, right?

So that's the tracing capability is very important because you need to understand how your model behaves or how your AI agents behaves, and then we continuously tracing all the prompts, responses and performance, and then we feed back to the lab, and then the lab can continue to do the fine-tuning. So this, again, this closed loop or feedback loop is very important, um, to basically have this kind of trust in place, right?

And so it just like, you know, we rely on like, you know, auto testing GitHub to actually track our code base, right? So it's the same thing, and here you need basically kind of the versioning control, and you need basically extensive, um, you know, uh, testing and validation before you release to production, and then you continuously monitor that, right?

So, so you need to establish this kind of like workflow, uh, for the AI agents as well. Yeah.

Mehmet: Right. If we want like to [00:28:00] imagine that we are having a look from the future, what would separate, in your opinion, a company that successfully, you know, operationalize the AI, right, from those who treated AI as just, you know, like, yeah, it's just another software layer?

Helen: Yeah. So, uh, it, it will be very, very different, right? Just like today, you can see like, uh, from the kind of software engineering field, right? So like, you know, we, we used to like just human write code and then basically we might have tools to monitor the code. Now basically this thing's completely changed, right?

So we are using AI to write code. And then humans start to actually just doing reviewing, architecturing, and then basically, um, you know, uh, be responsible to actually, uh, monitor things, right? So this, this, this, this role change is happening and so, uh, so for a lot of business, right, so y- you will see, like, human becomes more kind of [00:29:00] re- um, control and guardrail reviewers rather than doing the work, you know, uh, directly because AI can do a lot of those, like, you know, uh, tedious, repetitive, and time-consuming work.

And so this going to be actually change, right? So, uh, so I think AI is no longer just a tool, right? So AI becomes, uh, part of your workflow, right? And it... So you, you, you can treat AI, it, it's really becomes one of the team members, right? Just like at Instafinder we have this AI agent called Ari, and so Ari is doing basically this, uh, SRE work for our IT infrastructure, so monitor all the systems and then when something wrong, we can interact with Ari, ask Ari, tell us, "Okay, what's happening?

What's the root cause of those problem, how to fix it?" And similar thing here, right? So y- the... you, you will soon see, like, AI agent becomes one of your team members, and now the question is that how [00:30:00] you actually, uh, smooth streamline this interaction between the human operators and the AI agents, and how they actually interact, and what's the role changes.

Um, so if you don't do this kind of switch in your business workflow, and so you are lagging behind, right? Because, uh, obviously for those tedious, like, highly repetitive workload, AI will have much more, uh, higher productivity than human, right? And also you don't want a human to do those work because they could be actually released to do more intelligent, more innovative work, right?

So how to optimize the workflow for, for the company, and how to actually introduce basically, um, actual, um, kind of, uh, reliability services to the business, right? So all those things will basically, uh, augment eventually your brand, your, your, your... basically your business, um, kind of productivity, right? And so this is basically where we see the, the [00:31:00] changes.

Yeah.

Mehmet: Right. Um, a- Helen, you know, as we are coming close to an end, maybe final thoughts and where people can get in touch.

Helen: Yeah, so yeah, absolutely. So I think this is just beginning, and so we are in a very exciting time and I, you know, for me, I always, like, you know, focus on reliability and, uh- And, uh, you know, uh, kind of like how to make the services trustable, right?

So that's my theme, that's my passion. And so, uh, so at sim- at same time, we are very excited about, like, new technology, and, uh, it's, it definitely change how basically we work and the way we live. And, you know, um, just like 30 years ago when I talk about AI and machine learning to people, very few people actually even know AI, right?

So, and so they, they often laugh at it because they think, oh, that's science fiction, right? So you don't have... It is true back then, like, you, you don't have [00:32:00] enough data, you don't have the, the computing power to do the sophisticated inference and training, but now it's changed, right? So then AI has so much data, and we have so much, uh, computing power to do that.

But then I see, like, people go extremes, right? So they will think, oh, AI can replace human to do everything. And so, um, I think... So at this point, like, you know, AI can do certain things, um, but it doesn't mean AI can do everything. And the other thing is that I think, uh, you know, as a professor, also as, as a mother, I want to remind people is that, you know, l- as we develop AI models, as we actually, you know, rush to become the, the kind of the leader on the AI race, but the most important thing, you need to educate people, right?

So, like, you know, uh, still, you know, I don't think the AI can replace the creative side of human. And, uh, so, um, we need to actually educate [00:33:00] our, you know, kids to, to learn how to use AI and, and, and educate ourselves, right? So it's, uh... You know, anybody is new to AI technology. We need to learn how to use it, but, uh, we need to use it responsibly, uh, and, uh, and, uh, responsibly, right?

So it's, it's very important is that you need to be aware AI will make mistakes. That's statistical models, right? So as we see at the right beginning, don't trust the AI, like, 100%, right? So, like, I saw people go extremes, right? They used to think, oh, AI is all science fiction. Now they think AI is all like, you know, God, right?

So, so, so, so we need to be aware, right? So, uh, there's risk here, there's reliability concern, uh, here. We need to actually put that in place before we actually fully embrace AI technology. And, uh, um, on the oth- on the other hand, I do think basically, uh, it's very important to, [00:34:00] to be aware that how to actually educate our, um, uh, you know, um, people to learn how to evolve and, uh, how to actually, uh, work, co-work together with AI, right?

So I don't believe it's AI replace human. It's more like, you know, how to actually collaborate Uh, with AI to become more productive and make our life better, right? So yeah, so if, you know, if you're interested in this topic, happy to reach out to us, right? So, uh, so you can contact us through insightfinder.com, and then basically, uh, we will happy to actually, uh, share our, um, you know, i- insights or share our experiences with our customers, um, you know, uh, similar like u- use cases, uh, in this domain.

So we definitely see many, uh, interesting real world use cases. Yeah.

Mehmet: Right. Uh, I want to thank you for, you know, the way you concluded things, Helen, especially about the point of, [00:35:00] you know, how we look at AI. It's as you said, like people think it's the wizard, the, the thing that is like super intelligent that never make mistakes, but we know that it still makes mistakes.

I advise people to go read books, non-technical books about AI, because this is what I'm... Although, like I come from technical background, but this is like eye-opener for me to understand about bias and why actually these LLMs makes these mistakes. Understand like, you know, a little bit on a high level why they hallucinate, like h- h- the logic.

You don't have to be too much technical, and I think this is important. And even myself, you know, um- I s-- I fell in this at the beginning. I thought, "Oh, yeah, the AI is, like, 100% gonna give me right answers." But of course, you know, I figured out quickly that no, I need to go and check, you know, what it's giving me.

And think this about a personal side. Now think about a company that maybe manages, you know, people's portfolios in, in a financial institution, or maybe they are like a [00:36:00] hospital and it's people lives on the, uh- Yeah ... what we're discussing, right? So I agree with you on this. We need to educate ourself. Of course, creativity, it's, it's coming from humans.

And of course, AI can help us maybe put our ideas, but it's still, it's our ideas. Um- Yeah ... right? So I really want to thank you, uh, Helen, for sharing this. And of course, the whole conversation with you was very, um, you know, very impressive to me and, you know, learning about this new field, and it's very important field.

And of course, again, I put the links in the show notes, so for people who want to go, they can go and check the website. And this is how I end my episodes. This is for the audience. If you just discovered this podcast by luck, thank you for passing by. I hope you enjoyed it. If you did so, give me a small favor, share it with as many people, uh, as you can.

And if you're one of the, one of the people who keeps coming again and again, thank you very much for, you know, the support, for keeping, you know, tuning in. I hope I'm doing good job, [00:37:00] and I want to thank everyone who tunes in. And because of you, we are able to be in the Apple Top 200 podcast chart across multiple countries.

I'm hoping to see new countries also as soon as possible. But yeah, I do this because, you know, I believe we can share knowledge, we can share information, and make this world a better place. So as I say always, stay tuned for a new episode very soon. Thank you. Bye-bye.

#600 AI Reliability Is a Business Risk. Not Just an Engineering Problem | Helen Gu

Listen On

Featured Episodes

Recent Episodes

Support On