Mozilla CTO:
Open Source AI Agents and the Fight for Control
AI agents operating today inside enterprise systems access data, execute workflows, and make decisions.
Thank you to Cornerstone onDemand for supporting CXOTALK.
Cornerstone Workforce AI helps organizations connect workforce data, workforce signals, skills intelligence, labor market insights, and agentic AI into a unified decision-making layer for workforce readiness.
Explore how organizations are moving from static workforce planning toward continuous workforce intelligence with Cornerstone Workforce AI.
AI agents have become essential enterprise tools. The platforms companies choose now will decide whether they own their AI or depend on big tech. Mozilla CTO Raffi Krikorian argues that open-source AI agents are the key to real independence.
Key points:
- AI agents act inside your systems, and most enterprises have limited visibility into whose interests they serve
- Building on closed, proprietary platforms means the vendor controls your agent's behavior, data access, and roadmap
- Krikorian explains the open-source alternative and what enterprise control of AI actually requires
AI agents operating today inside enterprise systems access data, execute workflows, and make decisions. But the organizations deploying them often have limited visibility into what those agents are doing or whom they serve. It’s a security and privacy nightmare waiting to happen.
Raffi Krikorian, Chief Technology Officer at Mozilla, argues that the platform choices executives make today will determine whether their organizations own their AI infrastructure or become permanently dependent on a small number of technology companies that do.
Krikorian brings an unusual mix of credentials to this conversation. He rebuilt Twitter's core infrastructure at scale, led Uber's first commercial deployment of autonomous vehicles, and served as CTO of the Democratic National Committee before joining Mozilla, the organization that built Firefox and challenged Microsoft's control of the early web. Mozilla is now applying the same open-source strategy to AI agents that it used to break open the browser market two decades ago.
What we will cover:
- Why AI agents create a different category of enterprise risk: unlike every prior software tool, agents initiate action rather than wait for instructions
- The strategic cost of building agent infrastructure on closed, proprietary platforms, where the vendor controls behavior, memory, guardrails, and roadmap
- What Claude Mythos revealed by finding vulnerabilities in critical open-source software, and why the most exposed organizations are not always the best equipped to respond
- The governance and permissions gap: no mature standard yet exists for defining what an agent can access, what actions it can take, and how those boundaries get enforced and audited
- Why the frontier model race may be the wrong question: how small, locally deployable models running on enterprise hardware change the build-versus-rent calculus for CXOs
- What Mozilla is building and what a credible open-source alternative to Big Tech's agent platforms looks like today
Episode Participants
Raffi Krikorian is the Chief Technology Officer at Mozilla, where he leads the organization's open-source AI strategy. His career spans some of the most consequential technology deployments of the past two decades: VP of Platform Engineering at Twitter, Director of Uber's Advanced Technologies Center, where he launched the first commercial self-driving fleet, and first-ever CTO of the Democratic National Committee. He also writes Owners Not Renters, a Substack on open source AI, and contributes regularly to The Atlantic.
Michael Krigsman is a globally recognized analyst, strategic advisor, and industry commentator known for his deep expertise in business transformation, AI, and leadership. He has presented at industry events worldwide and written extensively on the reasons for IT failures. His work has been referenced in the media over 1,000 times and in more than 50 books and journal articles; his commentary on technology trends and business strategy reaches a global audience.
In This Episode
Renters versus owners in AI
Raffi Krikorian: Pinterest deployed open models instead of closed and saved something in the order of $10 million that quarter alone.
Michael Krigsman: Open source software built the internet, but it's losing the AI war. Raffi Krikorian is CTO of Mozilla, which makes the Firefox browser.
Raffi Krikorian: Most enterprises are renters, not owners. They're talking to either big platform companies, they're making API calls that are off-prem, they're doing all these things. You're basically turning over your destiny over to a different system that you might be making a call to. So in that definition, yes, most of them are renters, and we're starting to see a bunch of this even inside of Mozilla of how that renters versus owners situation is playing out. You know, I was just looking at one of our Claude code leaderboards within Mozilla.
We're using it on one of our subsidiary companies, Moz VC, and one of the engineers effectively could have racked up a $10,000 API bill, this month, but thankfully we're paying for the $200 a month subscription. But you know that subscription cost could change at any moment. It will probably change post-IPO. So all those things I think in my mind collapse into this, renters and not owners framing. But Michael, I want to touch on that first one.
I think the question of whether open source is losing the AI war is more of a question of just what do you think the war is? If it. Is it losing on a bunch of leaderboards? Then sure, we're. It's losing on a bunch of demo cases or the extreme cases. The question for me is really, is it going to win on the everyday use cases? Is it going to win on the things that most enterprises care about, most companies care about, most regular users care about?
It's probably losing, or definitely losing on the frontier bo- battle, but in the same way that, you know, we would've said at some point that Linux lost the desktop, but at the same time Linux effectively runs every computer on the planet, and then there's a few iPhones lying around kind of thing. And so I think we could be in a similar situation when it comes to open source, open weights, and the AI world.
And on top of that I really think that, you know, I think there's a lot of evidence that shows that the open models are compressing their difference between where their closed models are. So I think this battle is long from over, this war is long from over. But I think you're right, in today's parlance it might be losing, but I think the curves are starting to compress somewhere.
The risks of depending on closed models
Michael Krigsman: Walk us through the issues, the problems when an enterprise is essentially beholden to the major model makers, to closed systems.
Raffi Krikorian: A lot of this comes down to just a question of what does it mean to control your destiny? What does it mean to control the optionality that you're sort of working in? So if I were to rewind the clock, I think that when we were all talking about cloud and cloud deployments, I think we all got into this position where we're just "Well, we can't just have one cloud provider." We in the enterprise case or the people who are trying to deploy large systems case we need the ability to hedge.
We need the ability to choose and be able to play off vendors from each other, do all the things that we want in order to make sure we're getting the right value for our cash, we're getting the right features that we want. So you know, in a lot of ways we didn't build raw to people like AWS and others. You use tools like Terraform or something else in order to build an abstraction layer so you can quickly move between them.
So when AWS shows up one day and it's "I want to give you a huge bill this month," we can be "Well, fine, we'll pay it, but next month we're moving," and then that starts a negotiation. And I think we're in the same place right now when it comes to all these open. Oh, sorry these AI tools in the enterprise.
You know, Zapier just did, I mentioned this in my newsletter this week, Zapier just did a survey of talking to people, talking to enterprises about whether or not they feel like they could switch off their AI provider. Something like 85, 86% of them thought they could, but when they actually tried to do it only about 30% of them could actually pull it off.
So, there's already so much lock-in happening, and so you get into a situation where these things aren't under my control, and then these, and then the providers can just change model behavior at any given moment. If they're having capacity issues, they could shuttle you off to a different model that might behave differently, so therefore I've lost control of even just how I'm programming against this. And then that thing I mentioned earlier, which is costs right now I feel are both a little, both kind of inflated, but also a little out of control.... I would've predicted a few months ago that token costs were going to go down. They kind of have, but token usage has skyrocketed, and a lot of that could also be because of model change and how new models and new harnesses just need more tokens, which is in the benefit of the people I'm buying from. So I feel like all these situations if I were an enterprise buyer would be "This seems a little out of my control right now," and that, so I think that's what I really mean.
Michael Krigsman: So would it be accurate to say the issues that you've brought up are loss of control, of the model, which is ultimately all-important here, loss of control on, pricing and obviously not token usage because that's going to be driven by the user, but essentially you're at the mercy of the large platform providers?
Raffi Krikorian: Precisely. I think the models could be changing at any given moment, so therefore writing code against them or write, or trying to codify your behavior against them, it's not under your control, it's under control of the provider. That seems not great. It might be fine in a prototyping sense, but not in a large-scale deployment sense. I think what you, what we mentioned on the costs, I think that's definitely one thing that's a little bit out of control. And I also do think token usage is a little bit not fully in your control.
Sure, you can decide how much or how complicated a task you're asking, but once I set off my, once I set off my initial prompt, all the different things that happen in the harness could just exponentially grow, and that is also a little bit out of my control.
Solving lock-in through choice
Michael Krigsman: Folks, we're talking with Raffi Krikorian, who is the CTO of Mozilla. You can ask questions. If you're watching on LinkedIn, just pop your question into the chat. So Raffi, one of the points that you mentioned was the lock-in, is how big an issue is that? And how does open source solve it? And what are the solutions in general to this lock-in platform problem that you were describing?
Raffi Krikorian: I do think it's a big problem in the sense that if I'm running software within my business, I want to know that I have optionality. I want to know that I can move between different people depending on feature sets, depending on exactly what I want to go do. I don't think it makes sense for people to assume that one provider is going to do everything for me. I want to be able to grain and tune it so I can use the features of a provider that match the things I need to get done.
So in some ways, I do think lock-in's a big deal. I think the Zapier survey is a good example of just how people are implicitly getting locked in. They. I don't think they willingly walked into that situation. In fact, they believed they could walk out of it, but it turns out to be quite hard to go do for a variety of different reasons. And so I think open source is one of the solutions, it's clearly not the only one, but I think it's one of the different solutions.
You know, the open source models are both getting really quite good at what they can do in enterprise work cases. We've been experimenting with the open models within our Mozilla.Ai group, and that group is building developer-facing tooling and enterprise-facing tooling, so they've been understanding and better using what those models look like. I'm using an open model on my day-to-day coding tasks, like I run Qwen 30B on my laptop, especially when I'm on United Wi-Fi and that thing doesn't work. So it provides me a really good experience for my laptop.
So I can pick and choose the kinds of things I want to go do. So if you already live in a world where your assumption is, "I should be able to pick and choose," then the open models are give you that ability to do it. And then we've been working on a set of products, one of them that we call Otari, it used to be called Any LLM, and it's basically a router that you can install on your desktop or it can run in that cloud environment, that if you point to it, you.
It can then switch rapidly between different models depending on exactly what you want to do or what your particular workload looks like right now. So just even embracing the idea that you should be talking multiple vendor, multiple model, and just going into that mentality, I think gets you into a more stable spot, and then you can start really having a conversation of just this model for this use case, this model for this use case, this provider for that use case. It just seems like a more resilient place to be.... And if you're in that space, then open source has clearly got a part of that solution. Might not be the entire thing, or for some people it might be the entire thing, but you can then be the one who chooses.
Michael Krigsman: The focus then for you is on the user's ability to make those choice decisions- Correct opposed to leaving it to the third party vendor, essentially.
Raffi Krikorian: Correct. I mean, if you were to abstract away my entire position here, it's basically some question of choice and, competition. How can you actually have a user, end user be the one that's making the decisions, and how can they have a vibrant ecosystem that they can be choosing from, and actually selecting the things that they want to be using from that menu of different things? So if we can get to that world of open systems combined with open source that allows for choice and competition, I think my job here is done.
We are very far away from that world right now. There are lots of things between here and there, but that's the world I'm trying to get to.
Running your own evaluations
Michael Krigsman: We have some questions that are coming in, and so let's jump to those. The first question is on Twitter/X from Anthony Scriffignano, who is a prominent data scientist. He does a lot of work with, consulting for various agencies with 3 letters and things like that. And Anthony says this. He's been a guest on CXO Talk a number of times. He says, "Drilling down on your comments about model change without notice, please talk a bit about the challenge of regression testing, simply proving that things still work the way they used to work.
Are we losing the battle?"
Raffi Krikorian: Last year there was a really big push into d- building evaluation suites, building toolings to do evaluations, et cetera. Evaluations are incredibly important. The academic compu- community does it all the time. And I think what we've lost the muscle of, and we need to figure out how to bring back, is for all of us to be doing our own evaluations on the workloads that we really care about. So, you know, in some ways, we look at evals all the time.
We look at the coding bench, we look at all those things to understand how well models are doing, but those are very much in the abstract cases. And frankly, they can be, I mean, I'm hesitant to be saying gamed, but they can be tuned for. You can tune yourself to be able to pass a particular benchmark.
But if instead what we were doing is recording all the different prompts and all the different tasks that we're doing locally, and then creating tooling that allows for model evaluation ourselves, then we can easily figure out, or maybe not easily, but at least have a framework to figure out whether or not a model selection choice or an agentic harness choice, whatever, how that's going to perform on my actual personal workload. So I've actually been thinking a lot about this. I actually wrote. You can find it on GitHub.
I built this thing I called Morph, which is basically Git, but for prompt-based workflows. So if you think about what Git does, Git allowed me to take-. Computer syntax that was then compiled down to bytes and stuffed those bytes into a version control system. I don't think that really works anymore on these LLM-based coding tasks, and what you instead need to do is you need to record the prompt, and then record the output of the prompt, in that case, which is code. Bytes might be the wrong abstraction level.
So if you start doing that, so if you start using version control to m- to manage what our prompts look like. What's the prompt, what's the harness, what's the model, when I ran it, all the settings and stuff like that, then we can have reproducible use cases of understanding what my real workload looks and then I can start playing that back against other models or other versions. For example, Opus 4h just came out. Everyone is super excited by it.
I'm super excited by it, but one of the first things I'm going to do this weekend is I've been recording all the prompts of all the different pieces of code that I've been tinkering on, and I'm going to run those prompts against Opus 4h so I can just then start doing comparison of just oh, actually if I had this model six months ago, this is the direction this piece of code would've gone into.
So if we can start thinking about how to build those kind of workflows, both in the enterprise case and in the personal coding case, then I think we can get to a more intelligent conversation of this model is actually better. Because, you know, at some point you might need the Opus 4h, or in some cases, you know, Qwen 353B might be just good enough. And so understanding what that balance looks we can only really do that with data, so we just need the right tools to record that data, I think.... Michael Krigsman: I'll say amen to that. I mean, this whole situation is very ad hoc right now, choosing models and figuring it out. I personally right at the moment pay for the $200 plan for Anthropic and the $200 plan for Perplexity.
Raffi Krikorian: Sure.
Michael Krigsman: And I say to myself, "Who really knows?" Anyway. Yeah,
Raffi Krikorian: Exactly. I mean I mean, if you look at me, I mean, like I'm spending more money on different AI plans than I spend on my cable bill or my television streaming bill right now. Because to your point, I'm just "I guess I'll try them all and just see which one works best," but we need a better definition of what works best means for Raffi, what works best means for Michael, so that we can actually then have the tooling that can figure it out for us, stuff like that.
Will tiered pricing survive
Michael Krigsman: We have some more questions coming in, and this is from Chris Peterson, and I encourage you guys, ask your questions. Take advantage of this opportunity. All right. Chris Peterson, again on Twitter, L- X, says, "Do you think that tiered usage plans will continue long-term, or will all the big AI providers be forced into per-token pricing as venture capital wants to exit and not keep funding?"
Raffi Krikorian: There is no clear, exact answer, but I try to look for equivalent analogies. So I used to work at Uber. I ran the self-driving car division there, and you could see it even in Uber that post-IPO, the prices changed. All the subsidization that we did of Gen Z just basically disappeared, and the real true pricing showed up in that situation.
And I'm likely to believe that this is a similar marketplace, that if we- all that subsidization, that $200 plan that my Moz engin- my Moz AI engineer who then did the analysis "I would have spent $10,000 on API calls," think it won't go to the $10,000 case because I think that sort of lo- locks the market, sort of limits the market. But I do think it's not going to be these $200 plans.
I think that price either floats up significantly or we go into this per-API model, at which point you need to then start making the decision of do I use Opus, do I use Sonnet? Do I run it locally? Do I do something else? So I think that I think the best situation is to start thinking about those evalu- those kind of questions now to try to find the right model for the tasks that you're doing, because I do think the price will change. I think you're right. It's just unclear exactly how it's going to change.
And if you think that I'm wrong, and I might be, then you can make other different arguments of just do I really need to spin up an entire data center's worth of GPUs if I'm just "Write me a piece of code that says 'Hello, world.'" Right? Then just as good engineers, that feels kind of icky. And so trying to figure out what do we think is the right engineering size for the right tasks I'm doing could be another angle to look at it then.
Michael Krigsman: I always want to use the best possible model because if I'm asking an LLM for a solution, I want. I have some intention, some reason behind it. I want the best possible result.
Raffi Krikorian: I don't disagree with you. I mean I've spoken to. You know, I was speaking to an engineer the other day who was making the argument that his bosses only allowed them to use the frontier models because they have a belief that it'll make them more efficient, that the frontier model will probably give them the right answer the first time, and they're not going to sit there and whittle it away. And I, and I guess my response to that is just it's a good belief, but where's the data behind it?
So this is, this is why I'm really harping on the, we need to be recording our prompts. We need to record what models are on. We need to personally do our own evals of this, so we need good tooling to do that so we can actually then make these real logical trade-off decisions of just I think, I think I agree with you, Michael.
I would like to d- be using Opus 48 the entire time, until I get the bill, at which point I'm just "Okay, but was that really worth $1,500 this month, or could I have done that with $500?" That kind of thing. So I think it's just when the rubber meets the road fitting the right size is going to be the thing that we all need to do.... Michael Krigsman: Here's a question on LinkedIn from Tim Crawford, who is a major CIO advisor, and he says, "On the topic of pricing, it seems that outcome-based pricing is not granular enough, but token-based pricing is harder to understand."
Raffi Krikorian: Sure.
Michael Krigsman: "Is there a good alternative? It seems this is part of what is causing the drive to locally sourced AI platforms."
Raffi Krikorian: We've been facing this challenge as we figure out pricing on our different products as well. So the Moz AI team has something that they've released that they call Octonous. It's a fully open source, agentic workflow system. So, you know, think of it as a really good agent with really good UI that helps you inside your enterprise connect your email to Notion, to Zapier, et cetera, et cetera, et cetera. And right now their pricing model is being. Is a bit wonky, because we're trying to figure out how to do outcome-based pricing as well.
It's just like what does it cost to price an action as opposed to all the thought and thinking needed in order to make that action go, and what if the action is super complicated and stuff like that? So I would love an answer to that of if you understand a better way to go do that I'm fully open to it. We're just facing the exact same problem. But token-based pricing is so crazy in my opinion.
I was actually having this conversation, you know, Mozilla has a partnership with Mila, the Canadian research lab, and I was up in Montreal just, you know, last week, and I was having this conversation with a bunch of AI researchers and I was "Why do the models effectively when they're doing their internal thinking do it in English? Why is it. Why am I looking at a train of thought in English and then and then it talks to me in English or whatever is the native language of the point?
That clearly can't be the most efficient way for it to be doing itself." And they all looked at me and they're "No, that makes a lot of sense. You can probably come up with a better thing that you're doing for inspectability." But then I'm "But then why am I paying for the model thought to be happening in English if you told me there was a better way to do it?" So I think this whole per token pricing is just frankly too confusing.
It's probably not going to be the end run, but I don't think anyone has solved how do you tier actions, how do you tier impacts? I don't think anyone has solved that just yet, but I'm very excited if someone does because I'm going to copy it, just open source it.
The token explosion and harnesses
Michael Krigsman: One of the things I find most fascinating about all of this token discussion, is although the cost per token is coming down, the expense of tokens is going up because we realize, hey, we can do all of these new things, and so I'm willing to spend more money. I mean, it's kind of a crazy set of economics that way.
Raffi Krikorian: And I don't think a lot of people understand the impact of the different harnesses that you might choose in which to run your models in. OpenCode is going to have a slightly different behavior than Cursor is going to have, which is going to have a slightly different behavior than Terminex is going to have. All those things, I don't think most people understand that. But in a lot of ways those harnesses are also part of that.
Well, I'm not going to call it a problem, but part of what's causing this token explosion is because of the way that they're going to do their internal calls. Are they going to allow for a call depth of 20 or a call depth of 100 and stuff like that? MCP is part of the issue of just using natural language to figure out should I even make this call in the first place and having to do all that thinking.
We've built a whole almost Jenga pyramid of tools that's all trying to do communication via English, and so that token cost is getting a little, it's getting a little crazy. There must be better solutions. I'm not smart enough to know what there is, but there must be better solutions.
Automating prompt tracking and privacy
Michael Krigsman: On the subject of tools and the tool that you were describing to help keep track of and audit prompts and results, it seems to me that it needs to be automated in some way because otherwise it simply becomes too com-, too complicated, and then you end up with an entire set of privacy issues.
Raffi Krikorian: If you look at for under my username on GitHub, which is just R, but under R in GitHub is a project called Morph. But, and so the way Morph works is that when you init your directory, you then tell it which is the agentic framework that you're going to be working in. So you can do you know, Morph set up Cursor or Morph set up OpenCode, so it'll hook itself in and then just start recording all the prompts as you sort of put them out there. But.... And the nice thing of doing that is that you get in a standardized format. If you're working a team scenario and someone uses Cursor and someone uses OpenCode, it all normalizes. You get them all in a single database format, stuff like that. But you're right I mean, the problem is that a lot of people are, I'm not going to s- call them lazy, but they don't switch out of the tool depending on what they're doing.
So at one point they might be actually typing in a prompt to do a programming, a programming, but then they might also type in a prompt because they just want to ask a question of just "Well, I'm already here, so I'm going to keep on going.", So yeah, there's a massive privacy question that comes out of that. The issue there is that you can't really scrub it, because if you scrub it, then you're fundamentally changing what the prompt is, which might make the eval a bit problematic.
So I haven't really thought about how this works outside of personal context, but we clearly need to figure this out on the enterprise context, et cetera. So maybe it's a little more of a social question of when you're doing an enterprise, you can promote the prompts that we should be using as part of the eval set, as opposed to just doing it across everything that happened on my computer. So maybe there are things like that.
But, you know, I f- I feel like this will fall into the same problem of, you know, people use their web computers, their, sort of, work computers all the time to surf the internet, and they have bookmarks on them. So we're going to have a similar problem. The obvious issue being that people ask more detailed questions and prompts than they do the web, but we'll just have to solve that.
IT as HR for AI agents
Michael Krigsman: This would be an excellent time, too, to subscribe to the CXO Talk newsletter. Go to cxotalk.Com so we can notify you of upcoming shows. Okay. So this is from Arsalan Khan, and he says, "All these AI agents are creating confusion for the employees. Should there be an AI agent that addresses this confusion? Should we have a limit of how many AI agents can be created in an organization?" I like this, a very enterprise type of question.
Raffi Krikorian: There's the analogy that I'm a very much a fan of, which is the IT team is slowly becoming the HR team for agents effectively. So I do think that if I were inside one of these enterprise environments right now, I would be putting a lot of weight onto the IT team to actually figure out, like-What are the standards that we want to go with? What are the permission models we want to work with? How do we do observability on what all the different things that are happening are?
And so I think if we can figure out how to centralize a bunch of those inside the enterprise environment, then I think we can sort of have good policies that then can control the way that things are being rolled out. I mean, this is a new frontier that we're all sort of experiencing, not just on the builder side, but like you were saying, on the deployment side. So understanding and figuring out how to open source those policies would actually be a really good thing.
The question I keep on having people ask me is just how do you manage write versus read permissions, like the dangerous triangle that sort of happens within agentic frameworks of they can read things and they can actuate things? And how. If you can't, if there aren't some transactional boundaries across my entire enterprise, how do I make sure I can roll back and. Or how do I make sure these are actually idempotent so if an agent does it 3 times I don't accidentally bill myself $1,000 each time? Kind of thing. And all open questions.
I think we're all racing into this because we see what the potential power looks but, you know, it is not just a coding problem. This is a whole socio-technical economic commercial thing that we need to be solving together.
Who owns token costs in the C-suite
Michael Krigsman: On that topic, Arsalan Khan comes right back and he says, "Who should care about these token prices in the enterprise? The CFO, the CIO, nobody, everyone?"... Raffi Krikorian: The responsible thing to do is to sort of everyone sort of needs to care about it. But I think you need to be a little bit open-minded about it. So I think that people going on their whole token maxing spree, obviously that could be problematic on my, my end, my end bill. But I think we need to understand. You know, I think of myself, if we just look at myself, I think of myself as a pretty decent engineer. Maybe not the best engineer, I'm maybe a little out of date.
But these systems have made me incredibly productive. A lot of it's because I have a lot of experience in software architecture already, and these systems are amplifying my work, and I don't think I'm token maxing like crazy, but there is a cost to doing it. So I do think that a CFO in a similar way, that he or she is looking at what the head count costs look what is the value we're actually getting out of the employee workforce?
Are we getting the right value for the number of employees on what the outputs and the outputs of the organization look like? I think they are a part of the conversation. But I think that given that the world is moving so quickly, I don't think it's purely the CFO. I think a CFO in partnership with whatever is the right CXO function, the CIO, the CTO, whoever is the right thing I think is a partnership there of understanding where we are today, how the trend lines are going to look tomorrow, what the money impact looks like.
So I think that it's a, it almost requires a full C-suite, potentially even at some point a board level conversation of just how do we want this to work inside that organization?
Michael Krigsman: Yeah, and to your point, we don't have great tools for aligning the inputs with what we ultimately. With the data that we need in order to make those evaluations.
Raffi Krikorian: No, I mean, I think there's going to be a whole cottage industry that can potentially pop up here of just building s- HR-like tools, but for agentic workflow. So just how much are we actually spending here? What's actually coming out? How many are we hiring every single quarter? How many are we turning off every single quarter? Who are they all reporting to? What are the rep- I think there's some actually pretty good metaphors that we can use.
We shouldn't obviously follow them to the T, but there's some pretty good starting points, and you can imagine a whole cottage industry of building those tools so that CIOs, CFOs, et cetera, can manage their new agentic.
A hybrid model strategy
Michael Krigsman: Let's grab another question, again from LinkedIn, and this is from Swami Vaidyanathan. And, he has an interesting question. He says, "Would enterprises develop or should enterprises develop a hybrid approach between renting commercial models and also managing locally hosted models and leverage them based on usage case complexity, cost, risk, and other factors? How do you envision the target operating model?" And I think, Raffi, you were getting to this earlier, describing the, deploying multiple models based on cost, use case, and so on.
Raffi Krikorian: I definitely think that a responsible organization should get to that. I think when you are just starting off, like if you're a startup, then I think you should do the most inefficient thing you can, but move very quickly to figure it out. I like this analogy all the time of, you know, I used to run a big part of the Twitter engineering team, the entire infrastructure group back in the day, and my team was the team that did the transition from Ruby on Rails to the JVM for a lot of the code base.
Let's call this in 2010, 2011. And a lot of what happened after that migration when we were giving public talks about it is we'd have new startup engineers coming to us all the time being "I want to start writing code on a JVM today. Can you help me?" And I was "No, no, no." Like "You should be lucky to get to the place that we had this conversation about how to optimize everything.
You should do the thing that you can do the fastest." I'm not saying Ruby on Rails was a bad thing, I just said Ruby on Rails was a bad thing at this moment of Twitter's evolution. So back in the beginning stages, you should be doing the thing you can do the fastest and you can learn the most from. And in those cases, it probably is just use the frontier models.
Figure out what you're up to, figure out what you're doing, and then when you've reached some point of scaling or efficiency or stuff like that for me is the point that we start thinking about what's the right mixture of models? What's the right mixture of providers? Can I do something local and not? And you know, a good place to start, frankly, is that if you're buying your, if you're buying your engineers, you know, MacBook M4s, MacBook M5 laptops-They can probably run a pretty good local model to do coding.... I do it all the t- I live on planes, and I do it all the time because United WiFi never works. And so I'm always running a local model on my laptop, and that's usually my first resort. My first interaction when I'm doing my prompt-based stuff is with this Qwen model that I probably have on my laptop running right now.
And but then if I have something more complex, or if I seem stuck, or ex- and I have a good network connection, I might then make a callout to cl- to Claude, to Opus 4.7, Opus 4.8 To just be "It's stuck. Help me out here for a second," and then I just go back to my day-to-day workflow. So I think you can find small ways that it's showing up already, and that you can make it possible to use already.
But I also, just being pragmatic about it, I really encourage that when you're off starting, just choose something that lets you move quickly, and then figure it out from there.
Governance shaped by organizational maturity
Michael Krigsman: That's very interesting. You're bringing in an element of governance that we don't usually think about. CIOs typically think about governance as being cost control, security, privacy, things like that. But you're really bringing in another dimension as to, based on organizational maturity, size, and goals, thinking about the type of model, which gets to both the results, efficiency, and, cost savings.
Raffi Krikorian: All those things matter a lot, and it is actually one of the things to think about as you're deploying on-prem versus using a platform provider. I want to be protect- I want my data to be protected by architecture, not legal handshakes. So the, you know, the platform provider's privacy policy might say, "We will never read your code," but I want to know for sure it's not going to happen, so therefore a local model, locally hosted model could be a path in that situation. But I do think.
I don't think it's necessarily a unique thing to say that we should be working on this evolutionary curve. I just generally think of it as a what an engineer would do. You shouldn't have prematurely optimized your code base. You want to get something working to an MVP as quickly as possible, because that'll help you understand the shape of the problem. It'll help you understand the bounds. So get to that MVP that you can ship quickly, and then we figure out how to optimize from there.
Owning the full AI stack
Michael Krigsman: Let's get to another question, and here's an interesting one from Noah Crowe on LinkedIn, who says, "What has to happen before an individual or a small organization can realistically own and operate- Yeah a capable AI system end to end, including the model, memory, data, orchestration, and inference, without dependence on a major cloud provider?"
Raffi Krikorian: The analogy I like to talk about all the time is when I'm just doing my weekend hacking, and I just want to do a quick prototype of something, I'm probably calling ChatGPT, or I'm tr- calling, you know, GPT 5.5, As part of the API call, because it's just the simplest thing to do that I can just make sure something's up and running. I don't have to debug anything else. And I think that's a vast majority of developers out there. It sounds it sounds like it's a question for you too.
And right now it's because there is no good, credible alternatives to the entire system that someone like an OpenAI or an Anthropic provides to you. It's not just the API. If you really think about it's like they already have the GPU. They already know how to orchestrate it. They've built the RAG layer. They have tool calling. They've done all the stuff, that they've just abstracted behind the scenes for you. So just having the model is not enough. Now, if just having the model was enough, you could do a lot.
You know, MLX on the sil- on Apple Silicon works really great. Ollama works really great. There are ways to go do this. But if you want to really think about this in an SMB or an enterprise situation, which is like I have a rack of servers in my closet. Fun fact, it turns out that something like 70% of enterprise GPUs sit idle most of the time, so I think there's a big opportunity to do something here. But if you have a rack of GPUs sitting in your closet. I'm, I'm pointing to my closet right now.... A rack of GPUs sitting in your closet, or you have a rack sitting somewhere in the, inside the, inside the building, and you want all your developers to be able to make use of it's not easy to set it up. But just being very brutally honest. You got to choo- you're probably going to do Kubernetes, but then you got to choose what's going to be your RAG layer. You got to. Maybe it's DeepSeek.
You got to choose all the different components, and then you got to do a weekend or a week's worth of SRE work in order to get this all working properly. And so this is actually one of the problems I'm working on right now, of just how can Mozilla help figure out how to get this, like. You know, I don't know what's a good analogy for it. The LAMP stack for inference, or a J2EE stack for inference.
How do we actually build a sim- a opinionated system that could be installed that then someone can just get off the ground and start using? And when there are small things they want to change, you know, you just apt-get install something else, and just swap out that layer. So it doesn't really exist right now, but I think it could.
You know, if you think about what was the brilliance of Ubuntu, you know, there is a world of code out there in order to make a Linux distribution, but Ubuntu had, you know, a little bit of taste-making of what are the packages that we can fit into a CD that someone can pop into their machine and just get up and running? And I think someone needs to build something similar to that for an open source stack, of just let's just choose the right components that make up this LAMP stack for AI.
Let's just bundle it together. It's something someone can just pip install, apt-get install, whatever, onto their server, and it's up and running. And then if we want to tweak it, customize it, et cetera, then you can do that next. But this is a thing that. One of the things that Mozilla's working on right now, is how do we just build that container code base that then someone can just apt-get install? But-I wish it was done already. Not done yet
Filling the gaps in the open stack
Michael Krigsman: Chris Petersen on Twitter says, "Is Mozilla providing support for open source model and harness developers or organizations that are building open standards to empower agents?"
Raffi Krikorian: We are not playing as much in the open model world as we are in the rest of the stack, and only because a lot of people are already playing in the open model world. So like I really view that our job right now is to try filling the gaps in so we can rapidly get to that thing that you can apt-get install and have it just working inside your rack in your data center. And so I don't think models are currently the problem there.
I think the problems are the other things you mentioned, of just like how do we build an open system so we have interoperability between all these different layers of the stack? What truly are the missing layers that need to get invested in? I think the agentic harness is this n- this new thing that we're now all talking about that's an interesting opportunity, but then that still needs a fully open stack underneath it for you to have real trustworthiness around what's truly going on inside that system.
So I think it's that open system, the interoperability things that we're going to be leaning in on the most, and trying to make sure the community is doing the rest of the pieces. If you look at one of my newsletters recently, I write a newsletter called Owners Not Renters, we did a whole analysis of what an open stack could look like. So we took, you know, everything from bare metal all the way up to, at that point, providing an OpenAI compatible API. I think we just released another version newsletter that also had the agentic harness.
But what that layered looked like from hardware all the way to an OpenAI compatible API, broke it down to all the individual components, and then literally did a catalog of all the open source projects that fit into every single one of those layers. And it turns out almost all the layers are covered. Like you could do this today. I mean, the map is kind of red and screaming at us when it comes to enterprise readiness, but all the pieces exist.
What's missing right now is making sure they're all compatible with each other, that we can all tie it together in a very simplistic and easy-to-use kind of way, and then some battle hardening so that like an IT team wouldn't, you know, vomit when they saw this thing show up on their doorstep. And so I think like that's, that's the kinds of stuff that I think is Mozilla's unique secret sauce. Like can we play an orchestrator in the ecosystem to pull that together?
Whose side is the model on
Michael Krigsman: Anthony Scriffignano comes back and he says, "Using the best model to get, quote-unquote, 'the best result' can lead to confirmation bias. He always tries to get a dissenting opinion and also, provenance. Why is this the best answer?"... Raffi Krikorian: I wrote this piece in The Atlantic a few months ago called The Validation Machines, and it was specifically targeting what that. What you were just saying, of just like you get lulled into believing these are the best results because of the confidence in the way that they talk about it. And what they are not doing is you know, main analogy here is like we've all underestimated, I think, what do the 10 blue links on Google have actually done?
Like my mom doesn't really understand what PageRank is, but when she sees the 10 blue links, she's just "I kind of understand what's going on.", And we are, we are potentially falling into that world of like when you ask for the best model, it'll give you the best result. Like how did that best result come out? Like there was actually just a paper recently, it may maybe some minor divergence from enterprise use case, but I think it's good illustratively.
There's a paper recently that came out of, I think, joint between Princeton and UW up in Seattle, where they did an analysis of asking all these agentic systems on purchasing decisions, like I need to buy X, Y, Z. And it turns out, maybe unsurprisingly in retrospect, a lot of the platform systems, recommend, products which have been sponsored more often than not.
And in some situations there seems to be a lot of evidence, again, I defer to very smart researchers at Princeton and UW, a lot of evidence that these systems then do also a little bit of socioeconomic status polling of you to then try to figure out how to maximize the number of dollars they can extract from me when I'm asking this question. So you're right. It's not just confirmation bias. Like it's a clear example of these systems are not under your side, they're on someone else's side.
So if you bring that back to coding or if you bring that back to workflows, now I think really the question is just like how do you have trust that it's done the right thing? And so like on the philosophical level, do you have trust because you're using an open system that you're running yourself? You know, maybe someone else's biases were embedded, but it's not like making a network call to figure out which is the promoted thing that might get some kind of dividend or cut out of.
So there's a little bit of trust you can gain by running it locally. There's a little bit of trust that you can gain by it being open source. But I think, Anthony, you're right. Like provenance, being backed up with data, being understand sourcing of the information, that's where true trust is going to come in, and I just don't believe that we're going to get to a place where a, the larger providers are going to be incentivized to give that information to you. But the open systems could be.
So I want to figure out how to make that happen.
Control beyond cost savings
Michael Krigsman: Lisbeth Shaw, who says, "How does creating your own open source AI infrastructure address the issues you've talked about, except for avoiding token pricing and cost?"
Raffi Krikorian: A lot of ways it's not just token pricing and cost. I think when you create. I'm not sure I'm advocating for create, but when you use one of the open systems, -Then at least you know full control over what's going on and what's happening. We saw this in some of the ChatGPT migrations where you know, they bumped a version number on ChatGPT. A lot of people had their code pinned to latest as opposed being code pinned to a specific version, and all of a sudden the behavior just fundamentally changed.
And so yes, pinning it to a version number might have alleviated some of that problem, but there's also a lot of evidence that when they're under capacity load, that the behavior of these systems are changing. Who knows what's going on behind the scenes? I have many theories, but none of them are necessarily backed up. But we've noticed that their behavior changes have happened to these systems when they're under load versus when they're, when they're under not.
And so I think having control over your own destiny I think is one of the biggest gains that you can get by running one of these systems yourself. Now, you know, there are obviously going to be trade-offs when you do it. Like when I was at. When I was doing a lot of work in politics, a lot of the reasons why emails got hacked is because people were running on self-hosted servers versus running on the hardened servers that the big platforms provide.... So there are trade-offs that have to be made in these systems, but I believe that at least for most CIOs, CTOs, et cetera, we can navigate the trade-offs if we're just told what they are, and if we know there are viable alternatives that we can be working with. And I think a lot of situations, especially in this world of AI that we're working under, we just assume that the only option is to give our credit cards to one of the platform providers.
It's a good option, but then you should just understand what all the trade-offs are.
Building human-centered AI
Michael Krigsman: This is from Nate Angell, and Nate Angell says, "How can Mozilla help ensure that AI is always human-centered, always in human-centered loops, rather than the cliché we keep hearing about having humans in the AI loops?" I'm so glad he asked that because we do hear human in the loop has become this meaningless buzzword and. So it's an interesting question.
Raffi Krikorian: I do actually think Mozilla is one of the organizations that's uniquely able to do this. It's one of the reasons I gave up my board seat in order to do this work as an operator within this company is because I think this combination of our nonprofit status, which means that we can do things that benefit people and benefit humanity as opposed to a bottom line. We need to be sustainable, don't, don't get me wrong on that.
But the fact that my top line, my top line every day is driven by a mission and not by a bottom line I think makes us a unique, a unique organization that can go tackle this problem, and we have a history of doing things like this. Firefox was effectively created as a way to push back against the monopolization of the web by Microsoft in the '90s.
And because of Firefox existence, and because of the way it was deployed and used throughout the world, and because of the way that people got trust in it forced the web to become open in a bunch of different ways. So I'm looking to do a very similar play when it comes to how we think about AI and how we get to human-centered AI. So our Mozilla Foundation, the foundation side of the organization, is doing all this programmatic work around right-sizing AI, about making sure AI is built on the side of humanity.
How do we think about augmentation instead of automation? You know, we. So all those kind of thinking and programmatic work trickle back down to the products that we're building. Now, the real thing is it's harder than just an ethos thing. Most companies don't have the ethos we have, so I think we already have a leg up there. But it comes down to actual product decisions that are just hard, especially in this agentic world.
You can imagine a world where we build all these agentic systems, Octanus being one example from the Moz AI team, the workflow builder, and what it could do is it could literally ask you for your permission every single time it wanted to take an action.
The problem with that is we're just going to get into cookie banners all over again, where for a while people will think about it and eventually just hit approve, approve, and then they'll put a, you know, rubber duck that sits on my keyboard and just hits the approve button over and over and over again. So I think we are in this position that we want to make this ethos work, and now we are actually transitioning to really actively thinking about the UI, UX, the actual human interaction in order to pull that off.
Do we want to baseline the way agents work and then try to un- flag a human down when an anomaly happens? Do we need to figure out what transactional boundaries look like so you can actually roll back things in case a mistake was made? There are huge computational architectural UI, UX problems that need to be solved. But I like to believe that Mozilla is one of the few places that are really incentivized to lean in and solve it, whereas everyone else might just go do the willy-nilly thing. So I don't know.
I mean I like to believe we're the trustworthy people, that we've demonstrated over the past 25 years that we're trying to do things on the side of humans, and using technology to make us better, trying to push for openness in a bunch of systems. We're just going to have to do it again. It's just really freaking hard, but we're going to just have to do it again.
Can open source beat big tech
Michael Krigsman: As you said, Mozilla beat Microsoft in the browser wars. Can open source beat big tech in AI? And what does winning look like?
Raffi Krikorian: I think it's going to be a hard, just being really honest, hard to win on the frontier level stuff. I think on the frontier level stuff, I think we get to a world of the amount of compute needed, the amount of data needed, the amount of money needed just becomes a little out of the reach.
I like to jokingly say to people that Mozilla only has $1 billion to deploy in the grand scheme of things, but $1 billion is a drop in the bucket when you look at amount of money that Anthropics or the OpenAIs are raising out there in the world. So-I think in the frontier world, the battle's going to be very hard.... But I think in the everyday use case world of I want to just do things, I want to do actions in the world, I want to do web queries, I want to synthesize information, I want to add things to my calendar, I want to figure out how to coordinate the summer camps between my sons and my best- my sons' best friends, I think that the open source models are already proving that they can do those type of things.
So I think that it's possible that we get into a world where the closed big frontier models are doing scientific breakthroughs, are being the ones that are the co-pilots inside a wet lab, are doing the things that are trying to invent new forms of mathematics for us. I think for the everyday use you know, I want a world, live in a world that we have seven billion AI agents, one for each of us, who are doing things on my behalf.
I think open source can get us there pretty fast and pretty well if we focus on it. So I'm, I'm optimistic about those kinda things.
The rounded corners moment
Michael Krigsman: So you're optimistic that we can eventually live in a world where enterprises don't have agentic lock-in?
Raffi Krikorian: 100%. I think we could get there today if we really wanted to. But, you know, these things take time to play out, but I'm 100% optimistic we can get to that world. There are just. You know, we're sort of in this rounded corners moment. You know, when cell phones first came out, they all had sharp corners, and they looked like clunky boxes. And then Steve Jobs polished the corners and made them really usable for the rest of us, and now they're indispensable in our day-to-day life.
I think open source AI is kind of in this rounded corners moment. It's just a little hard, too hard to use. That small amount of friction deters most developers from actually tinkering with it, but if we can deliver them a rounded corners version of it, then it can be a viable alternative, and I think that will change the curve in a similar way that Firefox, the vast majority of the web sadly does not use Firefox, but enough of the web does that it's caused an entire industry to move. I think that's the, that's the play.
Michael Krigsman: But also I do have to say that it's not just a, an issue of cost and control, but also demon- and ease of use, but also demonstrating the caliber of result- Sure that you can get from the major frontier models, because if you don't have that, then you will be relegated to relatively low-value problems.
Raffi Krikorian: I think we do need a bunch of what's the HBS case study of how these things are being deployed and making it actually useful inside organizations. I agree we need things like that. Mozilla sometime in July will be releasing what we're calling the State of Open Source AI report, and part of it will have some case studies of just how this stuff is being used into enterprises, and how it's actually making a difference. Pinterest being the canonical example that everyone throws around.
Q3 of last year, Pinterest deployed open models instead of closed, and saved something in the order of $10 million that quarter alone by just making that switch. So I think if we can tell those stories and find alt- other stories of just if companies like that are willing to do it on those type of workflows, and it makes economic sense for them to do it, becomes a no-brainer. The rest of us should be doing it too. But what's the difference is that Pinterest has a bunch of A+ engineers that can make it happen.
The question is how do we make it so that every SMB, every enterprise, every Fortune 500 can make use of the exact same transformation?
Inside Mozilla.AI's products
Michael Krigsman: You've spoken a little bit about Mozilla.Ai. Tell us, we have only a few minutes left. Tell us briefly what you're building there, and also what is the gap that Mozilla.Ai is filling that no one else will?
Raffi Krikorian: Mozilla has undergone a pretty big restructuring itself, and so now we're a portfolio of companies, all with slightly different business models, all tackling different parts of the internet and AI ecosystem. There's the Mozilla Corporation, which works on Firefox, that everyone knows and loves. Mozilla AI is now one of these newer companies that we're specifically focusing on the developer ecosystem. So we have actually a few set of products that are open right now. You know, they are.... We are in this mode of rapidly creating ideas and figuring out which ones are actually needed in the space, and then doing GTM after that. But Mozilla AI has a few things, Otari being one of them. That's their locally deployable open router-like dev-, system, where you can then point your code at Otari, and it can then figure out the best model to use, the right model to use.
It'll track token counts, track costing for you, give you gorgeous dashboards, and then we can then do so much more after that point once we're sort of installed in that way. So it becomes your model routing layer. We have something that we call CQ that I jokingly call Stack Overflow for agents. So imagine a world where Raffi is using his agent in an enterprise work case, and Michael's using his agent in an enterprise workplace. The question is: How do we get institutional learning?
So in the, in the regular enterprise case, we have whole departments just figuring out how to share knowledge between people, and how do we bring new training in, and stuff like that. CQ is meant to do that for the agentic workflow. So Raffi's agent and Michael's agent can then be talking to CQ as the common place to share the tasks that they've been developing, share the way that they've been manipulating the work- the code base, so that everyone's agent is rising. It's like a rising tide all together across the entire enterprise. So that's CQ.
And then we have a last one that we call Octonous, which is our open source workflow development system. So we've built really good tools to be able to orchestrate a whole bunch of IT-friendly things, so you know, your Slacks, your Notions, your emails, et cetera. We have some really good open source tools there that can then be tied into either a CQ or an Otari to make it work really well. So we're really going after the developer niche on Mozilla AI, so that's one of our AI bets.
Another one is-You know, on the Thunderbird team, the team that makes the email client, they've just created something that they're calling Thunderbolt, which is basically the open source chat client that can be working, again, in an enterprise use case. Because in a lot of situations, a lot of these companies don't want to just have all their documents indexed and put up into Claude for then Claude to serve.
So if you use a locally hosted model inside your environment, you can then run Thunderbolt on top of it, which will then start figuring out how to access all the different documents, build RAG databases off of all the internal company knowledge, and then expose it through a Thunderbolt interface so that you can actually have really contextually aware chats with the entire knowledge system that's going inside of your enterprise.
So those are just two examples of ways that we're trying to make a dent into being locally hosted, human-centered systems that can really work and still bring these benefits of AI technology up, but under a way that's more under your control.
The rebel alliance for open AI
Michael Krigsman: David Quirke on LinkedIn says, "Mark Surman from the Mozilla Foundation has put out the idea of creating a rebel alliance," which he says is such a great message. How is the whole Mozilla portfolio ecosystem approaching this? And very quickly, please.
Raffi Krikorian: We have a whole Moz VC team, a ventures team, that we're doing a bunch of investing across the space, but we're also just building partnerships. We're, we're starting to work with the Mistrals of the world. We're starting to work with all the Hugging Faces of the world, who's actually in the VC portfolio, and trying to figure out ways for us to all work together as we build that single open source AI stack. That rebel alliance, I think, can then sum up to the greater than the big platform providers.
Michael Krigsman: And with that, we are out of time. A huge thank you to Raffi Krikorian. He is the Chief Technology Officer of Mozilla. Raffi, thank you so much for being with us. I'm very grateful to you.
Raffi Krikorian: No, thank you, Michael. I hope you have me back one day.
Michael Krigsman: I hope you'll come back. Everybody, thank you for watching. You guys, as always, you guys are amazing, the questions you ask.
Before you go, subscribe to the CXO Talk newsletter. Go to cxotalk.Com. We want you to join us. We have truly incredible shows coming up. I always say that, and you know what? It's really true. We have great guests.
Okay, folks. Have a great day, and we'll see you again next time. Take care.

