Video: AI Governing AI: Mastering Real-Time Control with SAGE | Duration: 2708s | Summary: AI Governing AI: Mastering Real-Time Control with SAGE | Chapters: Introduction to Sage (35.870003s), Understanding AI Agents (88.8s), Agent Risks (156.275s), Agent Policy Governance (242.885s), Policies vs Guardrails (308.47s), Policies vs Guardrails (397.97998s), Agent Security Risks (471.81s), Agent Control Systems (561.32s), Rubrik's Resilience Platform (614.82s), SAGE Architecture (700.56s), Live Policy Demo (997s), Real-Time Tool Blocking (1168.9s), Real-Time Tool Blocking (1309.45s), Wrap Up & Next Steps (1394.1901s)
Transcript for "AI Governing AI: Mastering Real-Time Control with SAGE":
Hello, everyone. Thank you for joining us for our webinar where we will dive into our brand new policy engine, Sage, the AI that governs AI. My name is Varun Grover. I lead product marketing for AI here at Rubrik. Joining me are Dev Rishi, general manager of AI, and Arnav, technical lead manager of AI here at Rubrik. Static tools are not meant to govern probabilistic agents. You need a solution that uses AI. With that, we are excited to dive into Sage. Throughout this entire webinar, we will have the q and a tab live. So please drop your questions in, and we will try to answer them as as you drop those in. And I'll pass it over to Arnav to dive deeper, into our capability. Awesome. Thanks, Varun. So today's webinar is focused on governing AI agents using AI. So I wanna start by making sure that we have a good shared understanding of what an agent is. At least to me, an agent is an LLM, which serves as the brain and a set of connected tools, which we can think of as the hands. Together, we, this looks like a model like Cloud Opus or Gemini Pro that has been given API access to your most sensitive systems. So Gmail, SAP, Salesforce, or Slack by a set of tools that you define or install. And with these tools, it is able to take actions on your behalf, such as reading emails or customer support tickets, writing marketing emails, or even responding to your Slack messages for you. If you think about this, giving it both the brains and the tools to take actions on our behalf makes agents uniquely superhuman in their capability, but not always in a good way. Humans are generally thoughtful and slow and very good at making decisions and getting work done. However, they can be slightly unpredictable. Over over decades, we have built enforcement frameworks to limit the scope of damage that may come from non responsible human behavior, particularly in enterprise such as using strong RBAC systems with clear roles and permissions. However, agents are autonomous and rapid acting at scale. They're often weakly supervised, which means which makes them quite unpredictable. And frankly, irresponsible when compared to deterministic traditional software and even humans. And this aspect poses a set of new challenges that we still need to solve for as we start to adopt AI agents. And AI agents are incredibly powerful, but they cannot be trusted. And this is really just by design. It is a crowning feature and also a bug. And oftentimes, we tend to attribute mistakes to LM hallucinations. And by them by that, just hallucinations by themselves are not a big deal when we just think about the l l m. But when you think about AI agents where the l l m's are connected to tools, these hallucinated executed actions, can have many damaging consequences. So what happens in the wild when you combine inherently unlimited autonomy, difficulty in facing, and all at this really super massive scale? The best way to understand the implications is is by talking about two case studies, Replit and AI Air Canada's chatbot. So in July 2025, there was this post where, an indie hacker who had built, a pretty big business, was using Replit to bytecode their AI app, had the Replit agent drop, their entire production database during a code freeze where it shouldn't have done that. So that had really big implications. And similarly, you know, Air Canada built a chatbot for customer support and, the chatbot invented a refund rule, which Air Canada had to, honor, because the court enforced the airline to honor it. So financial implications. And traditionally, most organizations have added input and output guardrails or a set of static rules to their AI agent workflows to prevent these kinds of mistakes. However, these are actually insufficient because they don't govern the intent of the actions that the agent is trying to take. So how can we prevent our agents from making these kinds of decisions on our behalf? We at Rubrik believe that the right way to govern agents is by defining a set of policies that can be enforced in real time through a semantic AI engine across your organization's fleet of agents, and we're calling this engine Sage. And I wanna take a moment to explain what a policy is and how it differs from guardrails, which is a term that is more popularly used in the agent landscape and one that you might be familiar with. A policy is a safety envelope for nondeterministic agents. And unlike traditional rules, these are machine and human readable and enforced in real time. They define exactly who can do what, when. And this concept is, essentially, per factored to things like agent identity, the set of tools, and what time in the day, on what conditions, on what specific scenarios they're allowed to operate and perform these actions. And policy differs from guardrails in three main ways. What they focus on, the level they operate at, and the type of risk mitigation they help with. Guardrails typically serve as a politeness layer. They keep the chat window clean. They focus on speech and content to prevent toxicity or bad words. They control what the user sees in the chat window, and it avoids effectively offensive or biased outputs or reputational harm from, saying things that it shouldn't. However, policies differ in that they operate at the permission layer. They control the system back end. So they prevent bad outcomes like deleting a database. The advantage of a system like a policy governance layer is that it avoids things like data loss, data exfiltration, financial liability, and so much more. It helps mitigate operational risk across your your agent fleet. And to see why we believe policies would have saved the day in many of the examples we showed you earlier, it's worth revisiting how guard what guardrails would do and how a policy would actually prevent, the the issue from happening. So for example, in the case of Replit, you know, where we deleted entire company's database during a code freeze, Would a guardrail have helped? Not really. The agent wasn't being told to not be rude or respond in the company's style or anything to that effect. Right? It was it was it was actually being too efficient in solving the issues. A policy, right, could have helped instead. So imagine you had a two blocking policy and it intercepts the NPM run DB push command that your agent took because the code freeze condition was active. Similarly, in the case of Air Canada, where the chatbot invented a refund rule, it was a hallucination. Right? And but the language itself was professional, so Guardrail wouldn't have stopped it. Maybe it would have helped catch that it was a hallucination, but that depends on many things like context. However, in the in the world of policies, you could have been in a compliance policy that flags a financial commitment like a refund is being made without verifying the official policy by a trusted API. And, you know, just these are two examples of how all of this could work, but you can imagine why policies are so much more powerful compared to guardrails. And with that, I'm gonna hand it over to Dev to talk to you a bit more about AI agent risks and what we're building, at Rubrik. Yeah. Thanks for that, Arnav. I think, you know, in really doubling down on the examples that Arnav has given, one of the key things that I think stands out is that existing infrastructure and IT and security departments haven't really been created to be able to manage this new class of risks that agents are being able to bring about. So whether it was the invented refund rule, the dropping of a production database, or the consistent set of tools like Open Call and others, what we find is that the feature of agents in terms of their ability to be creative and inventive in solving problems, oftentimes also becomes a security nightmare with respect to how you actually secure these platforms from being able to do everything that may cause wrong. Now the way that we think about these is that in many times, these are not necessarily hallucinations. They're not the agent just coming up with a potential piece of content that it's surfacing towards a user and incorrect fact. But increasingly, as agents get access to tools and API calls via MCP or direct access to APIs via rest, what we're seeing is that they're actually becoming executed actions. Which is why one thing that we've really been, I think, seeing firsthand is that agents now have the opportunity to cause 10 times of damage in one tenth of time when compared towards human counterparts. We think about what are the core tenants towards the solution that we think we need to be able to solve agentic risk. Now what we really think about it is with three key elements. And so if we get to the next slide, we'll see that those really tend to be about visibility. How do you see agents in action? What we consistently hear is how do you understand what an agent is doing and be able to automatically populate an agent inventory everywhere agents might run. A system to be able to place those agents under control. And our knob has given a sense of what types of controls we might want to be able to put on an agent in terms of both guardrails as well as policies that make sure that they act safely. And then I think one thing that we probably don't talk about enough, which is at some point, these systems of controls are going to be ninety, ninety five, 98% effective, but there will be opportunities where agents actually do end up making a mistake and how we allow people to recover from those mistakes when that happens. At Rubrik, we believe that really the key element for being able to surface, these, core elements has to do with resilience. We actually talk about how we think about resilience as a connected control plane across the three key areas that AI enterprises are actually storing and maintaining critical parts of their internal infrastructure. Rubrik has its core background and understanding in data security, which means we understand from the fact that we actually provide back when resilience for some of the world's largest organizations where the data is actually being stored and what the metadata and data associated with those, you know, backups actually are. That's given us an understanding of things like where is sensitive data inside of your organization. Rubrik also has a key understanding inside of identity, which gives us a sense of how do we help organizations recover their identity systems after an attack, And how do we get preventative in an attack as well to know things like indicators of exposure for any given identity that may be compromised? And then finally, what we're bringing into the market with our AI agent operations platform called Rubik Agent Cloud is a core understanding of the agents and the models themselves. The key insight that we believe is that with these three combinations of data, identity, and the agent metadata, we have the most compelling signals and context to be able to actually secure and govern the AI future that organizations are gonna be diving into. And the next slide I wanna actually show you, what is the reference architecture we think about for the solution towards this problem. We call this method SAGE, which stands for our semantic AI governance engine. It's a new way of how you think about AI security and governance that we think will become the way that organizations look to secure and lock down their agents. The core intuition behind Sage is that you can no longer rely on old conventional rules based policies to govern and secure the way agents are operating today and tomorrow. The combination of what an agent can do oftentimes extends past what you would be able to enforce in a given rule. A great example is that, you know, my agent might have access to data in Salesforce, and they may have access to the ability to write an email. And that is completely intended. There's no posture limitation towards what that agent should do. But what I don't wanna make sure the agent does is take sensitive data from Salesforce and then output it, you know, via an email to an external audience. These sets of combinations, I think, prevent us from being able to use any core static rule. And instead, we've decided to take the approach where an organization or customer can actually just define the policies that they wanna be able to enforce directly in natural language. So in something as simple as English, to be able to say, do not share sensitive data externally via email, what we allow you to do is define that policy and then convert that semantically into what we consider our policy configuration. We automatically populate key parts of the definition and allow you to be able to edit those, and then distill your policy into a proprietary small language model that can run on every input and every output of an agent. This small language model looks through every prompt, response, and tool call that an agent might be making and understands and alerts whether or not the action that the agent's been taking or a prompt that the user's provided to that agent is something that we would want to be able to either allow, alert on, or block altogether. Two additional components towards Sage that I think make it a really powerful governance engine is the fact that it can also pick up on anomalies over time. And so if your agents have been acting in a way that's relatively consistent and we start to notice deviations towards those, that is something that is built in in an intelligent way towards the platform so you get the adaptive intelligence out of the box. And then finally, we think that policies are things that have always suffered from false negative and false positive, challenges. And one of the hardest things about false positives and negatives is that when it comes towards rules, you need to wait for the vendor to update the rule to catch the cases that it's currently missing or that's currently over triggering on. With Sage, this is no longer an issue. Instead, we have an adaptive learning loop where organizations can actually start to be able to feed in example data for how they wanna be able to adjust our governance models directly in real time. So put very simply, Sage's models actually get better the more that they're used. If we go to the next slide, I will can show you what I think is some of the most compelling things about, you know, how Sage is actually architected and why small language models are really critical for being able to do that. As we've said, we need AI models to be able to help us secure and govern AI. But what we found is that you actually need to not use small language models purpose built and tailored towards AI governance in order to be able to do this effectively at scale. Small language models allow you to be able to enforce your AI governance security and guardrails at a much higher accuracy threshold than if you were to use some of the leading AI large language models at a significantly lower latency threshold. So what you actually get is models that are more accurate in enforcing your policies that are much faster, which is critical because in a lot of cases, our customers are interested in using this for real time blocking prevention. And then finally, this entire system runs at a fraction of the cost versus trying to enforce those same types of guardrails using a conventional large language model. Sage is also unified rather than fragmented into the platforms that these agents are able to get built in. And I think that the one of the most compelling pieces is that because of Rubrik's unique posture and being able to understand the data and identity context in addition to the l l ones and the agents, we can augment the signals that Sage is using by using the same signals that Rubrik has an underlying understanding of in terms of sensitive data, identities that have been compromised, and really feed that into our dynamic and adaptive governance engine. So that's one of the reasons we're the most excited about how Sage is gonna help the future of the way that we're the world secures and governs AI agents. And I think that what I'd love to be able to do from here is hand it over to Verne so that he can show us how this actually works and runs in practice. Thank you, Dev. Let's dive in to the components that, Arnav and Dev just talked through. So as I share, this demo here, you can see, the Rubrik agent cloud dashboard view. And here you see different policies, that have already been created. One of the things that, Dev referred to in the architecture was our natural language policy interface. This is what it looks like. While we do have predefined policies that you can configure very easily, the natural language policy interface allows you to configure custom policies. Let's say one of the policies that we configure is agents should not give financial advice. Then what happens next is the semantic converter takes the the text that you have inputted and converts that into policy logic. So here you see the output of that logic in the form of these, instructions and the goal for the policy. And then in the background, the secret sauce, which is our proprietary SLM judge, evaluates how strong your policy is and gives you a strength score. So here you can see that this is a moderately strong policy. And what is really critical to note here is that it actually provides you recommendations on how to improve the policy. So, for example, one of the recommendations is adding more specificity and clarity around, how the lack of, certification is determined. And, clarify if budgeting advice counts as financial advice for all scenarios. You can then go about incorporating these suggestions, and you will see a direct increase in the strength score of the policy. Now I want to walk you through an a live example of what it would look like if, the agent was to give a recommendation that sounded like financial advice. So let's run this test here. What you can see is, immediately, it found a violation. Because, when you had this recommendation to invest in index funds, that was a clear violation of, the natural language policy that we just created that agent should not give financial advice. On top of that, it can also give you a confidence interval. And in this case, that confidence interval is 99%. So as we go through this process, I want you to take note of a couple of these critical capabilities. First, we were able to define a policy using natural language. Then we were able to configure it in a way that was stronger based on the recommendations that we got from our SLM judge. And then, lastly, you see two different modes here, monitor and blocking. Today, we'll be configuring this in the monitor mode, but I'll give you a demonstration of the blocking mode for some of the predetermined policies that we already have. And we, are going to make this blocking mode available for the custom policies very soon as well. So let me walk you through what real time tool blocking looks like in Rubrik Agent Cloud. So here I have a customer support agent that I have configured using Rubrik Agent Cloud. And the key thing to note here is when you configure this customer support agent, we had determined certain policies such as it not being able to use certain unauthorized tools, but that was in monitor mode. What we then did was we changed it to tool blocking mode. And when we changed it to tool blocking mode, we now have the ability to block a risky action from happening in real time. So as you can see, when I issued the customer support order request, the agent used a tool called refund order. Now let me try the same thing again with, Rubik agent cloud being used, for tool blocking. So when I send this request, this time, it has to route through the Rubik's gateway, and you will see real time tool blocking in action. And, I have just sent in the same, refund request, and you will get a clear response that says, the refund order, tool was blocked by Rubik's agent cloud. This is a critical capability because we're not trying to clean up a disaster after the fact. You are preventing it from happening in real time. And this is something that we look to be able to enable for custom policies, which is going to be a game changer moving forward. So we walked you through the critical capabilities of Sage from the natural language policy creation to the semantic converter to the SLM judge to real time tool blocking. And there's a lot more to come. So we just walked you through what the monitor mode looks like within Rubrik Agent Cloud. Now let me walk you through tool blocking. So here, we have a customer support agent that we had created using g p t five, and we issue a simple refund request. I'm gonna show you two versions. In the first version, we don't have Rubrik Agent Cloud enabled for real time tool blocking. And the agent issues a refund request, and it uses a tool called refund order. As you can see, that goes through and the refund is initiated. Then what we did next was we set up real time tool blocking in Rubrik Agent Cloud, which is the block mode, that I'd walked you through initially. Now when you do that and you route through Rubrik Gateway, you get a clear, response that says the tool was blocked by Rubrik Agent Cloud. This is a game changer because now you're not looking to remediate after the fact. You're actually able to prevent, disaster from happening in real time or a destructive action from taking place in real time. This is a game changer because we've not only looked to be able to do this for predetermined policies, we will soon be able to enable this for custom policies, which is what, Dev and Arnav walked you through as a part of Sage. With that, let's wrap up. So today, we walked you through Sage. And the way for you to get started is very simple. Scan this QR code to learn more about how you can assess your resilience and get a comprehensive inventory of identities and agents in the environment. Over the next ninety days, understand which entities are highest risk and define the appropriate controls and have a mitigation plan in place. And over the next six months, drive an implementation project to protect critical ID systems, data, and AI agent operations. Rubrik is here to help you on your journey of resilience, and Sage is your solution to agent control. If you have any questions, reach out to your Rubik's rep, and we would be more than happy to dive deeper into what Rubik agent cloud can do for you. Thank you for joining, and appreciate the time.