Video: The New Standard of Data Resilience: Securing Google Workspace from Identity Access to AI Innovation | Duration: 1964s | Summary: The New Standard of Data Resilience: Securing Google Workspace from Identity Access to AI Innovation | Chapters: Welcome and Introductions (51.460003s), Workspace Evolution (150.125s), AI Workspace Integration (332.26s), Threat Landscape Evolution (408.16498s), Native Google Tools (619.62s), Recovery vs Compliance (794.455s), Architecture & Recovery (928.39s), Real-World Data Risks (1171.7s), Data Loss Vectors (1387.115s), Governance and Recovery (1565.495s), Recovery Path Partnership (1767.7451s), Flexible Cloud Management (1858.36s), Closing Remarks (1892.7849s)
Transcript for "The New Standard of Data Resilience: Securing Google Workspace from Identity Access to AI Innovation":
Alright. Welcome, everyone. Thanks for being here. I'm Sohil Sheff, go to market tech lead here at Rubrik, and I'm joined today by McAlvin Romaine, product lead at Google. So excited to to have an engaging conversation today. If you've been watching what's happening in the last year, this intersection of data, identity, AI, you already sense something fundamentally shifted. So we're gonna unpack what that shift means today for organizations running on Google Workspace. McAlvin, why don't you kick us off? Tell folks a bit about who you are and what you work on at at Google. I appreciate it. Thanks so much for having me. As as you mentioned, I am a product lead at Google. My focus is primarily on, harnessing the potential of customer data safely with control. And so a lot of my work intersects, with building, like, the bleeding edge of AI with exactly what enterprise leaders are they're looking for. So we're talking about CIOs, IT administrators, security admins, you name it. Primarily trying to adopt and prove ROI, with their tools and and be able to adopt those tools with confidence. Great. Great. And on my side, I focus on the go to market motion for Google Workspace protection at Rubrik. I came up through consulting, product management, working with enterprise customers primarily across the space of data security and increasingly AI. So been living in this world where, you know, the technology is moving fast, risk conversation hasn't quite caught up. So that gap is is really what we'll talk about today. So let's get into it. Yeah. So, Calvin, I wanna start, where I think the the real story begins. It's not about security just yet, but about what Google Workspace is in 2026 and and why it matters so much because I still hear emails and documents from lots of organizations, and that framing's a a bit out of date from where you sit at Google. How has Workspace evolved, and what's driving the urgency around it at the business level? Yeah. So it's a great question. So today, we're seeing, like, a just a fundamental reset on how companies are are valued and and working, and and and, primarily, it's outside of just day to day work. It's the major question we're seeing today is around, like, how companies are using AI to, you know, transform their unit economics. And so, a lot of a lot of what we're seeing outside of just, day to day using email and and and documentation, tools that are really around how do we use AI to add some velocity and and then ultimately, thinking about leveraging these tools, how you can go about deploying them, iterating them, and and kind of monetizing these workflows. And so adoption is a is a big piece of, adoption with the AI is a big piece of what we're looking for. But also, what comes with that is just making sure that there's clear governance and resilience, and making sure that we're we're addressing those concerns as folks are kinda making sure that that's a big part of the valuation penalty for folks that are adopting or are fearful of adopting those products. Yeah. So, you know, what I what I heard there is just some level of business pressure. Right? So, Calvin described it exactly as as it is, kind of the the change, in the stakes on the Resilience side. I think when Workspace was, you know, primarily emails, documents, data loss incidents, they were painful but ultimately bounded. And now Workspace is increasingly this AI operating layer. You have agents running workflows, institutional knowledge is concentrated there, and your competitive position, right, depends on how fast you can operate inside it. So the cost of of incidents will scale with the platform's importance. So I think Google done a great job handling availability, infrastructure uptime, reliability, performance at scale. What Rubrik is focused on here is is the resilience question. Right? When something goes wrong inside the tenant, whether it's human, an attacker, autonomous agent, can you recover with that full context intact? Can you prove what happens? And and those are ultimately different different questions than availability. So I think the bigger Google's footprint grows inside organizations operations, just the the higher the stakes are on the resilience side. So I think the two scale together. Yeah. No. I think I think you're you're hitting the nail on the head. Especially as I was mentioning, with AI being used day to day, workspace in particular isn't just the, you know, the suite that people are using to write their memos or, putting in just general queries. We we have products on Gemini embedded in natively across your entire enterprise. So it's really heightened, kind of the the amount of data that enterprise are able to amass over time. This the idea of user generated content is more than just the documents you're creating. It's actually just the way you're working, especially as we're moving from this AI assistant era to more progressively in a in a genetic era. So you're you're introducing things like workspace studios, where you're allowing agents to operate and act as, like, digital coworkers. And so the more of that, you're seeing with these, like, complex multi step tasks that are being incorporated, we're seeing kind of this level of autonomy when it comes to, like, productivity multiplication that I was mentioning earlier. And so the idea, just the general stakes for recovery, and and just general understanding of, like, observability and governance, is is super high and is super important. It's a big reason why we, are are really, you know, championing this idea of having a very fortified resilience layer, to ensure that the idea of adoption isn't something that gets in the way of folks being able to maximize these tools. Okay. Yeah. So why don't we move, a little bit to the threat landscape, right, and and talk about why this is urgent right now. I I think, you know, the the threat conversation used to center almost entirely on ransomware, hitting on prem infrastructure. And I think that's still real, but that surface area has expanded. So can can you talk a little bit about, you know, what patterns you're seeing in in some of these customer conversations? Yeah. I I think organizations generally that are are set up for success, are are thinking about data solver like data sovereignty, digital sovereignty, in a pretty equivalent lens for, like, global organizations and even regulated industries. The the main focus is really about, you know, is is their data safe? Do they know where it is? Who has the ability to access their data? And generally, what type of observability audit trails do these, enterprise leaders have just to make sure that the data, is being used in a way that they can track and ultimately, can govern across their own enterprise safely. And so, I mean, generally speaking, I I think that's how we look at it from a, you know, content generation perspective. I would love to know how are you all thinking about it, from, like, cloud first environment. Yeah. So if you look at at the underlying data, I think it surprises people when when they see it. So accidental deletion, human misconfiguration, they're still the top categories of cloud data loss that hasn't really changed. What has changed is more than half of in incidents are now intentional. So malicious insiders, external attackers, credential theft, and the attack volume really reflects that. Hundreds of millions of identity based attempts every single day. And that number isn't growing linearly. It's growing exponentially and partly because attackers have AI on their side now too. So you get automated credential stuffing, AI generated phishing that can bypass filters, social engineering scripts that are getting more convincing every month. Even if you look at the latest models, you know, Anthropic's been in the news, of course, for for this lately, but it's really inverted how we think about cybersecurity and the real risk surface area with these powerful models able to find and exploit vulnerabilities in in completely new ways at superhuman scale. And the method in almost every high profile case isn't a technical exploit against Google's infrastructure. Right? It's identity. Someone logs in with stolen credentials or a hijacked session. They bypass MFA through social engineering, and they show up as a legitimate user. And and once they're in with a privileged account, they can move at machine speed. So a compromised super admin can wipe a shared drive and its archive simultaneously because they're operating inside that same tenant. So recovery capabilities living outside that tenant can certainly help, and, that's the way we're looking at things. Right? Recovery living outside that blast radius, and we'll get into that, more here in in a few minutes. Got it. Got it. Definitely. So I wanna move to, tackling something. It's a it's a question I get quite a bit in customer conversations when talking with, customers who leverage Google Workspace quite a bit. When we get into data protection, you know, the the first response is typically some version of, you know, we try to use Google to handle this natively. So I wanted to see from your perspective, how would you describe what the native tool set covers and what it's designed for? Yeah. And I'll go through the high level. I can I can nerd out on this, for a long period of time, but I will show you a tight window? So, I'll probably summarize it in, like, three native, like, data pillars. So we we have the first, which is kind of out the box of domain wide takeout. And this is the run of the mill mass extraction tool. Primary use case is just like bulk exports, for all workspace data. This would be, you know, looking at pulling a a full enterprise or domain or tenant worth of data, and this is typically used for cold archiving, and that data is stored in a in a Google Cloud bucket. And it allows the user to be able to download and and ultimately, save the files. And the second one we have is, Google Vault. And this one is, is more of, like, a compliant eDiscovery product, typically for, legal teams to be able to handle retention and audit trails, holes, things of that nature. And so those are very specific use cases, when you're really focusing on the the e discovery or compliance lens. And and then the third, and honestly, one of my favorites, is local data storage, affectionately called LDS. But local data storage is more of like the enterprise grade export product. This is, product used to move petabytes of data, and and ideally, it allows you to filter down, from a sovereignty perspective. You were able to move exports into specific regions. And and largely, there's a bunch of there's a couple other products associated with LDS that allows you to convert, unstructured data into valuable insights and information. And so a big value driver I consider for LDS is just enabling organizations to scale AI securely, with full oversight. And so since it sits within it covers all of the workspace. The idea is it's a central point for all customer data and and key for unlocking and kinda managing customers' most valuable data, and and providing the asset for monetizing and defensible AI across the org. So, those are the the three. And specifically on the Google Vault and local data storage pieces, those are part of the assured control assured control SKU, which, is a big asset for being able to back up and restore back up specifically, the data and, ultimately, kinda lends a big big hand to what we're trying to build with you all in the future. Yeah. Yeah. And I I hear these all the time on on calls. I think, you know, Vault especially comes up. These are really great products and focused on these different use cases. So one thing that I try to talk about with customers is, you know, what is that use case and how does something like compliance, let's take that as an example, and recovery, have fundamentally different requirements. Right? So if you think about a compliance tool, you need to optimize for preservation, searchability, chain of custody over, long time periods. When you think about recovery, really, you're optimizing for speed, fidelity, scope, you know, restoring thousands of files with their folder structures, permissions, metadata, all intact and, ideally minutes. So the the other thing that that comes up is having recovery capabilities, in sort of in in isolated tenant. Right? So if you have data loss, or someone gets access, maliciously accidentally to native tool sets, making sure from an architecture standpoint that things are external isolated where they need be depending on the use case. Yeah. No. They think that's been, something prevalent, especially we've seen with customers, and that we've alluded to this a ton with AI being so prevalent and native across the entire ecosystem. The idea of being able to have these resilient backups of not only, you know, Gmail and chat or Gmail and Drive, but also being able to have these, clear backups for these other services that have very clear enterprise knowledge when we're talking about, like, the Gemini app inside panels, notebook LMs, chat. Increasingly more and more business decisions are happening there. And so, it's a big reason why, at least, we're thinking about ensuring that, you know, the LDS offering as far as being able to have this resilient story is is tightly fortified with what we're building with Rubik. Great. Great. So let's move on to architecture, specifically, you know, if you think about operational recovery. So I'll I'll kind of, take a pass and and explain how Rubrik thinks about this today for for Google Workspace specifically, and then I'd love to hear, how how you think about it on your side. So from the Rubrik perspective, one of our core design principles is, you know, what we call logical air gap. So Rubrik doesn't just, you know, take a copy of your workspace data. It will move it into a Rubrik Managed Vault that's entirely isolated from your Google tenant. So different security domain, different access controls, no inheritance from tenant level events or admin credentials. So if a super admin, for example, is compromised and they wipe a shared drive or something like that, they they can't touch, the Rubik Vault. So immutability, from our perspective, really gets into thinking about how that data is is treated. Once it lands in the vault, nothing can modify or delete it, even a compromised admin or an AI agent, right, that that goes rogue, ransomware, whatever it it may be. So when you put all that together and then think about the recovery piece, in terms of fidelity of recovery when you restore, you're getting everything back exactly how it was. You're not getting raw files. You're essentially getting back the full context, which is you alluded to, right, in in this age of AI with these tools. That's more and more important in terms of the the context around the data and having it back as it existed at at the point in time that you're recovering to. So I'd be curious to think about, or or get your take on, you know, overall across Workspace, the products you work on, how some of these things how some of these things kind of get built in or how you think about them from from Google's perspective. You know, this is this is definitely discussed quite often and I'll and I'll be succinct with this. I think there's a couple ways of, thinking about it. First, specifically in in the Rubik case. Right? You you have customers that are operating in multi cloud environments with workspace being one of the the many software services that they're using within their stack. And so, what we've seen is a lot of customers are looking for opportunities or ways that they can mitigate their risk and avoid these single points of failure. And so the idea of having, you know, sound architecture and optionality with a partner like Rubrik makes a ton of sense. And then there's other opportunities where as well we're thinking about, customers having the a real focus and emphasis on digital sovereignty. Data sovereignty is is kind of the the very big theme of the past year or so. And so, you know, providing that optionality to not only have, their data stored in, you know, customer owned buckets or other services like Rubik, allows them to have that optionality to have these duplicate, environment set up in case anything does occur. And then, ultimately, if anything does occur, they wanna be able to have a very clear story around how that data can be restored, and you've realized and leveraged, ultimately, for insuring of a business continuity. And so that's something that we've seen, pretty prevalent. And how do we tie that story into the different use cases or combinations depending on the specific users' needs, multi cloud, single cloud. The real focus is how do we provide that full circle, and full story for survivability, failover as well as business continuity and and ultimately, having those options available and partnering, with you all is a is a big part of us making sure that we can provide the customers the the value that they need and and kinda meet them at the moment. Yeah. Absolutely. So Yeah. Let's make this a little bit more concrete, I think, with some scenarios. Sometimes, I think threat landscape conversation can feel a little bit abstract, but Yeah. The reality is you need resilience against, you know, whether it's accidental, malicious risk. So when you're talking to enterprise workspace customers, where are you seeing data risk show up in practice? Right? Everyday things that, you know, maybe aren't making the news, but, you know, really matter for for businesses. You know, you know, funny enough, even some of the most mundane and probably, like, not so nefarious, situations or where you see some of the most the most risk or even just impact the businesses. So we're thinking things like employee transitions, customers leaving, data's moving from one user to the next, or it does not move. How does that, you know, ultimately impact the organization? You're having m and a events. One customer purchases another and and effectively absorbs their their content within a domain? How is that managed? And, ultimately, some of those very basic, transactions or even, general use cases are chronically underestimated as far as the the impact from a data perspective. Often, you'll see data getting orphaned, accidental deletion as part of, like, the chaos related to a merger. But ultimately, you know, these events lead to just institutional knowledge and enterprise intelligence disappearing, which is, you know, really difficult and hard to quantify, for these business users, especially in the m and a situation. You're you're making a a purchase for a reason, and so, you don't wanna lose lose value right out the gate. And so the idea of of being able to prevent that and even avoiding having to reconstruct, and kind of revisit and ultimately work backwards and figure out how to rectify these situations is a a big reason why we see these risks show up in probably the most common ways outside of just, you know, someone intentionally trying to destroy an organization. It's it's more often some of these very simple tasks that are likely oversought. Yeah. Yeah. I mean, the human error scenario. I I always keep coming back to that. Right? Still the number one driver of cloud data loss. So ahead of, you know, credential attacks, ransomware Yep. Someone Yep. Moves a folder to the wrong location, discovers it weeks later, or they, you know, restore it and ACLs or permissions, don't come back with it. So you end up spending days reconstructing things. So it's not, you know, exotic events. These are just, as you said, routine at every organization. And the question is really whether the recovery path gives back that full environment, or is it, you know, just dumping data somewhere and and the team has to kind of reassemble. So that's that's what we're we're thinking about here. Yeah. Absolutely. We're we're also seeing that even within regulated industries, whether in a much more proactive and preventative way, there's a lot of questions around chain of custody and strip requirements. And so the idea of being able to have clear observability and oversight of these things is something that a lot of customers wanna get ahead of. One of the things I do wanna chat with you about just as we talked about, you know, Genentech AI a little bit earlier, you know, we're seeing that, you know, a lot of folks are thinking about identity and AI is factoring into this general conversation around resilience and how that's introduced. Would kinda love to hear from your perspective, what you're seeing in that space and how does that kind of evolve your product thinking and and ultimately the problems you're addressing for customers. Yeah. Good question. So I think this is one area where the picture has gotten meaningfully more complicated. So if you look at how workspace data can get lost or compromised, I think it breaks down into two buckets, and both of these are being amplified right now. And the first is, let's call it intentional, attackers, malicious insiders. So the attack method is in almost every high profile case. It's not a technical exploit, as we said. It's it's identity. Someone logs in, phish a credential, something like that. Threat actor shows up as a legitimate user. What makes this scary is if you look at the latest data, 82% of attacks have no malware whatsoever. So if you just think about that, no ransomware no ransomware payload, no encryption event, nothing that's triggering EDR or SIM, the attacker becomes a legitimate user. So their goal is simply to persist, stay undetected, understand the environment, and move laterally. So if you now think about what that means for your data, persistent attacker with legitimate credentials can now make changes, sharing files externally, modifying access controls, exfiltrating data potentially, corrupting documents. So by the time, you know, detection comes in, the integrity of your data over that window is is in question. So you don't just have an identity problem. You have a data integrity problem, and it can span a really long time. So that was the first bucket. The second bucket, unintentional. I would classify it as what we talked about, accidental deletions, misconfigurations. But that category is about to get dramatically larger and and is as we speak because AI is supercharging all of these failure modes. So Ernst and Young actually just published research showing that, almost 65% of billion dollar enterprises lost more than a million dollars due to AI agent failures in the past year alone, and we're still in the early stages. So, you know, if you think about AI agents, they're operating at machine speed and they're not making one mistake. Typically, you know, they can make thousands of mistakes before an alert fires. So, you know, we don't really see this as a separate risk per se. It's kind of all the same attack surface but from different angles. Right? So identity is that vector for intentional attacks, while AI is expanding the blast radius of those unintentional ones. Got it. Got it. Yeah. I think that that makes a lot of sense and especially as we're we're talking about certain industries, particularly regulated industries. It's a big it's a big aspect of just being proactive and ensuring that they have clear governance and requirements, to ensure that there's trace the traceability and even auditability of those tasks. And so that's something that we've seen be pretty consistent on our end as well. It it's definitely something that we think about, you know, as we build out these tools to ensure adoption is, is happening in a smooth, manner, is to ensure that folks clear on some of the rollback and governance tools that are available to them. And and some also a big part of why, you know, the the idea of leveraging, you know, some of the tools I mentioned earlier like LDS, is a big part of, you know, day to day AI management. It's just being able to have these, reliable and resilient backups being part of your general core service, for data foundational, needs as well as, like, having that full coverage across your enterprise. And so I think a lot of what we discussed here is not necessarily in a in a way to, cast aspersions or even, generate more fear. It's much more around how do we provide some of the best the best in class resilient offerings, and then ultimately, how do we partner that to provide how do we partner here to provide customers the optionality to to diversify where their data sits and how it's ultimately used to to power enterprise intelligence? Yeah. You you mentioned governance. Right? It it ultimately comes comes down to that. And, you know, we see some hesitation there. I think any hesitation in in governance is is a huge opportunity right now. So organizations that solve it first, you're not just mitigating risk. Right? At this point, you're unlocking AI adoption, as you mentioned, ahead of everyone else. So setting up, you know, that resilience layer before hitting the wall is is going to lead to to that adoption. And when we think about from a rubric perspective, how we're thinking about, both of those vectors of identity and AI and how we solve those challenges, On the identity side, you know, if a super admin account gets compromised, first thing you need to do is restore control of the identity plane. So air gap protection for something like Okta credentials, super admin configurations, MFA policies, group memberships, just regaining control of who has access before recovering the data and even thinking about, you know, how you recover the data. And on the AI side, I don't think the answer is slowing down agent adoption as as you mentioned. It's really that ability to identify which objects an agent modified, for example, and, you know, rewind precisely those actions to a known good state if you need, and do that without, you know, rolling back in entire environment. So having that surgical recovery, ability. Yep. Definitely makes sense. And and still align with what we're seeing on our side as well. Yeah. And, honestly, I think, you know, this macro context, when you think about identity AI, it makes all of this more urgent right now. Right? Every enterprise is being benchmarked basically against this AI adoption curve, and we're seeing boards ask why growth, you know, doesn't look like, you know, Google or OpenAI's trajectory on, you know, some of these AI initiatives. So the companies that execute AI at at scale are are those are the ones that are gonna pull ahead. So I don't think, you know, the the constraint isn't necessarily technology. Right? Look at Gemini. Right? It's it's extraordinary. It keeps getting better, shipping things extremely fast. The the constraint is that confidence in in governance. Can can the security team say yes? Right? And I think everything we're talking about today is is what makes that that possible. So I'd love to close with just kind of a a question that I think is is actually the right one for for every organization, you know, on this webinar to to sit with and and think about. And it's not, you know, do I need backup? Right? That's very narrow. I think the real question is, you know, if something went wrong in your workspace environment today, whether it's identity compromise, agent error, bulk deletion, do you have a recovery path that survives the event itself? Right? So I think that's the conversation Rubrik and and Google are partnering to to make possible. Absolutely. And, and it's a big it's a big and an important question that a lot of leaders are are asking themselves. And I think a goal of ours is to make sure whether you are operating in a multi cloud option or you're primarily within the workspace environment, to give you the best options to manage based on the governance provided and and general requirements of every organization, and then provide you the flexibility to be able to to move freely and and leverage your enterprise intelligence in a way that allows you to get to the future a lot faster and safely. Yeah. Well said. So with that, I wanna thank you, McAlvin, for for joining us. Thank you everyone for spending time with us. And, again, the conversation doesn't have to stop here. If you want to go deeper on any of this for your specific environment, reach out to us. I think it's worth the follow-up and really the the kind of conversations that, Rubrik and Google really want to continue having. Appreciate it, man. Thanks for having me.