Name: More Ships, More Problems? How Mixpanel Is Building Experiments for the AI Era
Uploaded: 2026-05-14T19:30:34.331Z
Duration: 1 h 29 min 40 s
Description: More Ships, More Problems? How Mixpanel Is Building Experiments for the AI Era

Transcript for "More Ships, More Problems? How Mixpanel Is Building Experiments for the AI Era": Hello, everyone. Welcome. Thank you so much for joining. We're gonna give everyone a couple of minutes to trickle in. In the meantime, we would love to initially hear from you in the chat. Where is everyone tuning in from today around the world? Please let us know in the chat. You should see that to the right of your screen. Please let us know. Very nice. Oh, New York. New York. Very cool. Okay. We got both coasts all over North America. That's great. And we got Egypt. Oh, I see an s s. SF. Very nice. Awesome. Alright, everyone. Well, we're a minute over now. Why don't we go ahead and kick it off? Emma, I'll turn it over to you to get things started. Absolutely. Thank you, Russell. So I'm Emma. I'm a solutions engineer here at Mixpanel, and I'm so excited to have you all here. So as I'm sure you guys are all acutely aware of, you know, AI has made it faster and cheaper to build than ever before. And most teams have responded by shipping more. But a lot of those teams are still operating in this, like, ship and price cycle where you launch a feature, you wait a couple of weeks for the data to trickle in, and you realize a little bit too late that it didn't move the needle the way that you anticipated. And that's because shipping faster doesn't always mean that you're building the right things. And that gap between how fast you can build and how fast you can actually learn, it's something we call the velocity gap, and it's what we're here to close. So to kind of set the stage here, here is our quick plan. I'll set some context. We'll do a quick live demo. And then Russell Loeb, our senior PM who owns experiments, will go through a fireside chat on product thinking and trade offs and really some honest lessons from building this. And then, of course, some live Q and A as well. So for some quick live intros here, again, like I said, I'm Emma and I sit on our solutions engineering team, which means that I work with all types of teams who are trying to move faster without losing signal. And experimentation is a thing that I get asked a lot about right now for obvious reasons. But I wanna make sure that Russell can introduce himself as well. Yeah. Hey, everyone. It's Russell here from the Mixpanel team, part of the product team focusing on experimentation and feature management. Been with Mixpanel for six months. So while I'm somewhat new to the team, definitely not new to the experimentation space, I previously came from a company called Optimizer. I've been working for a few years within experimentation and then the broader MarTech space. And I'm really excited to be at Mixpanel to be building the next generation of experimentation and our broader product intelligence system to help you make more confident decisions based on your data. Yeah. Perfect. And so before we get into the demo, I wanna set the stage on what we're kind of talking about here. So for a long time, running an experiment meant something linear. Right? Having an idea, setting up a test, waiting for results, shipping the winner, and that process was largely manual. And the tools that were used to do this were oftentimes separate. And that model makes sense when the pace of development was slower, but it's not that slow anymore. Because AI has compressed the time from idea to implementation so dramatically that that bottleneck has completely shifted. And writing code is no longer the hard part, but knowing whether you're making the right decisions is. And so speed is now table stakes and insights is the edge here. And that's what pushed us to rethink Mixpanel's role, not just in how experimentation is done, but really how it fits into the entire product development process. Because if you're shipping more faster, every feature becomes a or every feature without a hypothesis, I should say, is a bet that you're running blind and blind bets compound very quickly. And so that's why we launched experiments and feature flags in Mixpanel last year. What we see with teams now is that tests are running and but they're siloed in all of these different tools because analytics isn't one and experiments isn't another and customer feedback is somewhere else. And somebody has to manually stitch it all together. And by that time, the team has likely already moved on because that feedback loop is still a little bit too slow. And so we think that the teams that figure this out how to consolidate it all where testing isn't an occasional step but something that is continuous and connected with the rest of the product cycle, they're building something that is genuinely really hard to copy. And so let me show you what that can look like. One innovation that we're really, really excited about is our MCP server. So if MCP is new to you, it acts like a universal adapter that lets an AI model like Claude or GPT or Cursor connect directly to your Mixpanel data and act on it in natural language with no with no exports or copy paste or context switching. And so when we first launched it, we were focused on analytics, But now we're also introducing support for experiments and feature flags. And what that unlocks is the ability to run the entire experimentation cycle from identifying where to focus to drafting the experiment brief all the way through wiring up and shifting the test from a single conversation grounded in your real Mixpanel data. And so that means that experimentation can finally catch up to the pace of your building. And so I'm going to show you part of that first cycle live here and then we'll dive into this more q and a and kind of fireside chat with Russell. But let me go ahead and set the stage here. Let's say that I'm a PM and I own the checkout flow. Right? Traditionally and let me go ahead and begin sharing my screen here. One second. Perfect. So traditionally, I would go in and I would view a mixed panel funnels report and I would say, hey. It seems like we're losing a lot of users right here between adding to cart and viewing the cart. And I would go in and do a lot of manual slicing and dicing. I would read these replays one by one to try to figure out what could be happening here. But I could do something also entirely different now with MCP. So I'm gonna stop sharing here and show you what that looks like. Alright. So, alternatively, I could go on and simply do this. Now one thing worth orienting you on before this goes too deep here is that what I'm using here is a mixed panel skill. And skills are pre built workflows that run on top of the MCP connection. And so somebody's already done the hard work of figuring out what the right sequence of queries is, the right signals to surface and the right format to hand back. So instead of prompting from scratch every time, you just simply invoke this skill. And so this one is called brainstorm. It's going to run a funnel analysis. It's going to go through behavioral flows, session replay patterns. And I'll kind of narrate as this goes along a little bit further. But as this is running, it's like this is a little bit funny here. Let me go ahead and start one more. We'll do brainstorm and then purchase funnel last thirty days. Let's see if it wants to behave now. Perfect. It'll go ahead and go through this. So what this skill is doing is it's gonna narrate a couple of a couple of things here. Perfect. So it's gonna go through a funnel analysis. So it's gonna pull conversion rates from every step and segment by device and persona and platform, all while comparing this window to the prior thirty days because a sixty day drop off that's stable might be a design problem, but one that's been declining for three weeks is an alert. Perfect. It'll continue to go through here And that distinction really matters before you decide what to test. It'll also go through flows. So it's gonna be running and looking at our biggest drop off points because when users don't convert, what are they actually doing? Are they going back to browse? Are they going straight off the site? Are they going to a support page? That answer tells you whether this is confusing, confusion rather, missing information or friction and it shapes the hypothesis completely. You'll see it's continuing to run through here. Perfect. And lastly, it's gonna go in and run through some session replays. So for the single biggest drop off step, it'll pull a sample set of replays from users who abandoned, scanning for hesitation over thirty seconds, rage clicks, dead clicks, and kind of those last three actions before they left because qualitative signal from 100 users without watching a single replay becomes incredibly, incredibly valuable. So now what we can see here is that we have ranked opportunities that are super specific with numbers behind them. Not just this like, hey, checkout is declining or otherwise, but these exact step and segment and behavioral pattern that explains why. And at the bottom of east, you'll see a suggested hunch. So if we do x, then y because z. And this format hands itself exactly to the next step. So we can see here that it looks like there are actually zero purchases in this data. There must be something going on really dramatically here and I'm glad that it caught this. It also looks like the added conversion is weak. We have some mobile kind of degradation. It goes through this decision log and then of course next steps here. So what I can do from here is I can go in and say, hey, now that I have these ranked opportunities, do I want to say have a solution brief? Do I want to go in and write an experiment brief to then hand off to my engineers? Do I want to wire this up directly? Or in fact, do I actually even just want to ship this all just within kind of that LLM of your choice? And let me fast forward a little bit here. Alright. So if I kind of show you that end state, I kind of wanna show you a little bit of what living the dream is in that full loop. So we can see now if we were to go ahead and go through that where not only do we just do the brainstorm, but we go through say an experiment brief and we wire it up and hand it off to an engineer, now we can see what it looks like in Mixpanel, which is fully configured with all of the metrics that we care most about with the same data in the same context all with that same signal. And that means that you're kind of really abbreviating what that entire life cycle looks like. We're going to continue going into this a little bit further but I just wanted to show you kind of what that beginning cycle looks like and how powerful MCT could be. But before I dive in a little bit further here, I want to go ahead and stop for just a second. Let's go back to our slides here. I want to get kind of a pulse on the room here. I'm curious, like, how has your testing process kept pace with your shipping speed? Well, we'll give it just a moment for kind of everybody's answers to flow in here. Alright. Seeing some votes come in. Looks like a lot of folks are actively trying to close that gap, which is exciting. Definitely what we have our eyes on as well. Alright. Some folks have evolved both together. A lot of folks are kind of actively trying to close that. Some folks are saying, hey. Testing hasn't changed at all. I'll give this maybe ten or fifteen more seconds here, and then we'll go ahead and pivot. I'll talk a little bit about what Mixpanel is launching right now and then we'll go ahead. We'll bring in Russell for this fireside chat. Alright. So it seems pretty clear here that most folks in the room are actively trying to close that gap right now. Alright. So what you have seen today, the brainstorms and those behavioral signals and that replay clustering, that all of that kind of in experimentation, that's not just one feature. It's a few specific bets that we've been making about where we want product intelligence to go. And let me just frame this really quickly because it's the context for what Russell and I are going to be talking about. We've been building towards three things. First, agents that understand you and your business context, not just your data schema, but your metric definitions, your product launches, your taxonomy. Really, the more you use Mixpanel, the smarter it gets about what matters to your team. Now the second is moving from teams from what's happening to here's the fix. That's things like root cause analysis that doesn't just surface at the top, but it traces behavioral sequences that caused it and tells you where to act. And third, kind of this flexible access wherever you're working because an AI first team doesn't just live in one tool. You're in Cloud and in Cursor and in Slack and in GitHub. Your product data needs to be there too. And that kind of brainstorm that we just saw there, it really lays at the intersection of all three of these. And now with MCT support for experiments and replays, which we are just launching this week, it's the piece that really closes that gap because you're not just analyzing facts faster, but you're acting on what you find in the same place. And that's really the loop where you're finding an opportunity, you're building a case, you're briefing the team, and you're shipping the test, all connected, all in one place, and all grounded in real data. So I wanna shift and bring in the person who is actively building this. You heard from him a little bit earlier, but Russell is our senior PM for experiments, and he's in a really cool position because he's a product builder that's building for other product builders. Now I would be curious, Russell, like, because you're in that cool position, how have you started to see the product cycle change both inside Mixpanel and for our customers? Like, what are the facets that are changing here? Yeah. What we observe changing and what we feel changing ourselves primarily is the emerging technologies we have available today. Some of the stuff you show, MCP is a great example, the underlying models that we have available to us and our ability to build on top of them. It's really compressing the overhead it takes to go from analyzing, deciding what to do about, as an example, an anomaly you're seeing in your data, to actually taking some sort of concrete steps, some sort of concrete action to have an impact with your end users. That overhead, if I think to a year ago, if I see an anomaly in a dashboard for a metric that I'm responsible for, a year ago, that might have taken me a few days to manually with some AI augmentation, but also often manually looking through a lot of different data sources to figure out and explain the movement behind that metric. That could be secondary metrics that might have defined to help explain that movement. It could have been qualitative data, likes, replays of user sessions to see if you've introduced friction, could be other places as well, things like support tickets or NPS scores, a lot of different data sources to look at to triangulate whatever finding or recommendation I wanna bring forth. But looking through all that data and gathering and curating the data I need just to begin my decision making process, that was quite a lot of overhead. I think today with some of what you just showed, MCP server as an example, mixed panel agent as another example, We can really compress the overhead it takes to gather and curate the data you need to actually exercise judgment, give your team clarity. I think experimentation is a very interesting part of that because as the cost of shipping is going down, it makes it more expedient, especially for smaller bets. When you have a hunch about something, you've done some scrappy quick research to come up with a hypothesis. To validate the hypothesis, often it's more expedient to ship and run an experiment and observe how people are reacting to that. Some examples I can share from us internally in this panel, one of my teammates, Tyler, on our growth team, had a hunch based on some scrappy research about being able to speed up the time it takes for our customers to activate, and he was able to go from some quick research to a prototype. He got pretty far along for agent augmented instrumentation of the mixed panel to getting something live in front of our customers with an experiment pretty quickly. That's something that I think wasn't really possible a year ago. Yeah. Totally makes sense. And, you know, speaking of a year ago, we decided to launch kind of experimentations in future flags last year. Right? So what, like, triggered that decision to invest in building these mix panel? And, like, why why did we choose to go that route? Yeah. What triggered our investment in experimentation was, firstly, we were seeing a lot of product managers and product builders more generally using NextPanel every day for directional correlational analysis. We did offer some basic legacy experimentation features that were intentionally pretty lightweight, and we kept hearing more and more from our customers that they wanted to do more causal inference within Mixpanel as well. Some of the specific challenges we were hearing were a lot of dependencies on people like data scientists or data analysts to run this type of analysis. And secondly was fragmentation of their tool stack, having to navigate a bunch of different tools to do this this type of causal analysis. We felt at Mixpanel, to answer the second part of your question, why did we think we're uniquely positioned to deliver on this? One was we do provide the industry leading product analytics platform. Product analytics and experimentation go hand in hand, whether it's ideating, figuring out the next thing you want to build and and test in an experiment, or with results interpretation, figuring out the why behind your top line results for experimentation. We felt that bringing an experimentation into the fold in our ecosystem was a very unique opportunity for us to drive that growth loop for our customers. And importantly as well as we looked outward to other existing platforms, it's a mature space. It's also a very technical activity, and we observed a lot of existing platforms out there with a lot of features, which we felt in some ways might be a bit overwhelming, especially for people just getting started with experimentation. We felt that just as with our product analytics where people love us for balancing simplicity and approachability with a lot of power under the hood, We've taken the same sort of design approach with experimentation, and I really do think it it shows in in the product and the experience. Yeah. Totally. You know, and I'm curious. You mentioned that you came from one of the big experimentation players before. Coming into Mixpanel, where you are building this from the ground up, do you think there's anything unique that you were able to bring in having had so much exposure to that space before? Yeah. That's that's a great question. Some unique things I think I was able to bring in, especially on that last point around simplicity versus power. I think having some experience in this space helped me give me some intuition around capabilities that would be very powerful but might be a little intimidating for people just getting started just as similar to product analytics where we have there's a lot of sophisticated things you can do with Mixpanel under the surface, but we intentionally, we design our system such that they do feel intuitive, that we are focusing on what we feel are the most critical core capabilities, surfacing those, making those very discoverable, but also allowing you to dig deeper in some of these more advanced capabilities, which we also are taking the same approach with experimentation and provide a lot of statistical rigor for those who who want to dig into the nuts and bolts. And for those who are just looking more for those top line results, we try to make that very front and center as well. Yeah. Totally. You know, one thing that I have heard you bring up a little bit and something that I talked a little bit about as well is, like, this idea of connecting your experiments with that whole product cycle. Can you expand maybe a little bit on how you're thinking about that of, like, connecting that behavioral data and session replay, Yeah. really that whole picture, and maybe how other people should be thinking about it too? Yeah. Yeah. It it comes down to there are certain phases in the experimentation life cycle where this is really important. And I would say it's to be clear, it's the bookends of the the experimentation life cycle where this is most relevant. When you're starting off in ideating new experiment ideas, that, as I was speaking to a little bit earlier, when you're looking at, anomaly in your product analytics, trying to understand the why behind the movement there, the breadth of data with which we can look at now in a rapid amount of time is pretty incredible with things like our MCP server. As an example, as we on the product team at Mixpanel are using our MCP server, we're connecting it across our tool stack heavily using our MCP server to analyze Mixpanel data for quantitative analysis and looking at sessions, but also connecting to other sources such as in Notion. We we have a lot of things like our PRDs, NPS feedback, and product, feedback we get from customer support tickets, other areas like pointing to our GitHub code repo to get a better understanding of the product from a technical perspective and how we've instrumented events. That's been pretty huge to, as I mentioned before, speeding up the overhead it takes to make decisions, take action. And then for interpretation as well, that's been it's been really powerful to glean insights and new questions that we might not have thought of without some of this AI augmented synthesis and analysis. That's I think both ends of the spectrum have been pretty powerful for us and for our customers as well as they start to adopt things like mixed panel agent and MCP server. Yeah. Totally. You know, and like we mentioned, experiments and feature flags were launched last year. So they've been in our customer's hands for a couple of quarters now. You know, I'm I'm curious if there were any assumptions that you and your team had that, like, turned out to be a little bit different than reality or perhaps just anything surprising that you've seen in how folks are are really using the tools. Yeah. One assumption we had was that product builders and product managers, especially, people who were not highly specialized data scientists and data analysts, would be most concerned with top line results. From the start, we still built the platform with deep statistical rigor that you would expect from any modern experimentation platform built in for the data scientists and the data analysts that these PMs and and more generally builders were working with. What we were pleasantly surprised by was there were people on teams outside of those data scientists and data analysts who did wanna engage with some deeper sort of dimensions of their experiments, things like, can we control for false positives? Can we control for variance? Can we control for outliers to reach statistical significance faster, to learn faster? That was a really pleasant surprise, and that's inspired us to invest in areas such as we recently released a lot of health check features to make sure that your assignments for your variants are not getting skewed or that there's no preexisting bias in the the different treatment groups in your experiment. Other things to control, like Bonferroni correction, QPID for variance, winsorization to control outliers. Those are some more sophisticated powerful features that we would have been able to ship quickly, and we're gonna continue investing in that statistical foundation as, through line through our our road map continuously. Yeah. Totally. I've seen so many folks come in and ask about that. So seeing all of those changes getting shipped and shipped so quickly, I know it's been exciting for me, but maybe more importantly has been so excited to all of the folks that I talked to. Yeah. You know, Certainly. we've been talking a lot about, like, MCP. Right? You know, can you give us some of the behind the scenes of what it was like to build that in conjunction with experiments and feature flags, you know, especially with what you've seen with our customers who are using it right now and maybe some advice for folks that wanna get started with this but don't particularly know how. Yeah. It's been really exciting as we've built MCP. We, in general, heavily use early versions of what we're building, and it's been it's been really cool. Some of the things that stood out to me, one, was being able to connect your code base to your analytics and your experiments has been really a powerful unlock. I think the second thing has been being able to couple our MCP server with extending your LLM using skills like you showed off a little earlier, Emma, and and plugins. That's been really cool. The experimentation life cycle, it is a multistage, somewhat sophisticated workflow that that requires a lot of different steps. That's a great opportunity to introduce a plugin, which, if you're not familiar with that, is a bundle of skills like Emma was showing off. What that exact thing you were showing was a plug in we've developed that our growth function developed to run experiments more quickly. Something that I don't think it's talked about enough is these skills and plug ins can be really molded and personalized to your workflows. As a PM myself, I've grabbed plugins from my teammates and molded them to my workflows because the experiments I run, my optimization focus is gonna be a little different than some of my teammates. That's been really powerful because these things are so malleable. We in SF on Tuesday, we unveiled Mixpanel AI. I got to speak with some of our customers firsthand and get them set up with our MCP server. There's been a ton of excitement about it. One theme that has come up a lot is governance, and I'm excited to say that our MCP server comes with so much of that out of the box. When you authenticate with MCP, it inherits all of your user roles and permissions. Even though you can do us so much with MCP, do a lot of bulk manipulation, there still are inherently those guardrail in place where any resource that you cannot edit or cannot delete, the same restrictions apply to the MCP server. That's been an important theme that's come up, and we're really well positioned for it. Advice to get started, one is on that point about skills and plug ins. At the end of of our session today, we're gonna share some resources. One of them is a public repository that we are building out live in the world for all to see of different skills and plugins. We're using generalized versions we want to share with all of you. I would strongly encourage you to make those your own. Use them out of the box, but also hold them into Claude. Use them as reference and chat with Claude about how can I mold these to my tool stack, my workflow would be one piece of advice? I would strongly recommend for the audience here. Totally. So you mentioned these skills and plugins, which, obviously, we've talked about a little bit today. But I'm just curious, like, what skills and plug ins are you using, like, on your daily? What have you found to be most cool for you? Yeah. I what has been really exciting for me a few things. One is I find the session replay using session replays and skills and plugins really powerful. I think that's one source of data that is quite time consuming more so than others to glean insights out of looking both in our team and then speaking with some of our customers. We've observed how much time investment that can require. Now with MCP or if you prefer to work in within mixed panel, with mixed panel agent, having a natural language conversation where you can interrogate your LLM about insights of your replays has been very, very powerful. I think a second thing that comes to mind is on the topic of molding skills and plugins to my own workflow. There are certain frameworks I've I've come across in the past to grow existing features that I really like, that I brought from pre previous roles. I've been able to infuse those same frameworks into the skills and plugins I'm using, which could be a little bit of a different focus from as a core PM, often, I am thinking about when it comes to experimentation, how can I grow features we've shipped, how can I understand if they're working or not? That can be very different. If if you're a growth PM thinking about monetization as an example, that can be a very different sort of ballgame, and the types of ideas and the types of frameworks you might be using could look very different. Those are a couple examples that come to mind, Emma. Totally. So you have mentioned Mixpanel agent a little bit, Yeah. and this is something that was just launched this week. So do do you wanna give a little bit of an explanation about, like, when you might use mixed panel agent versus when you might use an MCP for thinking about experiments? Yeah. That's a that's a very good question. It will depend. A core belief of our AI product vision and road map is that the interface is changing, And as a result, we're gonna meet you where you are, whether that you do prefer to work out of cloud co work or cloud code. That is your central control plane for everything you do. We will meet you there. Or if you prefer to work within Mixpanel agent, we will meet you there as well. There are some intrinsic pros and cons to to each. Some of the things we can bring in Mixpanel Agent that are not possible in MCP is, I think, a a deep level of interactivity. You'll notice as Mixpanel Agent is rolling out to all of our customers in the next couple of weeks, right out of the gate, we're packing a lot of rich user experience within the chat. So it's really not just a back and forth multi turn chat. We are embedding components that link to first class citizens and primitives across the Mixpanel ecosystem to make it really easy to understand the our agents' chain of thought, the data it's using. You can do that all right within the the Mixpanel ecosystem. So I do think that's a big benefit of the agent is you are already in our Mixpanel ecosystem, and you can view these things, the the data it's using really firsthand in an unparalleled way. Yeah. Totally. And I have to assume for folks where we're experiencing kind of challenges in getting MCP connections, whether that's for security reasons or otherwise, that being able to have something in platform is is a huge plus there. Yeah. Absolutely. It's something that you can use right away. Again, it'll I would strongly encourage everyone here to give that a try in a couple of weeks when we roll it out more more broadly, but that's at your fingertips as soon as you log in to to Mixpanel. I am so excited for everybody to be able to, like, get their hands on this. No. So, you know, we, or I'd say, like, you know, one thing is that AI is really making experiments more accessible than than than ever. Right? Like, it's just it's becoming everywhere. Everybody is getting access to it. So instead of setting up all of this kind of statistical stuff manually, AI can now handle it where you don't have to become that expert. So I'm curious, like, what kind of knowledge and expertise do people still really need right now? And how can that be offloaded by AI or perhaps supplemented? Like, how are you thinking about that? Are there any perhaps fundamentals that won't go away? I think there are some fundamentals that won't go away. I do think it's important to understand the intuition of experiments and some of these statistical paradigms to understand the value. Things like understanding the value of causal inference as a whole, firstly, where if you are doing directional directional analysis in your product analytics and you're seeing interesting trends, you're starting to form hunches about why a metric might be moving, maybe these things are correlated, that's very powerful. At the end of the day, the only way to make causal inference is through running, an AB test. I think that alone is a pretty powerful thing for teams to digest and and internalize, or that's that is the way to make very confident data informed decisions is to to run experiments. And in Mixpanel, we are consistently providing new statistical models depending on we've heard from our customers that certain teams, they really find a frequentist model very much fits their data. It fits how they think about experimentation. An upcoming area we're investing in is Bayesian, where as opposed to frequentist, some teams find it more intuitive to think about the probabilities of whether or not something is moving the needle. That's something we're we're actively building right now. Ultimately, the understanding the intuition behind the value of AB testing is not something that will go away. We're going to be continuously adding more augmentation to help you design experiments in a sound way statistically, but that intuition is always gonna be important in my opinion. Totally. And for folks that are maybe starting at ground zero where they don't have that intuition, they don't really know where to start, what resources or kind of just ideas would you recommend that folks, really, like, develop? Like, where do they go? How do they begin developing this? Like, what have you found to be helpful there for, say, somebody who's coming in with with near zero knowledge? We have a couple at Mixpanel, we provide we'll we'll mention this at the end too, but we provide, some educational resources, a couple of which is an experiments university course. We also provide an ebook on how we feel experimentation is changing. We also think things like Mixed Panel Agent and our MCB Server are great ways to ramp up on this domain in general. Those are both things with full access to our knowledge base and docs in an easier way and more fluid way, I think, to ramp up on some of these concepts that were that are going to be in the product that would be worthwhile getting familiar with. Totally. Big Spinal University is such a great resource. We have courses on quite literally anything you can think of and absolutely experimentation there too. You know, we've talked a lot about, like, the tech side of this. Right? But sometimes the people and culture part is often the hardest. And so I'm curious, you know, like, what have you seen in teams that really kill it when it comes to experimentation? And what does good culture and process look like today? Thinking about both us internally at Mixpanel as we scale our growth program and seeing some of our most advanced customers, one characteristic of a strong team that that comes to mind is crowdsourcing ideas. Regardless of if you are running a centralized experimentation model where one team is driving that your growth initiatives, completely decentralized where experimentation is going to be embedded in each team, or a hybrid, a center of excellence where you have an enabling team helping different pods scale. At the end of the day, the more you can democratize and promote participation from everyone across the org, the more momentum and excitement you'll drive. One concrete example I'll share for us internally at Mixpanel is we have a company wide Slack channel where literally anyone and everyone posts growth ideas. How can we grow our product? And this ranges all different teams, and the the focus can vary for all different sort of units of the business. That's been a huge source of excitement. And to see ideas go from just ideas to shipping things, especially now as things like RAMCP server and Mixpanel agent enable more and more people to get up and running with tests, that's been a huge source of momentum for us. The second thing I'll mention is, on the flip side, sharing out your results as well and not being fixated on sharing out successes. It always feels so good to take a big swing in an experiment and learn that it was a success or hypothesis was validated beyond any any doubt. And to share out that that accomplishment is great. It is also equally important to share out experiments that were inconclusive or that were negative. But you learn some very valuable things from those experiments. And to keep in mind, the common rule of thumb is that most experiments do fail, but all of them are lessons to be learned from. The more you can share out those types of the experiment results as well, I think the more it drives inclusivity of others to get started, understand that we do need to this is how we learn with confidence by failing and understanding that that will raise new questions that we can turn into new things to test and ship. The final thing I'll mention is that regular metric reviews as part of your product analytics practice is also a great forcing function to drive experiment ideas as you are looking at your product analytics, seeing movement and metrics, and trying to understand the root cause of those movements. That is great fodder for experiment ideas to take action to see. I have a hunch that this is the thing that's driving this. Going going from that idea to a full fledged spec and getting it out in front of your your end users, that is a great source of ideas. Those are a few things I'd I'd recommend that both we try to live at Mixpanel that we see with our customers as well. Totally. So thinking about that, like, sharing piece, that seems to be really instrumental here. Is there an easy way to share those results within Mixpanel for those tests that you're running? Yeah. There is. There is. In Mixpanel, first and foremost, we have a dedicated view for results that you can share across your team, and we're consistently working on making that experience more digestible. You may have noticed for those already using Mixpanel, we recently for experiments, we recently released some some new changes to that experience to make it more digestible. What we heard from a lot of our customers was the reports we provide for a lot of of people on their team, that serves as an executive summary for results. I want to see at a glance how was my experiment gone. And then, optionally, there's going to be a minority of people, maybe the person who ran the experiment, maybe certain specialists who want to dig deeper into the configuration, who want to dig deeper into diagnostics behind that experiment, understanding the settings that were configured, understanding our our how our health check's looking for our experiment. We follow that same mental model on the product, and you're gonna see more changes soon. Right now, we've provided a very clean, simplified, approachable view for your results, and we've provided dedicated views for those other sort of concerns as well. We're gonna continue building on this. I think an exciting opportunity I'm, looking forward to with the team is making that experience more customizable in in the product. We're we're actively shaping some some work to add some more color and commentary to your experiment results within the product. I also think some of the things that you showed off, Emma, with MCP, that's a great resource to help you mold your results and make those more digestible for your team. And then with Mixpanel agent as well, we already shipped v one of our evaluation agent that helps you get a plain language summary of your experiment results, especially for those who are less familiar familiar with some of the statistical concepts involved. We're gonna continue building on on that as well. Yeah. You read my mind. That's exactly where I was gonna go next, which is, like, how does MCP enable this? So. thank you for for jumping ahead of me there. Sure thing. You know, there's so much great engagement in this chat and q and a, and I wanna make sure that we can have time to to answer some of those really cool customer questions that are coming in. But before I do that, I just have kind. of like one big picture to ask from you. And that's like, where do you think that the biggest evolutions are coming in I think about this in there's a shorter term time horizon and a longer term time horizon. Shorter term, some of the things I'm most excited about that that we're executing on are, one, is no code experimentation, democratizing from a technical perspective who can run experiments. There are existing solutions out there. We are actively shaping and building our own no code experimentation solution. We are in a very exciting position with the explosion of AI tools for generating code to build something that is much more AI native and what we feel is going to stand out and feel a bit different from existing legacy tools out there. So there'll be more to come soon on that, but building this next generation of of experimentation in the age of AI is really exciting. The other thing I'll call out that's more near term is predictive simulation. We already have things like our metric trees, which visualize relationships between your lower level tactical metrics and longer term, more lagging North Star metrics that ultimately constitute success. We already today show you if you are running experiments for a metric in that tree, how those experiments are affecting that metric. What we are moving to is more predictive extrapolation where we can help you answer the question of, okay. Yes. This experiment drove x uplift in this metric, But what is the impact of extrapolating that over the full fiscal year? One. And then two, what is the impact on upstream metrics that we really care about that my leadership would really care about? Things like pipeline and bookings and retention. Those are a couple of shorter term in in the grand scheme of things, shorter term evolutions we're building towards and excited about. Longer term, what we're really excited about, one is simulating experiments with synthetic users. We at Macepanel have over 29,000 customers. We process billions upon billions of events for our trusted customers. That is such a great foundation to start creating synthetic users that do accurately represent how your real users behave given that foundation of of data to build on top of and get earlier signal to simulate experiments to get really early feedback on your hypothesis before even launching a live experiment to your actual users. Another aspect that we're excited about is the opportunity to fully automate certain aspects of optimization. If there are certain allow listed surfaces that you want to let Mixpanel fully automate or certain aspects that you want to allow Mixpanel to fully automate with strong guardrails in place to control only this part of our site or application or even just these parameters of our site of this part of our site and maybe a close set of values that we can automatically optimize for. But at the end of the day, being able to optimize any single surface of your product, whether that is higher stakes and requires human intervention or whether it's lower stakes and can be fully automated is is what we're really excited about working towards. Yeah. Well, I surely can't wait to see that vision, and, you have been so helpful in just understanding what this looks like, not only at Mixpanel, but for all of the customers and prospects that that are listening in right now. There are so many fantastic q and a's that have come in. I wanna make sure that we have time to get to them. But I do think that we have a poll for our audience that we want to launch really quickly just to kind of understand before we begin jumping into the q and a where folks are at here. So how would you guys describe your team's current experimentation just like maturity? Seeing Seeing some votes come in here. Experimenting consistently. Fantastic. A lot of running tests ad hoc with no real system. You know, as these are coming in, Russell, is this what you anticipated the kind of results would look like here? I think it can it can vary quite a bit. It can be hard to get started. But once you build that momentum, it can really take off. So I it I wouldn't say it necessarily surprises me. What I would strongly encourage everyone here who is in that bucket of running tasks ad hoc, no real system, trying to systematize your efforts, Some of the things that we talked about today, MCP, and Mixpanel Agent, strongly encourage you to try those things out because it can really reduce the overhead it takes to simply get started, build momentum, start showing people the results of your testing efforts. That's, I think, a very actionable next step that I'd I'd recommend anyone in that camp. Absolutely. Okay. Well, why don't we go ahead and get to some of the q and a that I see here. Let me go ahead and stop sharing my screen. You know, one of the big ones here is a lot of questions around, like, why would we use a Mixpanel feature flag? Like, what is the advantage of using a Mixpanel feature flag over one of the other tools, like a LaunchDarkly, etcetera? Can you speak to that a little bit? Yeah. At Mixpanel, something we provide that no other platform does is an unparalleled depth of targetings and most notably with behavioral targeting. What's really unique about Mixpanel's feature flags is the same behavioral cohorts that you, if you are a Mixpanel customer, are already using today to run your analytics, which what we see from our customers is it can get pretty deep in sophistication. You can target things that are pretty unparalleled, in my opinion, to to other platforms, having built other platforms in this space as well. You can use those same cohorts to target for your feature rollouts and for your experiments, all without the overhead of having to manage everything in your code, which if you look externally. One one of the things we noticed is is a lot of overhead to to do just simple targeting. Took a lot of engineering overhead and and work. That's, I think, one of the things that stands apart the most with mixed panel flags, in my opinion. Totally. And, you know, kind of leading into that, I see somebody asking here about, like, hey. They wanna run onboarding experiment experiments, but for those users that are landing the first time on their site, is there a way to just, like, show a simple AB test to those first time users? Is that something that you can configure? There is. We we, not all that long ago, released runtime events, which allows you to target with mixed panel cohorts first time events for that exact use case of we want to run an onboarding flow. And after this event is triggered, we want to show a different experience. That's something that is supported today in Mixpanel. That is still using Mixpanel cohorts. That's all going to be synced up with your Mixpanel cohorts throughout the ecosystem. But you can still you can get the the best of low latency targeting with your existing Mixpanel cohorts. That's something that that is available and sort of tailor made for that use case today in in the platform. Yeah. Awesome. And, you know, again, talking about how we incorporate all of these other kind of facets of Mixpanel into experimentation, particularly session replays. Right? How do you kind of recommend avoiding any sort of, like, summation issues when you're still being effective about token use? Like, what is the trade off between, say, analyzing all of your session replays versus maybe just a subset for those specific tests that you wanna run? Yeah. The trade offs are the the broader you go, the it can really be helpful going very broad for high level direction if you are thinking higher level as a product manager trying to define direction for our road map, what problem spaces might we want to tackle, that's where going really broad and and inevitably shallower can help to get some higher level signals. If there's a specific problem space, a specific if you are a PM responsible for a conversion funnel and you're seeing friction in a certain phase of the funnel, I wanna identify some specific subproblems within the problem space here. That's where I think going deeper can help. Both in both cases, things are augmented by our ability to do this through natural language and then look at the underlying data ourselves. I do think at the end of the day, it is important as product managers. We still stay close to the data. In mixed panel agent, as an example, when you curate session replays and start to converse with mixed panel agent about session replays, we will, by default, surface time stamps of, hey. We found these insights for you. Here's the here are some excerpts of raw data for you to look at to still empathize with your users, see the actual friction they're hitting, see the actual frustration signals. The way we've designed this panel, agents, is we wanna give you both. We wanna give you the speed to synthesize. We also want to make sure that you are staying close to your customers and staying close to the data. Totally. And I think that this leads into the next question a little bit, which is that, you know, traditional experimentation was often built for, like, really high volume consumer products. But we're seeing that shift a little bit, particularly in the age of AI. Like, in what way should PMs adapt their thinking around statistical significance or directional signal or just general product judgment in lower volume but faster learning environments? Yeah. We are already shipping some capabilities to help here. Things like Cupid controlled experiments using pre experiment data. This helps you control for variance control for underlying variance within your users that is independent of your experiment. That can actually be used. We can adjust the the results of users depending on this is a certain group of users where they just naturally are maybe power users who have just naturally more engagement. We can use that variance to reach time to statistical significance faster. And we're gonna continue to build upon this. Something on our roadmap are things like stratified sampling to ensure, that your treatment groups are homogeneous for certain dimensions that are important for your business to reduce any kind of skews between those treatment groups. I do think it is also worth embracing even in lower traffic environments. Shipping and observing still is a valuable exercise even if you, in a certain scenario, can't reach statistical significance. At the ultimately, the confidence levels you you choose are arbitrary. You might be more comfortable lowering confidence if that helps you the confidence levels if you reach that sig faster. If it is a lower risk change you're making, we give you full control over your the confidence level you're using in Mixpanel. You can also pretty seamlessly roll out a feature, observe how people are behaving in response to that feature in Mixpanel as well. If you don't even think it it makes sense to run a a formal experiment, but still want to act, observe, see how people react, and see that through multiple angles, things like not just the metrics, also things like the session replays. How are people experiencing firsthand the things we've shipped? So I would say don't be closed off to all the different tools and methods you have available to get data. Some data is always going to be going to be better than no data and opinions. Yeah. Absolutely. You're also seeing some questions here about we've been talking so much about MCP and how it allows you to connect all of these different systems. And that's inherently super valuable. But for folks that are going to be leaning more towards like a mixed panel agent where it's entirely in product, is there any additional way to like add context that might otherwise be living outside of it? What's the the best method there for when using an internal agent to still kind of get some of that data that might be otherwise living outside? Yeah. First of all, we are already going to be providing the ability to connect to other external data sources with MCP, and we're gonna be rapidly adding to that. Whether you are working in MCP server or with a mixed panel agent, we will provide you the ability to stitch together from external data sources. Additionally, the business context engine we're we're building, that is going to be, I think, a really powerful way for you to start to incorporate data from outside sources so that as you continue to work in Mixpanel, as you can continue to pull in external data sources, our system will continuously learn from all the context you're providing it through your interactions with the Mixpanel agent. And that will be something that we can automatically leverage to improve our responses and improve our recommendations. Yeah. Absolutely. So I think that that is nearly all of the questions that I see here. The last one is just, like, we've been talking about MCP. We've been talking about mixed panel agent. Are folks able to get their hands on it right now? And if not, when does that timeline look like? MCP server with is is available for all of our customers today. Support for experimentation and feature flagging is in open beta as of Tuesday with San Francisco MXP. That is available today. We have a slide to link to some of the resources that you can try out. That is one of them. You can try that out today. Mixed panel agent, we've already rolled it out to, a broad spectrum of of customers, especially those who attended MXP here in San Francisco a couple of days ago. In a couple weeks, we're gonna be rolling that out to all of customers. So do expect that by the the end of the month. Fantastic. Well, thank you so much, Russell. This has been so informative, and the questions have been so targeted. So I'm so excited to see those rolling in here. I know that we have kind of this last takeaway slide here. Did you want to mention anything in particular here for folks to walk away with? In conclusion, please give some of the things we've been talking about a try. I'm confident it's going to help you move from analyzing, gathering data, a lot of the, candidly, the tedious tediousness that comes with that to making decisions, taking action, giving your team clarity, whether that is our MCP server, whether that is mixed panel agent. Give that a try. We have a QR code that links to a lot of resources for how to get started for those those things. Also, we offer educational resources like our university course for those who want to get a better idea of some of the concepts behind experimentation that you you will see in our platform. And, also, check out our our ebook as well for for similar content on how we kind of view testing shifting in this this AI era. So exciting. I've seen this ebook myself. It's very cool. But, yeah, thank you again so much. I really appreciate all the time that you took today. Your deep expertise is absolutely felt, and and some really great resources for everybody to check out. But thank you everyone for coming today. I really appreciated it. We will have a recording of this. So if anybody wants to take a deeper look a little bit later, it will be available for you. But otherwise, I know folks are joining in from all over the world. So I hope that folks have a good rest of their mornings and afternoons and evenings, and we'll all talk soon. Thanks so much again. Thanks, everyone. Bye bye.