“Experiments in Chaos” is a Steadybit podcast that brings together experts in software systems reliability to discuss how to incorporate resilience practices like chaos engineering to foster a culture of reliability.
In this episode, Benjamin Wilms sat down with Jessica Kerr, an Engineering Manager of Developer Relations at Honeycomb, to discuss the impact of new AI agents on software systems. They explore why organizations are not ready to handle this new complexity and how they will need to shift their thinking for this new paradigm. Lastly, Ben and Jess discuss how AI will likely impact observability, the role of chaos engineering in validating agents, and how iterative learning can benefit both team members and agents.
Benjamin Wilms: All right. And yeah, today I’m very glad to welcome Jesse. We recorded already our podcast session a while ago, some years ago, at that time it was for, uh, in the, in the Honeycomb, session. And yeah, with a lot of luck, we just reconnected at KubeCon in Atlanta, a couple of weeks ago, and now it’s my pleasure to welcome you in my podcast session. So happy to have you here.
Jessica Kerr: Thank you.
Benjamin Wilms: Maybe you can just introduce yourself for the audience.
Jessica Kerr: Great. My name is Jessica Kerr. I go by Jessitron online, jessitron.com, cetera, and what I care about is systems, learning systems, and I think the software systems include the team in that. So I care about software learning from our teams, which is like we code it and I care about our teams learning from our software, and that’s why I now work in observability at Honeycomb.
Benjamin Wilms: It’s, it’s very nice that you mentioned like your definition of a system because like in previous episodes, I already talked with other people like Russ Miles, but also, um, like Casey Rosenthal. If normally people from our space are talking about a system, their definition is just technical, but a system by design is a socio-technical system, so you need of course, technical components, the software, and there are so many other important elements like you mentioned. There’s the people that are building and running those systems and, is there any change you can identify in today’s system as the complexity of those systems?
Jessica Kerr: Oh yeah, of course. Because now we have the running software. We have the team that modifies it, and we have the AI agents in between, so there’s a complete third component in the sociotechnical system. It’s now a socio agentic technical… Yeah.
Benjamin Wilms: Yeah, we need to find a new term for it.
Jessica Kerr: Right, yeah, there’s these three completely different, using the term loosely intellects, I mean the, there’s the deterministic software, there’s the, the people, the human team that decides what it should do, and then there’s the coding agents that we use. So the people are communicating to the coding agents, which are changing the code, among other things. Uh, and then, and then sometimes you have agents involved in the running software, and that’s a completely different part, so now our software is not fully deterministic anymore.
Benjamin Wilms: Yeah,
Jessica Kerr: Yeah, it’s, it’s a very different system and it’s more powerful than what we had before, but we’re still, we’re even less familiar with how to do it.
Benjamin Wilms: Exactly, and that’s like, there is a lot of opportunities coming up with those new designed systems and they are very dynamic, but, how to say it, right? Are we ready for this change as organizations?
Jessica Kerr: Of course not.
Benjamin Wilms: Why, why? Yeah. We, we cannot change it. It, it’s already there, but, um, what are like is this, why we are not ready. What do you think?
Jessica Kerr: Ah, why we’re not, well, I don’t think we ever got to figure out how software systems work in the first place, right? I,
Benjamin Wilms: Yes.
Jessica Kerr: Right? Like how to design into software- that’s still new. ’cause software as a, a medium. Like compared to paint or bricks or metal, software as a medium, is completely different and you can kind of design right into it.
And when you do that, like you encode your design into what should it do and then the, the software like in theory does that or something close, and then you have to find out how it interacts with the real world, because a piece of the system that we haven’t added in yet is the people using the software or the other software using the software.
That’s, that’s a different boundary that is not what we’re mostly talking about as developers, but yet that’s the, the system that we care about is how our software impacts the rest of the world. That’s our real objective. Okay, so we were still working on figuring out how to really craft software and, grow it and, improve that interaction with the rest of the world and care about it and keep learning ourselves, et cetera.
We didn’t get to figuring that out. I don’t think we’ve mastered it, and now you add agents into the mix and we don’t know what we’re doing, which, which is fun.
Benjamin Wilms: Mm-hmm.
Jessica Kerr: My friend Eric Evans says that, uh, this, the introduction of LLMs is the most exciting thing since the internet?
Benjamin Wilms: I That he, he was, he was around for like, when the internet was new and that was very exciting and people were exploring what can we do with this? And, and this time reminds him of that.
And now with like the first steps done in that new direction with agentic-driven systems, is there a gap coming up? A gap between like the expected behavior from engineers that are designing those new systems and then really like what those new systems are doing? Is there a gap?
Jessica Kerr: Oh, definitely. Yeah,
Benjamin Wilms: A good one or a bad one.
Jessica Kerr: Oh, I, I mean, the part where we don’t know what we’re doing is disconcerting, but there’s very high potential, right? We don’t know what we can do yet. So in that sense, it’s good, but we are used to writing deterministic software. I mean, I got into coding because I was studying physics in college.
Okay. Just, just bachelor level, but once we got to quantum physics, I was like, well, this is ruined. Now I can’t like, make nice predictions about how the system is gonna go, what fun is this? It’s all just probabilities,
Benjamin Wilms: Yeah, yeah,
Jessica Kerr: And coding was more fun because the computer did what you told it and, and that’s gone now. It’s gone.
Benjamin Wilms: Yeah, it’s, uh, partially gone, that’s right, and, hmm, so I don’t want want to sound like this paranoid guy, but I need to ask this question again or like more in this direction, most people, if they are talking about AI, they are just assuming, Hey, there will be a prompt.
I will enter a prompt and something will come back, and then this new AI system is doing something. But how do nowadays engineers really interact with it? Have you seen it in production or is it, you’re still in this prompting approach?
Jessica Kerr: Well, there’s two aspects. There’s using AI for coding and there’s using AI as part of the software. So when we’re talking about AI as part of the software, there’s that black box of we send it something, it sends us something back, but it’s never just the prompt. It’s also what tools you give it and what context you, you give it when it’s an agent, not just what context you actively give it, but what context you give it access to through tools.
So we get to control what it knows and what it can know, and all of that influences its output, and then we also can change how we react to its output. Like what, what do we do when we get a parsing error and, what different checks and, uh, corrections do we apply to it? That’s a thing. But there’s still a black box in there.
And that black box, we can’t just test. For an API, you can write contract tests and you can say, I believe when I send it this, I, I get this back, and then you can run that against the API and you can run it again when you think maybe someone else on the other side has introduced a bug, and you don’t get contract tests with LLMs, you get evals. And that’s a whole ‘nother space that we have to learn how to do.
Benjamin Wilms: Yeah, and those responses, the context where this response is coming out of is every time a different one, it, it will change, and in unit testing, integration testing, yes, you’ve got a lot of control compared to the new way of dynamic systems where agents are part of it, but how to really make sure that those systems are working as expected and how much should we control them and what is now the, the new way of observability?
Jessica Kerr: We can’t define anymore. We can only define good enough,
Benjamin Wilms: It’s, it’s good enough for now. Yes.
Jessica Kerr: Right? Right, can we define good enough and can we define better? And usually, I mean, that’s really hard. But if you can get there, you can improve.
Benjamin Wilms: And, in the old days with observability, you have installed maybe some agents, they have scanned your systems, they have collect data. Maybe you have been using some open telemetry information. With now those agents in place, what needs to change from the way of how we observe our systems?
Jessica Kerr: We have to care more. We have to be more deliberate about it. So you, you mentioned installing agents for observability. That’s like level one. That’s like the bare minimum, but if you care about your software and you really want to know what’s going on, and not just in your software, but in the software plus users system, then you want, nowadays people use open telemetry for this, but you want your software to emit telemetry about the business domain, you want it to tell, tell you through its telemetry, what it’s doing, who it’s doing it for, like user IDs, um, what it’s using to make decisions, whenever I see an IF statement, whatever that clause is, I like to stick it on as a span attribute. So it shows up in my distributed trace so I can know which path the code went down, that kind of thing.
You want your software to tell you what it’s doing and to spy on your users for you. Which features are they using? How are they using them? Is it what you expected? And now we need it to spy on the agents as well. What did we send them?
What did What did they send back? What tools did they invoke? What came back?
And then what came out at the end and how did our application react to what came out in the end? All of that is part of your application tracing, and the less we can predict what the black box is doing, the more we need all of that information. Both how did we get to this context? What happened in the interaction? And how did our application react to it?
What resulted and, if we can get some sort of feedback or indications from the user of whether that was good enough, really valuable, so we need way more observability.
Benjamin Wilms: And, as of now, observability is more showing what happened in the system. How close do we need to get to that real time approach? How fast should we be able to see what happened in the system?
Jessica Kerr: Oh, like what time do you need between this happened and I can see it.
Benjamin Wilms: Yeah. For example, if, if there are only humans in the loop and no agents are doing stuff automatically in the system, so, so they are not part of the system, um, do I need less time?
Jessica Kerr: Ah, that’s a good question. I mean, we all like real time observability. Uh, at Honeycomb we have an internal SLO that your data should be available in about five seconds – from when you send it to us. We can’t control the delays before that. Um, and it’s usually less than that, so you want to be able to know quickly.
Now, the reason pre AI, the reason honeycomb has always been designed this way so that you can know quickly is so that you can deploy something and see whether it worked very quickly, and now that cycle of do-something-and-look can be even faster if an agent is involved, And it’s, and with agents, they can take in a lot of information and react faster than we can in a lot of cases, but for that, they need to be able to gather the information quickly. Um, so like the availability of the information can be a limit on how quickly your agents can work on the speed of that feedback loop, and also how quickly does a query return. This is, this is one reason that like the things Honeycomb has always focused on are even more important with AIs, ’cause we put a lot of emphasis on making every query fast. And suddenly, I mean, we did that so that people could run multiple queries in a minute. And now it’s like the agent running tens of queries in a minute.
Benjamin Wilms: Yeah, so, and now what is like the solution? We need to be even faster at space before.
Jessica Kerr: We still, it’s still about enough, you know, it’s about fast enough because honestly, do I want my agents making changes to my system at the rate of 10 seconds each? Not sure about that actually.
Benjamin Wilms: It really, it really depends on, on like the use case or context we are talking about. Maybe if you are in a trading system, maybe, but if you are like in a healthcare system, ah, take your time.
Jessica Kerr: Right, right, right. Because you do want humans in the loop, if anything really interesting is going on. Um, if it’s not interesting, agents are just somewhere on the scale of, we know this happens. We automate it. Like in Kubernetes, we know the service gets busy. Sometimes we scale it up. We know the, it goes down, sometimes we restart it.
There’s a, there’s a level of predictability you can fully automate, and then there’s, we don’t know what’s going on and we better look at it, and we better look at the wider implications, that’s people are involved.
And agents agents can do something in between, right? They can do, well, we don’t have an automated response for this, but we can make a really good guess.
Benjamin Wilms: Yeah, as long as we are controlling the context or the data which has been used to produce a new response. If this is not controlled by a human in the loop, I mean, it could be getting very ugly was like those responses by an AI agent. If I remember correctly, there was like a company trying to replace all the support people on on-call.
And after a couple of days, the agent was responding in a way that this was not good for the company because he was getting out of control. He was collecting more and more data from all the conversations. And then he, yeah, don’t get me wrong, but suddenly changed his mind about the company he, he should work for.
Jessica Kerr: That’s hilarious.
Benjamin Wilms: It’s, yeah. That’s why, you need boundaries for your system, for your agents, and yeah, don’t get everything out of control. It could, it could hurt you as well.
Jessica Kerr: Yeah. Too much information is too much information. More, more is not always better because a lot of, like when I’m working with an agent for coding, a lot of what I do is clear information out of its context. It’s time to start over and give it just the stuff that was right so it doesn’t get hung up on the stuff that it was wrong about earlier.
Benjamin Wilms: That’s one of those very important concepts of AI. Um, AI will not generate something new. It’ll always build on top something which was done in the past. So it’ll not create a new idea or a new whatever way of solving an…
Jessica Kerr: We like to think that, but I think combining, combining multiple things from the past is how we come up with something new, and I think it does that.
Benjamin Wilms: Yes, but it’s, Hmm. Yeah. Okay.
Jessica Kerr: I, Yeah, I mean, I mean, I don’t know whether it’s intelligence or not. Uh, I don’t think that word has a definition that’s useful.
Benjamin Wilms: Yeah,
Jessica Kerr: But I do observe that it, it can surprise us.
Benjamin Wilms: That’s true, and if you see it as something which can support you, can help you, can maybe give you a little bit of guidance, but the final decision, or like the last step, it needs to be approved or done by a human, then it’s a strong way of building new systems or solving very complex issues.
Jessica Kerr: So do you think agents have a place in automatically responding? To some, uh, system conditions, software system, condition.
Benjamin Wilms: Um, yes, if they are running in, in kind of some boundaries, and if the context is defined very precisely, then yes, and then they can act way faster as anyone else, but it would be an horrible solution to, let’s imagine there is an old system running since 10 years, and now you’re putting in a new freaking awesome agent, he’s now part of the system. The system is not designed to handle this. And I mean, um, it’s the same issue like in the old days there was like, uh, let’s say some monolithic applications and then there were showing up new companies like Netflix, microservice driven everything fine. And now Netflix was very successful with the microservice approach.
And then the old companies that are still running on a monolithic approach, they are now trying to be the same, like the new stack of Netflix that’s, that’s not working. If you can build a system from the ground up prepared for agents, this will work.
Jessica Kerr: Like cloud Like cloud-native, but now we sort of agent-native we don’t know what it is yet.
Benjamin Wilms: Yeah.
Jessica Kerr: Yeah, yeah. But something like, so, so, so part of part of building your software to be ready for agents to help is going to be good observability because observability feeds its context, right? It feeds what it can know, and if it can, if you’ve got boundaries around, okay, the error rate has increased. And what people do in this situation is go look at the traces until they find a service that’s failing more than, uh, more than normal. And the first thing they’re gonna do is restart that service. And the second thing they’re gonna do is look for a recent change in that service and roll it back.
Benjamin Wilms: Yep.
Jessica Kerr: Um, and if that’s always what people do, and I think that’s pretty common, um. You could, an agent could identify the service that’s failing, identify changes. It could do that rollback while contacting people. And so then the people start out a little bit more ahead of knowing whether the rollback helped.
Benjamin Wilms: More like this, let’s say agent on call. So the agent is observing the system. He’s able to see all the data from the observability tools. Maybe there are some error rates popping up and he was trained to now, um, take the decision. Okay. I could try to solve it with just a restart. That’s based on old incident data that’s based on all the experience from my team and they teach me as an AI agent to do it exactly that, but again, as I’m on call, I will inform my team that I did exactly that one time, not like multiple
Jessica Kerr: Yes, yes, Yes, don’t, don’t do this on a ten second loop and try different things every 10 seconds. That’s gonna be chaos.
The bad kind.
Benjamin Wilms: Absolutely, and, and now you mentioned chaos. And now let’s zoom in, in exactly that topic. Why? Um, because you mentioned some seconds ago, you need to know what normal looks like. So if there is like a service getting out of control, you need to exactly know on a, let’s say on a Tuesday at 10:00 AM this is normal for this service because the traffic is going up for whatever reason, but this knowledge needs to be also not only part of your teams, of your experts that are building those software. It also needs to be part of the agent. And how should you train it? How should you teach and educate an agent how he should react? You need to put him in not normal conditions, and the best way to do it from my perspective is to introduce some chaos to really change the system in a very controlled way, to really, yeah, inject some errors in the system so that this, the human can detect if the agent is now doing the right steps.
Jessica Kerr: Right, right. So the question of whether agents are going to be useful. Whether you want to put them on call in production is going to be answered potentially by chaos engineering experiments
Benjamin Wilms: Yeah.
Jessica Kerr: Of while we’re sitting here watching it, does it screw it up?
Benjamin Wilms: Correct. And you need to do it in a proactive approach. It’s not like a good idea to install the agent, go into production, ah,
Jessica Kerr: and then wait for and then wait for an incident.
Benjamin Wilms: Exactly,
Jessica Kerr: Then the incident comes and you’re like, oh crap. That agent just did something. Oh, right. We, oh, right. We turned that on. What were we thinking?
Benjamin Wilms: Yeah. How to shut it up, how to shut it off, how to turn it off. Yeah. And now the agent is restarting all his friends because they’re all on call now. That’s where it’s getting complicated. And that’s, yeah, I mean, you need to be prepared for those moments even more now with this new way of building systems with agents inside.
Jessica Kerr: Because you need to test, does it react appropriately?
Benjamin Wilms: Yeah.
Jessica Kerr: And then does it stop when, when it should?
Benjamin Wilms: Absolutely, and something, we have created at Steadybit is, of course you can run those experiments in a running system, but now there’s a ton of data coming out of those executions and tests and all that stuff. It’s at, at some level, very hard to figure out, what is like the root cause and how could I solve it.
And that’s for me an, a strong use case to then use this data in combination with observability data to really identify, with AI help, what, what was going on here? What is like the root cause? How should I address this issue?
Jessica Kerr: Yeah. What, like, what really happened? What did the agent see? What did the agent do?
And we, And we have to introduce instruments in a scientific sense,
like right, little spyware to see what’s going on, like thermometers or telescopes or whatever. These are all instruments, and with agents, our instruments are like recording the ins and outs, and the agent reporting its thought process, and then in the software system, observability gives you all those instruments so you can see what happened on that side, and the agent is telling you stuff and then, and then Slack is really useful, now that we’re mostly remote, and if it’s the middle of the night in an incident, I certainly hope we’re dialing in, then we have some record of what the people were thinking and saying.
Benjamin Wilms: Hmm.
That’s where I can see like, let’s imagine you’re an SRE on call and wouldn’t it be nice if, let’s say your AI SRE buddy has already done like a first review or an audit about the situation. He has collected all the informations and you will get a nice
Jessica Kerr: And he’s made you a custom dashboard just for this. Now you still need, now you still need to go in and check that those graphs are querying what you think they are querying,
Benjamin Wilms: Absolutely, and…
Jessica Kerr: But it’s a starting point.
Benjamin Wilms: More like helping you to, to be more focused to, to focus on the right elements, but then still it’s your final decision if you would like to restart after 100 times the service again.
Jessica Kerr: Oh, right, right. Because maybe the problem is actually in the configuration data and there’s some discount that’s defined in an invalid way, and you could put that on the list of things for the AI to check next time, but next time it’s gonna be something new.
Benjamin Wilms: Yes, absolutely. That’s true. That’s absolutely true.
Jessica Kerr: Yeah, so there’s even more, you mentioned like incident review, I think, but we get all this data and the question is how do we learn from it?
Because when you have a chaos engineering experiment, voluntary or involuntary, um, incidents count too. Uh. I guess it’s planned or unplanned, when we’re looking at learning from this, we want our agents to learn from this. So we specifically want to be able to improve their prompts or run books or whatever you wanna call them, for what to check, what to look for, who to inform. Um, and I, I think this might be a promising advance in incident review because companies that are like, oh yeah, we could learn something, blah, blah that, and don’t wanna invest in that… if you tell them, oh, but we need to improve our agent prompts so our agents can be more effective… that they believe in.
Benjamin Wilms: Yes, Yes. And um, also, uh, a fun fact, we are working with like some students from an university and they have implemented their own SRE agent and they are using our tool to really, there are two main use cases. First of all, they want to identify and learn if the agent is really doing. The right stuff because they are like the SRE experts and they have trained the agent and he’s now part of the system and he’s doing changes in the system.
But the second, and for them, more important fact is, or use case is people that are getting this agent, for the first time in their hands, they don’t trust the system. They don’t trust SRE agent.
Jessica Kerr: Oh Yeah.
Benjamin Wilms: So they want to see a validation upfront, and that’s where Chaos Engineering is again coming in.
So they are installing the agent, let’s say in a Kubernetes cluster. He’s been trained to react under specific conditions and he’s able to do that and then out of nothing, they are running chaos experiments with Steadybit to really show the potential customer, Hey, our agent is doing the stuff as desired.
Jessica Kerr: So you can build that trust.
Benjamin Wilms: Yes, correct.
Jessica Kerr: That’s so important.
Benjamin Wilms: It is for both ends.
Jessica Kerr: Yeah, yeah. So yeah, the agent has to be able to work with its context, and so the people can improve that, but the people have to believe the agent is useful. They have to look at its output. And then an exercise for us, like one of the ways that we have to change to work with these agents is when its output is wrong, to not say stupid agent and throw it away, but to say, all right. Time to give it better context.
Benjamin Wilms: It’s always like it’s getting back to this start with the why first, why this result was so bad. There needs to be a reason. Is the data not correct? Is the
Jessica Kerr: And you, And it, it’s the same problem as with root cause. Is there really a why? No, I, I, I don’t even think, why is it is a healthy word? Um, it’s so vague and it’s so blamey, but if we ask, how could this be different,
Benjamin Wilms: Mm-hmm.
Jessica Kerr: then we can change the input, we could potentially change the tools or the model, whatever. There are lots of ways that it could be different, and you don’t have to assign any of them to the root cause in order to change some of them and get a different result.
Benjamin Wilms: Always other words, what do I need to change to fulfill my expectation? Because I mean, same for like…
Jessica Kerr: or even, what can I change? It’s not your fault, right?
Benjamin Wilms: no, no, no.
Jessica Kerr: It, you’re not the only one who could change as well. I mean, there’s other parts of the system that, that could change, but like what is within my scope that I can do.
Benjamin Wilms: Yeah, that’s,
Jessica Kerr: Such that my expectations are met. And sometimes that means changing your expectations of, well, the agent’s not gonna solve it, but it’s gonna give me some information that’s better than nothing.
Benjamin Wilms: that’s true.
Jessica Kerr: Or it doesn’t have to be right to be useful. It can still give me ideas or clues.
Benjamin Wilms: Then again, I’m like, or like say human is the last responsible in the loop. He needs to take the final decision.
Jessica Kerr: Yeah.
Benjamin Wilms: All right, anything from your side you, you would like to talk about?
Jessica Kerr: Hmm, I’m happy this conversation has gotten me excited about what we can do with these agents and how they can fit in, as helpers, because they’re never going to replace your SREs, but they can make your SRE even more effective. I love that part about building trust and, and chaos engineering becomes the evals for the SRE agent. Yeah. Yeah, totally, and, and that’s, I mean, I, I think chaos engineering is wonderful and I think incident reviews and learning from incidents are wonderful and they’re wonderful for humans, and if people will invest in them because they’re also good for agents, then both the agents and the people will get the benefit.
Benjamin Wilms: That’s true. Absolutely, and, but therefore,
Jessica Kerr: That’s true for observability too.
Benjamin Wilms: yeah, but I mean, therefore there needs to be a shift in this mindset. There needs to be a change in the practices, how people are doing it. I mean, so many times in so many conversations. I’m still identifying this strategy: hope. We are, just hope everything will be all right. But hope is not a strategy and your system…
Jessica Kerr: Hope is cop out.
Benjamin Wilms: Yeah, and your system, especially like your social technical system, will not react as you expected in that moment.
Jessica Kerr: Yeah, the agent’s not gonna get better with thoughts and prayers, but it will get better with better context.
Benjamin Wilms: Yes. Over time, but therefore it needs to be… even like an agent needs to be living in a system that is tolerating faults, is able to handle failures, and it’s not like something bad. If we can, and that’s like this, this mind mindset shift I’m, I’m talking about is if a fault or a failure is a learning opportunity and we will react as, Hey, we can identify something new, which we were not expecting at that point in time. That’s a big change in the mindset.
Jessica Kerr: Yeah. Yeah. And then we can fix it at lots of layers
Benjamin Wilms: Yeah.
Jessica Kerr: You could be like, okay, there’s a bug in the code. You could be like, okay, somebody put in some bad admin data. Let’s make that not possible.
Benjamin Wilms: Correct.
Jessica Kerr: Or you could be like, Hey, agent, next time this happens, yeah, just restart it and we’ll deal with it the next day, or roll back the configuration and we’ll deal with it the next day. Um, or, or whatever it is. Yeah, it gives us another layer at which to fix slash mitigate problems.
Benjamin Wilms: Yeah. That’s cool.
Jessica Kerr: that gets back to how we don’t, we don’t yet know how to work in these systems. And, and the fun part is some of the stuff that we learned about software but haven’t fully adopted, like chaos engineering, uh, like test-driven development, if you’re writing a durable code with agents, test-driven development helps, and you can get them to write the test, so it’s a lot less work and you can get them to make the test output really pretty and you can refactor the test and you can write generators to create the data, which I love to do, but was super time-consuming.
But now can shave all those yaks ’cause all of that code is disposable code and I don’t have to look too hard at it. I can just let the agent do it and if it works, it works. So, So we have these new tools to use practices that were always useful, but were not widespread, but are now even more important.
Benjamin Wilms: Yeah.
Jessica Kerr: Yeah, there’s, there’s a lot to, a lot of growing to do
Benjamin Wilms: Absolutely. I was just smiling a bit because test-driven development and if I now assume like the agent is creating the code and he’s also writing the test case, wasn’t there one rule that you shouldn’t do that Like even like with the old way,
Jessica Kerr: I used to write the tests and write the code. The trick is like that, you look at them, but I also require seeing a test fail before I will believe it when it passes.
Benjamin Wilms: That’s true. Yeah.
Jessica Kerr: And there’s things like approval tests. Approval tests are just, here’s, here’s what the HTML was before. Here’s what it is now, is it the same?
Benjamin Wilms: Mm-hmm.
Jessica Kerr: I really like those because it shows me exactly what changed in the HTML.
Benjamin Wilms: Yeah.
Jessica Kerr: And it’s not, it’s not that the agent wrote a sneaky test, it’s a straight comparison.
Benjamin Wilms: Okay, good.
Jessica Kerr: Uh, so there are considerations, but absolutely, I have the agent write the test down the code. I just look at it.
Benjamin Wilms: That’s good. And yeah, then thank you very much for joining me. What is the best way to reach out to you? So if someone wants to, yeah, gets to know you, would like to get in contact with you?
Jessica Kerr: Jessitron.com and I have a newsletter if you want to hear from me, less than once a month. Uh, and also if you want to chat, especially if it’s about observability and or open telemetry or whatever you’re trying to get working, honeycomb.io/office-hours has like a Calendly, and I get to have half hour meetings with random people and find out what they’re thinking about and it’s super fun.
Benjamin Wilms: It is. Absolutely. Okay then. Thank you very much for joining me in this session. It was my pleasure to talk with you and hope to see you soon on the next conference maybe.
Jessica Kerr: Thank you.
Benjamin Wilms: Alright, thank you very much.
Jessica Kerr: Ah, this was great.