Start Experimenting

Create and run experiments that provide real insights on your systems

Put your systems to the test and improve your operational readiness with a wide variety of experiments. Identify all of your reliability gaps and system limitations.

  • Simulate network-level outages, latency, and traffic issues
  • Run actions to stress resources like CPU, disk space, and memory
  • Change instance states and test database failover processes
  • Inject application-level faults with delays and method exceptions
light bulb icon

See Recommendations

Our Advice feature provides you with a list of recommended experiments

template icon

Start with Experiment Templates

Create new experiments fast by selecting from over 50 pre-built templates

target icon

Create Custom Actions

Build experiments from scratch and add your own custom faults

Get started fast with Advice and recommended experiments

Our built-in Advice feature automatically detects if you have any common reliability issues in your Kubernetes clusters. You will also see recommendations for which experiments to run, based on potential issues that would benefit from validation.

For each recommendation, just click “Create Experiment” and you’ll see the full step-by-step experiment already built in the editor.

You can also customize Advice to check for your own requirements.

experiment templates selection

Use experiment templates for popular scenarios

You don’t have to start from scratch. When you create a new experiment in Steadybit, you can pick from a library of 50+ experiment templates to import all the steps you need. Make final adjustments to fit your specific use case and then you’re ready to run.

You can also save an experiment as a custom template and reuse it across your organization’s applications and teams.

actions in the reliability hub

Build with 100+ no-code actions in a timeline-based editor

Designing experiments should be easy. With a library of drag-and-drop actions at your fingertips, you can build out the exact experiment you want in minutes.

For example, you can select an action like “Inject Latency into AWS Lambdas” and adjust parameters like duration, rate, minimum latency, and maximum latency.

With our timeline-based editor, you have full control of each step of your experiment.

Graphic titled 'Action Kit' featuring a set of action-driven tools and icons for project implementation

Add your own custom faults and checks with ActionKit

Missing an action that would take your experiment to the next level? You can add your own custom actions to Steadybit using our language-agnostic ActionKit.

define targets

Define targets with granular precision and set a safe blast radius

Follow the principle of start small first, then expand. It’s easy to select the exact targets you want an action to impact and configure a blast radius. For example, you could say you only want to effect 10% of the pods in a cluster with a given action.

If you need it, there is always an emergency stop button close by to stop all running experiments and prevent new ones from starting.

Explore chaos experiment examples in Steadybit

If you want to dive deeper, these video tutorials show common experiments designed in the editor and ready to run.

Simulating an Availability Zone Outage
Stopping or Rebooting EC2 Instances
Rebooting RDS Database Instances
Limiting Egress Network Bandwidth
Testing Behavior by Filling Disks
Injecting Latency into Java-Based Apps
Testing High 3rd Party Service Latency
Validating Monitoring Alerts
experiment runs

Watch experiments run in real-time and validate monitoring alerts

When you start an experiment, you will be able to watch it run in real-time as each step is executed and review a summary of your system’s behavior. If your target is a Kubernetes cluster for example, you’ll see the Kubernetes event log so you can see each change and the results of health checks.

You can also watch to see if your observability tool is raising an alert when expected. Just install the relevant extension and view these real-time events in Steadybit.

CICD Workflows - Comp

Schedule experiments or automate tests with the Steadybit API and CLI

You can run experiments manually, on a schedule, or with automation. Many teams will incorporate Steadybit experiments into their CI/CD workflow so they can continually verify experiments and ensure that new deployments meet a certain reliability standard.

With the Steadybit API and CLI, it’s easy to incorporate experiments into your development lifecycle to on your terms.

Browse Actions and Templates in the Reliability Hub

See what types of actions, targets, and templates are waiting for you and your team in our open source library.