Boost your GitOps practices by integrating Chaos Engineering with Steadybit

Boost your GitOps practices by integrating Chaos Engineering with Steadybit

Boost your GitOps practices by integrating Chaos Engineering with Steadybit

Chaos Engineering




5 min


Learn how to integrate Chaos Engineering into your GitOps practices using Steadybit. We'll shortly cover in this blog post what GitOps is, followed by where you can benefit from integrating Chaos Engineering. Finally, we integrate Chaos Engineering hands-on using the Steadybit CLI and a GitHub action.

What is GitOps?


Back in the old days, infrastructure consisted of rigid components maintained manually by highly specialized teams. With the rise of microservices, distributed systems, and especially cloud computing, the demand has shifted to a modern infrastructure world with the need for dynamic scaling and increased deployment frequency. Eventually, GitOps arose by benefitting from concepts of software development. Thereby, GitOps consists of three concepts:

  • Infrastructure as Code (IaC): With IaC, all infrastructure configurations are stored as code in a version control system like a Git repository. Thus, you always have one single source of truth for your infrastructure that can easily be applied repeatedly to generate a new environment.

  • Pull Requests (PRs): To update infrastructure components, teams create PRs to collaborate via reviews, comments, and approval processes. These PRs also serve as a form of an audit log.

  • Continuous Integration / Continuous Deployment (CI/CD): Using a CI/CD pipeline, GitOps automate infrastructure updates whenever a PR is merged. Any configuration drift, like manual changes, is overwritten to prevent diverging environments and ensure consistent infrastructure.

So, like in software development, where an application source code generates the same application binaries every time it is built, GitOps generate the same infrastructure environment every time it is deployed.

Uniting Chaos Engineering with GitOps

To improve the reliability of your infrastructure and verify it continuously, we can now integrate Chaos Engineering with GitOps. This is especially useful when adding new critical components to your infrastructure. For instance, you could verify that a newly added message broker is highly available in case of an availability zone outage or that your downstream applications can cope with the short unavailability of the broker. Another example is to check that applications using an in-memory cache like Redis don't have availability issues when the in-memory cache is gone.

Given that infrastructure changes only happen via PRs, we can perfectly intercept at that point to run a set of Chaos Engineering experiments. Similarly to GitOps, we benefit from automation by using a CI/CD pipeline and could even think of version control for the experiments to be executed. Let's see how this works with Steadybit and GitHub actions.

Integrating Chaos Engineering with Steadybit into your CI/CD pipeline

For integrating Steadybit into your CI/CD pipeline, we use the Steadybit Command Line Interface (CLI) and pick GitHub action as CI/CD tool. Even so, the steps are similar for other CI/CD tools, e.g., Jenkins, TeamCity, Bamboo, or CircleCi.

1. Create a new GitHub Action workflow

First, we have to create a new GitHub Action workflow. This can easily be done by creating a new run-experiments.yml-file in the .github/workflows directory with the following content:

name: Run Chaos Engineering experiments
    types: [opened, reopened]
    runs-on: ubuntu-latest
    name: Run Experiments
      - uses

This creates a workflow with just one job, which checks out the current Git repository. The workflow will always run when a PR is (re)opened or someone triggers the workflow manually.

2. Install Steadybit CLI in the pipeline

The next step is to install the Steadyblit CLI into the workflow. The Steadybit CLI is publicly available at and can be installed via npm install -g steadybit.

      - name: Install Steadybit CLI
        run: npm install -g steadybit

3. Configure a Steadybit CLI profile

Now we need to configure the Steadybit CLI to connect to your Steadybit tenant and team. Therefore, we create a new API access team token in the Steadybit Platform (see Steadybit docs: How to add a new API access token) and store this token for security reasons as an encrypted secret in GitHub (see GitHub Docs: How to create an encrypted secret).
Before adding the Steadybit CLI profile, we add a mask to prevent accidentally printing the access token in log files:

      - name: Add Mask
        run: echo "::add-mask::${{ secrets.STEADYBIT_API_ACCESS_TOKEN }}"

      - name: Add Steadybit Profile with a Team Token
        run: steadybit config profile add --name "CI/CD" --token "${{ secrets.STEADYBIT_API_ACCESS_TOKEN }}"

4. Run experiments via Steadybit CLI

Now we can run an experiment. We can use an existing one from the platform and reference it via the experiment key or run an experiment stored in a .yml file. We use both approaches as the former has the advantage of always running the latest experiment version and the latter benefits from versioned experiments next to your application.

      - name: Run experiment via Experiment Key
        run: steadybit experiment run -k GITHUB-1 --recursive

      - name: Run versioned experiments

That is already everything we need! You can check the full example in our demo online shop example at . The workflow automatically verifies the infrastructure changes whenever we create a new pull request.


Boosting your GitOps practices by integrating Chaos Engineering is especially useful when a high-availability infrastructure is required. By using the Steadybit CLI, this is pretty straightforward. You could even use the Steadybit CLI to version your experiments whenever there is a change; this will be covered in a future blog post.

Start integrating Chaos Engineering into your GitOps practices today by signing up for a free Steadybit trial at