Notes on Autonomous Development – Prevent AI Reward Hacking In Gitlab CI/CD

“When a measure becomes a target, it ceases to be a good measure.” – Goodhart’s Law

I frequently run autonomous agents in the background to handle development tasks. I am using a Trunk-Based Development modelwhere the agents are able to merge to branch via auto-merge when the CI/CD pases. However, I have observed agents attempted “reward hack” – skipping tests or bypassing checks to satisfy their goal of completing a task.

To ensure the integrity of my CI/CD pipeline, i have implemented the following guardrails

Enforcing “Pipeline Must Succeed”

This is a baseline requirement, but it is insufficient on its own. There is nothing stopping an agent from removing or editing a test suite to ensure the pipeline passes, thereby triggering an undeserved merge.

Utilizing CODEOWNERS

The CODEOWNERS file assigns ownership to specific files or directories. By combining this with branch protection rules, you can ensure that any changes to critical files require manual approval from a human owner.

It is vital to include self-protection rules. You must prevent the agent from modifying the CODEOWNERS file itself, as well as any related CI configuration files. If an agent can edit your CI YAML, it can simply “silence” the test steps. Inside your CODEOWNERS file, you should add:

Inside your CODEOWNERS file, add:
CODEOWNERS @your-username
very_important_tests.py @your-username
.gitlab-ci.yml @your-username
# Include any other nested CI includes or scripts
ci/scripts/* @your-username

Using Pipeline Execution Policies

If we look at industries with the higest stakes for software engineering (i.e Aviation / Space) – external verification becomes very important. Relying on a single repository to guard itself is starting to feel like a loop that’s too easy to break. It’s either insufficient or just really inefficient to manage.

GitLab’s Pipeline Execution Policies allow teams to enforce mandatory, immutable CI/CD jobs across specific projects. These policies ensure that critical validation gates cannot be bypassed or modified by an autonomous agent, as the configuration lives outside the agent’s reach.

Futhermore, pipeline execution policy jobs can be assigned to one of the two reserved stages:

  • .pipeline-policy-pre: Runs at the very beginning of the pipeline (before the .pre stage). This is ideal for security scans or IaC (Infrastructure as Code) validation to prevent unwanted code from executing.
  • .pipeline-policy-post: Runs at the very end (after the .post stage). This is the place for integration tests, ensuring test coverage levels are maintained, and preventing “spec drift.”

Other Mechanisms and Conclusion

There are several other tools to enhance CI/CD verification that are worth exploring:

  • External Status Checks: Requiring a “green light” from an external service.
  • Webhooks: Triggering secondary validation layers.
  • Scan Result Policies: Blocking merges if new vulnerabilities are detected.
  • Push Rules: Prohibiting specific file changes or naming conventions.

Software development is evolving, and our CI/CD practices must evolve with it. We have moved from simple “build and test” routines to a world where we are governing autonomous intelligence. It is a challenging, yet incredibly exciting time to be a software engineer.

Comments

Leave a comment