Advanced Multi-Agent DevOps Agents: Improving Autonomy, Model Control, and Tool Coverage
Overview: This thesis aims to improve Knowit's existing multi-agent DevOps automation framework by enhancing its autonomy, accuracy, and tool coverage. The current system - developed in a prior master's thesis project at Knowit (KTH DiVA see link at the bottom) - interprets natural language input, plans actions using ReAct-style reasoning, generates DevOps artifacts (e.g., GitHub Actions workflows, Dockerfiles), and executes shell-level commands while giving the user full control. However, it lacks long-term memory, retrieval-augmented generation (RAG), and support for additional tooling such as Jenkins or Terraform. This project will develop and evaluate the next version of the system, integrating domain-specialized AI, session memory, and support for additional tools to create a more intelligent and reliable DevOps assistant.
Description
Key Components
1. Fine-Tuning or Retrieval-Augmented Generation (RAG)
- Use domain-specific datasets (e.g., GitHub Actions, Jenkins docs, Terraform examples) to improve generation accuracy.
- Evaluate trade-offs between fine-tuned models and a RAG pipeline.
2. Memory-Enhanced Agent Framework:
- Integrate a persistent memory layer (e.g., Redis or SQLite) to store task history, user corrections, and tool context.
- Enable reasoning across multiple sessions and user feedback cycles.
3. Tool Coverage Expansion:
- Extend automation logic to support Jenkins pipelines, Terraform scripts, and Kubernetes manifests.
- Ensure agent orchestration works consistently across tools.
4. Improved User Interaction:
- Maintain and optionally extend the user-in-the-loop interface.
- Let users review, edit, approve, or reject AI-generated steps.
5. Evaluation Suite:
- Benchmark the new system on multiple metrics and compare with the previous version.
Challenges
- Avoiding overfitting or irrelevant generations during fine-tuning.
- Ensuring reliable document retrieval and context relevance in RAG.
- Handling multi-step, tool-spanning workflows.
- Balancing automation with human control in critical tasks.
Impact
An enhanced, domain-aware, multi-agent DevOps system that:
- Produces more reliable and context-aware DevOps artifacts
- Works with more tools and platforms
- Remembers task history and prior decisions
- Shows measurable improvements over the original architecture in both automation success and user trust
Om Knowit Sweden
Curious to learn more about Knowit? Vistit www.knowit.se
Jobbar du redan på Knowit Sweden?
Hjälp till i rekryteringen och hitta din framtida kollega.