Advanced Multi-Agent DevOps Agents: Improving Autonomy, Model Control, and Tool Coverage
Overview: This thesis aims to improve Knowit's existing multi-agent DevOps automation framework by enhancing its autonomy, accuracy, and tool coverage. The current system - developed in a prior master's thesis project at Knowit (KTH DiVA) - interprets natural language input, plans actions using ReAct-style reasoning, generates DevOps artifacts (e.g., GitHub Actions workflows, Dockerfiles), and executes shell-level commands while giving the user full control. However, it lacks long-term memory, retrieval-augmented generation (RAG), and support for additional tooling such as Jenkins or Terraform. This project will develop and evaluate the next version of the system, integrating domain-specialized AI, session memory, and support for additional tools to create a more intelligent and reliable DevOps assistant.
Description
Key Components
1. Fine-Tuning or Retrieval-Augmented Generation (RAG)
- Use domain-specific datasets (e.g., GitHub Actions, Jenkins docs, Terraform examples) to improve generation accuracy.
- Evaluate trade-offs between fine-tuned models and a RAG pipeline.
2. Memory-Enhanced Agent Framework:
- Integrate a persistent memory layer (e.g., Redis or SQLite) to store task history, user corrections, and tool context.
- Enable reasoning across multiple sessions and user feedback cycles.
3. Tool Coverage Expansion:
- Extend automation logic to support Jenkins pipelines, Terraform scripts, and Kubernetes manifests.
- Ensure agent orchestration works consistently across tools.
4. Improved User Interaction:
- Maintain and optionally extend the user-in-the-loop interface.
- Let users review, edit, approve, or reject AI-generated steps.
5. Evaluation Suite:
- Benchmark the new system on multiple metrics and compare with the previous version.
Challenges
- Avoiding overfitting or irrelevant generations during fine-tuning.
- Ensuring reliable document retrieval and context relevance in RAG.
- Handling multi-step, tool-spanning workflows.
- Balancing automation with human control in critical tasks.
Impact
An enhanced, domain-aware, multi-agent DevOps system that:
- Produces more reliable and context-aware DevOps artifacts
- Works with more tools and platforms
- Remembers task history and prior decisions
- Shows measurable improvements over the original architecture in both automation success and user trust