Stop Sharing Context: How to Let Grafana Assistant Pre-Study Your Infrastructure for Faster Fixes

Introduction

When an unexpected alert fires, most engineers instinctively turn to their AI assistant for help. But without pre-loaded knowledge, the assistant requires extensive context sharing—what data sources are connected, which services are running, how they depend on each other. Every conversation starts from scratch, eating into valuable troubleshooting time. Grafana Assistant eliminates this friction by automatically building and maintaining a persistent knowledge base of your infrastructure before you ever ask a question. This guide walks you through setting up and leveraging that capability, so you can jump straight into fixing issues instead of wasting minutes explaining your environment.

Stop Sharing Context: How to Let Grafana Assistant Pre-Study Your Infrastructure for Faster Fixes

What You Need

Step-by-Step Guide

  1. Enable Grafana Assistant
    Navigate to Administration → General → Grafana Assistant in your Grafana Cloud stack. Toggle the feature on if it isn't already enabled. This activates the background AI agents that will scan your infrastructure. (In some plans it's on by default; verify the status.)
  2. Connect All Relevant Data Sources
    Ensure your Prometheus, Loki, and Tempo data sources are properly configured in Configuration → Data Sources. The assistant automatically discovers every connected data source in your stack. For maximum context, include all Prometheus instances (metrics), Loki logs, and Tempo traces. No additional configuration or API keys are needed—the assistant uses existing connections.
  3. Let the AI Agents Work in the Background
    After enabling the assistant and confirming your data sources, the system runs a swarm of AI agents that:
    • Identify all Prometheus, Loki, and Tempo data sources in your Grafana Cloud stack.
    • Query Prometheus metrics in parallel to discover services, deployments, and infrastructure components.
    • Correlate logs (Loki) and traces (Tempo) with their corresponding metrics, enriching the knowledge base with log formats, trace structures, and service dependencies.
    • For each discovered service group, generate structured documentation covering: what the service does, its key metrics and labels, how it's deployed, its upstream/downstream dependencies, and relevant health indicators.
      This whole process happens automatically and continuously—no manual triggers required. Expect the first full scan to complete within a few minutes depending on your stack size.
  4. Verify the Knowledge Base
    After a short wait, test what the assistant knows. In the Grafana Assistant chat interface (accessible from the toolbar or directly via URL), ask a simple question like:
    • "What services are running on my infrastructure?"
    • "Which downstream services does my payment system depend on?"
    • "Show me the key latency metrics for the checkout service."
    If the assistant responds with accurate, detailed information without asking for context, the knowledge base is populated and ready for action.
  5. Use Assistant for Incident Response
    When an incident occurs, simply ask questions directly—no need to re-explain your setup. For example:
    • "Why is my checkout service slow?" – The assistant already knows its metrics live in a specific Prometheus data source and its logs are structured JSON in Loki. It will correlate metrics, logs, and traces to pinpoint the root cause.
    • "What upstream services could be causing errors?" – It knows dependencies from pre-scanned relations, so even if you're new to the system, you get accurate answers instantly.
    • "Show recent traces for the order service." – Traces are linked to metrics and logs without additional query construction.
    The assistant's pre-built context shaves valuable minutes off mean time to resolution (MTTR), especially for teams where not everyone has full infrastructure knowledge.

Tips for Optimal Results

By following these steps, you transform Grafana Assistant from a reactive helper into a proactive partner that already knows your infrastructure inside out. No more context sharing—just faster, smarter troubleshooting.

Recommended

Discover More

From Stills to Motion: A Comprehensive Guide to Diffusion Models for Video GenerationScience Saru's Ghost in the Shell Anime Set for July 2026 Release: What to ExpectSamsung App Challenges the Gesture Navigation Trend: A Q&AHow Storytelling Transforms User Research into a Compelling NarrativeGlobal Math Gender Gap Widens Again: Girls Lose Ground in Latest International Study