Introduction
Monitoring infrastructure and applications is essential for maintaining system health and performance. This guide walks you through setting up a powerful monitoring stack using Docker Compose that combines:
- Prometheus: An open-source monitoring and alerting system that collects and stores time-series metrics data with a powerful query language. Prometheus excels at monitoring containerized environments, microservices, and dynamic infrastructure through its pull-based architecture.
- Grafana: A feature-rich visualization platform that transforms metrics into insightful dashboards with beautiful graphs, charts, and alerts. Grafana connects to various data sources, with Prometheus being one of the most popular.
Together, these tools create a robust monitoring solution that’s easy to deploy and maintain. Prometheus handles the collection and storage of metrics, while Grafana provides the visualization layer that makes those metrics actionable and understandable.
This guide will help you set up a complete monitoring stack capable of tracking system resources, container performance, and application metrics in a containerized environment.
Prerequisites
Before beginning, ensure you have:
- Docker Engine installed and running
- Docker Compose installed
- Minimum 2GB RAM recommended
- Ports 9090, 3000, 9100, and 8080 available
Project Structure
First, create the following directory structure:
monitoring/
├── docker-compose.yml
└── prometheus/
└── prometheus.yml
Basic Configuration
Create a docker-compose.yml
file with the following basic configuration:
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- ./prometheus:/etc/prometheus
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
ports:
- "9090:9090"
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana
ports:
- "3000:3000"
restart: unless-stopped
depends_on:
- prometheus
volumes:
prometheus_data:
grafana_data:
This Docker Compose configuration creates:
- Prometheus container: Runs the official Prometheus image with configuration files mounted from your local directory. The
prometheus_data
volume ensures metrics data persists between container restarts. Prometheus listens on port 9090 and automatically restarts if it crashes. - Grafana container: Deploys the official Grafana image with a persistent volume for storing dashboards, users, and other settings. It exposes port 3000 for web access and depends on the Prometheus container to ensure it starts only after Prometheus is running.
- Persistent volumes: Two named volumes (
prometheus_data
andgrafana_data
) ensure your monitoring data and configurations survive container restarts or rebuilds.
Adding Exporters
Enhance your monitoring capabilities by adding essential exporters:
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
ports:
- "9100:9100"
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
restart: unless-stopped
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
restart: unless-stopped
This configuration adds two important exporters to your monitoring stack:
- Node Exporter: Collects and exposes system metrics from the host machine including CPU usage, memory, disk space, and network statistics. It mounts host system directories as read-only volumes to access hardware and OS metrics. The command parameters specify paths to host filesystems and exclude unnecessary mount points.
- cAdvisor (Container Advisor): Provides container-specific metrics about resource usage and performance. It requires access to Docker’s data directory and system information to gather container statistics. cAdvisor exposes a web interface and API on port 8080 that both humans and Prometheus can use to view container metrics.
Both exporters are configured to restart automatically if they crash, ensuring continuous monitoring of your system and containers.
Adding Alert Manager
Extend your setup with alerting capabilities:
alertmanager:
image: prom/alertmanager:latest
container_name: alertmanager
ports:
- "9093:9093"
volumes:
- ./alertmanager:/etc/alertmanager
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
This configuration adds:
- Alert Manager: Handles alerts sent by Prometheus and routes them to the appropriate receiver channels like email, Slack, or PagerDuty. It deduplicates, groups, and routes alerts based on rules defined in its configuration file. The container mounts a local directory containing alerting rules and listens on port 9093 for incoming alerts and web UI access.
Additional Exporters
Consider adding these exporters based on your infrastructure:
mysql-exporter:
image: prom/mysqld-exporter:latest
ports:
- "9104:9104"
redis-exporter:
image: oliver006/redis_exporter:latest
ports:
- "9121:9121"
These specialized exporters extend your monitoring capabilities:
- MySQL Exporter: Collects performance and health metrics from MySQL database servers. It exposes metrics like connection counts, query execution time, buffer usage, and more on port 9104. This requires appropriate database credentials to be provided via environment variables (not shown).
- Redis Exporter: Monitors Redis instances by gathering metrics about memory usage, connections, persistence, replication lag, and command statistics. It makes these metrics available on port 9121 for Prometheus to scrape. Like the MySQL exporter, connection details would need to be configured via environment variables.
Adding these exporters allows you to monitor database performance alongside your system and container metrics, giving you comprehensive visibility across your entire stack.
Configuring Prometheus
Create prometheus/prometheus.yml
with the following configuration:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
Launching the Stack
- Navigate to your project directory:
cd monitoring
- Start the services:
docker-compose up -d
- Verify all containers are running:
docker-compose ps
Configuring Grafana
- Access Grafana at
http://localhost:3000
- Log in with default credentials:
- Username: admin
- Password: admin
- Change the password when prompted
- Add Prometheus data source:
- Click “Connections” → “Data Sources” → Click “Add Data source”
- Select “Prometheus”
- Set URL to
http://prometheus:9090
- Click “Save & Test”
Creating Your First Dashboard

After connecting Prometheus as a data source, you can create your first monitoring dashboard:
- Create a new dashboard:
- Click on the “+” icon in the top navigation
- Select “New Dashboard”
- Click “Add visualization”
- Configure your first panel:
- Select “Prometheus” as the data source
- In the query builder, enter a basic metric like
node_cpu_seconds_total
- Add a filter:
mode="user"
- Apply the
rate
function:rate(node_cpu_seconds_total{mode="user"}[1m])
- Click “Run queries” to preview
- Customize the visualization:
- Change the panel title to “CPU Usage”
- Select an appropriate visualization type (try “Time series”)
- Under the “Standard options” tab, set the unit to “Misc” → “Percent (0.1-1.0)”
- Adjust the min/max range if needed
- Add a description for clarity
- Add more panels:
- Return to dashboard view by clicking “Apply”
- Click “Add” → “visualization” again
- In the query editor click over to “Code” from “Builder”
- Add this in for the query
((node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100)
- Organize your dashboard:
- Drag panels to rearrange them
- Resize panels by dragging corners
- Add a “Text” panel with instructions or context
- Create rows to group related panels
- Add dashboard variables for dynamic filtering:
- Click the gear icon to open dashboard settings
- Select “Variables” and “Add variable”
- Create a variable named “instance” that queries
label_values(node_exporter_build_info, instance)
- Use this variable in your queries:
node_cpu_seconds_total{instance="$instance"}
- Save your dashboard:
- Click the save icon in the top right
- Give your dashboard a descriptive name
- Add relevant tags for easier searching
- Click “Save”

This simple dashboard gives you immediate visibility into key system metrics. As you become more comfortable with Grafana, you can create more complex dashboards with additional panels, alerts, and annotations.
For more advanced monitoring, explore Grafana’s pre-built dashboards by clicking “Import” from the dashboard menu and entering these popular dashboard IDs:
- Node Exporter Full (ID: 1860)
- Docker and System Monitoring (ID: 893)
- Prometheus 2.0 Overview (ID: 3662)
- Container Monitoring (ID: 14282)
Best Practices
Security
- Change default credentials immediately: Default credentials are widely known and the first target for attackers. Update the default admin password for Grafana during initial login and configure application-level authentication for Prometheus using a reverse proxy like Nginx with basic auth.
- Implement authentication for exposed services: Configure authentication for all externally accessible endpoints. For Prometheus, use a reverse proxy with basic authentication or OAuth2. For Grafana, set up LDAP/Active Directory integration or OAuth providers for enterprise environments.
- Use network isolation through Docker networks: Create dedicated Docker networks for your monitoring stack to control container-to-container communication. Only expose the ports that need to be publicly accessible:
networks:
monitoring:
driver: bridge
services: prometheus:
networks:
- monitoring # Only expose ports to other services in the monitoring network - Regularly update container images: Schedule routine updates to patch security vulnerabilities. Use specific version tags instead of ‘latest’ for production deployments to control when upgrades occur:
# Update images and restart containers
docker-compose pull docker-compose up -d
Performance
- Adjust scrape intervals based on your needs: Default 15-second scrape intervals may be too frequent for some metrics and not frequent enough for others. Tune these based on metric volatility and importance:
networks:
monitoring:
driver: bridge
services:
prometheus:
networks:
- monitoring - # Only expose ports to other services in the monitoring network
- Monitor resource usage of the stack: The monitoring system itself needs monitoring. Set resource limits in your Docker Compose file to prevent containers from consuming excessive resources:
services:
prometheus:
deploy:
resources:
limits: cpus: '0.50'
memory: 1G - Set appropriate retention periods for metrics: Balance storage requirements with data retention needs. Configure Prometheus’s retention settings:
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=15d' - Use recording rules for complex queries: Pre-compute expensive queries at scrape time to improve dashboard performance:
# In prometheus.yml
rule_files:
- /etc/prometheus/rules/*.yml
# In rules/system_metrics.yml
groups:
- name: system
rules:
- record: node:memory_utilization:percent
expr: 100 - ((node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100)
Key Takeaways of Prometheus and Grafana
When implementing a Prometheus and Grafana monitoring stack, keep these key takeaways in mind:
Prometheus Strengths
- Pull-Based Architecture: Prometheus actively scrapes metrics from targets on a configurable interval, making it resilient to network issues and providing control over monitoring load.
- Powerful Query Language (PromQL): Allows for complex data analysis and aggregation with a purpose-built query language designed specifically for time-series data.
- Service Discovery: Automatically discovers targets to monitor in dynamic environments like Kubernetes, AWS, or Docker Swarm, reducing manual configuration.
- Built-in Alerting: Native alert definition and generation capability, with AlertManager handling notification routing, grouping, and deduplication.
- Dimensional Data Model: Uses key-value pairs (labels) that enable powerful filtering and grouping of metrics across various dimensions of your infrastructure.
Grafana Advantages
- Visualization Flexibility: Offers numerous visualization options beyond simple graphs, including heatmaps, histograms, tables, and gauges to represent data meaningfully.
- Multi-Source Dashboards: Supports multiple data sources in a single dashboard, allowing you to correlate metrics from various systems in one view.
- User Management: Provides comprehensive user access controls with roles, teams, and permissions for enterprise deployments.
- Alert Management: Offers a unified alerting system that works across all data sources, not just Prometheus.
- Annotation Support: Enables marking significant events (like deployments, incidents, or maintenance) on your monitoring graphs for better context.
Use Cases and Applications
- Infrastructure Monitoring: Track server health, resource utilization, network traffic, and storage capacity.
- Container Orchestration: Monitor Docker containers and Kubernetes clusters with detailed performance metrics.
- Application Performance: Observe response times, error rates, and throughput of your applications.
- Database Monitoring: Track query performance, connection counts, and resource utilization across database systems.
- Business Metrics: Beyond technical metrics, you can monitor business KPIs when your applications expose relevant data points.
Scaling Considerations
- Federation: For large-scale deployments, implement Prometheus federation to have multiple Prometheus instances with hierarchical scraping.
- Remote Storage: Use remote storage integrations for long-term metric retention and to offload storage requirements from Prometheus.
- Load Balancing: For high-availability, deploy multiple scrapers behind a load balancer and use service discovery for automatic configuration.
- Resource Planning: As metrics volume grows, plan for increased CPU, memory, and storage resources, especially for Prometheus instances handling high cardinality metrics.
This Prometheus and Grafana stack provides a solid foundation for comprehensive monitoring of your infrastructure and applications. The Docker Compose setup makes it easy to deploy, while the configuration options allow you to customize the stack to your specific monitoring needs. As your infrastructure grows, you can expand this monitoring solution with additional exporters, custom metrics, and more sophisticated alerting rules.