If you've been running Netdata or basic monitoring scripts on your Ubuntu VPS, you've probably hit a wall. Maybe you need metrics from multiple services correlated on a single dashboard. Maybe you need alerting that goes beyond "disk is full." Maybe you need to retain weeks of historical data to spot trends. That's when it's time to graduate to Prometheus and Grafana — the industry-standard open-source monitoring stack used by companies from startups to Fortune 500s.

This guide walks you through deploying a production-ready monitoring stack on an Ubuntu VPS using Docker Compose. By the end, you'll have Prometheus scraping metrics from your system, your web server, your database, and your containers — all visualized in Grafana dashboards with alerting configured.

MassiveGRID Ubuntu VPS includes: Ubuntu 24.04 LTS pre-installed · Proxmox HA cluster with automatic failover · Ceph 3x replicated NVMe storage · Independent CPU/RAM/storage scaling · 12 Tbps DDoS protection · 4 global datacenter locations · 100% uptime SLA · 24/7 human support rated 9.5/10

Deploy a self-managed VPS — from $1.99/mo
Need dedicated resources? — from $19.80/mo
Want fully managed hosting? — we handle everything

When to Move Beyond Basic Monitoring

Tools like Netdata (covered in our VPS monitoring setup guide) are excellent for real-time system observation. But as your infrastructure grows, you'll need capabilities that basic tools can't provide:

Architecture Overview

The Prometheus + Grafana stack follows a pull-based architecture:

┌─────────────┐     scrapes      ┌──────────────────┐
│ node_exporter├────────────────► │                  │
└─────────────┘                  │                  │
┌──────────────┐     scrapes     │   Prometheus     │     queries     ┌─────────┐
│nginx_exporter├────────────────►│  (time-series DB)├────────────────►│ Grafana │
└──────────────┘                 │                  │                 │  (UI)   │
┌──────────────┐     scrapes     │                  │                 └─────────┘
│mysqld_exporter├───────────────►│                  │
└──────────────┘                 └──────────────────┘
┌──────────────┐     scrapes            │
│   cAdvisor   ├────────────────────────┘
└──────────────┘

Prometheus periodically scrapes HTTP endpoints (called "targets") exposed by exporters. Each exporter translates service-specific metrics into Prometheus's text format. Prometheus stores these metrics as time-series data. Grafana connects to Prometheus as a data source and lets you build dashboards and alerts using PromQL queries.

Prerequisites

You'll need an Ubuntu VPS with at least 2 vCPU and 4 GB RAM. The full stack (Prometheus, Grafana, four exporters) uses approximately 500–800 MB of RAM at idle and scales with the number of metrics and retention period.

Docker and Docker Compose must be installed. If you haven't set those up yet, follow our Docker installation guide first.

Verify your Docker installation:

docker --version
# Docker version 27.x.x

docker compose version
# Docker Compose version v2.x.x

The full stack runs on a Cloud VPS with 2 vCPU / 4 GB RAM. Prometheus stores time-series data — scale storage independently as your retention grows.

Project Structure

Create the directory structure for your monitoring stack:

mkdir -p ~/monitoring/{prometheus,grafana/{provisioning/datasources,provisioning/dashboards}}
cd ~/monitoring

The final structure will look like this:

~/monitoring/
├── docker-compose.yml
├── prometheus/
│   └── prometheus.yml
└── grafana/
    └── provisioning/
        ├── datasources/
        │   └── prometheus.yml
        └── dashboards/
            └── dashboards.yml

Docker Compose: The Full Stack

Create the docker-compose.yml file that defines all services:

cat > ~/monitoring/docker-compose.yml << 'EOF'
services:
  prometheus:
    image: prom/prometheus:v2.53.0
    container_name: prometheus
    restart: unless-stopped
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=30d'
      - '--storage.tsdb.retention.size=10GB'
      - '--web.enable-lifecycle'
    volumes:
      - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus_data:/prometheus
    ports:
      - "127.0.0.1:9090:9090"
    networks:
      - monitoring

  grafana:
    image: grafana/grafana:11.1.0
    container_name: grafana
    restart: unless-stopped
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=changeme_strong_password_here
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_SERVER_ROOT_URL=https://monitoring.yourdomain.com
      - GF_SMTP_ENABLED=true
      - GF_SMTP_HOST=smtp.gmail.com:587
      - GF_SMTP_USER=alerts@yourdomain.com
      - GF_SMTP_PASSWORD=your_app_password
      - GF_SMTP_FROM_ADDRESS=alerts@yourdomain.com
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning:ro
    ports:
      - "127.0.0.1:3000:3000"
    depends_on:
      - prometheus
    networks:
      - monitoring

  node_exporter:
    image: prom/node-exporter:v1.8.1
    container_name: node_exporter
    restart: unless-stopped
    command:
      - '--path.rootfs=/host'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
      - '--collector.netclass.ignored-devices=^(veth.*|br-.*|docker.*)$$'
    volumes:
      - /:/host:ro,rslave
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
    pid: host
    networks:
      - monitoring

  nginx_exporter:
    image: nginx/nginx-prometheus-exporter:1.1
    container_name: nginx_exporter
    restart: unless-stopped
    command:
      - '-nginx.scrape-uri=http://host.docker.internal:8080/nginx_status'
    extra_hosts:
      - "host.docker.internal:host-gateway"
    networks:
      - monitoring

  mysqld_exporter:
    image: prom/mysqld-exporter:v0.15.1
    container_name: mysqld_exporter
    restart: unless-stopped
    environment:
      - DATA_SOURCE_NAME=exporter:exporter_password@(host.docker.internal:3306)/
    extra_hosts:
      - "host.docker.internal:host-gateway"
    networks:
      - monitoring

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.49.1
    container_name: cadvisor
    restart: unless-stopped
    privileged: true
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    devices:
      - /dev/kmsg
    networks:
      - monitoring

volumes:
  prometheus_data:
  grafana_data:

networks:
  monitoring:
    driver: bridge
EOF

Key decisions in this configuration:

Prometheus Configuration

Create the Prometheus configuration file that defines what to scrape and how often:

cat > ~/monitoring/prometheus/prometheus.yml << 'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  scrape_timeout: 10s

rule_files:
  - /etc/prometheus/alert_rules.yml

scrape_configs:
  # Prometheus monitors itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: 'prometheus'

  # System metrics from node_exporter
  - job_name: 'node'
    static_configs:
      - targets: ['node_exporter:9100']
        labels:
          instance: 'vps-01'

  # Nginx metrics
  - job_name: 'nginx'
    static_configs:
      - targets: ['nginx_exporter:9113']
        labels:
          instance: 'nginx-main'

  # MySQL/MariaDB metrics
  - job_name: 'mysql'
    static_configs:
      - targets: ['mysqld_exporter:9104']
        labels:
          instance: 'mysql-main'

  # Docker container metrics from cAdvisor
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']
        labels:
          instance: 'docker-host'
EOF

The scrape_interval of 15 seconds is the standard for most deployments. Lower intervals (5s) increase storage requirements proportionally. Higher intervals (30s-60s) may miss short spikes.

Understanding Scrape Configuration

Each job_name groups related targets. The static_configs block lists the endpoints to scrape. Since all our exporters run in the same Docker network, we use container names as hostnames. The labels section adds custom metadata that you can use in PromQL queries and Grafana dashboards to filter and group metrics.

Node Exporter: System Metrics

The node_exporter service defined in our Docker Compose file exposes hundreds of system-level metrics. Here are the key metric families:

Metric PrefixWhat It MeasuresExample Metric
node_cpu_seconds_totalCPU time per core and modeTime spent in user, system, idle, iowait
node_memory_*Memory allocationMemTotal, MemAvailable, Buffers, Cached
node_filesystem_*Disk space and inodesSize, free, available per mount point
node_disk_*Disk I/O operationsReads/writes completed, bytes, time
node_network_*Network interface statsBytes received/transmitted, errors, drops
node_load1/5/15System load averages1, 5, and 15-minute load averages

Once the stack is running, you can verify node_exporter metrics directly:

curl -s http://localhost:9100/metrics | head -20

# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 283974.22
node_cpu_seconds_total{cpu="0",mode="iowait"} 142.87
node_cpu_seconds_total{cpu="0",mode="system"} 4821.63
node_cpu_seconds_total{cpu="0",mode="user"} 12487.91
...

Adding Nginx Exporter

The nginx_exporter reads from Nginx's stub_status module. First, enable stub_status on your Nginx server. Add a location block to your Nginx configuration:

sudo nano /etc/nginx/conf.d/status.conf

Add the following content:

server {
    listen 8080;
    server_name localhost;

    location /nginx_status {
        stub_status on;
        allow 127.0.0.1;
        allow 172.16.0.0/12;  # Docker network range
        deny all;
    }
}

Test and reload Nginx:

sudo nginx -t
sudo systemctl reload nginx

# Verify stub_status works
curl http://localhost:8080/nginx_status

# Expected output:
# Active connections: 3
# server accepts handled requests
#  1542 1542 8923
# Reading: 0 Writing: 1 Waiting: 2

The Nginx exporter translates this into Prometheus metrics like nginx_connections_active, nginx_http_requests_total, and nginx_connections_reading/writing/waiting.

Adding MySQL/MariaDB Exporter

The mysqld_exporter needs a dedicated MySQL user with limited privileges. If you're running MySQL or MariaDB on the host (see our MySQL/MariaDB tuning guide for optimization), create the exporter user:

sudo mysql -u root << 'EOF'
CREATE USER 'exporter'@'%' IDENTIFIED BY 'exporter_password';
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%';
FLUSH PRIVILEGES;
EOF

Security note: Use a strong password in production. The exporter user only needs read-only access. The PROCESS privilege allows reading query statistics, and REPLICATION CLIENT allows reading replication status.

Update the DATA_SOURCE_NAME environment variable in your docker-compose.yml with the password you chose. Key metrics exposed by the MySQL exporter include:

Docker Metrics via cAdvisor

cAdvisor (Container Advisor) exposes per-container resource usage metrics. This is essential if you run your applications in Docker containers. The key metrics include:

cAdvisor automatically discovers all running containers and labels metrics with the container name, image, and ID — no additional configuration needed.

Grafana Data Source Provisioning

Instead of configuring the Prometheus data source manually through the Grafana UI, we'll provision it automatically via a YAML file:

cat > ~/monitoring/grafana/provisioning/datasources/prometheus.yml << 'EOF'
apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    editable: false
    jsonData:
      timeInterval: '15s'
      httpMethod: POST
EOF

This tells Grafana to connect to Prometheus using the Docker network hostname. The timeInterval matches our Prometheus scrape interval, which helps Grafana choose appropriate step sizes for graph queries.

Dashboard Provisioning

Configure Grafana to look for dashboard JSON files in a specific directory:

cat > ~/monitoring/grafana/provisioning/dashboards/dashboards.yml << 'EOF'
apiVersion: 1

providers:
  - name: 'default'
    orgId: 1
    folder: 'Provisioned'
    type: file
    disableDeletion: false
    editable: true
    updateIntervalSeconds: 30
    options:
      path: /etc/grafana/provisioning/dashboards
      foldersFromFilesStructure: false
EOF

Launch the Stack

Start all services:

cd ~/monitoring
docker compose up -d

# Verify all containers are running
docker compose ps

Expected output:

NAME               IMAGE                                 STATUS          PORTS
cadvisor           gcr.io/cadvisor/cadvisor:v0.49.1      Up 2 minutes
grafana            grafana/grafana:11.1.0                 Up 2 minutes    127.0.0.1:3000->3000/tcp
mysqld_exporter    prom/mysqld-exporter:v0.15.1           Up 2 minutes
nginx_exporter     nginx/nginx-prometheus-exporter:1.1    Up 2 minutes
node_exporter      prom/node-exporter:v1.8.1              Up 2 minutes
prometheus         prom/prometheus:v2.53.0                Up 2 minutes    127.0.0.1:9090->9090/tcp

Verify Prometheus is scraping all targets:

# Check Prometheus targets status
curl -s http://localhost:9090/api/v1/targets | python3 -m json.tool | grep -A2 '"health"'

All targets should show "health": "up". If any show "health": "down", check the container logs:

# Check logs for a specific exporter
docker compose logs nginx_exporter --tail 20

Importing Pre-Built Community Dashboards

Grafana has thousands of community-created dashboards on grafana.com/grafana/dashboards. Here are the most useful ones for this stack:

DashboardIDWhat It Shows
Node Exporter Full1860Complete system metrics (CPU, memory, disk, network)
Docker Container Monitoring893Per-container resource usage via cAdvisor
Nginx Exporter12708Nginx connections and request rates
MySQL Overview7362MySQL queries, connections, buffer pool
Prometheus Stats2Prometheus self-monitoring

To import a dashboard:

  1. Access Grafana at http://localhost:3000 (or via your reverse proxy)
  2. Log in with the admin credentials from your docker-compose.yml
  3. Navigate to Dashboards → New → Import
  4. Enter the dashboard ID (e.g., 1860) and click Load
  5. Select Prometheus as the data source and click Import

The Node Exporter Full dashboard (ID 1860) is particularly comprehensive. It shows over 30 panels covering CPU utilization by mode, memory usage breakdown, disk I/O rates, network traffic, system load, and more.

Creating Custom Dashboards

Community dashboards are a great starting point, but you'll want custom dashboards tailored to your specific stack. Here's how to build them.

Essential PromQL Queries

Before creating panels, understand the key PromQL patterns:

# CPU usage percentage (all cores combined)
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

# Disk usage percentage for root filesystem
(1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100

# Network received bytes per second
rate(node_network_receive_bytes_total{device="eth0"}[5m])

# Nginx requests per second
rate(nginx_http_requests_total[5m])

# MySQL queries per second
rate(mysql_global_status_queries[5m])

# Container CPU usage percentage
rate(container_cpu_usage_seconds_total{name!=""}[5m]) * 100

# Container memory usage in MB
container_memory_usage_bytes{name!=""} / 1024 / 1024

# Disk I/O wait time (indicates storage bottleneck)
rate(node_cpu_seconds_total{mode="iowait"}[5m]) * 100

# Filesystem predicted full (linear prediction, 4 hours ahead)
predict_linear(node_filesystem_avail_bytes{mountpoint="/"}[6h], 4*3600) < 0

Building a VPS Overview Dashboard

Create a dashboard with these panels for a single-VPS overview:

Row 1 — Stat panels (single values):

Row 2 — Time-series panels:

Row 3 — Time-series panels:

Row 4 — Application panels:

Using Dashboard Variables

Dashboard variables (also called template variables) let you create reusable dashboards. For example, add a variable to select the network interface dynamically:

  1. Open Dashboard Settings → Variables → New Variable
  2. Name: interface
  3. Type: Query
  4. Query: label_values(node_network_receive_bytes_total, device)
  5. Regex: /^(eth|ens|enp).*/ (filters out virtual interfaces)

Then use $interface in your PromQL queries:

rate(node_network_receive_bytes_total{device="$interface"}[5m])

This adds a dropdown at the top of the dashboard where you can select which network interface to display.

Alert Rules and Notification Channels

Grafana's built-in alerting system can send notifications when metrics cross thresholds. Here's how to set up production-ready alerts.

Configure Contact Points

In Grafana, navigate to Alerting → Contact points. You can configure multiple notification channels:

Email — Uses the SMTP settings from our Docker Compose environment variables. Add a contact point of type "Email" and specify recipient addresses.

Slack — Create an incoming webhook in your Slack workspace, then add a contact point of type "Slack" with the webhook URL:

# Slack webhook URL format
https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

Telegram — Create a bot via @BotFather, get the bot token and chat ID, then configure a Telegram contact point with those values.

Essential Alert Rules

Create these alert rules under Alerting → Alert rules → New alert rule:

High CPU Usage:

# Query A (PromQL)
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Condition: Is above 85
# For: 10m (sustained for 10 minutes before firing)
# Summary: CPU usage above 85% for 10 minutes

Memory Running Low:

# Query A (PromQL)
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

# Condition: Is above 90
# For: 5m
# Summary: Memory usage above 90% for 5 minutes

Disk Space Critical:

# Query A (PromQL)
(1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100

# Condition: Is above 90
# For: 5m
# Summary: Root filesystem above 90% capacity

Disk Space Prediction (fills in 4 hours):

# Query A (PromQL)
predict_linear(node_filesystem_avail_bytes{mountpoint="/"}[6h], 4*3600) < 0

# Condition: Is equal to 1 (the expression itself returns 1 when true)
# For: 30m
# Summary: Disk predicted to fill within 4 hours

High I/O Wait:

# Query A (PromQL)
avg(rate(node_cpu_seconds_total{mode="iowait"}[5m])) * 100

# Condition: Is above 20
# For: 10m
# Summary: High I/O wait - possible storage bottleneck

Nginx Connections Spike:

# Query A (PromQL)
nginx_connections_active

# Condition: Is above 500
# For: 5m
# Summary: Nginx active connections above 500 for 5 minutes

MySQL Too Many Connections:

# Query A (PromQL)
mysql_global_status_threads_connected / mysql_global_variables_max_connections * 100

# Condition: Is above 80
# For: 5m
# Summary: MySQL connection usage above 80%

Container Restart Loop:

# Query A (PromQL)
increase(container_last_seen{name!=""}[5m]) == 0

# Alternative: Use container start time changes
changes(container_start_time_seconds{name!=""}[10m]) > 2

# Summary: Container restarted more than twice in 10 minutes

Notification Policies

Under Alerting → Notification policies, configure how alerts route to contact points:

Securing Access

Since Prometheus and Grafana are bound to 127.0.0.1, they're not directly accessible from the internet. You have two options for external access:

Option 1: Nginx Reverse Proxy with SSL

server {
    listen 443 ssl http2;
    server_name monitoring.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/monitoring.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/monitoring.yourdomain.com/privkey.pem;

    # Grafana
    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket support for Grafana live
    location /api/live/ {
        proxy_pass http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
    }
}

Option 2: SSH Tunnel

For quick access without configuring a reverse proxy, use an SSH tunnel from your local machine:

# Forward Grafana to your local port 3000
ssh -L 3000:127.0.0.1:3000 user@your-vps-ip

# Then open http://localhost:3000 in your browser

For more SSH tunnel patterns, see our WireGuard VPN guide for persistent secure access.

Storage Management for Time-Series Data

Prometheus stores all scraped metrics as time-series data on disk. Understanding storage growth is essential for long-term operation.

Estimating Storage Requirements

The formula for Prometheus storage:

Storage = num_series × scrape_interval_samples_per_day × bytes_per_sample × retention_days

Approximate:
- bytes_per_sample ≈ 1-2 bytes (after compression)
- samples_per_day at 15s interval = 5,760

For our stack with ~1,500 active time series (node_exporter ~700, cAdvisor ~400, MySQL ~200, Nginx ~50, Prometheus ~150):

1,500 series × 5,760 samples/day × 2 bytes × 30 days ≈ 500 MB

With more containers, more MySQL schemas, or shorter scrape intervals, this grows proportionally.

Monitoring Prometheus Storage

Add these PromQL queries to a "Prometheus Health" dashboard panel:

# Current TSDB storage size in GB
prometheus_tsdb_storage_size_bytes / 1024 / 1024 / 1024

# Number of active time series
prometheus_tsdb_head_series

# Samples ingested per second
rate(prometheus_tsdb_head_samples_appended_total[5m])

# Storage size trend over last 7 days (predicting growth)
predict_linear(prometheus_tsdb_storage_size_bytes[7d], 30*24*3600)

Compaction and Cleanup

Prometheus automatically compacts data blocks over time. You can also manage retention actively:

# Check current disk usage
docker exec prometheus du -sh /prometheus

# Current retention settings are in docker-compose.yml:
# --storage.tsdb.retention.time=30d
# --storage.tsdb.retention.size=10GB

# To change retention, update docker-compose.yml and restart:
docker compose up -d prometheus

If you need to free space immediately, reduce the retention time and Prometheus will clean up old blocks at the next compaction cycle (runs approximately every 2 hours).

Reloading Configuration Without Downtime

When you modify prometheus.yml (for example, to add new scrape targets), you don't need to restart the container:

# Reload Prometheus configuration via API
curl -X POST http://localhost:9090/-/reload

# Verify the reload was successful
curl -s http://localhost:9090/api/v1/status/config | python3 -m json.tool | head -5

This works because we enabled --web.enable-lifecycle in the Prometheus startup command.

Adding More Exporters

The Prometheus ecosystem has exporters for almost everything. Common additions:

ServiceExporterDefault Port
PostgreSQLpostgres_exporter9187
Redisredis_exporter9121
PHP-FPMphp-fpm_exporter9253
Blackbox (URL probing)blackbox_exporter9115
SSL Certificatessl_exporter9219

To add a new exporter, add the container to docker-compose.yml and a corresponding job_name block in prometheus.yml, then reload both services.

Troubleshooting Common Issues

Target Shows "Down" in Prometheus

# Check if the exporter is running
docker compose ps

# Check exporter logs
docker compose logs node_exporter --tail 50

# Test connectivity from Prometheus container
docker exec prometheus wget -qO- http://node_exporter:9100/metrics | head -5

Grafana Shows "No Data"

High Memory Usage by Prometheus

# Check Prometheus memory usage
docker stats prometheus --no-stream

# If memory is high, check the number of active series
curl -s http://localhost:9090/api/v1/status/tsdb | python3 -m json.tool

# Reduce cardinality by dropping unnecessary labels in prometheus.yml:
# metric_relabel_configs:
#   - source_labels: [__name__]
#     regex: 'go_.*'
#     action: drop

Prefer Managed Monitoring?

Setting up and maintaining a monitoring stack requires ongoing attention: updating container images, managing storage growth, tuning alert thresholds, and responding to incidents. If you'd rather focus on your application while experts handle the infrastructure monitoring, MassiveGRID's fully managed dedicated hosting includes 24/7 monitoring, proactive alerting, and incident response — all handled by our team.

PromQL queries are CPU-intensive for complex dashboards. If you're running dashboards with many panels, heavy aggregations, or long time ranges, dedicated CPU resources ensure consistent query performance without contention from other workloads.

Summary

You now have a production-grade monitoring stack running on your Ubuntu VPS:

This stack scales well. You can add exporters for any new service, create dashboards for different audiences (ops team vs developers), and extend alerting rules as your application grows. The pull-based architecture means adding a new target is just a configuration change — no agents to install on the monitored service.

For ongoing disk management as your Prometheus data grows, check our disk space management guide. And if your monitoring needs grow beyond a single VPS, consider running Prometheus with remote storage backends like Thanos or Cortex for multi-server aggregation.