Observability: Prometheus Deployment

In order to improve the observability of a system, we can use Prometheus to monitor resource consumption at different levels of the system, such as system-level disk I/O and network bandwidth, tenant-level CPU consumption and memory usage, component-level Ray cluster resource utilization, backend platform system call count, garbage collection status, operator-level object size and execution time, etc. Accumulating this data can help developers troubleshoot issues and provide direction for subsequent system optimization.

There are two ways to deploy Prometheus. The first is a lightweight deployment based on Docker-Compose, which is mainly deployed in scenarios where there is only one machine or a limited number of machines. The second is a deployment based on Kubernetes.. We will provide detailed introductions to these deployments and specific operational steps for deployment personnel.

Docker-Compose-based deployment#

When deploying with Docker-Compose, the resources of the client's server are often limited. For some clients, it is not acceptable to occupy 500MB of memory for the Prometheus component. Therefore, the Prometheus component needs to be designed to be pluggable. In addition, considering that clients may have several servers, since Prometheus is relatively independent of system functions, it can be deployed separately on idle machines without occupying Ray cluster resources, etc. Therefore, we use a separate deployment file to deploy Prometheus instead of merging it into the existing deployment file.

Assuming that Docker and Docker-Compose environments are already installed on each server. There are N servers in total, and the Prometheus monitoring module can be divided into 1 Master machine and N-1 Slave machines. The process of deploying Prometheus based on Docker-Compose can be divided into the following two steps.

Create a service on each Slave server.
Create a service on the Master server.

Create a service on each Slave server#

Create a configuration file for the service. The configuration file occupies port 9100 on the slave server. Assuming the server's IP address is 192.168.88.101, the node-exporter will run on 192.168.88.101:9100. In the next step, the information of 192.168.88.101:9100 needs to be added to the prometheus.yml configuration file on the Master.

version: '3'
services:
node-exporter:
image: prom/node-exporter:latest
container_name: node_exporter
hostname: node-exporter
restart: always
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
- /etc/hostname:/etc/hostname:ro
ports:
- "9100:9100"
# If port 9100 is occupied, you can use
# - "9200:9100"
networks:
- monitor

networks:
monitor:
driver: bridge
ipam:
config:
- subnet: 172.16.102.0/24

Start the service by running docker-compose up -d in the same directory.

Create a service on the Master server#

Assuming the IP address of the Master host is 192.168.88.13. First, create docker-compose.yml and prometheus.yml in the same directory. As shown in docker-compose.yml:

The node-exporter for the Master machine will be created at 192.168.88.13:9100.
The prometheus for the Master machine will be created at 192.168.88.13:9090.
The grafana for the Master machine will be created at 192.168.88.13:3000.
The alertmanager for the Master machine will be created at 192.168.88.13:9093.
The cadvisor for the Master machine will be created at 192.168.88.13:8080.

version: "3.7"
services:
  # Service1: Node monitoring
  node-exporter:
    image: prom/node-exporter:latest
    container_name: "node-exporter"
    ports:
      - "9100:9100"
    restart: always
  # Service2: Node monitoring
  prometheus:
    image: prom/prometheus:latest
    container_name: "prometheus0"
    restart: always
    ports:
      - "9090:9090"
    volumes:
      - "./prometheus.yml:/etc/prometheus/prometheus.yml"
      - "./prometheus_data:/prometheus"
  # Service3: Data dashboard
  grafana:
    image: grafana/grafana
    container_name: "grafana"
    ports:
      - "3000:3000"
    restart: always
    volumes:
      - "./grafana_data:/var/lib/grafana"
      - "./grafana_log:/var/log/grafana"
      - "./grafana_data/crypto_data:/crypto_data"  # The host address is before the colon and the container address is after the colon. This is used to specify the location of the sqlite database.
  # Service4: Alert processing
  alertmanager:
    image: prom/alertmanager:latest
    container_name: Myalertmanager
    hostname: alertmanager
    restart: always
    ports:
      - '9093:9093'
    volumes:
      - './prometheus/config:/config'
      - './prometheus/data/alertmanager:/alertmanager/data'
  # Service5: Docker monitoring
  cadvisor:
    image: lagoudocker/cadvisor:v0.37.0
    container_name: cadvisor
    restart: always
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /dev/disk/:/dev/disk:ro
      - /var/lib/docker/:/var/lib/docker:ro
    command:
      - "--disable_metrics=udp,tcp,percpu,sched"
      - "--storage_duration=15s"
      - "-docker_only=true"
      - "-housekeeping_interval=30s"
      - "-disable_metrics=disk"
    ports:
      - 8080:8080
    networks:
        - monitor
networks:
  monitor:
    name: monitor
    driver: bridge

global:
  scrape_interval: 15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
  - static_configs:
    - targets:
    # Needs to be consistent with the 'alertmanager' configuration file in the master machine's docker-compose.yml.
       - 192.168.88.13:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  #- "app/prometheus/rules/*.yml"
  - "rule.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'slave101NodeExporter'
    static_configs:
    - targets: ['192.168.88.101:9100']
      labels:
        host: slave101
  - job_name: 'masterNodeExporter'
    static_configs:
    - targets: ['192.168.88.13:9100']
      labels:
        host: master
  - job_name: 'masterCadvisor'
    static_configs:
    - targets: ['192.168.88.13:8080']
      labels:
        host: master
  # Add NodeExporter of other servers
  # - job_name: 'slave21NodeExporter'
  #   static_configs:
  #   - targets: ['192.168.88.21:9100']
  #     labels:
  #       host: slave21NodeExporter

Deployment Based on K8S#

Due to the native installation of Prometheus and Node-exporter in Kubesphere, only Grafana needs to be installed in Kubesphere. The steps mainly consist of two parts: deploying Grafana using Helm and adding a persistent volume to Grafana.

Deploying Grafana Using Helm#

The deployment method based on K8S uses Helm for deployment. Use the following command to create Grafana in the kubesphere-monitoring-system namespace (which is the default).

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install grafana grafana/grafana -n kubesphere-monitoring-system

Adding a Persistent Volume to Grafana
Next, we add a persistent volume to Grafana so that it can persistently save dashboards and user information. First, create a PVC named "grafana-storage" in the kubesphere-monitoring-system.

kubectl create pvc grafana-storage -n kubesphere-monitoring-system --size=1Gi

Then, we modify the YAML file fragment corresponding to the development of Grafana as shown below.

  volumes:
  - configMap:
      defaultMode: 420
      name: grafana
    name: config
  - name: grafana-storage
    persistentVolumeClaim:
      claimName: grafana-storage```