Monitoring system metrics

Anaconda Platform 7.0.0 is available through a limited early access program. Contact your Anaconda Technical Account Manager (TAM) if you’re interested in adopting the latest version.

System metrics are populated using Prometheus and Alertmanager and visualized using Perses.

Enabling and managing the monitoring stack

System monitoring is generally enabled during installation, but can be configured at any time.

Log in to the admin console.
Select the Config tab.
Under Observability, select or clear checkboxes to manage your monitoring stack:
- Enable Monitoring Stack: Enable or disable the use of the platform’s monitoring software.
- Enable Perses for Visualization: Enable Perses dashboards for visualizing metrics from the monitoring stack.
- Expose Prometheus & Alertmanager Ports Externally: Select this checkbox to display the subsequent checkboxes.
  - Expose Prometheus on Port 9090: Exposes Prometheus externally on port 9090.
  - Expose Alertmanager on Port 9093: Exposes Prometheus externally on port 9093.
Exposing Prometheus & Alertmanager externally enables you to connect your own visualization tools to the monitoring stack.
Scroll to the bottom and select Save config.
Select the Dashboard tab and redeploy your instance to apply your changes.

Accessing Perses to monitor system metrics

To access the Perses dashboards, open a browser and navigate to https://<FQDN>:8888.

Example

https://anaconda-platform.example.com:8888

Viewing dashboards

Anaconda Platform includes a number of important dashboards that contain helpful information for you to monitor the health of your system. Select a dashboard to view the metrics associated with that dashboard.

Dashboard

Node Exporter / Nodes: Shows CPU load, memory usage, and system-level performance per node. Useful for detecting hardware bottlenecks or I/O-related slowdowns.
Kubernetes / Compute Resources / Namespace (Pods): Displays CPU, memory, network, and IOPS metrics for all pods in a namespace. Helps identify resource pressure, network issues, or misconfigured limits across workloads.
Kubernetes / Compute Resources / Node (Pods): Shows aggregate CPU and memory usage for pods running on a specific node, both with and without cache. Useful for pinpointing uneven resource distribution.
Kubernetes / Persistent Volumes: Visualizes disk space and inode usage for persistent volumes. Indicates available capacity and helps identify full or slow storage volumes.
Alerts: Lists all active Prometheus alerts and their severity levels, showing trends over time. Highlights warning and critical events such as unready pods or failed jobs.
Prometheus / Overview: Displays Prometheus server performance metrics, including scrape duration, storage usage, and query rates. Useful for monitoring the health of the monitoring stack itself.

Anaconda Platform (Self-hosted)

​Enabling and managing the monitoring stack

​Accessing Perses to monitor system metrics

​Viewing dashboards

​Dashboard

Enabling and managing the monitoring stack

Accessing Perses to monitor system metrics

Viewing dashboards

Dashboard