Prometheus

https://prometheus.io/docs/introduction/overview/

Server config

export URL=http://localhost:9090

Show config:

curl ${URL}/config

server cli options

i.e. --log.level=debug

Query Examples

Query alerts:

curl ${URL}/api/v1/alerts | jq .data.alerts

Query examples:

curl -g "${URL}/api/v1/query?query={'up'}"
curl -g "${URL}/api/v1/query?query={'up'}" | jq '.data.result[0]'
curl -g "${URL}/api/v1/query?query={job='brix-node-exporter'}[7d]"

promtool

Only installable via GitHub release
https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#syntax-checking-rules

Install:

eget --file=promtool prometheus/prometheus

Setup env var:

export URL=https://...
promtool query instant ${URL} up

Use promtool from prometheus container

Set alias:

alias promtool='kubectl -n monitoring exec -it -c prometheus \
  statefulset/prometheus-kube-prometheus-stack-prometheus -- promtool'
alias prom_query='kubectl -n monitoring exec -it -c prometheus \
  statefulset/prometheus-kube-prometheus-stack-prometheus -- promtool query instant ${URL}:9090'

promtool query instant ${URL} node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate

Check rules

Syntax-checking rules

cli clients

Stale / unmaintained

Helm chart

Issues:

Invalid Scrape Target in 'kubernetes-pods' for Pods with Multiple Containers

Queries

https://prometheus.io/docs/prometheus/latest/querying/basics/ https://prometheus.io/docs/prometheus/latest/querying/examples/ Examples:

avg_over_time(node_memory_MemAvailable_bytes[1m])/1024/1024

Show failed blackbox exporter queries:

probe_success==0

Data maintenance

Analyzing Prometheus data with external tools

Dropping metrics at scrape time with Prometheus

List all metric names:

curl "${URL}/api/v1/label/__name__/values" | jq .

List all metric names starting with stackdriver:

curl -G "${URL}/api/v1/label/__name__/values" --data-urlencode "match[]={__name__=~'stackdriver.+'}" | jq .

List all metric names matching a label:

curl -G "${URL}/api/v1/label/__name__/values" --data-urlencode \
  "match[]={__name__=~'.+', pod='prometheus-stackdriver-exporter-77c9767477-dz64s'}" | jq .

Admin API

For some tasks (i.e. metric deletion) the admin API needs to get activated.

Docker compose:

Add the flag to the command list:

command:
  - '--web.enable-admin-api'
  ...

Kubernetes:

Set enableAdminAPI: true in prometheus/kube-prometheus-stack-prometheus:

kubectl -n monitoring edit prometheus kube-prometheus-stack-prometheus

Delete metrics

Prometheus: Delete Time Series Metrics

Note: The admin APIs needs to be enabled, see above how to do this.

export QUERY="{instance=~'mediaplayer.cas.*'}"
export QUERY="{__name__=~'stackdriver.*'}"

Search for old time series:

curl --netrc -g "${URL}/api/v1/query?query=${QUERY}[7d]"

Mark metrics matching a regex for deletion:

curl --netrc -X POST -g "${URL}/api/v1/admin/tsdb/delete_series?match[]=${QUERY}"

Delete all metrics until a unix timestamp:

curl --netrc -X POST -g "${URL}/api/v1/admin/tsdb/delete_series?match[]={__name__=~\".+\"}&end=1725208202"

Actually delete them:

curl --netrc -XPOST -g "${URL}/api/v1/admin/tsdb/clean_tombstones"

Mixins

https://github.com/monitoring-mixins/docs

A mixin is a set of Grafana dashboards and Prometheus rules and alerts, packaged together in a reuseable and extensible bundle. Mixins are written in jsonnet, and are typically installed and updated with jsonnet-bundler.

You can either generate mixins by hand or use the pre-generates alerts.

cd ~/projects/monitoring/prometheus/mixins/monitoring-mixins-website
gup

Prometheus-operator helm chart

Uninstall

https://github.com/helm/charts/tree/master/stable/prometheus-operator#uninstalling-the-chart

helm delete --purge oas-test-prometheus
kubectl delete crd prometheuses.monitoring.coreos.com \
  prometheusrules.monitoring.coreos.com \
  servicemonitors.monitoring.coreos.com \
  podmonitors.monitoring.coreos.com \
  alertmanagers.monitoring.coreos.com

evtl.

kubectl delete -n oas persistentvolumeclaims \
  prometheus-prometheus-oas-test-prometheus-promet-prometheus-0 \
  alertmanager-alertmanager-oas-test-prometheus-promet-alertmanager-0