Prometheus
https://prometheus.io/docs/introduction/overview/
Server config
export URL=http://localhost:9090
Show config:
curl ${URL}/config
server cli options
i.e. --log.level=debug
Query Examples
Query alerts:
curl ${URL}/api/v1/alerts | jq .data.alerts
Query examples:
curl -g "${URL}/api/v1/query?query={'up'}"
curl -g "${URL}/api/v1/query?query={'up'}" | jq '.data.result[0]'
curl -g "${URL}/api/v1/query?query={job='brix-node-exporter'}[7d]"
promtool
- Only installable via GitHub release
- https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#syntax-checking-rules
Install:
eget --file=promtool prometheus/prometheus
Setup env var:
export URL=https://...
promtool query instant ${URL} up
Use promtool from prometheus container
Set alias:
alias promtool='kubectl -n monitoring exec -it -c prometheus \
statefulset/prometheus-kube-prometheus-stack-prometheus -- promtool'
alias prom_query='kubectl -n monitoring exec -it -c prometheus \
statefulset/prometheus-kube-prometheus-stack-prometheus -- promtool query instant ${URL}:9090'
promtool query instant ${URL} node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate
Check rules
cli clients
Stale / unmaintained
- https://github.com/mtulio/prometheus-cli
- https://github.com/ryotarai/prometheus-query
- https://github.com/prometheus-junkyard/prometheus_cli
Helm chart
Issues:
Queries
https://prometheus.io/docs/prometheus/latest/querying/basics/ https://prometheus.io/docs/prometheus/latest/querying/examples/ Examples:
avg_over_time(node_memory_MemAvailable_bytes[1m])/1024/1024
Show failed blackbox exporter queries:
probe_success==0
Data maintenance
List all metric names:
curl "${URL}/api/v1/label/__name__/values" | jq .
List all metric names starting with stackdriver
:
curl -G "${URL}/api/v1/label/__name__/values" --data-urlencode "match[]={__name__=~'stackdriver.+'}" | jq .
List all metric names matching a label:
curl -G "${URL}/api/v1/label/__name__/values" --data-urlencode \
"match[]={__name__=~'.+', pod='prometheus-stackdriver-exporter-77c9767477-dz64s'}" | jq .
Admin API
For some tasks (i.e. metric deletion) the admin API needs to get activated.
Docker compose:
Add the flag to the command
list:
command:
- '--web.enable-admin-api'
...
Kubernetes:
Set enableAdminAPI: true
in prometheus/kube-prometheus-stack-prometheus
:
kubectl -n monitoring edit prometheus kube-prometheus-stack-prometheus
Delete metrics
Prometheus: Delete Time Series Metrics
Note: The admin APIs needs to be enabled, see above how to do this.
export QUERY="{instance=~'mediaplayer.cas.*'}"
export QUERY="{__name__=~'stackdriver.*'}"
Search for old time series:
curl --netrc -g "${URL}/api/v1/query?query=${QUERY}[7d]"
Mark metrics matching a regex for deletion:
curl --netrc -X POST -g "${URL}/api/v1/admin/tsdb/delete_series?match[]=${QUERY}"
Actually delete them:
curl --netrc -XPOST -g "${URL}/api/v1/admin/tsdb/clean_tombstones"
Mixins
https://github.com/monitoring-mixins/docs
A mixin is a set of Grafana dashboards and Prometheus rules and alerts, packaged together in a reuseable and extensible bundle. Mixins are written in jsonnet, and are typically installed and updated with jsonnet-bundler.
You can either generate mixins by hand or use the pre-generates alerts.
cd ~/projects/monitoring/prometheus/mixins/monitoring-mixins-website
gup
Prometheus-operator helm chart
Uninstall
https://github.com/helm/charts/tree/master/stable/prometheus-operator#uninstalling-the-chart
helm delete --purge oas-test-prometheus
kubectl delete crd prometheuses.monitoring.coreos.com \
prometheusrules.monitoring.coreos.com \
servicemonitors.monitoring.coreos.com \
podmonitors.monitoring.coreos.com \
alertmanagers.monitoring.coreos.com
evtl.
kubectl delete -n oas persistentvolumeclaims \
prometheus-prometheus-oas-test-prometheus-promet-prometheus-0 \
alertmanager-alertmanager-oas-test-prometheus-promet-alertmanager-0