The monitoring stack runs in the monitor namespace and is deployed as a single ArgoCD Application at sync-wave 2 (after ingress-nginx and cert-manager). Prometheus stores 30 days of metrics on local-path storage; Grafana persists dashboards to a separate 5 GiB volume. Alertmanager is enabled but alert routing is configured separately.
01 Stack components
| Component | URL | Notes |
|---|---|---|
| Grafana | grafana.in.alybadawy.com | Admin credentials from Vault secret/grafana-admin |
| Prometheus | prometheus.in.alybadawy.com | 30-day retention, 20 GiB PV on local-path |
| Alertmanager | alerts.in.alybadawy.com | Enabled — routing config TBD |
| node-exporter | DaemonSet (no UI) | CPU, memory, disk, network per node |
| kube-state-metrics | Deployment (no UI) | Kubernetes object state metrics (pods, deployments, etc.) |
02 Key configuration
Several kube-prometheus-stack defaults are disabled — they require HA control-plane components (etcd, scheduler, controller-manager) that don't exist in a single-node k3s cluster. The Prometheus operator's admission webhooks are also disabled to simplify the install.
prometheusOperator:
tls:
enabled: false # no TLS on operator — ingress-nginx handles it
admissionWebhooks:
enabled: false # simplified install; not needed for single-node
grafana:
admin:
existingSecret: grafana-admin # ESO syncs this from Vault secret/grafana-admin
userKey: admin-user
passwordKey: admin-password
persistence:
enabled: true
storageClassName: local-path # not Longhorn — no backup needed for dashboards
size: 5Gi
prometheus:
prometheusSpec:
externalLabels:
cluster: homelab
env: prod
retention: 30d
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: local-path
resources:
requests:
storage: 20Gi
# disabled — these components don't exist in single-node k3s
kubeEtcd:
enabled: false
kubeScheduler:
enabled: false
kubeControllerManager:
enabled: false
local-path instead of Longhorn? Prometheus and Grafana store metrics and dashboard configs — valuable but fully regenerable. Using local-path keeps Longhorn's backup job list clean and reduces Longhorn I/O. If the node is destroyed, metrics history is lost but dashboards can be reimported from JSON.03 Grafana credentials
Grafana admin credentials are managed through Vault → ESO. The grafana-admin Kubernetes Secret is created by an ExternalSecret that pulls from secret/grafana-admin in Vault.
$ vault kv put secret/grafana-admin \
admin-user="admin" \
admin-password="<strong-password>"
monitor Application has an ignoreDifferences block for the Grafana secret and checksum annotation. This prevents ArgoCD from trying to revert Grafana's auto-generated secret hash on every sync. Without it, the app would show a permanent diff.