Prometheus in Jsonnet via kube-prometheus
One of the main goals of rebuilding my Kubernetes cluster was configuring my ad hoc observability stack in a declarative fashion. The lynchpin of the stack is Prometheus. I intended to install it via the kube-prometheus Jsonnet library, which includes Grafana. The recommended approach is using jsonnet-bundler, but, given how complicated it would get with Argo CD, I wanted to provide simple, instantly-usable Jsonnet.
I could do this either by using jsonnet-bundler locally (under WSL) and vendoring the
dependencies (i.e. committing them to my own repository) or by using Git
submodules. I preferred the latter approach,
which wouldn’t require committing my files. I created a new
repository
containing just kube-prometheus at release-0.8
as a submodule. Unfortunately, this didn’t work:
there were issues with absolute paths, relative paths, and transitive dependencies. I had no choice
but to vendor the libraries, drastically and permanently inflating this new
repository.
At any rate, once that was done, I encountered a chicken and egg situation: I had many
ServiceMonitors to enable, but I could only do so once the Prometheus
Operator, which defines the ServiceMonitor resource, was
installed. I decided Prometheus would have to come immediately after Linkerd, which has
instructions for scraping metrics without
ServiceMonitor
. (This is possible for any
application, but ServiceMonitors provide a simpler, more widely-used interface.) I’d write my own
ServiceMonitor for cert-manager, which came before Linkerd; everything afterwards would have
monitoring enabled.
I tried it
out.
The entire kube-prometheus Application refused to sync, with a Status of Unknown. I
increased the log level in the Argo CD application controller and kept seeing request object
is too large; above that, I could see an error about kind:
being missing in the YAML
that was generated from the Jsonnet. I realized I was returning an object instead of an array, so I
copied the basic
example,
which was able to sync. I added the linkerd.io/inject
annotation afterwards.
I put the suggested Linkerd scrape configurations in a Secret, which I pointed kube-prometheus’s
additionalScrapeConfigs
at.
It didn’t seem to be processed: the Prometheus UI showed no Linkerd jobs in its
configuration. I tried the main
and release-0.9
branches of kube-prometheus, neither of which I
was able to build:
Outputcouldn't open import "github.com/kubernetes-monitoring/kubernetes-mixin/lib/add-runbook-links.libsonnet": no match locally or in the Jsonnet library path
Regardless, the real reason, as I discovered, was that I had prometheus+:
under values+::
instead of as a
sibling.
Fixing that made the jobs appear.
The next issue was one where the jsonnet CLI tool
could generate manifests but Argo CD said it couldn’t unmarshal an array into an object. I debugged
this by generating the manifests locally, running them through
gojsontoyaml, turning the resultant YAML into an
array instead of a stream, and running that through kubectl apply --dry-run=client. All this
showed that the ultimate cause was the namespaces:
key I was trying to specify. I thought it might
be a bug in the library, since the YAML looked malformed to me, so I temporarily removed
it.
At last, I could use the dashboards I had copied from the Linkerd repository! I added x509-certificate-exporter (both the application and the dashboard) to monitor the TLS certificates I had created and potentially remind me when I needed to rotate them.
I later moved all observability-related resources into one
Application,
which I defined using
k8s-libsonnet.
I initially had to keep both the original YAML files and the JSON equivalents (e.g.
for Helm values), until std.parseYaml
was released in version 0.18.0 of
go-jsonnet.
I noticed at some point that Grafana was ignoring the environment variables I had set for the admin username and password. Inspecting the Pod showed me it was indeed missing that configuration. I had to merge it the hard way:
Jsonnet[kp.grafana.deployment
{
spec+: {
template+: {
spec+: {
containers: [
super.containers[0] {
env+: grafanaCredentials,
},
],
},
},
},
}] +
[kp.grafana[name] for name in std.filter(function(name) name != 'deployment', std.objectFields(kp.grafana))]