Prometheus cannot access etcd targets created by relation

Bug #2004612 reported by Vern Hart
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Etcd Charm
Fix Released
High
Adam Dyess

Bug Description

When we related etcd to prometheus, a manual job is added with the following config:

  - job_name: etcd-cecc7935-ce48-44ca-b734-2400f78a24cb
    scheme: https
    static_configs:
    - targets:
      - 10.50.13.59:2379
      - 10.50.13.86:2379
      - 10.50.13.77:2379

But in the targets page of the webui I see it's having ssl errors:

    Get "https://10.50.13.59:2379/metrics": x509: certificate signed by unknown authority

I tried adding the following:

    tls_config:
      insecure_skip_verify: true

But then the new error on connection is:

    Get "https://10.50.13.59:2379/metrics": remote error: tls: bad certificate

It seems we need a valid client certificate to communicate with etcd.

Since etcd supports the relation with prometheus, it should take care of passing all the necessary bits to enable the communication.

In case it matters, this is prometheus2 charm, stable channel, rev 33 and etcd from the latest/stable channel, rev 724. Both deployed on focal.

Revision history for this message
George Kraft (cynerva) wrote :

Thanks for the report. The Prometheus job is defined by the etcd charm in the register_prometheus_jobs handler[1]. Looking at the prometheus-manual interface definition[2], it should be possible to provide client certificates to the register_job call.

[1]: https://github.com/charmed-kubernetes/layer-etcd/blob/ae98be0046953ced628f682eee266d0e875a62b0/reactive/etcd.py#L904-L912
[2]: https://github.com/juju-solutions/interface-prometheus-manual/blob/51094180f38de3e66afcda5e520cd0b895e88c26/provides.py#L17-L25

no longer affects: charm-aws-iam
Changed in charm-etcd:
importance: Undecided → High
status: New → Triaged
Revision history for this message
Vern Hart (vern) wrote :

I'm unsure why I originally filed this agains charm-aws-iam. That was not my intention. Ah, I see now that I was directed at the charmed-kubernetes project to file the bug and I did not notice there is a drop-down. My apologies.

I agree with your assessment that it should be possible (maybe even simple?) to provide the certificate on the relation.

The charm has an action for getting the certs called package-client-credentials and it references ~/.bash_aliases:

  $ cat ~/.bash_aliases
  export ETCDCTL_KEY=/var/snap/etcd/common/client.key
  export ETCDCTL_CERT=/var/snap/etcd/common/client.crt
  export ETCDCTL_CACERT=/var/snap/etcd/common/ca.crt

This suggests that updating the register job call with ca_cert, client_cert, and client_key with the contents of the above files would be all that is required. And, it turns out, there is already a method for pulling in those files.

I gave that a try by patching a live charm and, it turns out, etcd (724) from latest/stable doesn't have the latest prometheus-manual interface layer so it doesn't support the client cert. I patched that too with the latest and it works. Prometheus is now pulling the etcd metrics.

Revision history for this message
Vern Hart (vern) wrote :

For completeness, I updated the patch to include the prometheus-manual interface, but this is probably not needed with a new charm build.

Revision history for this message
Chris Johnston (cjohnston) wrote :

The way that the original operation was supposed to work [1] is that etcd has a certificates relation with easyrsa/vault, and then prometheus should have the same relation. Prometheus would then get the certs from easyrsa/vault. In the case of Vern's deployment, it looks like there was no relation between prometheus and easyrsa.

I've just done a deployment using FCE and it looks like FCE configured etcd to use easyrsa, but then related prometheus to vault, so obviously this doesn't work.

It looks like possibly the prometheus-manual interface didn't take client_cert and client_key at the time of this functionality originally being added [2].

It would seem safest to me for etcd to provide the certificates, but I'm not sure if there could be issues if etcd provides the certificates and there is a relation between prometheus and the correct one of easyrsa/vault.

[1] https://github.com/charmed-kubernetes/layer-etcd/pull/187
[2] https://github.com/juju-solutions/interface-prometheus-manual/commit/13501d437928fdafc9241f95265badf777255c8b

Revision history for this message
Chris Johnston (cjohnston) wrote :
tags: added: review-needed
Adam Dyess (addyess)
Changed in charm-etcd:
status: Triaged → In Progress
assignee: nobody → Adam Dyess (addyess)
milestone: none → 1.27+ck1
Revision history for this message
Adam Dyess (addyess) wrote :
Changed in charm-etcd:
status: In Progress → Fix Committed
tags: added: backport-needed
removed: review-needed
Revision history for this message
George Kraft (cynerva) wrote :
tags: removed: backport-needed
Changed in charm-etcd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.