Authenticator pod failing to authenticate

Hi there,

While configuring the Openshift integration I am currently failing the following authentication issue with the authenticator init container.

INFO: 2019/10/30 06:07:08 authenticator.go:187: CAKC005I Trying to login Conjur…
INFO: 2019/10/30 06:07:08 authenticator.go:120: CAKC007I Logging in as user host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/conjur_service_account/conjur-cluster.
INFO: 2019/10/30 06:07:08 requests.go:23: CAKC011I Login request to: https://conjurmaster.nodewhatever/authn-k8s/conjur-authenticator/inject_client_cert
ERROR: 2019/10/30 06:07:08 authenticator.go:139: CAKC029E Received invalid response to certificate signing request. Reason: status code 401,
ERROR: 2019/10/30 06:07:08 authenticator.go:190: CAKC015E Login failed
ERROR: 2019/10/30 06:07:08 main.go:46: CAKC016E Failed to authenticate
ERROR: 2019/10/30 06:07:08 main.go:74: CAKC031E Retransmission backoff exhausted

Is there any external direct authentication I can test myself to make sure the configs are correct ?

Thanks in advance.

Regards,

Jose

After talking to @QuincyCheng, I was pointed to the trouble-shooting page (https://gist.github.com/micahlee/d20fcf0a47cfffeaf487f49c7e26b6df) which helped me a lot.

I found that my variable conjur/authn-k8s//kubernetes/ca-cert was empty. I followed the following step to populate it:

conjur variable values add
conjur/authn-k8s//kubernetes/ca-cert
“$(kubectl get secret -n $TOKEN_SECRET_NAME -o json
| jq -r ‘.data[“ca.crt”]’
| base64 --decode)”

Unfortunately, even after that, the issue remains:

INFO: 2019/10/30 07:38:46 authenticator.go:187: CAKC005I Trying to login Conjur…
INFO: 2019/10/30 07:38:46 authenticator.go:120: CAKC007I Logging in as user host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/conjur_service_account/conjur-cluster.
INFO: 2019/10/30 07:38:46 requests.go:23: CAKC011I Login request to: https://conjurmaster.nodewhatever/authn-k8s/conjur-authenticator/inject_client_cert
ERROR: 2019/10/30 07:38:46 authenticator.go:139: CAKC029E Received invalid response to certificate signing request. Reason: status code 401,
ERROR: 2019/10/30 07:38:46 authenticator.go:190: CAKC015E Login failed
ERROR: 2019/10/30 07:38:46 main.go:46: CAKC016E Failed to authenticate

curl -ks https://conjurmaster.nodewhatever/info | jq .authenticators
{
“installed”: [
“authn”,
“authn-iam”,
“authn-k8s”,
“authn-ldap”,
“authn-oidc”
],
“configured”: [
“authn”,
“authn-k8s/conjur-authenticator”
],
“enabled”: [
“authn”,
“authn-k8s/conjur-authenticator”
]
}

Hi,

Your host record probably needs to be host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster, where conjur-cluster is the service account you created in OCP and conjur-follower is the namespace. The value that sits between those two in the host entry needs to be the resource type in OCP of the resource listed after it. Phrased differently, host is the Conjur resource type, conjur is the Conjur account name in your case, authn-k8s/conjur-authenticator is the authenticator service ID, apps is a hardcoded prefix, conjur-follower is namespace, then the next two spots are the OCP resource type (typically service_account) and resource name. Hopefully that helps clarify things, but let me know if I can clarify further.

Thanks @nathan.whipple, I wasn’t aware that the resource type had to be “service_account” as I definitely have it wrong as you can see below too:

I will update it and let you know how it goes.

Appreciate the help.

Hi @nathan.whipple. I did update as suggested, but it is still failing to authenticate.

INFO: 2019/10/31 04:58:09 main.go:43: CAKC006I Authenticating as user ‘host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster’
INFO: 2019/10/31 04:58:09 authenticator.go:176: CAKC008I Cert expires: 2019-11-03 04:57:20 +0000 UTC
INFO: 2019/10/31 04:58:09 authenticator.go:177: CAKC009I Current date: 2019-10-31 04:58:09.945300346 +0000 UTC
INFO: 2019/10/31 04:58:09 authenticator.go:178: CAKC010I Buffer time: 30s
INFO: 2019/10/31 04:58:09 requests.go:46: CAKC012I Authn request to: https://conjurmaster.dtt-iam.xyz/authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate
ERROR: 2019/10/31 04:58:09 main.go:46: CAKC016E Failed to authenticate

$oc get serviceaccount conjur-cluster -n conjur-follower
NAME SECRETS AGE
conjur-cluster 2 45h

$ oc describe serviceaccount conjur-cluster -n conjur-follower
Name: conjur-cluster
Namespace: conjur-follower
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration={“apiVersion”:“v1”,“kind”:“ServiceAccount”,“metadata”:{“annotations”:{},“name”:“conjur-cluster”,“namespace”:"conjur-follower"}}

Image pull secrets: conjur-cluster-dockercfg-f4m4f
dockerpullsecret
dockerpullsecretconjur
Mountable secrets: conjur-cluster-token-nwnm4
conjur-cluster-dockercfg-f4m4f
Tokens: conjur-cluster-token-dlzpl
conjur-cluster-token-nwnm4
Events:

I did try to update the resource to “serviceaccount” instead, but the error was the same.

I will keep working on it, thanks in advance.

This is the error from the conjur master side that started to pop-up:

conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster failed to authenticate with authenticator authn-k8s service deloittepilot:webservice:conjur/authn-k8s/conjur-authenticator: CONJ00029E Client SSL certificate is missing from the header

The Master serve log:

2019-10-31T05:48:53.000+00:00 d511651609b8 conjur-possum: [origin=13.238.105.115] [req_id=c41ebdd9-4ef4-40a4-a15b-9aa5ef7ecc55] Started POST “/authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate” for 13.238.105.115 at 2019-10-31 05:48:53 +0000
2019-10-31T05:48:53.000+00:00 d511651609b8 conjur-possum: [origin=13.238.105.115] [req_id=c41ebdd9-4ef4-40a4-a15b-9aa5ef7ecc55] Processing by AuthenticateController#authenticate as HTML
2019-10-31T05:48:53.000+00:00 d511651609b8 conjur-possum: [origin=13.238.105.115] [req_id=c41ebdd9-4ef4-40a4-a15b-9aa5ef7ecc55] Parameters: {“authenticator”=>“authn-k8s”, “service_id”=>“conjur-authenticator”, “account”=>“deloittepilot”, “id”=>“host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster”}
2019-10-31T05:48:53.000+00:00 d511651609b8 conjur-possum: [origin=13.238.105.115] [req_id=c41ebdd9-4ef4-40a4-a15b-9aa5ef7ecc55] Completed 401 Unauthorized in 5ms
2019-10-31T05:48:54.065+00:00 d511651609b8 nginx: 10.5.10.228 “POST /authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate HTTP/1.1” 401 5 “-” “Go-http-client/1.1” 0.007 0.004

Hi,

Did you update the host resource in Conjur and the entitlement as well?

Hi @nathan.whipple, yes, I have updated the policy to have the the authenticate privilege against the webservice:

- !policy
    id: apps
    body:
    # All application roles that are run in K8s must have
    # membership in the `apps` layer
    - !layer

    # `authenticated-resources` is an array of hosts that map to
    # resources in K8s. The naming convention is
    # namespace/resource type/resource name
    - &authenticated-resources
      - !host
        id: conjur-follower/service_account/conjur-cluster
        annotations:
          kubernetes/authentication-container-name: authenticator
          # Uncomment the following line to display the platform's icon in the UI
          # <platform>: "true"

I had a different load-balancer certificate and now I have change to only use the self-signed certificates created by the evoke configure at the initial master config (I have re-configured a new master from scratch today). I ended-up on the same issue (not sure where else I can add the conjur master certificate chain apart from the OCP config map “server-certificate”. Any other idea would be welcome! Thanks!

curl --cacert server_certificate.cert https://conjurmaster.dtt-iam.xyz/health
{
“services”: {
“ui”: “ok”,
“possum”: “ok”,
“ok”: true
},
“database”: {
“ok”: true,
“connect”: {
“main”: “ok”
},
“free_space”: {
“main”: {
“kbytes”: 4302400,
“inodes”: 802413
}
},
“replication_status”: {
“pg_current_xlog_location”: “0/1A8F788”,
“pg_current_xlog_location_bytes”: 27850632
}
},
“ok”: true

So this looks good, you’re getting a certificate so the first step is passing. Can you confirm if you added your ca-cert to conjur/authn-k8s/conjur-authenticator/kubernetes/ca-cert? Your note from last week listed the resource as conjur/authn-k8s/kubernetes, so I’m not sure if that was just a typo in your post or not. Also, can you confirm that the service-account-token and api-url values have been populated under conjur/authn-k8s/conjur-authenticator/kubernetes/ as well please?

Hi @nathan.whipple, thanks for your help and sticking with this (your guidance have been important in the details I am missing, sorry about that). I think the resources from last week might not have been alright. I have re-created and re-configured the master from scratch of them reviewing each and every step, so the latest goes more accurate with the observations from your previous posts.

I have used the following to add the ca-cert:

 	conjur variable values add \
    conjur/authn-k8s/conjur-authenticator/kubernetes/ca-cert \
    "$(oc get secret -n conjur-follower $TOKEN_SECRET_NAME -o json \
      | jq -r '.data["ca.crt"]' \
      | base64 --decode)" 

The conjur/authn-k8s/conjur-authenticator/kubernetes/api-url variable was added with the API server’s taken from:

oc config view --minify -o json \
>       | jq -r '.clusters[0].cluster.server'

The service-account-token was taken from:

 	TOKEN_SECRET_NAME="$(oc get secrets -n conjur-follower \
    | grep 'conjur.*service-account-token' \
    | head -n1 \
    | awk '{print $1}')"

TOKEN_SECRET_VALUE="$(oc get secret -n conjur-follower $TOKEN_SECRET_NAME -o json \
  | jq -r .data.token \
  | base64 --decode)"

Not sure if this helps, but I am able to reproduce the same 401 from a curl command (outside of the authenticator pod).

In this case I am just using the server_certificate.cert file that is generated by the https://github.com/cyberark/kubernetes-conjur-deploy/start script and uploaded as the config map.

Request (without passing --key or --cert), not sure if one of those is what is missing :

curl -X POST -v --cacert server_certificate.cert https://conjurmaster.dtt-iam.xyz/authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate
*   Trying 13.237.229.226...
* TCP_NODELAY set
* Connected to conjurmaster.dtt-iam.xyz (13.237.229.226) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: server_certificate.cert
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=conjurmaster.dtt-iam.xyz
*  start date: Nov  4 07:13:27 2019 GMT
*  expire date: Nov  1 07:13:27 2029 GMT
*  subjectAltName: host "conjurmaster.dtt-iam.xyz" matched cert's "conjurmaster.dtt-iam.xyz"
*  issuer: O=deloittepilot; OU=Conjur CA; CN=conjurmaster.dtt-iam.xyz
*  SSL certificate verify ok.
> POST /authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate HTTP/1.1
> Host: conjurmaster.dtt-iam.xyz
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 401 Unauthorized
< Cache-Control: no-cache
< Content-Type: text/html
< Date: Tue, 05 Nov 2019 00:33:04 GMT
< Server: nginx
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-Request-Id: c4226977-a442-439a-a049-ba2951befd70
< X-Runtime: 0.007000
< X-XSS-Protection: 1; mode=block
< Content-Length: 0
< Connection: keep-alive
< 
* Connection #0 to host conjurmaster.dtt-iam.xyz left intact

The error in the console is the same “CONJ00029E Client SSL certificate is missing from the header”:

And the server logs:

2019-11-05T00:33:04.000+00:00 3c5dd9119af3 conjur-possum: [origin=122.105.87.62] [req_id=c4226977-a442-439a-a049-ba2951befd70] Started POST "/authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate" for 122.105.87.62 at 2019-11-05 00:33:04 +0000



2019-11-05T00:33:04.000+00:00 3c5dd9119af3 conjur-possum: [origin=122.105.87.62] [req_id=c4226977-a442-439a-a049-ba2951befd70] Processing by AuthenticateController#authenticate as */*



2019-11-05T00:33:04.000+00:00 3c5dd9119af3 conjur-possum: [origin=122.105.87.62] [req_id=c4226977-a442-439a-a049-ba2951befd70]   Parameters: {"authenticator"=>"authn-k8s", "service_id"=>"conjur-authenticator", "account"=>"deloittepilot", "id"=>"host/conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster"}

2019-11-05T00:33:04.000+00:00 3c5dd9119af3 conjur-possum: [origin=122.105.87.62] [req_id=c4226977-a442-439a-a049-ba2951befd70] Completed 401 Unauthorized in 6ms


2019-11-05T00:33:05.522+00:00 3c5dd9119af3 nginx: 10.5.10.228 "POST /authn-k8s/conjur-authenticator/deloittepilot/host%2Fconjur%2Fauthn-k8s%2Fconjur-authenticator%2Fapps%2Fconjur-follower%2Fservice_account%2Fconjur-cluster/authenticate HTTP/1.1" 401 5 "-" "curl/7.54.0" 0.008 0.008

Hi Jose,

Did you initialize the internal CA we use to create a client certificate that is provided to the pod? Those would be the steps that populate the variables /conjur/authn-k8s/conjur-authenticator/ca/cert and /conjur/authn-k8s/conjur-authenticator/ca/key. I’ve never seen this error before, but the only client certificate in the mix here is the one the authenticator is going to generate with this CA and feed back to the pod. I’ve seen a bad value set for these that throws a different error along with the 401, but I can’t recall the error you see when the value is blank. Regardless, I don’t see that we’ve verified that bit yet in this thread, so worth making sure.

-Nate

Hi @nathan.whipple,

I have initialised before, but I did again just in case, to update the secrets. I’ve re-run the follower-deployment, updated the service account token (ca cert remained the same from ocp/kube side).

docker exec conjur-master chpst -u conjur conjur-plugin-service possum rake authn_k8s:ca_init["conjur/authn-k8s/conjur-authenticator"]
`/root` is not writable.
Bundler will use `/tmp/bundler/home/unknown' as your home directory temporarily.
Rails Error: Unable to access log file. Please ensure that /opt/conjur/possum/log/appliance.log exists and is writable (ie, make it writable for user and group: chmod 0664 /opt/conjur/possum/log/appliance.log). The log level has been raised to WARN and the output directed to STDERR until the problem is fixed.
Populated CA and Key of service conjur/authn-k8s/conjur-authenticator
To print values:
 conjur variable value conjur/authn-k8s/conjur-authenticator/ca/cert
 conjur variable value conjur/authn-k8s/conjur-authenticator/ca/key

Issue remained the same though:

`less than a minute ago

[conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster]

[conjur/authn-k8s/conjur-authenticator/apps/conjur-follower/service_account/conjur-cluster] failed to authenticate with authenticator authn-k8s service deloittepilot:webservice:conjur/authn-k8s/conjur-authenticator: CONJ00029E Client SSL certificate is missing from the header`

Also tried the handy troubleshoot script here and the service-account access to OC’s API server works:

$ sh -x troubleshoot.sh 
+ CLI=oc
+ CONJUR_NAMESPACE_NAME=conjur-follower
++ uname -s
+ [[ Darwin == \L\i\n\u\x ]]
+ BASE64D='base64 -D'
++ oc get secrets -n conjur-follower
++ grep 'conjur.*service-account-token'
++ head -n1
++ awk '{print $1}'
+ TOKEN_SECRET_NAME=conjur-cluster-token-4jm4j
++ oc get secret -n conjur-follower conjur-cluster-token-4jm4j -o json
++ jq -r '.data["ca.crt"]'
++ base64 -D
+ CERT='******'
++ oc get secret -n conjur-follower conjur-cluster-token-4jm4j -o json
++ jq -r .data.token
++ base64 -D
+ TOKEN=****
++ oc config view --minify -o yaml
++ grep server
++ awk '{print $2}'
+ API=https://api.oc4-cluster.dtt-iam.xyz:6443
+ echo '$CERT_CHAIN**'
++ curl -s --cacert k8s.crt --header 'Authorization: Bearer ${TOKEN}' https://api.oc4-cluster.dtt-iam.xyz:6443/healthz
+ [[ ok == \o\k ]]
+ echo 'Service account access to K8s API verified.'
Service account access to K8s API verified.
+ rm k8s.crt

@micahlee, sorry to tag you here, but I saw you were the author for the troubleshooting page which was really helpful, thought of asking your opinion on what this issue can be (running out of options at the moment).

Also, it is been a bit hard to find a documentation for me to increase the conjur’s master log level to try and have more details about it.

Thanks in advance!