Conjur cluster health status dashboard - standby & follower shows unknown

Hello i am building a lab for conjur POC and installed conjur v 12.4.0. HAProxy is used as a Load balancer. Pretty much followed the documentation and built the cluster( 1 Master and 2 standby) and 1 follower nodes. Everything seems to work but the dashboard on the UI displays “unable to connect to the health API for this node” for all standbys and followers.

When i issues a curl command i can see the health of the standbys and followers are normal, but the UI displays it cannot connect. I am confused as how a curl and ui displays different results. I checked the nginx logs on the master and did not see much information.

Spent enough time troubleshooting this and reaching out to the community for the thoughts. Can you please point me in the right direction on why this is happening ?

Thanks

Hi @senko,

What this means is that the IP address of the Standby or Follower connection doesn’t allow the Master to connect back and check the health. This is typically because the Standbys or Followers are connecting to the Master through a load balancer.

If you notice your Standbys and Followers all show an IP address of 10.2.0.6. I suspect this is your load balancer’s IP address. When the Master attempts to connect to this IP address to check the node’s health, it is routed back to itself, rather than to the desired Standby or Follower. Because of this, it isn’t able to report the health of the node in this dashboard.

This doesn’t mean those nodes are unhealthy, only that the Master itself isn’t able to check the health with the information it has available.

1 Like

Thanks much @micahlee .Showing the LB IP Address is normal for a load-balanced set up right? How do i get this fixed on the UI. Any thoughts are greatly appreciated.

Hi @senko, yes this is normal. Unfortunately, there isn’t way to change this in the UI. It’s provided as a convenience when the information is available. However, true system-wide monitoring requires an external system that can query each of the health endpoints.

1 Like

Thanks @micahlee. On a production live environment, how is this being designed ? Should the load balancers need to x-forward-for to preserve the client IP Address ?If so then it must be a layer 7 load balancing-correct ?. From the official docs - I see we would need a Layer 4 load balancing for postgreSQL replication.

I am more curious on how the cluster is designed to preserve the client IP address using X-Forwarded-For headers. Appreciate your help.

Hi @senko,

There are only two persistent connections between Standbys/Followers and the Master:

  1. Postgres for data replication from the Master.

  2. Syslog-ng for audit forwarding from Followers to the Master.

Each of these are TCP connections, but not HTTP. So they require the Layer 4 (TCP) load balancer. The cluster UI page gets the list of Standbys and Followers from the active Postgres connections. This is also where the Master gets the IP address for each peer node. Because these are TCP connections through the load balancer, the Master gets its connection from the load balancer and doesn’t see the client IP address of the actual Follower or Standby. We haven’t found an option for making the true Follower/Standby available for Postgres, which gives us the current limitation in the UI health page. These TCP connections are only used by Standbys and Followers, and are not used by Conjur API clients.

This limitation isn’t present for API clients using the Conjur Secrets Manager HTTP API. For that, we can use the layer 7 load balancer and leverage the x-forwarded-for header to pass through the true client IP address. This is an important configuration for using IP-based authorization and audit logging. More information on how this works and is configured is in the documentation here: IP address sourcing

1 Like

Thanks @micahlee . I was thinking of the fact to add the load balancers using “evoke proxy add” to solve this. But as you pointed out, this won’t help as Master reads the standbys from the DB connections. I will dig into this. Appreciate all your responses.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.