High Availability (HA) mode solves for horizontal scalability and automatic failover within a single region. When in HA mode, Coder continues using a single Postgres endpoint. GCP, AWS, and other cloud vendors offer fully-managed HA Postgres services that pair nicely with Coder.
For Coder to operate correctly, Coderd instances should have low-latency connections to each other so that they can effectively relay traffic between users and workspaces no matter which Coderd instance users or workspaces connect to. We make a best-effort attempt to warn the user when inter-Coderd latency is too high, but if requests start dropping, this is one metric to investigate.
We also recommend that you deploy all Coderd instances such that they have low-latency connections to Postgres. Coderd often makes several database round-trips while processing a single API request, so prioritizing low-latency between Coderd and Postgres is more important than low-latency between users and Coderd.
Note that this latency requirement applies only to Coder services. Coder will operate correctly even with few seconds of latency on workspace <-> Coder and user <-> Coder connections.
Coder automatically enters HA mode when multiple instances simultaneously connect to the same Postgres endpoint.
HA brings one configuration variable to set in each Coderd node:
CODER_DERP_SERVER_RELAY_URL. The HA nodes use these URLs to communicate with
each other. Inter-node communication is only required while using the embedded
relay (default). If you're using
custom relays, Coder ignores
CODER_DERP_SERVER_RELAY_URL since Postgres is the sole rendezvous for the
CODER_DERP_SERVER_RELAY_URL will never be
CODER_ACCESS_URL is a load balancer to all Coder nodes.
Here's an example 3-node network configuration setup:
If you installed Coder via
our Helm Chart, just
If you installed Coder into Kubernetes by some other means, insert the relay URL via the environment like so:
env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: CODER_DERP_SERVER_RELAY_URL value: http://$(POD_IP)
Then, increase the number of pods.