Prometheus

Coder exposes many metrics which can be consumed by a Prometheus server, and give insight into the current state of a live Coder deployment.

If you don't have a Prometheus server installed, you can follow the Prometheus Getting started guide.

Enable Prometheus metrics

Coder server exports metrics via the HTTP endpoint, which can be enabled using either the environment variable CODER_PROMETHEUS_ENABLE or the flag --prometheus-enable.

The Prometheus endpoint address is http://localhost:2112/ by default. You can use either the environment variable CODER_PROMETHEUS_ADDRESS or the flag --prometheus-address <network-interface>:<port> to select a different listen address.

If coder server --prometheus-enable is started locally, you can preview the metrics endpoint in your browser or with curl:

$ curl http://localhost:2112/
# HELP coderd_api_active_users_duration_hour The number of users that have been active within the last hour.
# TYPE coderd_api_active_users_duration_hour gauge
coderd_api_active_users_duration_hour 0
...

Kubernetes deployment

The Prometheus endpoint can be enabled in the Helm chart's values.yml by setting the environment variable CODER_PROMETHEUS_ADDRESS to 0.0.0.0:2112. The environment variable CODER_PROMETHEUS_ENABLE will be enabled automatically. A Service Endpoint will not be exposed; if you need to expose the Prometheus port on a Service, (for example, to use a ServiceMonitor), create a separate headless service instead.

apiVersion: v1
kind: Service
metadata:
  name: coder-prom
  namespace: coder
spec:
  clusterIP: None
  ports:
    - name: prom-http
      port: 2112
      protocol: TCP
      targetPort: 2112
  selector:
    app.kubernetes.io/instance: coder
    app.kubernetes.io/name: coder
  type: ClusterIP

Prometheus configuration

To allow Prometheus to scrape the Coder metrics, you will need to create a scrape_config in your prometheus.yml file, or in the Prometheus Helm chart values. The following is an example scrape_config.

scrape_configs:
  - job_name: "coder"
    scheme: "http"
    static_configs:
      # replace with the the IP address of the Coder pod or server
      - targets: ["<ip>:2112"]
        labels:
          apps: "coder"

To use the Kubernetes Prometheus operator to scrape metrics, you will need to create a ServiceMonitor in your Coder deployment namespace. The following is an example ServiceMonitor.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: coder-service-monitor
  namespace: coder
spec:
  endpoints:
    - port: prometheus-http
      interval: 10s
      scrapeTimeout: 10s
  selector:
    matchLabels:
      app.kubernetes.io/name: coder

Available metrics

You must first enable coderd_agentstats_* with the flag --prometheus-collect-agent-stats, or the environment variable CODER_PROMETHEUS_COLLECT_AGENT_STATS before they can be retrieved from the deployment. They will always be available from the agent.

NameTypeDescriptionLabels
agent_scripts_executed_totalcounterTotal number of scripts executed by the Coder agent. Includes cron scheduled scripts.agent_name success template_name username workspace_name
coderd_agents_appsgaugeAgent applications with statuses.agent_name app_name health username workspace_name
coderd_agents_connection_latencies_secondsgaugeAgent connection latencies in seconds.agent_name derp_region preferred username workspace_name
coderd_agents_connectionsgaugeAgent connections with statuses.agent_name lifecycle_state status tailnet_node username workspace_name
coderd_agents_upgaugeThe number of active agents per workspace.template_name username workspace_name
coderd_agentstats_connection_countgaugeThe number of established connections by agentagent_name username workspace_name
coderd_agentstats_connection_median_latency_secondsgaugeThe median agent connection latencyagent_name username workspace_name
coderd_agentstats_currently_reachable_peersgaugeThe number of peers (e.g. clients) that are currently reachable over the encrypted network.agent_name connection_type template_name username workspace_name
coderd_agentstats_rx_bytesgaugeAgent Rx bytesagent_name username workspace_name
coderd_agentstats_session_count_jetbrainsgaugeThe number of session established by JetBrainsagent_name username workspace_name
coderd_agentstats_session_count_reconnecting_ptygaugeThe number of session established by reconnecting PTYagent_name username workspace_name
coderd_agentstats_session_count_sshgaugeThe number of session established by SSHagent_name username workspace_name
coderd_agentstats_session_count_vscodegaugeThe number of session established by VSCodeagent_name username workspace_name
coderd_agentstats_startup_script_secondsgaugeThe number of seconds the startup script took to execute.agent_name success template_name username workspace_name
coderd_agentstats_tx_bytesgaugeAgent Tx bytesagent_name username workspace_name
coderd_api_active_users_duration_hourgaugeThe number of users that have been active within the last hour.
coderd_api_concurrent_requestsgaugeThe number of concurrent API requests.
coderd_api_concurrent_websocketsgaugeThe total number of concurrent API websockets.
coderd_api_request_latencies_secondshistogramLatency distribution of requests in seconds.method path
coderd_api_requests_processed_totalcounterThe total number of processed API requestscode method path
coderd_api_websocket_durations_secondshistogramWebsocket duration distribution of requests in seconds.path
coderd_api_workspace_latest_buildgaugeThe latest workspace builds with a status.status
coderd_api_workspace_latest_build_totalgaugeDEPRECATED: use coderd_api_workspace_latest_build insteadstatus
coderd_insights_applications_usage_secondsgaugeThe application usage per template.application_name slug template_name
coderd_insights_parametersgaugeThe parameter usage per template.parameter_name parameter_type parameter_value template_name
coderd_insights_templates_active_usersgaugeThe number of active users of the template.template_name
coderd_license_active_usersgaugeThe number of active users.
coderd_license_limit_usersgaugeThe user seats limit based on the active Coder license.
coderd_license_user_limit_enabledgaugeReturns 1 if the current license enforces the user limit.
coderd_metrics_collector_agents_execution_secondshistogramHistogram for duration of agents metrics collection in seconds.
coderd_oauth2_external_requests_rate_limitgaugeThe total number of allowed requests per interval.name resource
coderd_oauth2_external_requests_rate_limit_next_reset_unixgaugeUnix timestamp of the next intervalname resource
coderd_oauth2_external_requests_rate_limit_remaininggaugeThe remaining number of allowed requests in this interval.name resource
coderd_oauth2_external_requests_rate_limit_reset_in_secondsgaugeSeconds until the next intervalname resource
coderd_oauth2_external_requests_rate_limit_totalgaugeDEPRECATED: use coderd_oauth2_external_requests_rate_limit insteadname resource
coderd_oauth2_external_requests_rate_limit_usedgaugeThe number of requests made in this interval.name resource
coderd_oauth2_external_requests_totalcounterThe total number of api calls made to external oauth2 providers. 'status_code' will be 0 if the request failed with no response.name source status_code
coderd_provisionerd_job_timings_secondshistogramThe provisioner job time duration in seconds.provisioner status
coderd_provisionerd_jobs_currentgaugeThe number of currently running provisioner jobs.provisioner
coderd_workspace_builds_totalcounterThe number of workspaces started, updated, or deleted.action owner_email status template_name template_version workspace_name
coderd_workspace_latest_build_statusgaugeThe current workspace statuses by template, transition, and owner.status template_name template_version workspace_owner workspace_transition
go_gc_duration_secondssummaryA summary of the pause duration of garbage collection cycles.
go_goroutinesgaugeNumber of goroutines that currently exist.
go_infogaugeInformation about the Go environment.version
go_memstats_alloc_bytesgaugeNumber of bytes allocated and still in use.
go_memstats_alloc_bytes_totalcounterTotal number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytesgaugeNumber of bytes used by the profiling bucket hash table.
go_memstats_frees_totalcounterTotal number of frees.
go_memstats_gc_sys_bytesgaugeNumber of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytesgaugeNumber of heap bytes allocated and still in use.
go_memstats_heap_idle_bytesgaugeNumber of heap bytes waiting to be used.
go_memstats_heap_inuse_bytesgaugeNumber of heap bytes that are in use.
go_memstats_heap_objectsgaugeNumber of allocated objects.
go_memstats_heap_released_bytesgaugeNumber of heap bytes released to OS.
go_memstats_heap_sys_bytesgaugeNumber of heap bytes obtained from system.
go_memstats_last_gc_time_secondsgaugeNumber of seconds since 1970 of last garbage collection.
go_memstats_lookups_totalcounterTotal number of pointer lookups.
go_memstats_mallocs_totalcounterTotal number of mallocs.
go_memstats_mcache_inuse_bytesgaugeNumber of bytes in use by mcache structures.
go_memstats_mcache_sys_bytesgaugeNumber of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytesgaugeNumber of bytes in use by mspan structures.
go_memstats_mspan_sys_bytesgaugeNumber of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytesgaugeNumber of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytesgaugeNumber of bytes used for other system allocations.
go_memstats_stack_inuse_bytesgaugeNumber of bytes in use by the stack allocator.
go_memstats_stack_sys_bytesgaugeNumber of bytes obtained from system for stack allocator.
go_memstats_sys_bytesgaugeNumber of bytes obtained from system.
go_threadsgaugeNumber of OS threads created.
process_cpu_seconds_totalcounterTotal user and system CPU time spent in seconds.
process_max_fdsgaugeMaximum number of open file descriptors.
process_open_fdsgaugeNumber of open file descriptors.
process_resident_memory_bytesgaugeResident memory size in bytes.
process_start_time_secondsgaugeStart time of the process since unix epoch in seconds.
process_virtual_memory_bytesgaugeVirtual memory size in bytes.
process_virtual_memory_max_bytesgaugeMaximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flightgaugeCurrent number of scrapes being served.
promhttp_metric_handler_requests_totalcounterTotal number of scrapes by HTTP status code.code
See an opportunity to improve our docs? Make an edit.