Troubleshooting templates
Occasionally, you may run into scenarios where a workspace is created, but the agent is either not connected or the startup script has failed or timed out.
Agent connection issues
If the agent is not connected, it means the agent or init script has failed on the resource.
$ coder ssh myworkspace
⢄⡱ Waiting for connection from [agent]...
While troubleshooting steps vary by resource, here are some general best practices:
- Ensure the resource has
curl
installed (alternatively,wget
orbusybox
) - Ensure the resource can
curl
your Coder access URL - Manually connect to the resource and check the agent logs (e.g.,
kubectl exec
,docker exec
or AWS console)- The Coder agent logs are typically stored in
/tmp/coder-agent.log
- The Coder agent startup script logs are typically stored in
/tmp/coder-startup-script.log
- The Coder agent shutdown script logs are typically stored in
/tmp/coder-shutdown-script.log
- The Coder agent logs are typically stored in
- This can also happen if the websockets are not being forwarded correctly when running Coder behind a reverse proxy. Read our reverse-proxy docs
Startup script issues
Depending on the contents of the startup script, and whether or not the startup script behavior is set to blocking or non-blocking, you may notice issues related to the startup script. In this section we will cover common scenarios and how to resolve them.
Unable to access workspace, startup script is still running
If you're trying to access your workspace and are unable to because the
startup script
is still running, it means the
startup script behavior
option is set to blocking or you have enabled the --wait=yes
option (for e.g.
coder ssh
or coder config-ssh
). In such an event, you can always access the
workspace by using the web terminal, or via SSH using the --wait=no
option. If
the startup script is running longer than it should, or never completing, you
can try to debug the startup script to resolve
the issue. Alternatively, you can try to force the startup script to exit by
terminating processes started by it or terminating the startup script itself (on
Linux, ps
and kill
are useful tools).
For tips on how to write a startup script that doesn't run forever, see the
startup_script
section. For more ways to override the
startup script behavior, see the
startup_script_behavior
section.
Template authors can also set the startup script behavior option to non-blocking, which will allow users to access the workspace while the startup script is still running. Note that the workspace must be updated after changing this option.
Your workspace may be incomplete
If you see a warning that your workspace may be incomplete, it means you should be aware that programs, files, or settings may be missing from your workspace. This can happen if the startup script is still running or has exited with a non-zero status (see startup script error). No action is necessary, but you may want to start a new shell session after it has completed or check the startup script logs to see if there are any issues.
Session was started before the startup script finished
The web terminal may show this message if it was started before the startup script finished, but the startup script has since finished. This message can safely be dismissed, however, be aware that your preferred shell or dotfiles may not yet be activated for this shell session. You can either start a new session or source your dotfiles manually. Note that starting a new session means that commands running in the terminal will be terminated and you may lose unsaved work.
Examples for activating your preferred shell or sourcing your dotfiles:
exec zsh -l
source ~/.bashrc
Startup script exited with an error
When the
startup script
exits with an error, it means the last command run by the script failed. When
set -e
is used, this means that any failing command will immediately exit the
script and the remaining commands will not be executed. This also means that
your workspace may be incomplete. If you
see this error, you can check the
startup script logs to figure out what the
issue is.
Common causes for startup script errors:
- A missing command or file
- A command that fails due to missing permissions
- Network issues (e.g., unable to reach a server)
Debugging the startup script
The simplest way to debug the
startup script
is to open the workspace in the Coder dashboard and click "Show startup log" (if
not already visible). This will show all the output from the script. Another
option is to view the log file inside the workspace (usually
/tmp/coder-startup-script.log
). If the logs don't indicate what's going on or
going wrong, you can increase verbosity by adding set -x
to the top of the
startup script (note that this will show all commands run and may output
sensitive information). Alternatively, you can add echo
statements to show
what's going on.
Here's a short example of an informative startup script:
echo "Running startup script..."
echo "Run: long-running-command"
/path/to/long-running-command
status=$?
echo "Done: long-running-command, exit status: ${status}"
if [ $status -ne 0 ]; then
echo "Startup script failed, exiting..."
exit $status
fi
Note: We don't use
set -x
here because we're manually echoing the commands. This protects against sensitive information being shown in the log.
This script tells us what command is being run and what the exit status is. If
the exit status is non-zero, it means the command failed and we exit the script.
Since we are manually checking the exit status here, we don't need set -e
at
the top of the script to exit on error.
Note: If you aren't seeing any logs, check that the
dir
directive points to a valid directory in the file system.