Skip to content

Commit fe9a96e

Browse files
[CF1] ZTIA troubleshooting guide
1 parent ded181f commit fe9a96e

File tree

5 files changed

+311
-17
lines changed

5 files changed

+311
-17
lines changed

src/content/docs/cloudflare-one/applications/non-http/infrastructure-apps.mdx

Lines changed: 1 addition & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -90,21 +90,7 @@ If a user is connected to a target in VNET-A and needs to connect to a target in
9090

9191
Users can use `warp-cli` to display a list of targets they can access. On the WARP device, open a terminal and run the following command:
9292

93-
```sh
94-
warp-cli target list
95-
```
96-
97-
```sh output
98-
╭──────────────────────────────────────┬──────────┬───────┬───────────────────────┬──────────────────────┬────────────╮
99-
│ Target ID │ Protocol │ Port │ Attributes │ IP (Virtual Network) │ Usernames │
100-
├──────────────────────────────────────┼──────────┼───────┼───────────────────────┼──────────────────────┼────────────┤
101-
│ 0193f22a-9df3-78e3-b5bb-7ab631903306 │ SSH │ 22 │ hostname: do-target │ 10.116.0.3 (a1net) │ alice │
102-
├──────────────────────────────────────┼──────────┼───────┼───────────────────────┼──────────────────────┼────────────┤
103-
│ 0193f22a-9df3-78e3-b5bb-7ab631903306 │ SSH │ 23 │ hostname: do-target │ 10.116.0.3 (a1net) │ root │
104-
├──────────────────────────────────────┼──────────┼───────┼───────────────────────┼──────────────────────┼────────────┤
105-
│ 01943cff-6130-7989-8bff-cbc02b59a2b1 │ SSH │ 80 │ hostname: az-target │ 172.16.0.0 (b1net) │ alice, bob │
106-
╰──────────────────────────────────────┴──────────┴───────┴───────────────────────┴──────────────────────┴────────────╯
107-
```
93+
<Render file="tunnel/warp-cli-target-list" product="cloudflare-one" />
10894

10995
You can optionally add flags to filter the output. For example:
11096

src/content/docs/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access.mdx

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,3 +188,292 @@ The following SSH features are not supported:
188188
### Session duration
189189

190190
SSH sessions have a maximum expected duration of 10 hours. For more information, refer to the [Troubleshooting FAQ](/cloudflare-one/faq/troubleshooting/#long-lived-ssh-sessions-frequently-disconnect).
191+
192+
## Troubleshooting
193+
194+
Failure to connect to your SSH endpoint could be the result of multiple variables. Use the following steps to investigate and resolve the source of your connection failure.
195+
196+
1. [Verify that your Access policies](#1-review-access-policies) allow the user to access the target machine.
197+
2. [Check Cloudflare Tunnel](#2-check-target-machine-connection) health.
198+
3. [Confirm user existence](#3-confirm-user-existence-on-the-target-server) on the target server.
199+
4. [Check your `sshd_config` file](#4-debug-sshd_config-file-misconfiguration) for misconfiguration.
200+
201+
### 1. Review Access policies
202+
203+
A user may be blocked by an Access policy from reaching an SSH target because:
204+
205+
- An Access policy exists that denies that user access, or
206+
- No explicit allow Access policy exists and Access is set to deny the user by default.
207+
208+
:::note[Access policies and infrastructure applications]
209+
210+
The Access infrastructure application (created in [step 5](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#5-add-an-infrastructure-application)) is the policy container for your SSH server. Cloudflare refers to your SSH server as a [target](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#4-add-a-target).
211+
212+
[Access policies](/cloudflare-one/policies/access/policy-management/) are the rules attached to this Access infrastructure application, determining who can connect and what UNIX usernames they can log in as on the server. Cloudflare will not create new users on the target. UNIX users must already be present on the server.
213+
214+
You were guided to create an Access policy for your SSH target in [substep 9 of step 5: Add an infrastructure application](#5-add-an-infrastructure-application).
215+
216+
:::
217+
218+
#### End users
219+
220+
As an end user, run [`warp-cli target list`](/cloudflare-one/applications/non-http/infrastructure-apps/#display-available-targets) to verify that you have access to the target machine.
221+
222+
<Render file="tunnel/warp-cli-target-list" product="cloudflare-one" />
223+
224+
- If the target appears in the list, confirm that the username you are attempting to connect with is shown in the output. If the username is not shown, an administrator must find the Access policy associated with the target machine and add that username to the Access policy. An administrator should have created an Access policy in [substep 9 of step 5: Add an infrastructure application](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#5-add-an-infrastructure-application). If the username is shown, that means the Access policy should be granting access and you should ensure that the tunnel is healthy in [step 2](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#2-check-target-machine-connection).
225+
226+
- If the target does not appear in the list, an administrator must audit your organization's policies for the target machine in the Zero Trust dashboard for potential misconfigurations that may be blocking access.
227+
228+
#### Administrators
229+
230+
As an admin, instead of running `warp-cli target list`, you can use the Access logs to review if an Access policy is causing connection issues. Reviewing logs is useful when troubleshooting connection issues on behalf of the end user.
231+
232+
:::note
233+
234+
You will need Cloudflare dashboard access and log view [permissions](/cloudflare-one/roles-permissions/) to proceed with this step.
235+
236+
:::
237+
238+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **Logs** > **Access**.
239+
240+
2. Select the application you are testing or filter _Infrastructure_ as the App Type.
241+
242+
3. Review the **Decision**. If the **Decision** is `Access denied`, select the application and copy the name under App.
243+
244+
If the decision is `Access granted`, Access policies are not interfering with your connection attempts and your connection issue is due to the Cloudflare Tunnel, SSH server, or the `sshd_config` file.
245+
246+
4. Go to **Access** > **Applications**.
247+
248+
5. Input the app name in the search bar and select the application.
249+
250+
6. Select **Configure**.
251+
252+
7. Go to [**Policies**](/cloudflare-one/policies/access/policy-management/#test-your-policies) to review what criteria may be blocking the user.
253+
254+
By editing a [policy](/cloudflare-one/policies/access/) that is explicitly blocking the user or adding a new policy to explicitly allow the user, the connection issue should be resolved. After saving your policy changes, attempt to connect to the target machine as the end user.
255+
256+
If you are still having connection issues after auditing your Access policies, review Tunnel health in the following step.
257+
258+
### 2. Check target machine connection
259+
260+
If the end user cannot connect to the target SSH machine, the tunnel you set up in [step 1: Connect the server to Cloudflare](#1-connect-the-server-to-cloudflare) may be down or inactive.
261+
262+
To check the status of your Tunnel:
263+
264+
1. In [Zero Trust](https://one.dash.cloudflare.com/), go to **Networks** > **Routes**.
265+
2. Search your IP to find the Tunnel associated with the IP.
266+
267+
This IP will be visible in the `warp-cli target list` output in [the previous step](#1-review-access-policies). If you are an admin, you can also go to **Networks** > **Targets** and find the IP next to your Hostname.
268+
269+
3. Copy the Tunnel name.
270+
4. Go to **Networks** > **Tunnels** and search by your Tunnel name.
271+
5. Review that the [Tunnel status](/cloudflare-one/connections/connect-networks/monitor-tunnels/notifications/#available-notifications) says `Active`, and not `Down`, `Degraded`, or `Inactive`.
272+
273+
| Status | Meaning | Recommended Action |
274+
|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
275+
| **Healthy** | The tunnel is active and serving traffic through four connections to the Cloudflare global network. | No action is required. Your Tunnel is running correctly. |
276+
| **Inactive** | The Tunnel has been created (via the API or dashboard) but the `cloudflared` connector has never been run to establish a connection. | Run the tunnel as a service (recommended) or use the `cloudflared tunnel run` command on your origin server to connect the tunnel to Cloudflare. Refer to [substep 6 of step 1 in the Create a Tunnel dashboard guide](/cloudflare-one/connections/connect-networks/get-started/create-remote-tunnel/#1-create-a-tunnel) or step 4 in the [Create a Tunnel API guide](/cloudflare-one/connections/connect-networks/get-started/create-remote-tunnel/#1-create-a-tunnel). |
277+
| **Down** | The Tunnel was previously connected but is currently disconnected because the `cloudflared` process has stopped. | 1. Ensure the `cloudflared` service or process is actively running on your server. <br /> 2. Check for server-side issues, such as the machine being powered off, an application crash, or recent network changes. |
278+
| **Degraded** | The `cloudflared` connector is running and the tunnel is serving traffic, but at least one individual connection has failed. Further degradation in [tunnel availability](/cloudflare-one/connections/connect-networks/configure-tunnels/tunnel-availability/) could risk the tunnel going down and failing to serve traffic. | 1. Review your `cloudflared` logs for connection failures or error messages. <br /> 2. Investigate local network and firewall rules to ensure they are not blocking connections to the [Cloudflare Tunnel IPs and ports](/cloudflare-one/connections/connect-networks/configure-tunnels/tunnel-with-firewall/). <br /> |
279+
280+
For detailed steps on troubleshooting, refer to the [Troubleshooting Tunnel documentation](/cloudflare-one/connections/connect-networks/troubleshoot-tunnels/). Review the [Tunnel with Firewall documentation](/cloudflare-one/connections/connect-networks/configure-tunnels/tunnel-with-firewall/#test-connectivity) to ensure your network is correctly configured to allow `cloudflared` connections.
281+
282+
After you have verified that there are no issues with your Tunnel's health, confirm the user's existence on the target SSH server in the following step.
283+
284+
### 3. Confirm user existence on the target server
285+
286+
To verify the existence of the end user on the target SSH server, run the `id <USERNAME>` command on the target SSH server to verify that the end user's username exists. If the username does not exist, you must add the user to the server.
287+
288+
If the user exists on the target machine, debug your `sshd_config` file in the following step.
289+
290+
### 4. Debug `sshd_config` file misconfiguration
291+
292+
One reason a user is failing to connect to your SSH endpoint might be the result of a misconfigured `sshd_config` file. Follow the steps below to audit your `sshd_config` file for misconfigurations.
293+
294+
#### Review your `sshd` logs
295+
296+
`sshd` logs can confirm whether or not the user is making it to the server. The location of your `sshd` logs is defined in your `sshd_config`. The logs location is likely at `journalctl -u ssh` on Ubuntu and `tail /var/log/auth.log` for Red Hat.
297+
298+
Using your `sshd` logs, validate that SSH connection attempts are arriving to the SSH target machine.
299+
300+
#### Review your `sshd_config` file for misconfigurations
301+
302+
To rule out any issues in your `sshd_config` file, compare your existing `sshd_config` file with the example below to verify if any directives are causing authentication issues. The following example `sshd_config` file will result in successful authentication:
303+
304+
<details>
305+
<summary>Example `sshd_config` file</summary>
306+
307+
```
308+
# This is the sshd server system-wide configuration file. See
309+
# sshd_config(5) for more information.
310+
311+
# The strategy used for options in the default sshd_config shipped with
312+
# OpenSSH is to specify options with their default value where
313+
# possible, but leave them commented. Uncommented options override the
314+
# default value.
315+
316+
PubkeyAuthentication yes
317+
TrustedUserCAKeys /etc/ssh/ca.pub
318+
319+
Include /etc/ssh/sshd_config.d/*.conf
320+
321+
# When systemd socket activation is used (the default), the socket
322+
# configuration must be re-generated after changing Port, AddressFamily, or
323+
# ListenAddress.
324+
#
325+
# For changes to take effect, run:
326+
#
327+
# systemctl daemon-reload
328+
# systemctl restart ssh.socket
329+
#
330+
#Port 22
331+
#AddressFamily any
332+
#ListenAddress 0.0.0.0
333+
#ListenAddress ::
334+
335+
#HostKey /etc/ssh/ssh_host_rsa_key
336+
#HostKey /etc/ssh/ssh_host_ecdsa_key
337+
#HostKey /etc/ssh/ssh_host_ed25519_key
338+
339+
# Ciphers and keying
340+
#RekeyLimit default none
341+
342+
# Logging
343+
#SyslogFacility AUTH
344+
LogLevel DEBUG3
345+
346+
# Authentication:
347+
348+
#LoginGraceTime 2m
349+
PermitRootLogin yes
350+
#StrictModes yes
351+
#MaxAuthTries 6
352+
#MaxSessions 10
353+
354+
355+
356+
# Expect .ssh/authorized_keys2 to be disregarded by default in future.
357+
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2
358+
359+
#AuthorizedPrincipalsFile none
360+
361+
#AuthorizedKeysCommand none
362+
#AuthorizedKeysCommandUser nobody
363+
364+
# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
365+
#HostbasedAuthentication no
366+
# Change to yes if you don't trust ~/.ssh/known_hosts for
367+
# HostbasedAuthentication
368+
#IgnoreUserKnownHosts no
369+
# Don't read the user's ~/.rhosts and ~/.shosts files
370+
#IgnoreRhosts yes
371+
372+
# To disable tunneled clear text passwords, change to no here!
373+
#PasswordAuthentication yes
374+
#PermitEmptyPasswords no
375+
376+
# Change to yes to enable challenge-response passwords (beware issues with
377+
# some PAM modules and threads)
378+
KbdInteractiveAuthentication no
379+
380+
# Kerberos options
381+
#KerberosAuthentication no
382+
#KerberosOrLocalPasswd yes
383+
#KerberosTicketCleanup yes
384+
#KerberosGetAFSToken no
385+
386+
# GSSAPI options
387+
#GSSAPIAuthentication no
388+
#GSSAPICleanupCredentials yes
389+
#GSSAPIStrictAcceptorCheck yes
390+
#GSSAPIKeyExchange no
391+
392+
# Set this to 'yes' to enable PAM authentication, account processing,
393+
# and session processing. If this is enabled, PAM authentication will
394+
# be allowed through the KbdInteractiveAuthentication and
395+
# PasswordAuthentication. Depending on your PAM configuration,
396+
# PAM authentication via KbdInteractiveAuthentication may bypass
397+
# the setting of "PermitRootLogin yes
398+
# If you just want the PAM account and session checks to run without
399+
# PAM authentication, then enable this but set PasswordAuthentication
400+
# and KbdInteractiveAuthentication to 'no'.
401+
UsePAM yes
402+
403+
#AllowAgentForwarding yes
404+
#AllowTcpForwarding yes
405+
#GatewayPorts no
406+
X11Forwarding yes
407+
#X11DisplayOffset 10
408+
#X11UseLocalhost yes
409+
#PermitTTY yes
410+
PrintMotd no
411+
#PrintLastLog yes
412+
#TCPKeepAlive yes
413+
#PermitUserEnvironment no
414+
#Compression delayed
415+
#ClientAliveInterval 0
416+
#ClientAliveCountMax 3
417+
#UseDNS no
418+
#PidFile /run/sshd.pid
419+
#MaxStartups 10:30:100
420+
#PermitTunnel no
421+
#ChrootDirectory none
422+
#VersionAddendum none
423+
424+
# no default banner path
425+
#Banner none
426+
427+
# Allow client to pass locale environment variables
428+
AcceptEnv LANG LC_*
429+
430+
# override default of no subsystems
431+
Subsystem sftp /usr/lib/openssh/sftp-server
432+
433+
# Example of overriding settings on a per-user basis
434+
#Match User anoncvs
435+
# X11Forwarding no
436+
# AllowTcpForwarding no
437+
# PermitTTY no
438+
# ForceCommand cvs server
439+
```
440+
441+
</details>
442+
443+
#### Replace and test with example configuration
444+
445+
The next steps will walk you through a troubleshooting regimen. You will temporarily replace your existing `sshd_config` file with the provided example to rule out configuration issues. Before proceeding, carefully [review and compare both files](#review-your-sshd_config-file-for-misconfigurations) to identify any conflicting directives.
446+
447+
:::caution[You may lose access to your SSH server]
448+
449+
These troubleshooting steps could result in you being locked out of your SSH server because your existing auth may rely on existing configuration that is not in the [example file](#review-your-sshd_config-file-for-misconfigurations). Proceed with utmost caution.
450+
451+
:::
452+
453+
1. Back up the existing `sshd_config` file.
454+
455+
```sh
456+
mv /etc/ssh/sshd_config /etc/ssh/sshd_config.bak
457+
```
458+
459+
2. Create a new `sshd_config` file.
460+
461+
```sh
462+
vi /etc/ssh/sshd_config
463+
```
464+
465+
3. Enter insert mode by pressing the 'i' key on your keyboard.
466+
467+
4. Paste in the [example file](#review-your-sshd_config-file-for-misconfigurations).
468+
469+
5. Exit insert mode by pressing the escape (`esc`) key.
470+
6. Enter `:x` to save and exit.
471+
7. [Reload](#reload-your-ssh-server) your SSH server.
472+
473+
:::caution[Do not restart]
474+
Restarting your `sshd` service will result in the termination of your current SSH connection. Make sure to reload instead of restarting to avoid terminating all currently open SSH sessions.
475+
:::
476+
477+
<Render file="ssh/restart-server" product="cloudflare-one" />
478+
479+
By completing all four troubleshooting steps, you should have resolved any connection issues caused by misconfiguration of the SSH server. If issues persist, [recheck `sshd` logs](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#review-your-sshd-logs). The example [`sshd_config` shared above](/cloudflare-one/connections/connect-networks/use-cases/ssh/ssh-infrastructure-access/#review-your-sshd_config-file-for-misconfigurations) enables debug logging and may expose more specific issues.

0 commit comments

Comments
 (0)