Skip to content

Commit 37335e3

Browse files
committed
RFC proposal for ssh-over-ws
link PR in meta section Incorporate additional feedback on websocket proxies for OpenSSH Changed v2 to v3 compatibiliy only for CAPI Change according to feedback (ssh ingress) Add section to problem statement and solution proposal wrt the open SSH port
1 parent 437ec9b commit 37335e3

File tree

2 files changed

+333
-0
lines changed

2 files changed

+333
-0
lines changed

toc/rfc/rfc-draft-ssh-over-ws.md

Lines changed: 333 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,333 @@
1+
# Meta
2+
[meta]: #meta
3+
- Name: CF SSH over WebSockets
4+
- Start Date: 2024-04-08
5+
- Author(s): @domdom82
6+
- Status: Draft
7+
- RFC Pull Request: https://github.com/cloudfoundry/community/pull/807
8+
9+
10+
## Summary
11+
12+
This RFC aims at providing an additional way of using CF SSH in an effort to reduce the potential attack surface against SSH.
13+
14+
By tunneling SSH through a WebSocket connection, the SSH client may reuse the same port as regular HTTP traffic, removing the need for port 2222 as well as a separate ingress for platform operators.
15+
16+
## Problem
17+
18+
Cloud Foundry is a shared multi-tenant platform.
19+
20+
If a feature requiring an open port such as CF SSH is used by at least one customer, the port has to be opened on the whole platform. It cannot be enabled for one tenant but disabled for another.
21+
22+
The way CF SSH is configurable today, a specific port must be open and it must be open for the whole platform. The port is easily discoverable by both customers and potential attackers.
23+
24+
In production deployments of Cloud Foundry, it is commonplace for customers to run security scans on their workloads to ensure compliance and reduce attack surfaces.
25+
26+
These scans usually complain that besides port 443 for HTTP traffic, also port 2222 is open for SSH on the internet-facing load balancer. Many customers demand closure of the port as they
27+
view it as a potential attack surface for brute-force attacks against the SSH protocol.
28+
29+
The issue is exacerbated by features like "bring-your-own-domain" where customers provide a CNAME entry alongside a matching x509 certificate that points their own (sub-)domain to an app hosted on the shared Cloud Foundry environment.
30+
31+
If the Cloud Foundry operator chooses to host the HTTP ingress and the SSH ingress on the same load balancer, this can cause potential reputation damage to the customer because to an outsider, it looks like the port 2222 is open on the customer's domain, e.g. `www.big-corp.com:2222` open looks worse than `my-app.cf-app.com:2222` even though both are technically the same.
32+
33+
If the Cloud Foundry operator runs the HTTP ingress and SSH ingress on different load balancers, this may be less of an issue as the customer's domain would be `www.big-corp.com:443` only and be separate from a potential `ssh.cf-app.com:2222`. However, such an ingress would still look very inviting to attackers and would incur additional costs and maintenance.
34+
35+
Security-wise it is easier for an attacker to probe the environment by simply doing a port scan and then testing direct attacks on the SSH proxy:
36+
37+
```
38+
nmap api.cf-app.com
39+
40+
Starting Nmap 7.94 ( https://nmap.org ) at 2024-04-08 10:29 CEST
41+
Nmap scan report for api.cf-app.com (1.2.3.4)
42+
Host is up (0.031s latency).
43+
Not shown: 997 filtered tcp ports (no-response)
44+
PORT STATE SERVICE
45+
80/tcp open http
46+
443/tcp open https
47+
2222/tcp open EtherNetIP-1
48+
```
49+
50+
then proceed with connecting to the port 2222:
51+
52+
```
53+
nc api.cf.cf-app.com 2222
54+
55+
SSH-2.0-diego-ssh-proxy
56+
```
57+
58+
the SSH proxy greets with the SSH-2.0 version, allowing the attacker to try all kinds of SSH-2.0 related probes.
59+
60+
Another facet of the problem is that many customers require explicit allow-listing of external web sites from their on-prem environments.
61+
Port 443 is usually not a problem as many firewalls open it by default, but port 2222 likely is frowned upon and cause for concern.
62+
63+
## Proposal
64+
65+
The proposal aims to make CF SSH less discoverable and reduce complexity and costs to potentially maintain two separate load balancers for HTTP and SSH traffic.
66+
67+
### Note on Security
68+
The proposal does not aim to solve security issues of (CF) SSH per se. Tunneling a protocol through another does not make it inherently more secure. Any weakness of SSH within the CF SSH proxy will still be present even if the RFC is accepted. The main goal is to reduce the number of open ports required. Although, there is one minor security improvement as the SSH traffic will then be also encrypted by TLS within the WebSocket tunnel.
69+
70+
71+
### Architecture Changes
72+
73+
![architecture-changes](rfc-draft-ssh-over-ws/cf-ssh-over-websocket.drawio.png)
74+
75+
There are three scenarios:
76+
77+
#### Scenario A: Legacy SSH
78+
This is the default scenario as of today. All changed components MUST ensure this scenario still works. It SHOULD be tested with legacy CLI versions such as v6 which are no longer updated. This scenario includes regular SSH clients like OpenSSH.
79+
80+
#### Scenario B: Dual SSH / WS
81+
This scenario MAY be enabled by operators. It allows legacy SSH as well as pure WebSocket-based SSH in one environment. The CF CLI MUST understand the legacy flow from scenario A. The CF CLI SHOULD understand the new flow using a WebSocket tunnel. Other components MUST support both scenarios.
82+
83+
#### Scenario C: WS only
84+
This scenario MAY be enabled by operators. It turns of the legacy scenario A and allows operators to close the public port 2222 on the load balancer. The CF CLI MUST understand the new flow using a WebSocket tunnel. Other components MUST support the new flow and MAY support both scenarios.
85+
86+
87+
### Changes per Component (Scenario B and C)
88+
89+
#### CF CLI
90+
- The CLI MUST query the "/" endpoint of Cloud Controller.
91+
- The CLI MUST use at least v3 of CC API for this feature.
92+
- The "v2/info" endpoint is deprecated and not used by CLI v7/8. It MUST NOT return "app_ssh_ws_endpoint" info.
93+
- The "/" endpoint SHOULD return the legacy SSH info, if the legacy SSH feature is enabled:
94+
95+
```
96+
{
97+
"links": {
98+
(...)
99+
"app_ssh": {
100+
"href": "ssh.cf.cf-app.com:2222",
101+
"meta": {
102+
"host_key_fingerprint": "TD1dRQINLi2KxilVLzAI8tXB2h8MP79oyVJnUwshjdc",
103+
"oauth_client": "ssh-proxy"
104+
}
105+
}
106+
}
107+
}
108+
```
109+
- The "/" endpoint MUST NOT return the legacy SSH info, if the legacy SSH feature is disabled.
110+
111+
- The endpoint MUST return an object for ssh-over-websocket if the feature is enabled:
112+
113+
```
114+
{
115+
"links": {
116+
(...)
117+
"app_ssh_ws": {
118+
"href": "wss://ssh.cf.cf-app.com",
119+
"meta": {
120+
"host_key_fingerprint": "TD1dRQINLi2KxilVLzAI8tXB2h8MP79oyVJnUwshjdc",
121+
"oauth_client": "ssh-proxy"
122+
}
123+
}
124+
}
125+
}
126+
```
127+
128+
- If both legacy SSH and SSH-over-WebSocket features are enabled, the "/" endpoint MUST return info objects for both features:
129+
130+
```
131+
{
132+
"links": {
133+
(...)
134+
"app_ssh": {
135+
"href": "ssh.cf.cf-app.com:2222",
136+
"meta": {
137+
"host_key_fingerprint": "TD1dRQINLi2KxilVLzAI8tXB2h8MP79oyVJnUwshjdc",
138+
"oauth_client": "ssh-proxy"
139+
}
140+
}
141+
142+
"app_ssh_ws": {
143+
"href": "wss://ssh.cf.cf-app.com",
144+
"meta": {
145+
"host_key_fingerprint": "TD1dRQINLi2KxilVLzAI8tXB2h8MP79oyVJnUwshjdc",
146+
"oauth_client": "ssh-proxy"
147+
}
148+
}
149+
}
150+
}
151+
```
152+
153+
- CF CLI MUST prefer the `app_ssh_ws` link over the `app_ssh` link if present.
154+
- CF CLI MAY fall back to `app_ssh` link if `app_ssh_ws` did not connect successfully and both are present.
155+
- When using `app_ssh_ws`, the CF CLI MUST wrap the SSH connection in a WebSocket connection:
156+
157+
```
158+
func (c *secureShell) Connect(opts *options.SSHOptions) error {
159+
err := c.validateTarget(opts)
160+
if err != nil {
161+
return err
162+
}
163+
164+
clientConfig := &ssh.ClientConfig{
165+
User: fmt.Sprintf("cf:%s/%d", c.app.GUID, opts.Index),
166+
Auth: []ssh.AuthMethod{
167+
ssh.Password(c.token),
168+
},
169+
HostKeyCallback: fingerprintCallback(opts, c.sshEndpointFingerprint),
170+
}
171+
172+
// Wrap SSH in WebSocket if possible
173+
var secureClient SecureClient
174+
if c.sshWsEndpoint != "" {
175+
secureClient, err = tunnelSSHThruWebSocket(c.sshWsEndpoint, clientConfig)
176+
}
177+
178+
if err != nil {
179+
// Fall back to SSH over TCP
180+
secureClient, err = c.secureDialer.Dial("tcp", c.sshEndpoint, clientConfig)
181+
if err != nil {
182+
return err
183+
}
184+
}
185+
186+
c.secureClient = secureClient
187+
c.opts = opts
188+
return nil
189+
}
190+
```
191+
Code adapted from [cloudfoundry/cli ssh.go](https://github.com/cloudfoundry/cli/blob/main/cf/ssh/ssh.go#L119)
192+
193+
The `sshWsEndpoint` would have to be fetched from the `app_ssh_ws.href` field of the "/" endpoint of CAPI.
194+
195+
196+
A `tunnelSSHThruWebSocket` function could look something like this:
197+
```
198+
func tunnelSSHThruWebSocket(url string, config *ssh.ClientConfig) (SecureClient, error) {
199+
wsConn, err := websocket.Dial(url, "", url)
200+
if err != nil {
201+
return nil, err
202+
}
203+
204+
c, chans, reqs, err := ssh.NewClientConn(wsConn, url, config)
205+
if err != nil {
206+
return nil, err
207+
}
208+
sshClient := ssh.NewClient(c, chans, reqs)
209+
return &secureClient{client: sshClient}, nil
210+
}
211+
```
212+
213+
- The rest of CF CLI SHOULD be unchanged and work as before.
214+
215+
#### CAPI
216+
217+
- CAPI MUST provide an additional `app_ssh_ws` object in the "/" endpoint, if the feature is enabled:
218+
```
219+
"app_ssh_ws": {
220+
"href": "wss://ssh.cf.cf-app.com",
221+
"meta": {
222+
"host_key_fingerprint": "TD1dRQINLi2KxilVLzAI8tXB2h8MP79oyVJnUwshjdc",
223+
"oauth_client": "ssh-proxy"
224+
}
225+
}
226+
```
227+
- CAPI MUST provide both `app_ssh` and `app_ssh_ws` objects in the "/" endpoint, if both features are enabled.
228+
- CAPI MUST NOT change the `app_ssh` object in any way to remain backwards compatible.
229+
- The URL presented in the `app_ssh_ws.href`field MUST be `wss://` + the route as announced by route-registrar for SSH proxy (see [CF Deployment](#cf-deployment))
230+
231+
#### SSH Proxy
232+
233+
- Diego SSH Proxy SHOULD provide the option to [launch a separate WebSocket server](https://github.com/cloudfoundry/diego-ssh/blob/main/cmd/ssh-proxy/main.go#L65) besides the regular SSH server
234+
- SSH Proxy SHOULD provide a separate handler for unwrapping WebSocket connections:
235+
```
236+
sshWsProxy := wsProxy.New(logger, proxySSHServerConfig, metronClient, tlsConfig)
237+
wsServer := server.NewServer(logger, sshProxyConfig.Address, websocket.Handler(sshWsProxy), time.Duration(sshProxyConfig.IdleConnectionTimeout))
238+
```
239+
Changes in [ssh-proxy/main.go](https://github.com/cloudfoundry/diego-ssh/blob/main/cmd/ssh-proxy/main.go#L65C1-L67C1)
240+
241+
```
242+
func (p *WsProxy) HandleConnection(wsConn *websocket.Conn) {
243+
logger := p.logger.Session("handle-ws-connection")
244+
defer wsConn.Close()
245+
246+
// Here we start a SSH handshake inside the WebSocket tunnel
247+
serverConn, serverChannels, serverRequests, err := ssh.NewServerConn(wsConn, p.serverConfig)
248+
249+
(...)
250+
```
251+
Changes in [proxy.go](https://github.com/cloudfoundry/diego-ssh/blob/main/proxy/proxy.go#L67)
252+
253+
**Remark:**
254+
(these changes are not meant as verbatim code; they only illustrate areas that would need to change)
255+
256+
257+
- SSH Proxy SHOULD retain the legacy SSH server on port 2222 for backwards compatibiliy
258+
- The code of the legacy SSH feature SHOULD NOT be deleted
259+
- Diego Release SHOULD provide [a spec](https://github.com/cloudfoundry/diego-release/blob/develop/jobs/ssh_proxy/spec#L53) to define a `diego.ssh_proxy.listen_ws_addr` property with a default of `0.0.0.0:443` or `0.0.0.0:9443`
260+
- Diego Release SHOULD provide [a spec](https://github.com/cloudfoundry/diego-release/blob/develop/jobs/ssh_proxy/spec#L59) to disable the `diego.ssh_proxy.listen_addr` property, closing the default port of 2222.
261+
262+
#### CF Deployment
263+
264+
- CF Deployment [deploys SSH Proxy on Scheduler](https://github.com/cloudfoundry/cf-deployment/blob/main/cf-deployment.yml#L1329) but there is no route to it at the moment.
265+
- CF Deployment SHOULD deploy [route-registrar](https://github.com/cloudfoundry/routing-release/tree/develop/jobs/route_registrar) alongside SSH Proxy and configure it to announce a TLS route at Gorouter:
266+
```
267+
- name: route_registrar
268+
properties:
269+
nats:
270+
tls:
271+
enabled: true
272+
client_cert: "((nats_client_cert.certificate))"
273+
client_key: "((nats_client_cert.private_key))"
274+
route_registrar:
275+
routes:
276+
- name: ssh-ws-proxy
277+
port: 443 # or 9443 if preferred
278+
tls_port: 443 # or 9443 if preferred
279+
registration_interval: 20s
280+
server_cert_domain_san: ssh.((system_domain))
281+
uris:
282+
- ssh.((system_domain)) # e.g. ssh.cf-app.com, announced by CAPI as wss://ssh.cf-app.com
283+
release: routing
284+
```
285+
286+
287+
### Remaining Compatible with OpenSSH
288+
289+
- Getting the ssh token via `cf ssh-code` MUST remain unchanged as it connects to UAA.
290+
- Regular ssh, scp and sftp commands support a `-o ProxyCommand` option.
291+
- The program run as `ProxyCommand` MUST read from `stdin` and write to `stdout` for ssh to work.
292+
- The CF community MAY offer a simple WebSocket proxy for regular ssh as a [homebrew formula](https://github.com/cloudfoundry/homebrew-tap) or similar option.
293+
- The CF CLI MAY also offer a WebSocket proxy, e.g. as a sub-command like `cf proxy-ssh`
294+
- A simple WebSocket proxy could look like this:
295+
```
296+
func main() {
297+
url := os.Args[1]
298+
299+
wsConn, err := websocket.Dial(url, "", url)
300+
301+
if err != nil {
302+
panic(err)
303+
}
304+
305+
wg := sync.WaitGroup{}
306+
wg.Add(2)
307+
308+
go func() {
309+
_, err := io.Copy(wsConn, os.Stdin)
310+
if err != nil {
311+
fmt.Println(err.Error())
312+
}
313+
wg.Done()
314+
}()
315+
316+
go func() {
317+
_, err := io.Copy(os.Stdout, wsConn)
318+
if err != nil {
319+
fmt.Println(err.Error())
320+
}
321+
wg.Done()
322+
}()
323+
324+
wg.Wait()
325+
_ = wsConn.Close()
326+
}
327+
```
328+
Source of `ws-proxy` used below
329+
330+
Which can be used with regular ssh like so:
331+
```
332+
ssh -o ProxyCommand="ws-proxy wss://ssh.cf-app.com" cf:$(cf app app-name --guid)/[email protected]
333+
```
95.9 KB
Loading

0 commit comments

Comments
 (0)