Skip to content

feat: add SSH subscribe/execute/report remote-ops#1

Open
kooksee wants to merge 13 commits into
masterfrom
feat/tunnel
Open

feat: add SSH subscribe/execute/report remote-ops#1
kooksee wants to merge 13 commits into
masterfrom
feat/tunnel

Conversation

@kooksee

@kooksee kooksee commented Jun 6, 2026

Copy link
Copy Markdown

Device agent (BusyBox sh, no runtime) + cloud Go dispatcher (gliderlabs/ssh, independent of bastion):

  • device-agent/remote-agent.sh: poll fetch -> run via sh -> report, with offline spooling, idempotency and per-task timeout; inittab respawn snippet.
  • device-server/main.go: pubkey->devid auth (device never self-reports id), fetch/report verbs over SSH exec, file-backed task queue, taskid path- traversal allowlist, 4 MiB result cap.

devid is derived from the authenticated public key; host-key pinning enforced on the device side (no -y). Verified end-to-end (fetch/execute/report, unauth key rejected, path traversal rejected).

Device agent (BusyBox sh, no runtime) + cloud Go dispatcher (gliderlabs/ssh,
independent of bastion):

- device-agent/remote-agent.sh: poll fetch -> run via sh -> report, with
  offline spooling, idempotency and per-task timeout; inittab respawn snippet.
- device-server/main.go: pubkey->devid auth (device never self-reports id),
  fetch/report verbs over SSH exec, file-backed task queue, taskid path-
  traversal allowlist, 4 MiB result cap.

devid is derived from the authenticated public key; host-key pinning enforced
on the device side (no -y). Verified end-to-end (fetch/execute/report, unauth
key rejected, path traversal rejected).

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a minimal SSH-based 'subscribe / execute / report' system consisting of a shell-based device agent (remote-agent.sh) and a Go-based dispatcher server (device-server). Feedback on these changes highlights several critical security and robustness improvements. Key recommendations include validating taskid on the agent and devid on the server to prevent path traversal vulnerabilities, explicitly rejecting empty device IDs, avoiding standard input conflicts when executing scripts on the agent, preserving SSH stderr output for troubleshooting, properly detecting truncated execution reports, and handling file deletion errors to prevent infinite execution loops.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

[ -n "$resp" ] || return 0

taskid=$(printf '%s\n' "$resp" | head -n1)
[ -n "$taskid" ] || return 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The taskid received from the server is used directly to write files in the spool directory ($SPOOL/$taskid). If the server is compromised or spoofed, a malicious taskid containing path traversal characters (e.g., ../../etc/cron.d/malicious) could allow arbitrary file write on the device. Validate the taskid to ensure it only contains safe characters.

Suggested change
[ -n "$taskid" ] || return 0
[ -n "$taskid" ] || return 0
case "$taskid" in
*[!A-Za-z0-9._-]* | "." | "..")
log "invalid taskid: $taskid"
return 0
;;
esac

Comment thread device-server/main.go Outdated
}

func (d *dispatcher) handle(sess ssh.Session) {
devid, _ := sess.Context().Value("devid").(string)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

If devid is empty (e.g., due to a misconfiguration or if another authentication mechanism is added later), the server will perform operations on the root of tasksDir or resultsDir (since filepath.Join(dir, "") resolves to dir). This could allow unauthorized access or directory listing of all tasks. Explicitly reject connections with an empty devid.

	devid, _ := sess.Context().Value("devid").(string)
	if devid == "" {
		io.WriteString(sess.Stderr(), "unauthorized: missing device identity\n")
		_ = sess.Exit(1)
		return
	}


script=$(printf '%s\n' "$resp" | tail -n +2)
log "executing task $taskid"
out=$(printf '%s\n' "$script" | timeout "$TASK_TIMEOUT" sh 2>&1)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Piping the script body into sh's standard input can cause issues if any command inside the script (such as read, ssh, or interactive tools) also reads from standard input. Those commands will consume the remaining lines of the script itself, leading to truncated execution or unexpected behavior. Running the script via sh -c with stdin redirected from /dev/null avoids this issue and is more efficient.

Suggested change
out=$(printf '%s\n' "$script" | timeout "$TASK_TIMEOUT" sh 2>&1)
out=$(timeout "$TASK_TIMEOUT" sh -c "$script" < /dev/null 2>&1)

run_once() {
flush_spool

resp=$(dbc fetch 2>/dev/null) || { log "fetch failed"; return 0; }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redirecting stderr to /dev/null for dbc fetch hides useful SSH connection errors (e.g., host key mismatch, network timeout, authentication failure). Removing 2>/dev/null allows these errors to be printed to stderr, which is captured by the init system or syslog, greatly improving observability and troubleshooting.

Suggested change
resp=$(dbc fetch 2>/dev/null) || { log "fetch failed"; return 0; }
resp=$(dbc fetch) || { log "fetch failed"; return 0; }

Comment thread device-server/main.go Outdated
Comment on lines +60 to +68
fields := strings.SplitN(line, " ", 2)
if len(fields) != 2 {
return nil, fmt.Errorf("line %d: expected '<devid> <pubkey>'", i+1)
}
pk, _, _, _, err := gossh.ParseAuthorizedKey([]byte(fields[1]))
if err != nil {
return nil, fmt.Errorf("line %d: %w", i+1, err)
}
devs = append(devs, device{id: fields[0], key: pk})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Validate the devid loaded from the authorized_devices file to ensure it only contains safe characters (reusing validTaskID). This acts as a defense-in-depth measure to prevent path traversal or other unexpected filesystem issues if the configuration file is misconfigured or maliciously modified.

		fields := strings.SplitN(line, " ", 2)
		if len(fields) != 2 {
			return nil, fmt.Errorf("line %d: expected '<devid> <pubkey>'", i+1)
		}
		devid := fields[0]
		if !validTaskID(devid) {
			return nil, fmt.Errorf("line %d: invalid devid %q", i+1, devid)
		}
		pk, _, _, _, err := gossh.ParseAuthorizedKey([]byte(fields[1]))
		if err != nil {
			return nil, fmt.Errorf("line %d: %w", i+1, err)
		}
		devs = append(devs, device{id: devid, key: pk})

Comment thread device-server/main.go Outdated
Comment on lines +157 to +162
body, err := io.ReadAll(io.LimitReader(sess, d.maxResult))
if err != nil {
log.Printf("report read %s/%s: %v", devid, taskid, err)
_ = sess.Exit(1)
return
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using io.LimitReader(sess, d.maxResult) silently truncates the report if it exceeds the maximum size, and the server still deletes the task as if it succeeded. This can lead to silent loss of execution logs. Reading d.maxResult + 1 bytes allows the server to detect when the limit is exceeded, reject the report with an error, and keep the task in the queue for troubleshooting.

Suggested change
body, err := io.ReadAll(io.LimitReader(sess, d.maxResult))
if err != nil {
log.Printf("report read %s/%s: %v", devid, taskid, err)
_ = sess.Exit(1)
return
}
body, err := io.ReadAll(io.LimitReader(sess, d.maxResult+1))
if err != nil {
log.Printf("report read %s/%s: %v", devid, taskid, err)
_ = sess.Exit(1)
return
}
if int64(len(body)) > d.maxResult {
io.WriteString(sess.Stderr(), "error: result exceeds maximum allowed size\n")
_ = sess.Exit(2)
return
}

Comment thread device-server/main.go Outdated
_ = sess.Exit(1)
return
}
_ = os.Remove(filepath.Join(d.tasksDir, devid, taskid))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Ignoring the error from os.Remove can lead to silent infinite task execution loops if the task file cannot be deleted (e.g., due to permission issues). Log the error so that operators are alerted to filesystem issues.

	if err := os.Remove(filepath.Join(d.tasksDir, devid, taskid)); err != nil {
		log.Printf("failed to remove completed task %s/%s: %v", devid, taskid, err)
	}

kooksee added 12 commits June 6, 2026 17:50
- Removed the standalone device-server, integrating its fetch/report logic into the sshportal main program.
- Updated README to reflect the new management commands for devices, device keys, and device tasks.
- Implemented new database models for Device, DeviceKey, DeviceTask, and DeviceReport.
- Added device management commands in the bastion shell for creating, inspecting, enabling, disabling, and removing devices.
- Introduced device task management commands for creating, inspecting, listing, and canceling tasks.
- Implemented device RPC handling for fetch and report commands, ensuring compatibility with existing device-agent scripts.
- Enhanced authentication context to support device-specific operations.
…nality

- Added WebSession model to handle cookie-based authentication.
- Introduced session management in the API server with session creation and expiration.
- Implemented login and logout handlers to manage user sessions.
- Enhanced auth middleware to support both API tokens and web sessions.
- Added session garbage collection to periodically clean up expired sessions.
- Updated server to serve static files with optional on-disk overrides for the embedded UI.
- Introduced a new endpoint for user authentication and session management.
- Added a full-stack demo for end-to-end testing of the new features.
- Introduced new API endpoints for managing hosts and SSH keys, including listing, creating, retrieving, and deleting hosts.
- Implemented WebSocket support for shell access to hosts, allowing users to interact with remote systems via a terminal interface.
- Enhanced the Server struct to include AES key and ACL check command for secure operations.
- Added CheckACL function to share authorization logic between the web API and bastion.
- Updated the API server initialization to accept new parameters for encryption and ACL checks.
- Introduced new SFTP handling routes in the API to manage file operations (list, upload, download, mkdir, remove, rename).
- Implemented SFTP client connection and authentication logic in a new sftp.go file.
- Updated the existing hosts.go file to route SFTP requests appropriately.
- Added necessary data structures and response formats for SFTP operations.
- Enhanced error handling and logging for SFTP actions.
- Introduced device status derived from LastSeenAt with states: online, stale, and offline.
- Implemented deviceInfo model to store periodic self-reports from devices.
- Enhanced taskDTO to include task history and duration metrics.
- Added new API endpoints for device info and task management.
- Implemented event logging for task lifecycle events including queued, dispatched, reported, and cancelled.
- Updated the agent to send periodic info snapshots to the server.
- Added documentation for the device agent and REST API usage.
- Improved error handling and response structures across various endpoints.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant