-
Notifications
You must be signed in to change notification settings - Fork 35
Configurable viewport #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performed full review of 2d73dc8...fc57289
Analysis
-
Session Disruption Risk: The requirement to restart Chromium after resolution changes could negatively impact user experience and session continuity.
-
Deployment Infrastructure Coupling: Direct modification of supervisor configuration files for Xvfb creates tight coupling between the application and deployment infrastructure.
-
System Component Coordination Complexity: The implementation requires orchestrating multiple system components (X server, supervisor, Chromium), increasing the number of potential failure points and making the system more brittle.
-
Runtime Flexibility vs. Stability Tradeoff: While the configurable viewport adds runtime flexibility, it also introduces additional complexity that could impact system stability.
Tip
⚡ Quick Actions
This review was generated by Mesa.
Actions:
Slash Commands:
/review- Request a full code review/review latest- Review only changes since the last review/describe- Generate PR description. This will update the PR body or issue comment depending on your configuration/help- Get help with Mesa commands and configuration options
4 files reviewed | 0 comments | Review on Mesa | Edit Reviewer Settings
| wget ca-certificates python2 supervisor xclip xdotool \ | ||
| pulseaudio dbus-x11 xserver-xorg-video-dummy \ | ||
| libcairo2 libxcb1 libxrandr2 libxv1 libopus0 libvpx7 \ | ||
| x11-xserver-utils \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need this for xrandr
| s.ProcessExec(ctx, oapi.ProcessExecRequestObject{Body: &removeEnvReq}) | ||
|
|
||
| // Add the environment line with WIDTH and HEIGHT | ||
| addEnvCmd := []string{"-lc", fmt.Sprintf(`sed -i '/\[program:xvfb\]/a environment=WIDTH="%d",HEIGHT="%d",DPI="96",DISPLAY=":1"' /etc/supervisor/conf.d/services/xvfb.conf`, width, height)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic is sufficiently intricate where an e2e test might be beneficial. See https://github.com/onkernel/kernel-images/blob/main/server/e2e/e2e_chromium_test.go#L102-L219
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great stuff!
two general thoughts for neko:
- instead of raw http requests I wonder if it would be easier to generate an oapi client of their spec (e.g. https://github.com/onkernel/neko/blob/aa0487f68ebbff1056d7355ec2da127986e5be5f/server/openapi.yaml#L146-L170 +
- It might be helpful to pull out the neko bits into their own client that internally manages the token rather than that being a responsibility of the
ApiService
| refresh_rate: | ||
| type: integer | ||
| enum: [60, 30, 25] | ||
| description: Display refresh rate in Hz. If omitted, uses the highest available rate for the resolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double checking - did we also run numbers on impact of refresh rate in benchmarking? I don't see anything in https://docs.google.com/spreadsheets/d/1VBhIEoNTJoD95gKsalM3XUH2lDnz2XwFlyVwECV-45I/edit?gid=0#gid=0
namely, I'd think higher refresh rates would consume more resources for the gstreamer pipeline backing live views, so I'm curious on the numbers there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the api side, we are only going to allow these following configurations
1024x768x60
1920x1080x60
2560x1440x30
1920 x 1200x60
1440 x 900x30
On the images side, I think we can leave it flexible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we also want 10 in here?
server/openapi.yaml
Outdated
| live_view_sessions: | ||
| type: integer | ||
| description: Number of active Neko viewer sessions. | ||
| is_recording: | ||
| type: boolean | ||
| description: Whether recording is currently active | ||
| is_replaying: | ||
| type: boolean | ||
| description: Whether replay is currently active | ||
| resizable_now: | ||
| type: boolean | ||
| description: True when no blockers are present for resizing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need all these bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not... I can just return the current width, height and refresh_rate
server/cmd/api/api/display.go
Outdated
| if requireIdle { | ||
| live := s.getActiveNekoSessions(ctx) | ||
| isRecording := s.anyRecordingActive(ctx) | ||
| isReplaying := false // replay not currently implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this droppable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my thought process.
We can make the images api to be flexible to accommodate both cases where it requires no active sessions or not.
If the request says requireIdle, then we reject the request if there are active sessions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah sorry this was in reference to isReplaying. I think it's just a dupe of isRecording and it's not implemented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I see. Yeah, I will do that. Good thanks!!
| displayMode := s.detectDisplayMode(ctx) | ||
|
|
||
| // Parse restartChromium flag (default depends on mode) | ||
| restartChrome := (displayMode == "xvfb") // default true for xvfb, false for xorg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking my understanding - why don't we restart for headful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use this flag to --start-maximized to start chromium.
So when x.org expands, the chromium expands. But, we can optionally override this through the request. chromium restarts take time, so we want to avoid it as much as possible
| // detectDisplayMode detects whether we're running Xorg (headful) or Xvfb (headless) | ||
| func (s *ApiService) detectDisplayMode(ctx context.Context) string { | ||
| log := logger.FromContext(ctx) | ||
| checkCmd := []string{"-lc", "supervisorctl status xvfb >/dev/null 2>&1 && echo 'xvfb' || echo 'xorg'"} | ||
| checkReq := oapi.ProcessExecRequest{Command: "bash", Args: &checkCmd} | ||
| checkResp, _ := s.ProcessExec(ctx, oapi.ProcessExecRequestObject{Body: &checkReq}) | ||
|
|
||
| if execResp, ok := checkResp.(oapi.ProcessExec200JSONResponse); ok { | ||
| if execResp.StdoutB64 != nil { | ||
| if output, err := base64.StdEncoding.DecodeString(*execResp.StdoutB64); err == nil { | ||
| outputStr := strings.TrimSpace(string(output)) | ||
| if outputStr == "xvfb" { | ||
| log.Info("detected Xvfb display (headless mode)") | ||
| return "xvfb" | ||
| } | ||
| } | ||
| } | ||
| } | ||
| log.Info("detected Xorg display (headful mode)") | ||
| return "xorg" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly wonder if this is easier as an env var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.... good point. Let me research this. I talked about this with Raf as well. He was more comfortable with this approach
server/cmd/api/api/display.go
Outdated
| // setResolutionXorgViaNeko changes resolution for Xorg using Neko API | ||
| func (s *ApiService) setResolutionXorgViaNeko(ctx context.Context, width, height, refreshRate int, restartChrome bool) error { | ||
| if err := s.setResolutionViaNeko(ctx, width, height, refreshRate); err != nil { | ||
| return fmt.Errorf("failed to change resolution via Neko API: %w", err) | ||
| } | ||
|
|
||
| if restartChrome { | ||
| s.restartChromium(ctx) | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this wrapper?
server/cmd/api/api/display.go
Outdated
| // getNekoToken obtains a bearer token from Neko API for authentication. | ||
| // It caches the token and reuses it for subsequent requests. | ||
| func (s *ApiService) getNekoToken(ctx context.Context) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - we should pull the neko port / url from the env instead of hardcoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how long do these tokens last for? do they survive neko restarts? mainly wondering if we could just do it at API start up time so it's always ready but can see there could be problems with that approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the token lifetime is technically infinite. But you can manually expire it or rotate it. Maybe it is safer to maintain the current implementation?
| eval "FLAGS_ARRAY=($CHROMIUM_FLAGS)" | ||
| else | ||
| FLAGS_ARRAY=() | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Environment Variable Injection Vulnerability
The use of eval "FLAGS_ARRAY=($CHROMIUM_FLAGS)" in both scripts introduces a command injection vulnerability. Since CHROMIUM_FLAGS can be set via environment variables, malicious input could lead to arbitrary shell code execution.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is fine. It is not for production use case
| fi | ||
| done | ||
| FLAGS_JSON+=']}' | ||
| echo "$FLAGS_JSON" > "$FLAGS_FILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Shell Code Injection via User-Controlled Flags
The eval "FLAGS_ARRAY=($CHROMIUM_FLAGS)" command in both scripts introduces a command injection vulnerability. Since CHROMIUM_FLAGS can be user-controlled, malicious input could lead to arbitrary shell code execution.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, not user facing and this is fine
| // Note: Using bash -c (not -lc) to avoid login shell overriding DISPLAY env var | ||
| cmd := exec.CommandContext(ctx, "bash", "-c", "xrandr | grep -E '\\*' | awk '{print $1}'") | ||
| cmd.Env = append(os.Environ(), fmt.Sprintf("DISPLAY=%s", display)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Insecure Command Execution in Resolution Fetching
The getCurrentResolution function directly uses exec.CommandContext with bash -c instead of the ProcessExec API. This approach is fragile and vulnerable to command injection due to unescaped DISPLAY and xrandr output parsing. It also bypasses internal process controls and logging, and handles the DISPLAY environment variable inconsistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not user facing path. This is fine
| "--privileged", | ||
| "--network=host", | ||
| "-p", "10001:10001", // API server | ||
| "-p", "9222:9222", // DevTools proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: API URL Mismatch After Network Change
The runContainer function switched from --network=host to explicit port mapping (-p 10001:10001). This means the hardcoded apiBaseURL (http://127.0.0.1:10001) no longer correctly points to the container's API server from the host, causing API calls in tests that use runContainer to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to do this to run this on Mac. I will revert if this is not necessary
| s.ProcessExec(ctx, oapi.ProcessExecRequestObject{Body: &removeEnvReq}) | ||
|
|
||
| // Add the environment line with WIDTH and HEIGHT | ||
| addEnvCmd := []string{"-lc", fmt.Sprintf(`sed -i '/\[program:xvfb\]/a environment=WIDTH="%d",HEIGHT="%d",DPI="96",DISPLAY=":1"' /etc/supervisor/conf.d/services/xvfb.conf`, width, height)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Vulnerability in sed Command Injection
The sed commands in setResolutionXvfb use fmt.Sprintf to inject width and height into shell commands without escaping. This allows for command injection. The direct string manipulation of the supervisor config is also fragile, risking breakage if the format changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is fine. Not user facing
| eval "FLAGS_ARRAY=($CHROMIUM_FLAGS)" | ||
| else | ||
| FLAGS_ARRAY=() | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Shell Injection via CHROMIUM_FLAGS
The eval command in run-unikernel.sh and run-docker.sh parses CHROMIUM_FLAGS. This creates a security vulnerability, allowing arbitrary command injection if CHROMIUM_FLAGS contains malicious shell code, as it can be controlled by user input or environment variables.
Additional Locations (1)
Sayan-
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm! last couple of q's
server/cmd/api/api/api.go
Outdated
| // Initialize Neko authenticated client | ||
| adminPassword := os.Getenv("NEKO_ADMIN_PASSWORD") | ||
| if adminPassword == "" { | ||
| adminPassword = "admin" // Default from neko.yaml | ||
| } | ||
| nekoAuthClient, err := nekoclient.NewAuthClient("http://127.0.0.1:8080", "admin", adminPassword) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create neko auth client: %w", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we prefer to initial this here instead of passing it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no reasons lol
| refresh_rate: | ||
| type: integer | ||
| enum: [60, 30, 25] | ||
| description: Display refresh rate in Hz. If omitted, uses the highest available rate for the resolution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did we also want 10 in here?
| // AuthClient wraps the Neko OpenAPI client and handles authentication automatically. | ||
| // It manages token caching and refresh on 401 responses. | ||
| type AuthClient struct { | ||
| client *nekooapi.ClientWithResponses | ||
| tokenMu sync.Mutex | ||
| token string | ||
| username string | ||
| password string | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is way easier to read, thanks for iterating!
| N25 PatchDisplayRequestRefreshRate = 25 | ||
| N30 PatchDisplayRequestRefreshRate = 30 | ||
| N60 PatchDisplayRequestRefreshRate = 60 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d1189ff to
9802b64
Compare
9802b64 to
513dcbb
Compare
| else | ||
| FLAGS_JSON+=",\"$flag\"" | ||
| fi | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Shell Command Injection via eval
The eval command used to parse CHROMIUM_FLAGS in both run-unikernel.sh and run-docker.sh introduces a command injection vulnerability. If CHROMIUM_FLAGS contains malicious shell commands, eval will execute them, potentially leading to arbitrary code execution.
Additional Locations (1)
| fi | ||
| done | ||
| FLAGS_JSON+=']}' | ||
| echo "$FLAGS_JSON" > "$FLAGS_FILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: JSON Flag Parsing Fails on Special Characters
The JSON generation for Chromium flags doesn't escape special characters within flag values. If a flag contains double quotes, backslashes, or other JSON special characters, the resulting JSON will be invalid. This could cause the downstream flag parsing system to fail or misinterpret flags.
<!-- mesa-description-start -->

<!-- mesa-description-start -->
<!-- mesa-description-start -->
<!-- mesa-description-start -->
Initially
After running
For headless, I exec-ed into the running docker, and I confirmed the xvfb config by
TL;DR
This PR introduces a
PATCH /displayAPI endpoint to dynamically change the browser's viewport resolution without restarting the container.Why we made these changes
This allows users to programmatically control the browser's viewport size, which is crucial for testing responsive designs, simulating different device screens, and tailoring the resolution for various automation tasks.
What changed?
PATCH /displayendpoint inserver/cmd/api/api/display.goto allow dynamic resolution changes for both headful (Xorg) and headless (Xvfb) modes.xrandrfor Xorg and supervisor restarts for Xvfb. It includes arequire_idlesafety check and restarts Chromium to apply the new viewport.server/lib/nekoclient/client.go) to manage headful sessions.x11-xserver-utilspackage to theimages/chromium-headful/Dockerfileto providexrandr.images/chromium-headful/xorg.confto include additional display modes.openapi.yamlwith the new display endpoint and regenerated the Go client.run-docker.shandrun-unikernel.shto pass Chromium flags as a JSON array instead of a space-separated string.TestDisplayResolutionChange, to validate the functionality.Validation
curlcommand.require_idleflag prevents changes during an active session.Description generated by Mesa. Update settings
<!-- mesa-description-end -->
Description generated by Mesa. Update settings
<!-- mesa-description-end -->
Description generated by Mesa. Update settings
<!-- mesa-description-end -->
Description generated by Mesa. Update settings