Skip to content

📡 Add ftp server and cloudrun deployment configuration#8

Open
stevejpurves wants to merge 4 commits intomainfrom
add-ftp-service
Open

📡 Add ftp server and cloudrun deployment configuration#8
stevejpurves wants to merge 4 commits intomainfrom
add-ftp-service

Conversation

@stevejpurves
Copy link
Member

@stevejpurves stevejpurves commented Feb 5, 2026

Ok, so this adds the server as a package that package is also dependent on our SCMS Cloud Run client package which is back in the main monorepo.

There is also now a services/ folder with the actual Cloud Run deployment assets and scripts. We're going to use the same pattern of the services folder on the main monorepo as well for our other cloud room functions.

There's no secrets or anything in here. I've checked. There is something in pubsub.sh, but they're not secrets. They do sort of expose service account identifiers for the staging server. This was already in a public repo, and I think it's still very low risk. You can't actually do anything about those without credentials to the GCP project, so I think it's fine.


Note

Medium Risk
Introduces a new network-facing deposit service plus Cloud Run build/deploy scripts, and refactors PMC email message persistence to shared utilities, which could affect ingestion/status updates if misconfigured.

Overview
Adds a new @hhmi/pmc-ftp-service package implementing an Express/PubSub-driven workflow to validate deposit manifests, stream-download submission files with concurrency limits, generate manifest.txt/bulk_meta.xml, bundle into tar.gz, and upload to an SFTP target directory.

Adds Cloud Run deployment assets under services/pmc-ftp-cloudrun (Dockerfile, env sample, local/build/deploy scripts, and test scripts) to build and run the bundled service.

PMC backend alignment/refactor: bumps @curvenote/common and updates email-db.server.ts to use shared createMessageRecord/updateMessageStatus utilities instead of direct Prisma/UUID handling. Includes Changesets for versioning.

Written by Cursor Bugbot for commit 0d228de. This will update automatically on new commits. Configure here.

@changeset-bot
Copy link

changeset-bot bot commented Feb 5, 2026

🦋 Changeset detected

Latest commit: 0d228de

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@hhmi/pmc-ftp-service Minor
@hhmi/pmc Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@stevejpurves stevejpurves changed the title Add ftp server and cloudrun deployment configuration 📡 Add ftp server and cloudrun deployment configuration Feb 5, 2026
},
},
},
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test script has wrong message structure

Medium Severity

The test data structure in testPost.ts doesn't match the expected API format. The service expects message.data to contain base64-encoded JSON of the manifest, and message.attributes to contain userId, successState, failureState, statusUrl, jobUrl, and handshake. Instead, the test incorrectly places the manifest inside message.attributes.manifest and completely omits the required data field and all required attributes. This test will always fail immediately with a "no message data" error.

Fix in Cursor Fix in Web

const dirExists = await sftpClient.exists(targetDir);
if (!dirExists) {
console.log('Creating target directory:', targetDir);
await sftpClient.mkdir(targetDir);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SFTP mkdir fails without parent directory

Low Severity

The sftpClient.mkdir(targetDir) call creates a directory like upload/2024-01-15 but doesn't use the recursive option. If the upload/ parent directory doesn't exist on the SFTP server, the mkdir operation will fail. The ssh2-sftp-client mkdir method accepts a second boolean parameter for recursive creation that isn't being used here.

Fix in Cursor Fix in Web

console.log('Uploaded tar file successfully');

// Clean up SFTP connection
await sftpClient.end();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SFTP connection not closed on error after connect

Medium Severity

The sftpClient is connected at line 282-287, but if subsequent operations (exists, mkdir, or put) throw an error, the catch block at line 324 never closes the connection. The sftpClient variable is scoped to the try block and inaccessible from catch. Each failed upload after successful connection leaves an open SFTP connection that persists until timeout or container recycling, potentially exhausting server connection limits under sustained error conditions.

Additional Locations (1)

Fix in Cursor Fix in Web


// Clean and recreate temporary folder (match original behavior: empty dir for this job)
removeFolder(tmpFolder);
await fs.mkdir(tmpFolder, { recursive: true });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially async removeFolder called without await

Medium Severity

The removeFolder(tmpFolder) call on line 78 is not awaited, while fs.mkdir on line 79 is awaited. Following Node.js naming conventions, functions without a "Sync" suffix are typically async. If removeFolder returns a Promise, this creates a race condition where mkdir may execute before the folder is fully removed, potentially leaving stale files from previous jobs that could be included in the archive and uploaded to PMC.

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 5 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

"build": "npm run build:service && ./build.sh",
"deploy": "./deploy.sh",
"build:local": "npm run build:service && docker build --tag pmc-ftp .",
"dev": "npm run build:local && ./run.sh"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local dev image tag mismatch

Medium Severity

npm run dev builds a Docker image tagged pmc-ftp, but run.sh starts pmc-ftp-local. This makes the default local workflow fail because the expected image name is never built.

Additional Locations (1)

Fix in Cursor Fix in Web

@@ -0,0 +1,3 @@
*.sh
README.md
scripts No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build context can leak environment secrets

Medium Severity

.gcloudignore does not exclude .env, while deployment scripts read sensitive values like FTP_PASSWORD from that file. gcloud builds submit can upload .env to the remote build context, unnecessarily exposing secrets outside local development.

Fix in Cursor Fix in Web

// Generate XML metadata file required by PMC
const xml = pmcXmlFromManifest(manifest);
console.log('Generated XML metadata');
await fs.writeFile(path.join(tmpFolder, 'bulk_meta.xml'), xml);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reserved filenames can overwrite user files

Medium Severity

The service writes generated manifest.txt and bulk_meta.xml into the same folder as downloaded files without checking filename collisions. If manifest.files includes either reserved name, the downloaded file is overwritten and the archive contents become incorrect.

Additional Locations (1)

Fix in Cursor Fix in Web

const bufferSize = parseInt(process.env.STREAM_BUFFER_KB ?? '256') * 1024;
const writeStream = createWriteStream(filePath, {
highWaterMark: bufferSize, // Controls internal buffer size
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid buffer env can crash downloads

Low Severity

STREAM_BUFFER_KB is parsed with parseInt and used directly as highWaterMark without validation. Non-numeric or zero/negative values can produce an invalid stream configuration and crash file download handling at runtime.

Fix in Cursor Fix in Web

WORKDIR /usr/app

ADD index.js ./
ADD *.node ./
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wildcard native copy breaks Docker builds

Medium Severity

Dockerfile uses ADD *.node ./, which fails when no .node files exist in the build context. This can make docker build and Cloud Build fail for builds where the bundled service emits only index.js.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant