Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdded a validation step to shapefile unzip logic that verifies each ZIP entry's canonical path is contained within the target directory and raises an exception if not; tests added to exercise valid extraction and invalid/traversal scenarios. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj (1)
113-120: Consider creating test zip files in a temp directory.Creating zip files in the current working directory (
"invalid.zip", etc.) could cause issues in CI environments or with parallel test execution. UsingFiles/createTempFilewould be more robust.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj` around lines 113 - 120, The helper create-single-entry-zip writes test zip files to cwd; change it to create the zip in a temp location using java.nio.file.Files/createTempFile (or clojure.java.io/file wrapping that) instead of io/file on the given filename: update the function create-single-entry-zip to call Files/createTempFile (use the provided filename as prefix/suffix or drop it), open a ZipOutputStream to that temp file, write the entry as before, and return the temp File (optionally call .toFile() and/or file/deleteOnExit()). This ensures tests create zips in a temp directory safe for CI and parallel runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@search-app/src/cmr/search/services/parameters/converters/shapefile.clj`:
- Around line 55-59: The validate-entry-dir function currently uses a plain
startsWith string check which can be bypassed by path prefix attacks; compute
the canonical path of target-dir once (e.g., target-canonical via
(.getCanonicalPath (.toFile target-dir))) and then validate entries by checking
either equality with target-canonical or that the entry canonical path
startsWith target-canonical plus File/separator (or, better, perform a
path-segment-aware containment check using java.nio.file.Paths/Path#startsWith
to ensure the resolved entry is inside target-dir) and throw the exception if
that containment test fails.
In
`@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`:
- Around line 16-17: Duplicate import: ZipInputStream appears in both import
lists; consolidate by removing the duplicate so ZipInputStream is imported only
once. Update the import forms that reference (java.util.zip ZipEntry
ZipOutputStream ZipInputStream) and (java.util.zip ZipFile ZipInputStream) to a
single combined import (e.g., include ZipEntry, ZipOutputStream, ZipInputStream,
and ZipFile together) so ZipInputStream is no longer repeated.
- Around line 135-141: The test for shapefile/unzip-file leaks the extracted
temp directory because it only deletes the source zip (valid-zip) but not the
returned temp directory (result); update the test to delete the extracted
directory in the finally block (e.g., call io/delete-file on result with
recursive true) so both the created zip (valid-zip) and the unpacked temp
directory (result) are cleaned up; ensure this cleanup runs even on assertion
failures within the (try ... finally) around the shapefile/unzip-file call.
---
Nitpick comments:
In
`@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`:
- Around line 113-120: The helper create-single-entry-zip writes test zip files
to cwd; change it to create the zip in a temp location using
java.nio.file.Files/createTempFile (or clojure.java.io/file wrapping that)
instead of io/file on the given filename: update the function
create-single-entry-zip to call Files/createTempFile (use the provided filename
as prefix/suffix or drop it), open a ZipOutputStream to that temp file, write
the entry as before, and return the temp File (optionally call .toFile() and/or
file/deleteOnExit()). This ensures tests create zips in a temp directory safe
for CI and parallel runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b5928b7-effd-4f95-bdef-993de8db84bf
📒 Files selected for processing (2)
search-app/src/cmr/search/services/parameters/converters/shapefile.cljsearch-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
| (defn validate-entry-dir | ||
| [target-dir entry] | ||
| (let [canonical (.getCanonicalPath (File. (.toString target-dir) (str entry)))] | ||
| (when-not (.startsWith canonical (.getCanonicalPath (.toFile target-dir))) | ||
| (throw (Exception. "Given zip content is not allowed"))))) |
There was a problem hiding this comment.
Security: String prefix check can be bypassed in zip-slip validation.
The String.startsWith comparison is vulnerable to prefix attacks. An attacker could craft an entry like ../Shapes123-evil/malware.txt which, when the temp directory is /tmp/Shapes123, resolves to /tmp/Shapes123-evil/malware.txt. This path passes the startsWith("/tmp/Shapes123") check while escaping the intended directory.
Use path-segment-aware comparison instead:
🔒 Proposed fix using proper path containment check
(defn validate-entry-dir
[target-dir entry]
- (let [canonical (.getCanonicalPath (File. (.toString target-dir) (str entry)))]
- (when-not (.startsWith canonical (.getCanonicalPath (.toFile target-dir)))
+ (let [entry-path (.toPath (File. (.toString target-dir) (str entry)))
+ canonical-path (.toRealPath entry-path (into-array java.nio.file.LinkOption []))
+ target-canonical (.toRealPath target-dir (into-array java.nio.file.LinkOption []))]
+ (when-not (.startsWith canonical-path target-canonical)
(throw (Exception. "Given zip content is not allowed")))))Alternatively, append File/separator to the target path before the string comparison:
(when-not (or (.equals canonical target-canonical)
(.startsWith canonical (str target-canonical File/separator)))
...)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@search-app/src/cmr/search/services/parameters/converters/shapefile.clj`
around lines 55 - 59, The validate-entry-dir function currently uses a plain
startsWith string check which can be bypassed by path prefix attacks; compute
the canonical path of target-dir once (e.g., target-canonical via
(.getCanonicalPath (.toFile target-dir))) and then validate entries by checking
either equality with target-canonical or that the entry canonical path
startsWith target-canonical plus File/separator (or, better, perform a
path-segment-aware containment check using java.nio.file.Paths/Path#startsWith
to ensure the resolved entry is inside target-dir) and throw the exception if
that containment test fails.
There was a problem hiding this comment.
I cannot use this func because it checks for an existing file, but the purpose of this validation func is to check if the the proposed entry file path is valid here BEFORE we create the actual file.
There was a problem hiding this comment.
@jmaeng72 — you're absolutely right, apologies for that. Path.toRealPath() resolves symlinks against the actual filesystem and throws NoSuchFileException when the file doesn't exist, which is exactly the situation here. That proposed fix was incorrect.
The original string-based issue (prefix attack) can still be addressed without requiring the file to exist. Two valid approaches:
Option 1 — Path.normalize() + Path.startsWith() (path-segment-aware, cleanest since target-dir is already a java.nio.file.Path)
(defn validate-entry-dir
[target-dir entry]
(let [entry-path (.normalize (.resolve target-dir (str entry)))]
(when-not (.startsWith entry-path (.normalize target-dir))
(throw (Exception. "Given zip content is not allowed")))))Path.normalize() resolves ../. lexically (no I/O), and Path.startsWith() (the java.nio.file.Path version) compares path segments, not strings — so /tmp/Shapes123-evil will not satisfy .startsWith("/tmp/Shapes123").
Option 2 — Append File/separator to the target string (minimal change from current code)
(defn validate-entry-dir
[target-dir entry]
(let [canonical (.getCanonicalPath (File. (.toString target-dir) (str entry)))
target-canonical (.getCanonicalPath (.toFile target-dir))]
(when-not (or (.equals canonical target-canonical)
(.startsWith canonical (str target-canonical java.io.File/separator)))
(throw (Exception. "Given zip content is not allowed")))))File.getCanonicalPath() resolves .. without requiring the path to exist, and appending the separator prevents the prefix-attack bypass.
Option 1 is slightly cleaner since target-dir is already a Path. Both correctly address the vulnerability without touching the filesystem for the entry being validated.
🧠 Learnings used
Learnt from: daniel-zamora
Repo: nasa/Common-Metadata-Repository PR: 2361
File: search-app/src/cmr/search/services/parameters/conversion.clj:473-487
Timestamp: 2025-12-30T13:06:02.702Z
Learning: The collection-progress field in CMR collections is a single-valued enum. A collection can only have one progress value at a time, so AND operations on collection-progress parameters do not make logical sense. In code reviews, ensure parameter handlers or validation logic do not treat collection-progress as multi-valued and avoid composing multiple progress values with AND. Treat collection-progress as a single-valued enum and validate inputs accordingly, documenting this constraint where relevant.
There was a problem hiding this comment.
Also, the situation mentioned here is already tested in the unit tests in this PR and I account for this specific "../Shapes123-evil/malware.txt" type of scenario
There was a problem hiding this comment.
🧩 Analysis chain
Let me check the exact test cases to verify what entry names are being tested.
🏁 Script executed:
#!/bin/bash
fd "shapefile.clj" --full-path --type f | xargs grep -n "" | grep -A 80 "unzip-file-test\|create-single-entry-zip"
Length of output: 3844
@jmaeng72, thanks for pointing that out. Looking at the test file closely (lines 123–128), the test entry used is "../somewhere_else.txt" — this case resolves to /tmp/somewhere_else.txt, which does not start with /tmp/ShapesXXXXXX, so it is correctly rejected. That test is valid.
However, the specific prefix-attack scenario I raised — e.g. "../ShapesXXXXXX-evil/file.txt" — is not covered by the current tests. That entry resolves to /tmp/ShapesXXXXXX-evil/file.txt, which would pass the startsWith("/tmp/ShapesXXXXXX") check (the false positive).
That said, I concede this is not practically exploitable in the current code: Files/createTempDirectory "Shapes" generates a random suffix (e.g. Shapes3842917023847) that an attacker cannot predict when crafting their ZIP file. So the bypass path is theoretically constructible but unknowable to an attacker in advance.
If you'd like to close off the theoretical concern entirely, Option 1 from my previous comment (Path.normalize + Path.startsWith) would do it cleanly since target-dir is already a java.nio.file.Path:
(defn validate-entry-dir
[target-dir entry]
(let [entry-path (.normalize (.resolve target-dir (str entry)))]
(when-not (.startsWith entry-path (.normalize target-dir))
(throw (Exception. "Given zip content is not allowed")))))But given the random temp directory name renders the bypass unexploitable, I'll leave this to your discretion.
🧠 Learnings used
Learnt from: jmaeng72
Repo: nasa/Common-Metadata-Repository PR: 2401
File: search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj:129-129
Timestamp: 2026-03-13T14:49:00.112Z
Learning: In `search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`, the abbreviation "tar" in test descriptions refers to "target" (as in target directory), not the tar archive format.
Learnt from: daniel-zamora
Repo: nasa/Common-Metadata-Repository PR: 2361
File: search-app/src/cmr/search/services/parameters/conversion.clj:473-487
Timestamp: 2025-12-30T13:06:02.702Z
Learning: The collection-progress field in CMR collections is a single-valued enum. A collection can only have one progress value at a time, so AND operations on collection-progress parameters do not make logical sense. In code reviews, ensure parameter handlers or validation logic do not treat collection-progress as multi-valued and avoid composing multiple progress values with AND. Treat collection-progress as a single-valued enum and validate inputs accordingly, documenting this constraint where relevant.
There was a problem hiding this comment.
the key to why the current implementation works is because the file path is validated BEFORE the file is created
There was a problem hiding this comment.
@jmaeng72 — that's a good clarification, and you're right. File.getCanonicalPath() resolves .. lexically at the OS level and does not require the file to exist, so the pre-creation ordering makes the check fully valid. The pattern is sound:
- Validate the entry path (
validate-entry-dir) — purely lexical, no filesystem state required. - Then create the file (
io/copy) — only reached if the path is safe.
Combined with the unpredictable random suffix from Files/createTempDirectory, the implementation is secure. I'm satisfied with the explanation — no changes needed here.
🧠 Learnings used
Learnt from: jmaeng72
Repo: nasa/Common-Metadata-Repository PR: 2401
File: search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj:129-129
Timestamp: 2026-03-13T14:49:00.112Z
Learning: In `search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`, the abbreviation "tar" in test descriptions refers to "target" (as in target directory), not the tar archive format.
Learnt from: daniel-zamora
Repo: nasa/Common-Metadata-Repository PR: 2361
File: search-app/src/cmr/search/services/parameters/conversion.clj:473-487
Timestamp: 2025-12-30T13:06:02.702Z
Learning: The collection-progress field in CMR collections is a single-valued enum. A collection can only have one progress value at a time, so AND operations on collection-progress parameters do not make logical sense. In code reviews, ensure parameter handlers or validation logic do not treat collection-progress as multi-valued and avoid composing multiple progress values with AND. Treat collection-progress as a single-valued enum and validate inputs accordingly, documenting this constraint where relevant.
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
Show resolved
Hide resolved
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
Outdated
Show resolved
Hide resolved
eereiter
left a comment
There was a problem hiding this comment.
code rabbit found an issue.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj (2)
135-141:⚠️ Potential issue | 🟡 MinorResource leak: extracted temp directory is not cleaned up.
The test deletes the source zip file but not the extracted directory returned by
result. This leaves temp files on disk after test execution.Proposed fix to clean up temp directory
(testing "valid zip file returns file obj" (let [valid-zip (create-single-entry-zip "valid.zip" "somewhere_else.txt" "something") result (shapefile/unzip-file valid-zip)] (try (is (some? result)) (finally + (when result + (run! io/delete-file (reverse (file-seq result)))) (io/delete-file valid-zip true)))))),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj` around lines 135 - 141, The test for shapefile/unzip-file leaks the extracted temp directory because it only deletes the source zip; update the finally block to also remove the extracted directory returned in result (from shapefile/unzip-file), e.g. check that result is non-nil and call io/delete-file on it with recursive=true so the temp directory produced by shapefile/unzip-file is cleaned up after the test; keep the existing deletion of valid-zip and ensure both deletions run in the finally clause.
16-17:⚠️ Potential issue | 🟡 MinorDuplicate import of
ZipInputStream.
ZipInputStreamis imported on both line 16 and line 17. Consolidate into a single import statement.Proposed fix
- (java.util.zip ZipEntry ZipOutputStream ZipInputStream) - (java.util.zip ZipFile ZipInputStream) + (java.util.zip ZipEntry ZipFile ZipInputStream ZipOutputStream),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj` around lines 16 - 17, Duplicate import of ZipInputStream detected; consolidate the imports by removing the redundant import and ensure ZipInputStream appears only once alongside the other java.util.zip symbols—update the import list that currently mentions (java.util.zip ZipEntry ZipOutputStream ZipInputStream) and (java.util.zip ZipFile ZipInputStream) so that ZipInputStream is included in a single java.util.zip form (e.g., with ZipEntry, ZipOutputStream, ZipFile) and the duplicate line is removed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`:
- Line 129: Update the test description string in the testing block that
currently reads "invalid file path within the tar dir throws error" to reference
"zip" instead of "tar" (e.g., change the testing description in the (testing
"invalid file path within the tar dir throws error" ...) block in shapefile
tests) so the wording matches the ZIP file functionality being tested.
---
Duplicate comments:
In
`@search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj`:
- Around line 135-141: The test for shapefile/unzip-file leaks the extracted
temp directory because it only deletes the source zip; update the finally block
to also remove the extracted directory returned in result (from
shapefile/unzip-file), e.g. check that result is non-nil and call io/delete-file
on it with recursive=true so the temp directory produced by shapefile/unzip-file
is cleaned up after the test; keep the existing deletion of valid-zip and ensure
both deletions run in the finally clause.
- Around line 16-17: Duplicate import of ZipInputStream detected; consolidate
the imports by removing the redundant import and ensure ZipInputStream appears
only once alongside the other java.util.zip symbols—update the import list that
currently mentions (java.util.zip ZipEntry ZipOutputStream ZipInputStream) and
(java.util.zip ZipFile ZipInputStream) so that ZipInputStream is included in a
single java.util.zip form (e.g., with ZipEntry, ZipOutputStream, ZipFile) and
the duplicate line is removed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 16e4a903-2f4d-45a7-8022-ec6076f90403
📒 Files selected for processing (1)
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
search-app/test/cmr/search/test/unit/services/parameter_converters/shapefile.clj
Outdated
Show resolved
Hide resolved
jceaser
left a comment
There was a problem hiding this comment.
I think we should change the file name to a UUID saved to a walled off directory.
Overview
What is the objective?
Add feature
What are the changes?
Add feature
What areas of the application does this impact?
All
Required Checklist
Additional Checklist
Summary by CodeRabbit
Bug Fixes
Tests