Skip to content

Commit 9a6f1d0

Browse files
Merge pull request #28 from SemClone/fix/download-and-scan-json-parsing
fix: Critical bug in download_and_scan_package JSON parsing (v1.5.8)
2 parents 3dedf26 + ecada6e commit 9a6f1d0

File tree

8 files changed

+555
-156
lines changed

8 files changed

+555
-156
lines changed

CHANGELOG.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,67 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [1.5.8] - 2025-01-13
11+
12+
### Fixed & Redesigned
13+
14+
#### Critical Bug + Complete Redesign: download_and_scan_package
15+
16+
**Problem 1 - Tool was completely broken (v1.5.7):**
17+
The `download_and_scan_package` tool returned JSON parsing errors:
18+
```
19+
"metadata_error": "the JSON object must be str, bytes or bytearray, not CompletedProcess"
20+
"scan_error": "the JSON object must be str, bytes or bytearray, not CompletedProcess"
21+
```
22+
23+
**Root Cause:**
24+
The `_run_tool()` helper returns `subprocess.CompletedProcess` objects, but the code tried to parse them directly as JSON instead of using `.stdout`.
25+
26+
**Problem 2 - Incorrect workflow (v1.5.7):**
27+
Original implementation tried to use `upmex` and `osslili` with PURLs directly, but these tools require local file paths.
28+
29+
**NEW IMPLEMENTATION - Correct Multi-Method Workflow:**
30+
31+
**Workflow (tries methods in order until sufficient data is collected):**
32+
1. **Primary**: Use `purl2notices` to download and analyze (fastest, most comprehensive)
33+
2. **Deep scan**: If incomplete, use `purl2src` to get download URL → download artifact → run `osslili` for deep license scanning + `upmex` for metadata
34+
3. **Online fallback**: If still incomplete, use `upmex --api clearlydefined` for online metadata
35+
36+
**New Dependencies:**
37+
- Added `purl2src>=1.2.3` to translate PURLs to download URLs for Step 2
38+
39+
**Changes:**
40+
```python
41+
# OLD (v1.5.7) - Broken
42+
upmex_result = _run_tool("upmex", [purl]) # ❌ upmex needs file path, not PURL
43+
metadata = json.loads(upmex_result) # ❌ CompletedProcess is not JSON
44+
45+
# NEW (v1.5.8) - Correct multi-method workflow
46+
# Step 1: Try purl2notices (downloads internally, extracts from cache file)
47+
purl2notices_result = _run_tool("purl2notices", ["-i", purl, "--cache", temp_cache])
48+
cache_data = json.loads(open(temp_cache).read()) # ✅ Read cache file
49+
50+
# Step 2: If incomplete, get download URL and download artifact
51+
purl2src_result = _run_tool("purl2src", [purl, "--format", "json"])
52+
download_url = json.loads(purl2src_result.stdout)[0]["download_url"]
53+
urllib.request.urlretrieve(download_url, local_file) # Download artifact
54+
osslili_result = _run_tool("osslili", [local_file]) # ✅ Run on local file
55+
upmex_result = _run_tool("upmex", ["extract", local_file]) # ✅ Run on local file
56+
57+
# Step 3: If still incomplete, use online APIs
58+
upmex_online = _run_tool("upmex", ["extract", temp_file, "--api", "clearlydefined"])
59+
```
60+
61+
**Impact:**
62+
- ✅ Tool now works correctly with proper multi-method fallback
63+
- ✅ Uses the correct workflow: purl2notices → download+osslili+upmex → online APIs
64+
- ✅ Returns `method_used` field showing which method succeeded
65+
- ✅ Proper error handling with `methods_attempted` tracking
66+
- ✅ JSON parsing fixed (uses `.stdout` correctly)
67+
68+
**Thanks:**
69+
User feedback identified the bugs and clarified the correct workflow design
70+
1071
## [1.5.7] - 2025-01-13
1172

1273
### Added

README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,23 +14,34 @@ mcp-semclone integrates the complete SEMCL.ONE toolchain to provide LLMs with po
1414
- **Binary Analysis**: Analyze compiled binaries (APK, EXE, DLL, SO, JAR) for OSS components and licenses
1515
- **Vulnerability Assessment**: Query multiple vulnerability databases for security issues
1616
- **Package Discovery**: Identify packages from source code and generate PURLs
17-
- **SBOM Generation**: Create Software Bill of Materials in SPDX/CycloneDX formats
17+
- **SBOM Generation**: Create Software Bill of Materials in CycloneDX format
1818
- **Policy Validation**: Check license compatibility and organizational compliance
1919

2020
## Features
2121

2222
### Tools
23+
24+
**Analysis & Scanning:**
2325
- `scan_directory` - Comprehensive directory scanning for packages, licenses, and vulnerabilities
2426
- `scan_binary` - Analyze compiled binaries (APK, EXE, DLL, SO, JAR) for OSS components
2527
- `check_package` - Check specific packages for licenses and vulnerabilities
28+
- `download_and_scan_package` - Download package source from registries and perform deep license/copyright scanning
29+
30+
**Legal Notices & Documentation:**
31+
- `generate_legal_notices` - Generate legal notices by scanning source code directly (fast, recommended)
32+
- `generate_legal_notices_from_purls` - Generate legal notices from PURL list (downloads from registries)
33+
- `generate_sbom` - Generate Software Bill of Materials in CycloneDX format
34+
35+
**License & Policy Validation:**
2636
- `validate_policy` - Validate licenses against organizational policies
2737
- `validate_license_list` - Quick license safety validation for distribution types
2838
- `get_license_obligations` - Get detailed compliance requirements for licenses
2939
- `check_license_compatibility` - Check if two licenses can be mixed
3040
- `get_license_details` - Get comprehensive license information including full text
3141
- `analyze_commercial_risk` - Assess commercial distribution risks
42+
43+
**Complete Workflows:**
3244
- `run_compliance_check` - Universal one-shot compliance workflow for any project type
33-
- `generate_sbom` - Generate SBOM for projects
3445

3546
### Resources
3647
- `license_database` - Access license compatibility information
@@ -225,7 +236,7 @@ asyncio.run(main())
225236
1. **Scan project structure** to identify components
226237
2. **Extract metadata** for each component
227238
3. **Detect licenses** and copyright information
228-
4. **Format as SBOM** (SPDX or CycloneDX)
239+
4. **Format as SBOM** (CycloneDX 1.4 JSON)
229240
5. **Validate completeness** of the SBOM
230241

231242
## Architecture

examples/strands-agent-ollama/requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Python 3.10+ required
33

44
# MCP server with SEMCL.ONE compliance tools
5-
mcp-semclone>=1.5.7
5+
mcp-semclone>=1.5.8
66

77
# MCP SDK for connecting to MCP servers
88
mcp>=1.0.0

guides/IDE_INTEGRATION_GUIDE.md

Lines changed: 44 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -67,17 +67,19 @@ The SEMCL.ONE MCP server works with any IDE that supports the Model Context Prot
6767
"disabled": false,
6868
"autoApprove": [
6969
"scan_directory",
70+
"scan_binary",
7071
"check_package",
72+
"download_and_scan_package",
73+
"generate_legal_notices",
74+
"generate_legal_notices_from_purls",
75+
"generate_sbom",
7176
"validate_policy",
77+
"validate_license_list",
7278
"get_license_obligations",
7379
"check_license_compatibility",
7480
"get_license_details",
7581
"analyze_commercial_risk",
76-
"validate_license_list",
77-
"run_compliance_check",
78-
"generate_legal_notices",
79-
"generate_sbom",
80-
"scan_binary"
82+
"run_compliance_check"
8183
]
8284
}
8385
}
@@ -95,17 +97,19 @@ The SEMCL.ONE MCP server works with any IDE that supports the Model Context Prot
9597
"disabled": false,
9698
"autoApprove": [
9799
"scan_directory",
100+
"scan_binary",
98101
"check_package",
102+
"download_and_scan_package",
103+
"generate_legal_notices",
104+
"generate_legal_notices_from_purls",
105+
"generate_sbom",
99106
"validate_policy",
107+
"validate_license_list",
100108
"get_license_obligations",
101109
"check_license_compatibility",
102110
"get_license_details",
103111
"analyze_commercial_risk",
104-
"validate_license_list",
105-
"run_compliance_check",
106-
"generate_legal_notices",
107-
"generate_sbom",
108-
"scan_binary"
112+
"run_compliance_check"
109113
]
110114
}
111115
}
@@ -201,17 +205,19 @@ Kiro is Amazon's new agentic AI IDE with native MCP support.
201205
"disabled": false,
202206
"autoApprove": [
203207
"scan_directory",
208+
"scan_binary",
204209
"check_package",
210+
"download_and_scan_package",
211+
"generate_legal_notices",
212+
"generate_legal_notices_from_purls",
213+
"generate_sbom",
205214
"validate_policy",
215+
"validate_license_list",
206216
"get_license_obligations",
207217
"check_license_compatibility",
208218
"get_license_details",
209219
"analyze_commercial_risk",
210-
"validate_license_list",
211-
"run_compliance_check",
212-
"generate_legal_notices",
213-
"generate_sbom",
214-
"scan_binary"
220+
"run_compliance_check"
215221
]
216222
}
217223
}
@@ -229,17 +235,19 @@ Kiro is Amazon's new agentic AI IDE with native MCP support.
229235
"disabled": false,
230236
"autoApprove": [
231237
"scan_directory",
238+
"scan_binary",
232239
"check_package",
240+
"download_and_scan_package",
241+
"generate_legal_notices",
242+
"generate_legal_notices_from_purls",
243+
"generate_sbom",
233244
"validate_policy",
245+
"validate_license_list",
234246
"get_license_obligations",
235247
"check_license_compatibility",
236248
"get_license_details",
237249
"analyze_commercial_risk",
238-
"validate_license_list",
239-
"run_compliance_check",
240-
"generate_legal_notices",
241-
"generate_sbom",
242-
"scan_binary"
250+
"run_compliance_check"
243251
]
244252
}
245253
}
@@ -283,9 +291,9 @@ Kiro is Amazon's new agentic AI IDE with native MCP support.
283291
The `autoApprove` field allows these tools to run without prompting the user:
284292

285293
- **License Analysis**: `get_license_details`, `get_license_obligations`, `check_license_compatibility`
286-
- **Package Scanning**: `scan_directory`, `check_package`, `scan_binary`
294+
- **Package Scanning**: `scan_directory`, `scan_binary`, `check_package`, `download_and_scan_package`
287295
- **Policy & Risk**: `validate_policy`, `analyze_commercial_risk`, `validate_license_list`
288-
- **Documentation**: `generate_legal_notices`, `generate_sbom`
296+
- **Documentation**: `generate_legal_notices`, `generate_legal_notices_from_purls`, `generate_sbom`
289297
- **Complete Workflow**: `run_compliance_check` (one-shot compliance check for any project type)
290298

291299
**Note**: Only include tools you trust to run automatically. You can remove sensitive tools if needed.
@@ -316,17 +324,19 @@ Cline is a popular AI coding extension for VS Code with native MCP support.
316324
"disabled": false,
317325
"autoApprove": [
318326
"scan_directory",
327+
"scan_binary",
319328
"check_package",
329+
"download_and_scan_package",
330+
"generate_legal_notices",
331+
"generate_legal_notices_from_purls",
332+
"generate_sbom",
320333
"validate_policy",
334+
"validate_license_list",
321335
"get_license_obligations",
322336
"check_license_compatibility",
323337
"get_license_details",
324338
"analyze_commercial_risk",
325-
"validate_license_list",
326-
"run_compliance_check",
327-
"generate_legal_notices",
328-
"generate_sbom",
329-
"scan_binary"
339+
"run_compliance_check"
330340
]
331341
}
332342
}
@@ -344,17 +354,19 @@ Cline is a popular AI coding extension for VS Code with native MCP support.
344354
"disabled": false,
345355
"autoApprove": [
346356
"scan_directory",
357+
"scan_binary",
347358
"check_package",
359+
"download_and_scan_package",
360+
"generate_legal_notices",
361+
"generate_legal_notices_from_purls",
362+
"generate_sbom",
348363
"validate_policy",
364+
"validate_license_list",
349365
"get_license_obligations",
350366
"check_license_compatibility",
351367
"get_license_details",
352368
"analyze_commercial_risk",
353-
"validate_license_list",
354-
"run_compliance_check",
355-
"generate_legal_notices",
356-
"generate_sbom",
357-
"scan_binary"
369+
"run_compliance_check"
358370
]
359371
}
360372
}

mcp_semclone/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""MCP SEMCL.ONE - Model Context Protocol server for OSS compliance."""
22

3-
__version__ = "1.5.7"
3+
__version__ = "1.5.8"
44
__author__ = "Oscar Valenzuela B."
55
__email__ = "oscar.valenzuela.b@gmail.com"

0 commit comments

Comments
 (0)