Skip to content

Releases: SemClone/mcp-semclone

1.6.2

19 Nov 07:37
38bdade

Choose a tag to compare

v1.6.2 - Improved LLM/IDE Integration

Overview

This release enhances how mcp-semclone integrates with LLM-powered development environments (Cursor,
Windsurf, etc.) by improving tool recognition and selection guidance.

Problem Addressed

When users asked LLM-powered IDEs to "do compliance for this project", the LLM would often:

  • Not recognize that mcp-semclone handles compliance tasks
  • Attempt to install external tools like npm install license-checker or pip install scancode-toolkit
  • Struggle to select the correct tool from the 14 available options

What's New

1. Trigger Keywords for Better Recognition

Added explicit keywords that help LLMs recognize when to use mcp-semclone:

  • "compliance", "license compliance", "do compliance"
  • "SBOM", "software bill of materials", "supply chain security"
  • "can I ship this?", "license compatibility"
  • "mobile app compliance", "SaaS compliance"

2. Clear Decision Tree for Tool Selection

Added IF-THEN logic for common scenarios:

  • IF user says "do compliance" → use run_compliance_check()
  • IF source code directory → use scan_directory()
  • IF package archive (.jar, .whl, etc.) → use check_package()
  • IF compiled binary (.apk, .exe, etc.) → use scan_binary()
  • IF "can I use [license]?" → use validate_policy()

3. Condensed Instructions

Reduced instruction block by ~55% (384 → 175 lines) while maintaining critical information:

  • Kept: Anti-patterns, key workflows, tool descriptions, constraints
  • Condensed: License interpretation, binary scanning, workflows
  • Removed: Verbose examples, redundant explanations
  • Added: References to detailed docstrings

Impact

  • Faster tool recognition for compliance queries in LLM-powered IDEs
  • Reduced hallucination and incorrect tool selection
  • Better first-time user experience
  • Clearer guidance without reading full documentation

Files Changed

  • mcp_semclone/server.py: Enhanced MCP server instructions
  • pyproject.toml: Version bump to 1.6.2
  • mcp_semclone/__init__.py: Version bump to 1.6.2
  • CHANGELOG.md: Added v1.6.2 entry

Full Changelog

Full Changelog: v1.6.1...v1.6.2

1.6.1

14 Nov 06:33
ef29270

Choose a tag to compare

[1.6.1] - 2025-01-13
Fixed
download_and_scan_package: Handle osslili Informational Output
Problem: The download_and_scan_package tool was failing with JSON parsing errors when osslili outputs informational messages before JSON output. osslili now prefixes output with messages like:

Processing local path: package.tar.gz
This caused json.loads() to fail with "Expecting value: line 1 column 1 (char 0)"

Root Cause: Line 2026 in server.py attempted to parse osslili stdout directly as JSON without stripping informational messages.

Solution: Added preprocessing to find the first { character and parse JSON from that position, effectively stripping any informational messages before the JSON payload.

Changes:

mcp_semclone/server.py:
Added informational message stripping before JSON parsing (lines 2026-2031)
Finds first { in output and parses from there
Preserves backward compatibility with osslili versions that don't output messages
Installation Note: When using pipx, ensure purl2src is installed with console scripts enabled:

pipx inject mcp-semclone purl2src --include-apps --force

1.6.0

14 Nov 04:42
99c6837

Choose a tag to compare

[1.6.0] - 2025-01-13

Added

Maven Parent POM License Resolution + Source Header Detection

Problem:
Maven packages often don't declare licenses directly in their package POM - the license can be in:

  1. Source file headers (e.g., // Licensed under Apache-2.0)
  2. Parent POM (declared in parent but not in package POM)

When download_and_scan_package analyzed such packages, it would miss one or both of these sources.

Solution:
Enhanced Maven-specific license resolution to check ALL three sources and combine results:

How it works:

  1. Source file headers: osslili scans all source files for license headers → populates detected_licenses
  2. Package POM: upmex extracts metadata from package POM → populates declared_license (if present)
  3. Parent POM (Maven-specific): If no declared_license, automatically triggers upmex with --registry --api clearlydefined to query ClearlyDefined, which resolves parent POM licenses
  4. Combines results: Parent POM license added to detected_licenses if not already there
  5. Updates result with license_source: "parent_pom_via_clearlydefined"

Examples:

Scenario 1: License only in parent POM

download_and_scan_package(purl="pkg:maven/org.example/library@1.0.0")

# Before (v1.5.8):
#   declared_license: None
#   detected_licenses: []

# After (v1.6.0):
#   declared_license: "Apache-2.0"  # From parent POM
#   detected_licenses: ["Apache-2.0"]
#   metadata.license_source: "parent_pom_via_clearlydefined"

Scenario 2: Licenses in BOTH source headers AND parent POM
download_and_scan_package(purl="pkg:maven/org.example/another@2.0.0")

# Result:
#   declared_license: "Apache-2.0"  # From parent POM
#   detected_licenses: ["MIT", "Apache-2.0"]  # MIT from source, Apache from parent
#   scan_summary: "Deep scan completed. found 2 licenses. (includes parent POM license). ..."

Changes:
- mcp_semclone/server.py:
  - Added detailed 3-source license detection comment (lines 2059-2068)
  - Maven parent POM resolution with ClearlyDefined API integration
  - Combines parent POM license with source header licenses
  - Enhanced summary showing "(includes parent POM license)"
- Tool docstring: Documented Maven-specific behavior with all three sources
- tests/test_server.py:
  - Added test_maven_parent_pom_resolution (parent POM only)
  - Added test_maven_combined_source_and_parent_pom_licenses (both sources)

Impact:
- Maven packages now report licenses from ALL sources (source headers + parent POM)
- Source header licenses (MIT, BSD) combined with parent POM licenses (Apache-2.0)
- Automatic detection - no user configuration needed
- Transparent tracking with license_source metadata field
- Enhanced summary indicates when the parent POM was used
- Falls back gracefully if parent POM resolution fails

1.5.8

14 Nov 02:41
9a6f1d0

Choose a tag to compare

v1.5.8 - 2025-01-13

Fixed & Redesigned

Critical Bug + Complete Redesign: download_and_scan_package

Two critical issues fixed:

Problem 1 - Tool was completely broken (v1.5.7):
The download_and_scan_package tool returned JSON parsing errors:
"metadata_error": "the JSON object must be str, bytes or bytearray, not CompletedProcess"
"scan_error": "the JSON object must be str, bytes or bytearray, not CompletedProcess"

Root Cause:
The _run_tool() helper returns subprocess.CompletedProcess objects, but the code tried to parse them directly as JSON instead of using .stdout.

Problem 2 - Incorrect workflow (v1.5.7):
Original implementation tried to use upmex and osslili with PURLs directly, but these tools require local file paths.

NEW IMPLEMENTATION - Correct Multi-Method Workflow:

The tool now implements a robust 3-step fallback workflow:

  1. Primary: Use purl2notices to download and analyze (fastest, most comprehensive)
  2. Deep scan: If incomplete, use purl2src to get download URL → download artifact → run osslili for deep license scanning + upmex for metadata
  3. Online fallback: If still incomplete, use upmex --api clearlydefined for online metadata

New Dependencies:

  • Added purl2src>=1.2.3 to translate PURLs to download URLs for Step 2

Impact:

  • Tool now works correctly with proper multi-method fallback
  • Uses the correct workflow: purl2notices → download+osslili+upmex → online APIs
  • Returns method_used field showing which method succeeded
  • Proper error handling with methods_attempted tracking
  • JSON parsing fixed (uses .stdout correctly)

Testing:

  • Added 5 comprehensive unit tests covering all workflows
  • All 31 tests pass (26 existing + 5 new)
  • Test coverage: primary workflow, deep scan, online fallback, error handling, file cleanup

Thanks:
User feedback identified the bugs and clarified the correct workflow design.

Full Changelog: https://github.com/SemClone/mcp-semclone/blob/main/CHANGELOG.md

1.5.7

14 Nov 01:08
3dedf26

Choose a tag to compare

What's Changed in v1.5.7

New Tool: download_and_scan_package - Comprehensive Package Source Analysis

FEATURE: Download package source from registries and perform deep scanning

Problem:

  • Users didn't know we CAN download source code from PURLs
  • LLMs said "I don't have a tool to download source code" when we do!
  • Existing tools (check_package, generate_legal_notices_from_purls) can download but it wasn't explicit
  • No single tool that orchestrates: download → extract metadata → scan licenses → find copyrights

Solution:

New download_and_scan_package(purl) tool that makes it CRYSTAL CLEAR we can download and analyze packages

What it does:

  1. Downloads actual package source from npm/PyPI/Maven/etc registries
  2. Extracts package to temporary directory
  3. Uses upmex to extract metadata (license, homepage, description)
  4. Uses osslili to perform deep license scanning of ALL source files
  5. Scans for copyright statements in source code
  6. Returns download location for manual inspection (optional)

When to use:

  • Package metadata is incomplete (e.g., PyPI shows "UNKNOWN" license)
  • Need to verify what's ACTUALLY in package files (not just package.json)
  • Security auditing - inspect actual package contents before approval
  • Find licenses embedded in source files that aren't in metadata
  • Extract copyright statements from source code

Real-world example from user conversation:
User: "Can you check if duckdb@0.2.3 has license info in the source code?"
Before: "I don't have a tool to download source code"
After: download_and_scan_package("pkg:pypi/duckdb@0.2.3")
Result: {"declared_license": "UNKNOWN", "detected_licenses": ["CC0-1.0"], ...}

API:

# Basic usage - download and scan
download_and_scan_package(purl="pkg:pypi/duckdb@0.2.3")

# Keep downloaded files for manual inspection
result = download_and_scan_package(
    purl="pkg:npm/suspicious-package@1.0.0",
    keep_download=True
)
print(f"Inspect at: {result['download_path']}")

# Quick metadata only (no deep scan)
download_and_scan_package(
    purl="pkg:pypi/requests@2.28.0",
    scan_licenses=False
)

Returns:
- purl: Package URL analyzed
- download_path: Where files are (if keep_download=True)
- metadata: Package metadata from upmex
- declared_license: License from package metadata
- detected_licenses: Licenses found by scanning source files
- copyright_statements: Copyright statements extracted
- files_scanned: Number of files analyzed
- scan_summary: Human-readable summary

Why this matters:
- Makes capabilities EXPLICIT - LLMs know we can download source
- Single orchestrating tool - no need to chain multiple tools
- Comprehensive analysis - metadata + deep scanning + copyrights
- Real source verification - see what's actually in the package

1.5.6

14 Nov 00:47
255d62b

Choose a tag to compare

What's Changed in v1.5.6

Split Legal Notices Generation into Two Clear Tools

CLARITY IMPROVEMENT: Separated source scanning from PURL downloads

Problem:

  • v1.5.5 had one tool with two modes (path OR purls parameter)
  • Confusing for LLMs to choose which parameter to use
  • Not obvious which approach is faster/recommended

Solution:

Split generate_legal_notices into two distinct tools with clear purposes:

1. generate_legal_notices(path, ...) - PRIMARY TOOL (FAST)

  • Default tool for most cases
  • Scans source code directly (node_modules/, site-packages/)
  • Detects all transitive dependencies automatically
  • 10x faster than downloading from registries
  • Required parameter: path (no optional parameters confusion)

2. generate_legal_notices_from_purls(purls, ...) - SPECIAL CASES (SLOW)

  • Use only when dependencies NOT installed locally
  • Downloads packages from npm/PyPI/etc registries
  • Required parameter: purls list
  • Clear name indicates it's downloading from registries

Benefits:

  • Clear separation of concerns: Each tool does one thing
  • Better LLM guidance: Tool names indicate purpose and performance
  • No parameter confusion: path vs purls is now two separate tools
  • Self-documenting: Names make it obvious which to use

Updated Workflow Instructions:

  • CRITICAL WORKFLOW RULES now lists two tools clearly
  • Guidance on when to use each tool
  • Emphasizes generate_legal_notices (path) as default

Breaking Changes

  • generate_legal_notices(purls=[...]) no longer works
  • Use generate_legal_notices_from_purls(purls=[...]) instead
  • generate_legal_notices now requires path parameter (not optional)

Migration Guide

# OLD (v1.5.5 - no longer works):
generate_legal_notices(purls=purl_list, output_file="NOTICE.txt")

# NEW (v1.5.6):
generate_legal_notices_from_purls(purls=purl_list, output_file="NOTICE.txt")

# RECOMMENDED (v1.5.6 - use this instead):
generate_legal_notices(path="/path/to/project", output_file="NOTICE.txt")

User Impact

- Clearer workflow: Know which tool to use by default
- 10x performance improvement: Fast source scanning vs slow downloads
- Better LLM guidance: Tool names are self-documenting
- Simpler API: Each tool has one clear purpose

1.5.4

13 Nov 21:27
2f29d7b

Choose a tag to compare

What's Changed in v1.5.4

Server Instructions: Prevent External Tool Installation

Added prominent warning to prevent LLMs from installing external compliance tools:

  • Added "IMPORTANT - ALL TOOLS ARE BUILT-IN" section at the top of server instructions
  • Explicitly warns against installing: npm license-checker, scancode-toolkit, ngx, fossil, etc.
  • Clarifies that all necessary tools (purl2notices, ossnotices, osslili, ospac, vulnq) are pre-installed
  • Directs LLMs to use MCP-provided tools instead of trying to install external packages

Why this matters:

  • Prevents LLMs from wasting time trying to install tools that are already available
  • Avoids confusion about which tools to use (use MCP tools, not external CLIs)
  • Reduces risk of LLMs using outdated or incorrect external tools
  • Ensures consistent compliance scanning using the SEMCL.ONE toolchain

User Impact:

  • Faster response times (no unnecessary tool installation attempts)
  • More reliable results (always uses the correct, pre-installed tools)
  • Clearer guidance for LLMs on how to perform compliance tasks

1.5.2

12 Nov 02:25
8803659

Choose a tag to compare

v1.5.2 - 2025-01-12

Fixed

Improved Workflow Instructions to Prevent Single-Package Detection Issues

Problem: Users reported that compliance checks generated notices for only 1 package, rather than all transitive dependencies (e.g., 1 package instead of 48 in node_modules/).

Root Cause: LLMs were bypassing scan_directory or not using ALL packages from the scan result. Some were manually extracting PURLs from package.json instead of using the comprehensive scan.

Changes:

  • Enhanced server instructions with CRITICAL WORKFLOW RULES section
  • Added explicit warnings in generate_legal_notices against manual PURL extraction
  • Added diagnostic logging to warn when suspiciously few packages detected (3 packages or fewer)
  • Improved examples showing WRONG vs RIGHT workflow approaches

Impact:

  • LLMs now understand ALWAYS to use scan_directory first
  • Clear guidance that npm project with one dependency = approximately 50 packages in node_modules
  • Better visibility when the workflow is not followed correctly

Note: The underlying MCP server code and purl2notices scanning work correctly. This release only improves instructions and logging to prevent misunderstandings in the workflow.

What's Changed

  • Improve workflow instructions to prevent single-package detection issues
  • Bump version to 1.5.2 and update changelog

Full Changelog: v1.5.1...v1.5.2

1.5.1

12 Nov 01:10
72ef162

Choose a tag to compare

v1.5.1 - Architecture Simplification

Changed

Architecture Simplification: purl2notices for Everything

scan_directory now uses purl2notices scan mode exclusively:

  • REMOVED: osslili dependency for scan_directory (still used by check_package)
  • REMOVED: src2purl dependency entirely (replaced by purl2notices)
  • NEW: purl2notices scan mode handles all scanning in one pass:
    • Detects ALL packages including transitive dependencies
    • Extracts licenses from both project source and dependencies
    • Extracts copyright statements automatically from source code
    • No manual PURL extraction needed

Benefits

  • 100% accurate package detection (vs 83-88% fuzzy matching from src2purl)
  • Detects ALL transitive dependencies (e.g., 51 packages vs 8 fuzzy matches)
  • No confusing fuzzy match results
  • Automatic copyright extraction as a bonus feature
  • Simpler architecture: one tool instead of two

For npm projects

  • Scans entire node_modules/ directory (50+ packages)
  • NOT just direct dependencies from package.json (1-2 packages)
  • Includes all transitive dependencies automatically

Deprecated parameters in scan_directory

  • identify_packages - now deprecated, purl2notices always detects packages
  • check_licenses - now deprecated, purl2notices always scans licenses
  • Parameters still accepted for backwards compatibility but have no effect

Updated tool descriptions

  • scan_directory now documents that it detects ALL packages including transitive deps
  • Clarified that for npm projects, this means entire node_modules/ not just package.json
  • Added emphasis on automatic copyright extraction
  • Updated workflow examples to reflect simplified approach

Dependencies

  • Removed: src2purl>=1.3.4 (no longer used)
  • Still kept: osslili>=1.5.7 and upmex>=1.6.7 (used by check_package for archives)

What's Changed

  • v1.5.1: Simplify architecture - use purl2notices for comprehensive scanning by @oscarvalenzuelab in #21
  • Fix purl2notices integration - use JSON format output
  • Fix test mocks to match purl2notices JSON output format

Full Changelog: v1.4.0...v1.5.1

1.4.0

11 Nov 18:46
4f23cb5

Choose a tag to compare

mcp-semclone v1.4.0

This release implements a universal compliance workflow and improves agent usability. Changes include breaking changes from the removal of project-type-specific tools, as well as enhancements from v1.3.6 and v1.3.7.

Breaking Changes

Removed generate_mobile_legal_summary (formerly generate_mobile_legal_notice)

Project-type-specific tools do not scale across different distribution types.

Migration paths:

  • Use run_compliance_check for automated one-shot workflows
  • Use generate_legal_notices for manual workflow orchestration

The generate_legal_notices tool was always the correct choice for complete legal documentation.

New Tool: run_compliance_check

Universal compliance workflow that works for any project type (mobile, desktop, SaaS, embedded, etc).

Capabilities:

  • Automatic workflow: scan → generate NOTICE.txt → validate policy → generate sbom.json → check vulnerabilities
  • Returns APPROVED/REJECTED decision with risk level
  • Generates NOTICE.txt and sbom.json artifacts
  • Provides a complete report with actionable recommendations
  • Uses default policy if none specified
  • Distribution type is a parameter, not a separate workflow

Usage:
result = run_compliance_check(path, distribution_type="mobile")

Enhanced Tool Descriptions

All primary tools now include structured guidance:

scan_directory:

  • Positioned as FIRST STEP in workflows
  • WHEN TO USE and WHEN NOT TO USE sections
  • WORKFLOW POSITION guidance
  • Three complete workflow examples

generate_legal_notices:

  • Positioned as a PRIMARY TOOL for legal documentation
  • Emphasizes purl2notices backend for copyright extraction
  • WHEN TO USE and WHEN NOT TO USE sections
  • Three workflow examples: mobile app compliance, package analysis, batch compliance

validate_license_list:

  • Positioned for quick license validation
  • Clear return values: safe_for_distribution, app_store_compatible
  • Complete workflow example

Documentation Updates

  • Updated IDE integration guides for Cursor, Cline, and Kiro
  • Updated mobile app compliance guide
  • Updated configuration examples and autoApprove lists
  • Removed all references to deleted tools
  • Added migration guidance

Architecture Changes

Design principles:

  • No project-type-specific tools
  • Distribution type used only for policy validation context
  • Default policy provided
  • Single standardized workflow
  • Scales without code changes

Standard workflow options:

Option 1 (Recommended):
run_compliance_check(path, distribution_type) → APPROVED/REJECTED + artifacts

Option 2 (Manual):
scan_directory → generate_legal_notices → validate_license_list → generate_sbom

From v1.3.7 (2025-11-10)

License Approval/Rejection Workflow:

  • Enhanced validate_policy tool with approve/deny/review decision support
  • Added context parameter for static_linking and dynamic_linking scenarios
  • Returns structured decision output with action, severity, requirements, and remediation
  • Added summary object with boolean flags: approved, blocked, requires_review
  • Distribution-specific policy rules (GPL blocked for mobile, AGPL blocked for SaaS)
  • Updated OSPAC dependency to >=1.2.3

From v1.3.6 (2025-11-10)

Pipx Installation Support:

  • Comprehensive pipx installation documentation
  • Instructions for pipx inject to include all SEMCL.ONE tools
  • Isolated environment prevents dependency conflicts
  • All tools are accessible as both libraries and CLI commands
  • Updated MCP configuration examples for pip and pipx
  • Documentation for included tools: osslili, binarysniffer, src2purl, purl2notices, ospac, vulnq, upmex

Migration Example

Before v1.4.0

scan_result = scan_directory(path)
notice = generate_mobile_legal_summary(project_name, licenses)

After v1.4.0

result = run_compliance_check(path, distribution_type="mobile")

Automatically generates NOTICE.txt and sbom.json

Returns APPROVED/REJECTED decision

Alternative: manual workflow

scan_result = scan_directory(path, identify_packages=True)
purls = [pkg["purl"] for pkg in scan_result["packages"]]
generate_legal_notices(purls, output_file="NOTICE.txt")

See https://github.com/SemClone/mcp-semclone/blob/main/CHANGELOG.md for complete details.