Skip to content

skill-evaluator v1.2.2 - Production Ready Release

Latest

Choose a tag to compare

@bjulius bjulius released this 07 Nov 02:34
· 8 commits to master since this release

skill-evaluator v1.2.2 - Production Ready Release

Overview

Comprehensive evaluation toolkit for analyzing Claude skills across security, quality, utility, and compliance dimensions. This release represents a production-ready tool with a self-assessment score of 95.9/100 (HIGHLY RECOMMENDED).

What's New in v1.2.2

Production Ready Features

  • Dynamic Version Management: Version automatically read from SKILL.md at runtime
  • Windows Unicode Support: Full cross-platform compatibility with graceful ASCII fallback
  • Comprehensive Disclaimers: Clear documentation of evaluation limitations and user responsibilities
  • MIT License: Open source permissive licensing
  • Enhanced Report Template: Score breakdown table and aligned risk level logic

Core Capabilities

5-Layer Security Architecture

  • Input Validation
  • Execution Environment Control
  • Output Sanitization
  • Privilege Management
  • Self-Protection

4-Dimensional Scoring System

  • Security (35%): 5-layer vulnerability detection with hybrid scanner
  • Quality (25%): Code quality, documentation, and maintainability
  • Utility (20%): Functionality, usability, and value proposition
  • Compliance (20%): Adherence to skill-creator guidelines

Hybrid Scanner Technology

Zero false positives achieved through 6-component context-aware system:

  1. File exclusion (security_scanner.py, quality_checker.py)
  2. PATTERNS block detection
  3. Comment detection
  4. Ignore markers support (# evaluator: ignore, # noqa)
  5. String literal detection
  6. Directory exclusions (node_modules, venv, etc.)

Self-Evaluation Results

Dimension Score Assessment
Security 100/100 Perfect security posture
Quality 89.9/100 Excellent code quality
Utility 100/100 Highly valuable tool
Compliance 92/100 Strong guideline adherence
Overall 95.9/100 HIGHLY RECOMMENDED

Installation

For Claude Code (CLI/Desktop)

macOS/Linux:

# Download and extract
wget https://github.com/bjulius/skill-evaluator/archive/refs/tags/v1.2.2.zip
unzip v1.2.2.zip
cp -r skill-evaluator-1.2.2/skill-evaluator ~/.claude/plugins/

Windows:

# Download and extract, then:
xcopy /E /I skill-evaluator "%USERPROFILE%\.claude\plugins\skill-evaluator"

Direct Download

Download skill-evaluator.zip from the Assets section below for a ready-to-install package.

Usage

# Full evaluation
/skill-evaluator path/to/skill.zip

# Security-focused scan
/skill-evaluator --security-only path/to/skill.zip

# Pre-publication review
/skill-evaluator --pre-publish path/to/skill.zip

What's Included

  • 5 Python evaluation scripts: Complete evaluation engine
  • 4 reference documents: Security patterns, quality criteria, utility metrics, compliance checklist
  • Report template: Professional markdown output with score breakdowns
  • Comprehensive documentation: README, VERSION_ROADMAP, FUTURE_ENHANCEMENTS
  • MIT License: Open source licensing

Validation Testing

Successfully tested on multiple skills:

  • playwright-skill: 90.5/100 (HIGHLY RECOMMENDED)
  • csv-data-summarizer: 80.9/100 (RECOMMENDED)
  • xlsx (Anthropic): 75.7/100 (RECOMMENDED)
  • Video Processor: 24/100 (DO NOT INSTALL) - correctly identified 17 real vulnerabilities

Roadmap

v2.0 (Planned Q2 2026)

  • Community Trust Score: Integration with GitHub metrics (stars, downloads, maintenance activity, author reputation)
  • See FUTURE_ENHANCEMENTS.md for detailed specifications

Important Disclaimers

This evaluation CANNOT determine with certainty that a skill is safe. It should be used as ONE input into your security decision, not the sole determining factor.

Your Responsibilities

  1. Manual code review
  2. Test in isolation
  3. Follow organizational policies
  4. Assess your risk tolerance
  5. Monitor ongoing behavior

License

MIT License - See LICENSE file for details

Support


Full Changelog: https://github.com/bjulius/skill-evaluator/commits/v1.2.2