Thank you for your interest in contributing to these AI learning projects. These projects are designed as hands-on learning exercises for developing practical understanding of modern AI systems through building real, working implementations.
These projects emphasize learning through building rather than producing polished end-user products. Each project explores specific technical questions around RAG, agent design, evaluation frameworks, governance, and safety through hands-on experimentation, spec-driven development, and systematic evaluation.
Contributions that deepen understanding, improve architectural patterns, add evaluation capabilities, or enhance educational value are especially welcome.
The Upcoming Projects section offers the best entry points as they have roadmaps and concepts defined but are earlier in development:
An AI-powered accessibility remediation assistant focusing on WCAG/Section 508 compliance.
Good for learning:
- Multi-agent architecture with specialized roles (Analyzer, Strategist, Coder, Validator)
- Integration with deterministic validation tools (axe-core, Lighthouse)
- Human-in-the-loop patterns for subjective validation
- Audit trail generation and compliance documentation
Contribution opportunities:
- Implementing specialized agents with YAML specifications
- Designing context-aware remediation strategies
- Building deterministic validation loops
- Creating audit artifact generators
A governance-first RAG system emphasizing auditability and evaluation.
Good for learning:
- Document ingestion and chunking strategies
- Provenance tracking and audit logging
- Retrieval evaluation frameworks
- Refusal semantics and epistemic constraints
- Spec-driven development with OpenSpec
Contribution opportunities:
- Implementing chunking experiments and comparisons
- Building evaluation harnesses for retrieval quality
- Designing governance-focused audit logs
- Experimenting with embedding and retrieval strategies
An assistant demonstrating parallel input and output validation using LLM guardrails.
Good for learning:
- LLM guardrail patterns and safety constraints
- Input validation and sanitization
- Output validation before user delivery
- Security-focused AI system design
Contribution opportunities:
- Implementing guardrail validation layers
- Designing safety constraint specifications
- Building parallel validation pipelines
- Creating test suites for security scenarios
All existing projects welcome improvements and enhancements:
- Evaluation frameworks: Add systematic testing, LLM-as-judge metrics, regression detection
- Agent specifications: Convert existing agents to YAML specs, improve prompt templates
- Documentation: Add architecture diagrams, decision logs, usage examples
- Retrieval improvements: Experiment with chunking strategies, hybrid search, reranking
- Security enhancements: Add input validation, safety guardrails, audit logging
- Performance optimization: Improve response times, reduce token usage, optimize vector searches
Review the README.md to understand available projects. For first contributions, start with the Upcoming Projects or choose an existing project that matches your interests.
Before contributing, read:
- Project README and documentation
- Existing code to understand patterns and architecture
- Any YAML agent specifications in the project
- OpenSpec specifications (if the project uses spec-driven development)
For Upcoming Projects:
- Review the roadmap/concept documentation
- Choose an area to implement (agent, evaluation, ingestion, etc.)
- Start with a small, well-defined piece rather than the entire system
For Existing Projects:
- Look for TODOs or enhancement opportunities in the code
- Consider adding evaluation capabilities if missing
- Improve documentation or add usage examples
- Propose architectural improvements based on learnings
Many projects use OpenSpec for spec-driven development. For these projects:
- Write specifications first (YAML agent specs, API specs, etc.)
- Get feedback on specifications before implementation
- Implement code to match specifications
- Keep specs and code synchronized
Reference projects using OpenSpec: Cortex, QueryCraft, Veridex
For projects with AI agents:
- Define agent role and responsibilities in YAML
- Design prompt templates with clear instructions
- Specify input/output schemas using Zod or similar
- Implement agent with proper error handling
- Add evaluation criteria and test cases
Example projects: Plants FieldGuide, QueryCraft, A11y Remediation Assistant
For retrieval-augmented generation projects:
- Design document ingestion and chunking strategy
- Implement embedding and vector storage
- Build retrieval pipeline with ranking
- Add response synthesis and citation
- Create evaluation harness for retrieval quality
Example projects: PLANTS NLQI, Plants FieldGuide, Veridex
- TypeScript: Use strict type checking, avoid
anytypes - Error handling: Implement comprehensive error handling with clear messages
- Logging: Add structured logging for debugging and observability
- Testing: Include tests for core functionality, especially evaluation code
- Documentation: Update README files and inline comments for complex logic
- Dependencies: Minimize dependencies, prefer standard libraries
Strong emphasis on evaluation:
- Add automated tests for deterministic components
- Include evaluation datasets and test cases for LLM components
- Implement LLM-as-judge metrics where appropriate
- Document expected behavior and failure modes
- Create regression tests for fixed issues
- Fork the repository
- Create a feature branch with a descriptive name
- Make your changes following the standards above
- Test thoroughly including edge cases
- Update documentation to reflect changes
- Submit a pull request with:
- Clear description of changes
- Rationale for design decisions
- Test results or evaluation outcomes
- Any architectural trade-offs made
- Build specialized agents for A11y Remediation Assistant
- Add agent orchestration patterns to Cortex
- Design intent classification for Plants FieldGuide
- Implement multi-agent workflows
- Experiment with chunking strategies in Veridex
- Add hybrid search to PLANTS NLQI
- Implement reranking in Plants FieldGuide
- Design retrieval evaluation frameworks
- Build LLM-as-judge metrics for any project
- Create evaluation datasets for testing
- Implement regression detection systems
- Add confidence scoring and uncertainty quantification
- Add guardrails to Soil Guard CLI
- Implement audit trails in Veridex
- Design provenance tracking systems
- Build compliance documentation generators
- Implement research papers as small projects (see Research Paper Projects)
- Run ablation studies on existing systems
- Compare different architectural approaches
- Document failure modes and edge cases
Each project demonstrates specific AI concepts:
- RAG Architecture: PLANTS NLQI, Plants FieldGuide, Veridex
- Agent Design: Cortex, QueryCraft, A11y Remediation Assistant
- Evaluation: FairEval-CLI, QueryCraft evaluation harness
- Spec-Driven Development: Cortex, QueryCraft, Veridex
- Security & Safety: QueryCraft (SQL injection prevention), Soil Guard CLI
Study these projects to understand patterns before contributing.
If you have questions or want to discuss contribution ideas:
- Open an issue describing your proposal
- Reference the specific project and area of interest
- Explain the learning goals or technical questions you want to explore
- Discuss architectural approach before implementing large features
- Focus on learning and knowledge sharing
- Provide constructive feedback on code and architecture
- Document your thinking and design decisions
- Respect that these are educational projects, not production systems
- Help others learn by explaining your approaches
All contributors will be recognized for their work. Contributions that demonstrate strong understanding of AI system architecture, thoughtful evaluation design, or innovative approaches to safety and governance are especially valued.
Thank you for helping make these learning projects better for everyone!