🦅 Raven

Raven is an intelligent security news aggregator designed to help organizations stay informed about relevant security threats and updates. It uses LLMs to analyze and filter security news based on your organization's tech stack, compliance requirements, and critical dependencies.

Key Features

Multi-Source Collection: Supports multiple news sources (Risky.biz, The Record Media)
Smart Filtering: Uses LLMs to analyze news relevance based on your company profile
Deduplication: Intelligent cross-source deduplication to avoid redundant news
Context-Aware: Considers your tech stack, third-party dependencies, and compliance requirements
Modular Design: Easy to extend with new collectors and output formats
Efficient Processing: Two-stage LLM analysis to minimize resource usage
Time-Based Filtering: Configurable age-based news filtering
Mock Data Support: Built-in mock collector for testing and development

Getting Started

Prerequisites

Python 3.12+
Ollama with mistral-small model installed

Installation

# Clone the repository
git clone https://github.com/marklechner/raven.git
cd raven

# Set up Python environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install dependencies
pip install -r requirements.txt

# Pull required Ollama model
ollama pull mistral-small

Configuration

Create config/config.yaml based on the example:

global:
  max_age_days: 7  # Default for all collectors

collectors:
  riskybiz:
    enabled: true
    feed_url: "https://risky.biz/feeds/risky-business/"
  therecord:
    enabled: true
  mock:
    enabled: false
    data_dir: "data/mock_news"

llm:
  model: "mistral-small"
  relevance_threshold: 0.6

company:
  name: "Your Company"
  industry: "Your Industry"
  size: "startup|enterprise|..."
  region: "EU|US|APAC|..."
  
  tech_stack:
    cloud: ["AWS", "GCP", "Azure"]
    languages: ["Python", "Java"]
    frameworks: ["Flask", "React"]
    infrastructure: ["Kubernetes", "GitHub"]
    
  security_concerns:
    high_priority: ["Cloud Security", "API Security"]
    compliance: ["SOC 2", "GDPR"]
    3rd_party_providers: ["Provider1", "Provider2"]
    
  assets:
    critical_systems: ["System1", "System2"]

Usage

# Basic usage
python -m src.main

# Available commands:
python -m src.main --help                      # Show help
python -m src.main --check-config              # Validate config
python -m src.main --config custom-config.yaml  # Use custom config
python -m src.main --log-level DEBUG           # Set log level
python -m src.main --max-age 3                 # Override max age
python -m src.main --dry-run                   # Preview collection
python -m src.main --no-dedup                  # Disable deduplication

Project Structure

raven/
├── config/               # Configuration files
│   ├── config.yaml
│   └── config.yaml.example
├── data/
│   └── mock_news/       # Mock news for testing
├── src/
│   ├── collectors/      # News source collectors
│   │   ├── base_collector.py
│   │   ├── riskybiz_collector.py
│   │   ├── record_collector.py
│   │   └── mock_collector.py
│   ├── processors/      # Processing logic
│   │   ├── llm_processor.py
│   │   └── deduplication_processor.py
│   ├── delivery/        # Output formatting
│   ├── models/          # Data models
│   ├── utils/           # Utility functions
│   └── main.py
└── tests/               # Test suite

Adding New Features

Implementing a New Collector

Create a new collector class in src/collectors/:

from collectors.base_collector import BaseCollector
from models.news_item import NewsItem
from typing import List

class MyNewCollector(BaseCollector):
    def __init__(self, config: dict):
        super().__init__(config)
        # Initialize collector-specific settings

    async def collect(self) -> List[NewsItem]:
        # Implement collection logic
        pass

Testing with Mock Data

Create YAML files in data/mock_news/:

- title: "Test Security Event"
  content: "Detailed description of the security event..."
  published_date: "2024-12-19T10:00:00"
  categories: ["category1", "category2"]

Enable mock collector in config:

collectors:
  mock:
    enabled: true
    data_dir: "data/mock_news"

Development Guidelines

Use type hints for better code clarity
Add logging for important operations
Follow the existing modular architecture
Write tests for new features
Update documentation as needed
Use mock collector for testing new features
Validate configurations using the built-in validator

Future Improvements

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
_assets		_assets
config		config
data/mock_news		data/mock_news
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦅 Raven

Key Features

Getting Started

Prerequisites

Installation

Configuration

Usage

Project Structure

Adding New Features

Implementing a New Collector

Testing with Mock Data

Development Guidelines

Future Improvements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

marklechner/raven

Folders and files

Latest commit

History

Repository files navigation

🦅 Raven

Key Features

Getting Started

Prerequisites

Installation

Configuration

Usage

Project Structure

Adding New Features

Implementing a New Collector

Testing with Mock Data

Development Guidelines

Future Improvements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages