|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +Socials is a Python library for detecting and extracting social media profile URLs from a list of hrefs. It uses regex pattern matching to identify URLs from Facebook, Twitter, LinkedIn, GitHub, Instagram, YouTube, and email addresses. |
| 8 | + |
| 9 | +## Development Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +# Setup |
| 13 | +virtualenv -p /usr/bin/python3 venv |
| 14 | +source venv/bin/activate |
| 15 | +pip install -r requirements-dev.txt |
| 16 | + |
| 17 | +# Testing |
| 18 | +tox # Multi-Python version testing |
| 19 | +python setup.py test # Single test run |
| 20 | +make test # Alternative: runs py.test |
| 21 | + |
| 22 | +# Linting |
| 23 | +make lint # Runs flake8 socials tests |
| 24 | + |
| 25 | +# Coverage |
| 26 | +make coverage # Generates HTML coverage report |
| 27 | + |
| 28 | +# Release (requires bumpversion) |
| 29 | +bumpversion patch && git push --tags |
| 30 | +``` |
| 31 | + |
| 32 | +## Architecture |
| 33 | + |
| 34 | +The library is a single-module design in `socials/socials.py`: |
| 35 | + |
| 36 | +- **Platform constants**: `PLATFORM_FACEBOOK`, `PLATFORM_GITHUB`, etc. |
| 37 | +- **Regex patterns**: Each platform has a list of regex patterns (e.g., `FACEBOOK_URL_REGEXS`) |
| 38 | +- **PATTERNS dict**: Maps platform constants to their regex lists |
| 39 | +- **Extraction class**: Wrapper returned by `socials.extract()` with methods: |
| 40 | + - `get_matches_per_platform()` - returns dict of all matches by platform |
| 41 | + - `get_matches_for_platform(platform)` - returns list for specific platform |
| 42 | +- **Cleaners**: Optional post-processing functions (e.g., `clean_mailto` strips `mailto:` prefix) |
| 43 | + |
| 44 | +## Adding a New Platform |
| 45 | + |
| 46 | +1. Add platform constant: `PLATFORM_X = 'x'` |
| 47 | +2. Add regex list: `X_URL_REGEXS = [r'^http(s)?://...']` |
| 48 | +3. Add to `PATTERNS` dict |
| 49 | +4. Add test cases in `tests/test_socials.py` |
| 50 | +5. If needed, add cleaner function and register in `get_cleaner()` |
| 51 | + |
| 52 | +## Regex Conventions |
| 53 | + |
| 54 | +- All patterns use `^` and `$` anchors for exact matching |
| 55 | +- Character classes use `[A-Za-z0-9_-]` (avoiding chars between Z and a in ASCII) |
| 56 | +- Optional trailing slash: `/?$` |
| 57 | +- Optional https: `http(s)?://` |
| 58 | +- Optional www: `(www\.)?` |
0 commit comments