Python library and CLI to turn URLs into structured social media profiles.
You have a list of URLs from a scrape, a CSV export, or email signatures. Some of them are social media profiles. Socials finds them and gives you structured data to work with.
| ๐ Extract | Pull social profiles from scraped pages or contact lists |
| โ Validate | Check if URLs are recognized social profiles |
| ๐ Normalize | Get consistent usernames from messy URL variations |
| ๐๏ธ Categorize | Group URLs by platform or entity type |
| ๐ค Automate | Batch process URL files via CLI |
Note: This README documents the upcoming 1.0 release. To try it, install with pre-release support:
pip install --pre socials
# or
uv add --pre socialsFeedback welcome at GitHub Issues.
For the current stable version (0.3.x), use pip install socials and see the
v0.3.0 documentation.
import socials
# Parse a single URL
repo = socials.parse("https://github.com/lorey/socials")
print(repo)
# GitHubRepoURL(owner='lorey', repo='socials')
print(repo.platform)
# 'github'
print(repo.owner)
# 'lorey'
# Parse multiple URLs at once
urls = ["https://github.com/lorey", "https://twitter.com/karllorey", "https://example.com"]
result = socials.parse_all(urls)
print(result.all())
# [GitHubProfileURL(username='lorey'), TwitterProfileURL(username='karllorey')]
print(result.by_platform())
# {'github': [...], 'twitter': [...]}- Structured data, not strings. You get typed Python objects with extracted fields like
username,repo, orcompany. Not just a matched URL string. - Handles the edge cases. With or without
www. Trailing slashes or not. Old URL formats. Mobile URLs. Socials normalizes them all. - Comprehensive platform coverage. 8 platforms with multiple entity types each. Profiles, repos, companies, channels. Continuously updated as platforms change their URL formats.
- Extensible. Need to support an internal tool or a platform we don't cover? Register your own parser and it works with the same API.
- Built for messy real-world data. Lenient by default. Unknown URLs return
Noneinstead of crashing. Strict mode available when you need validation. - Type-safe with IDE support. Full type hints. Autocomplete works. Catch bugs before runtime.
Each parsed URL is a typed object with platform-specific fields:
import socials
company = socials.parse("https://linkedin.com/company/acme-corp")
print(company)
# LinkedInCompanyURL(company_name='acme-corp')
print(company.platform)
# 'linkedin'
print(company.entity_type)
# 'company'Navigate from a repo to its owner, or from any URL to its root:
import socials
repo = socials.parse("https://github.com/lorey/socials")
print(repo.get_parent())
# GitHubProfileURL(username='lorey')Parse many URLs at once and group the results:
import socials
urls = ["https://github.com/lorey", "https://twitter.com/karllorey"]
result = socials.parse_all(urls)
result.all()
# list of all parsed URLs
result.by_platform()
# {'github': [...], 'twitter': [...]}
result.by_type()
# {'profile': [...]}Only extract what you need:
import socials
extractor = socials.Extractor(platforms=["github", "linkedin"])
print(extractor.parse("https://twitter.com/someone"))
# None| Platform | Entity Types | Example Fields |
|---|---|---|
| GitHub | profile, repo | username, owner, repo |
| Twitter/X | profile | username |
| profile, company | username, company_name | |
| profile | username | |
| profile | username | |
| YouTube | channel | channel_id, username |
| Phone | phone | phone |
Missing a platform? Open an issue or submit a PR!
The CLI lets you process URLs directly from the command line. Run it with uvx (no install needed) or install globally with pip install socials.
$ uvx socials --help
Usage: socials [OPTIONS] COMMAND [ARGS]...
Extract social media profile URLs from a list of URLs.
โญโ Commands โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ extract Extract social media URLs from input. โ
โ check Check which platform a URL belongs to. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Examples:
# Find all social links on a webpage
$ curl -s https://karllorey.com | grep -oE 'https?://[^"]+' | socials extract
linkedin https://www.linkedin.com/in/karllorey
github https://github.com/lorey
instagram https://www.instagram.com/karllorey
# Check what platform a URL belongs to
$ socials check https://github.com/lorey
githubFull docs at socials.readthedocs.io
- Getting Started - Tutorial with examples
- CLI Reference - Command-line usage
- API Reference - Full API docs
- Architecture - How it works
- Socials API - REST API wrapper. Free hosted version available.
- social-media-profiles-regexs - Regular expressions for social media URLs.
- flutter_url_recognizer - Similar implementation for Flutter.
GNU General Public License v3