Local-first GitHub repository traffic archival, promotion tracking, propagation intelligence, and repository observability.
GitHub Traffic Intelligence persistently collects and structures GitHub repository traffic data into a local SQLite database, generating a static dashboard and historical intelligence layer over time.
Unlike GitHub’s built-in traffic graphs, this system preserves historical data indefinitely and allows traffic correlation against real-world actions such as:
- Reddit posts
- Facebook posts
- release announcements
- README rewrites
- demo videos
- benchmark publications
- documentation pushes
- social media campaigns
- fork activity
- star/fork growth
- external discovery events
This project treats GitHub traffic as an intelligence surface rather than merely a statistics panel.
GitHub only exposes a limited rolling traffic window.
Without local archival:
- older traffic disappears
- campaign history is lost
- growth trends become invisible
- promotion impact becomes difficult to measure
- repository evolution becomes hard to analyze
- organic propagation becomes difficult to reconstruct
This project solves that problem by continuously collecting:
- repository views
- repository clones
- popular paths
- referrers
- repository metadata
- stars
- forks
- fork lineage
- metadata snapshots
and preserving them locally.
The goal is long-term repository intelligence ownership.
Persist GitHub traffic history indefinitely into SQLite.
Tracks:
- views
- unique viewers
- clones
- unique cloners
- popular paths
- popular referrers
Track how repository attention spreads over time.
Current propagation surfaces include:
- incoming link timeline
- path timeline
- referrer first-seen intelligence
- path first-seen intelligence
- propagation highlights
- repository metadata timeline
- star/fork/watch snapshots
- fork lineage tables
- fork tree views
This helps answer questions such as:
- Where did this traffic come from?
- Which referrers appeared first?
- Which paths attracted serious attention?
- Did forks appear after a discovery event?
- Did GitHub internal propagation compound after external exposure?
Collects and preserves repository metadata including:
- stars
- watchers
- forks
- open issues
- default branch
- pushed timestamp
- GitHub update timestamp
- repository URL
Metadata is snapshotted so growth can be analyzed historically.
Fork discovery is collected and displayed in the dashboard.
Tracks:
- fork full name
- fork owner
- fork repository name
- fork creation date
- last pushed date
- default branch
- fork stars
- clickable fork links
This allows the dashboard to function as a lightweight repository lineage observatory.
Attach real-world actions to traffic spikes.
Examples:
- posted on
/r/git - Facebook launch
- release announcement
- benchmark publication
- new screenshots
- README overhaul
- demo video release
Each event supports:
- editable metadata
- framework grouping
- retroactive correction
- adjustable trail windows
- event overlays on charts
No external web server required.
Generates a local static HTML dashboard with:
- repository overview cards
- traffic charts
- event overlays
- referrer tables
- path tables
- inventory/ranking views
- promotion timelines
- propagation timelines
- fork lineage
- first-seen intelligence
- metadata snapshots
Optional localhost-only API for:
- adding events
- editing events
- deleting events
- regenerating dashboards
Security model:
- localhost only
- no public exposure
- no arbitrary shell execution
- structured JSON actions only
Supports:
- cron automation
- systemd user services
- fully local-first workflows
Everything is stored locally:
- raw API responses
- normalized traffic tables
- event metadata
- repository inventory
- repository metadata snapshots
- fork lineage
- collection runs
- propagation history
git clone https://github.com/TorMatzAndren/github-traffic.git
cd github-traffic
Debian / Ubuntu:
sudo apt install python3 sqlite3
Create a GitHub personal access token with repository traffic access.
Recommended:
- fine-grained token
- read-only repository permissions
GitHub:
- Settings
- Developer settings
- Personal access tokens
sudo install -d -m 0750 /etc/tokens
sudo install -m 0640 -o root -g $USER /dev/null /etc/tokens/github.token
nano /etc/tokens/github.token
Paste token into:
/etc/tokens/github.token
./setup_github_traffic.sh
This automatically:
- installs local API systemd service
- enables service
- creates local config
- creates dashboard directories
- installs daily cron job
- generates first dashboard
./github_traffic_daily.sh
This:
- collects traffic
- stores SQLite history
- updates repository metadata
- discovers forks
- snapshots propagation surfaces
- regenerates dashboard
./open_dashboard.sh
The dashboard provides:
- repository overview cards
- historical traffic graphs
- event overlays
- referrer intelligence
- popular path tracking
- repository ranking surfaces
- incoming link timelines
- path timelines
- propagation highlights
- first-seen intelligence
- metadata timeline
- fork lineage and fork tree views
Top repositories are dynamically ranked by traffic activity and velocity.
GitHub Traffic Intelligence can help reconstruct discovery chains such as:
Reddit discussion
→ README traffic
→ clone spike
→ fork creation
→ GitHub internal propagation
→ secondary stars/forks
This is especially useful because GitHub’s own traffic windows are temporary.
The local database becomes the historical memory layer.
Promotion events are the core manual intelligence layer.
Example:
Posted ChronoGit on Reddit
→ /r/git
→ framework: reddit_launch
→ trail_days: 7
Traffic changes can then be correlated against real-world actions.
- Notice unexpected clone/view spike
- Check incoming link timeline
- Check referrer first-seen table
- Check path first-seen table
- Check fork lineage
- Correlate against stars/forks/metadata timeline
- Post project to Reddit
- Add event
- Watch traffic spike appear
- Compare against future campaigns
- Publish release
- Add release event
- Observe clone/view impact
- Improve repository presentation
- Add documentation event
- Compare conversion changes
github_traffic_collect.py
Collects GitHub API endpoints and normalizes traffic, metadata, and forks.
github_traffic_query.py
Produces structured JSON intelligence surfaces.
github_traffic_event.py
Handles:
- event insertion
- editing
- deletion
- framework metadata
generate_static_dashboard.py
Produces static HTML + JSON dashboard.
github_traffic_local_api.py
Optional localhost-only structured API.
github_traffic_daily.sh
setup_github_traffic.sh
Primary tables:
- collection_runs
- repositories
- repository_metadata_snapshots
- repository_forks
- traffic_views_daily
- traffic_clones_daily
- popular_paths_snapshot
- popular_referrers_snapshot
- raw_api_responses
- promotion_events
Both raw and normalized data are preserved.
Tokens are stored outside the repository:
/etc/tokens/github.token
Never committed.
Never printed.
The API:
- binds only to
127.0.0.1 - does not expose arbitrary shell execution
- accepts structured actions only
- regenerates dashboards safely
Dashboard is static HTML.
No external SaaS dependency.
No external telemetry.
No cloud analytics.
Portable public version.
Features:
- static dashboard
- localhost helper API
- SQLite persistence
- event intelligence
- propagation intelligence
- fork lineage
- metadata snapshots
No Jarri dependency.
Experimental Jarri integration branch.
Future goals:
- Jarri Workspace panel
- jarri_cmd_api.py routing
- integrated audit surfaces
- Workspace-native visualization
Daily collection:
15 9 * * * /home/USER/projects/github-traffic/github_traffic_daily.sh
Local API:
systemctl --user status github-traffic-api.service
This project is intentionally:
- local-first
- inspectable
- archival
- deterministic
- portable
- SaaS-independent
The goal is not merely analytics.
The goal is long-term repository intelligence ownership.
Planned future work:
- traffic velocity scoring
- campaign grouping
- anomaly detection
- repository ranking engine
- event impact scoring
- comparative repository analytics
- timeline overlays
- traffic attribution systems
- referrer/path correlation
- fork activity scoring
- repository propagation graphs
- Jarri Workspace integration
- advanced intelligence layers
MIT
Tor Matz Andren
https://jarri.systems
