-
Notifications
You must be signed in to change notification settings - Fork 0
Add traffic collection and slack messaging if those fail #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
78f1c92
36011b6
4a030e8
8291149
ddd8965
7f03c2a
0371756
c129ee8
a9bd174
614e4b7
22f04b4
ce69353
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| name: Biweekly Traffic collection | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| inputs: | ||
| slack_channel: | ||
| description: Slack channel to post the error message to if the builds fail. | ||
| required: false | ||
| default: "sdv-alerts-debug" | ||
|
|
||
| schedule: | ||
| - cron: "0 0 */14 * *" # Runs every 14 days at midnight UTC | ||
|
|
||
| jobs: | ||
| collect_traffic: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python ${{ matrix.python-version }} | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.13' | ||
| - name: Install dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| python -m pip install . | ||
| - name: Collect Github Traffic Data | ||
| run: | | ||
| github-analytics traffic -v -t ${{ secrets.PERSONAL_ACCESS_TOKEN }} -c traffic_config.yaml | ||
| env: | ||
| PYDRIVE_CREDENTIALS: ${{ secrets.PYDRIVE_CREDENTIALS }} | ||
| alert: | ||
| needs: [collect_traffic] | ||
| runs-on: ubuntu-latest | ||
| if: failure() | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.13' | ||
| - name: Install slack dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| python -m pip install invoke | ||
| python -m pip install -e .[dev] | ||
| - name: Slack alert if failure | ||
| run: python -m github_analytics.slack_utils -r ${{ github.run_id }} -c ${{ github.event.inputs.slack_channel || 'sdv-alerts' }} | ||
| env: | ||
| SLACK_TOKEN: ${{ secrets.SLACK_TOKEN }} | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,184 @@ | ||||||||||||||||||
| """Traffic client for retrieving github information.""" | ||||||||||||||||||
|
|
||||||||||||||||||
| import logging | ||||||||||||||||||
|
|
||||||||||||||||||
| import pandas as pd | ||||||||||||||||||
| import requests | ||||||||||||||||||
|
|
||||||||||||||||||
| logging.basicConfig(level=logging.INFO) | ||||||||||||||||||
| LOGGER = logging.getLogger(__name__) | ||||||||||||||||||
|
|
||||||||||||||||||
| GITHUB_API_URL = 'https://api.github.com' | ||||||||||||||||||
|
|
||||||||||||||||||
|
|
||||||||||||||||||
| class TrafficClient: | ||||||||||||||||||
| """Client to fetch traffic data (popular referrers & paths) for a given repository. | ||||||||||||||||||
|
|
||||||||||||||||||
| Args: | ||||||||||||||||||
| token (str): | ||||||||||||||||||
| GitHub personal access token for authentication. | ||||||||||||||||||
| quiet (bool, optional): | ||||||||||||||||||
| If True, suppresses output logging. Defaults to False. | ||||||||||||||||||
| """ | ||||||||||||||||||
|
|
||||||||||||||||||
| def __init__(self, token): | ||||||||||||||||||
| self.token = token | ||||||||||||||||||
| self.headers = { | ||||||||||||||||||
| 'Authorization': f'token {token}', | ||||||||||||||||||
| 'Accept': 'application/vnd.github.v3+json', | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| def _get_traffic_data(self, repo: str, endpoint: str) -> list: | ||||||||||||||||||
| """Helper method to fetch traffic data from GitHub's REST API. | ||||||||||||||||||
|
|
||||||||||||||||||
| Args: | ||||||||||||||||||
| repo (str): | ||||||||||||||||||
| The repository in the format "owner/repo". | ||||||||||||||||||
| endpoint (str): | ||||||||||||||||||
| The traffic API endpoint (e.g., "popular/referrers", "popular/paths", "views" or | ||||||||||||||||||
| "clones"). | ||||||||||||||||||
|
|
||||||||||||||||||
| Returns: | ||||||||||||||||||
| list: | ||||||||||||||||||
| The JSON response containing traffic data. | ||||||||||||||||||
|
|
||||||||||||||||||
| Raises: | ||||||||||||||||||
| RuntimeError: | ||||||||||||||||||
| If the API request fails. | ||||||||||||||||||
| """ | ||||||||||||||||||
| url = f'{GITHUB_API_URL}/repos/{repo}/traffic/{endpoint}' | ||||||||||||||||||
| LOGGER.info(f'Fetching traffic data from: {url}') | ||||||||||||||||||
|
|
||||||||||||||||||
| response = requests.get(url, headers=self.headers) | ||||||||||||||||||
|
|
||||||||||||||||||
| if response.status_code == 200: | ||||||||||||||||||
| LOGGER.info(f'Successfully retrieved {endpoint} data for {repo}.') | ||||||||||||||||||
| return response.json() | ||||||||||||||||||
| else: | ||||||||||||||||||
| LOGGER.error(f'GitHub API Error ({response.status_code}): {response.json()}') | ||||||||||||||||||
| raise RuntimeError(f'GitHub API Error ({response.status_code}): {response.json()}') | ||||||||||||||||||
|
|
||||||||||||||||||
| def get_traffic_referrers(self, repo: str) -> pd.DataFrame: | ||||||||||||||||||
| """Fetches the top referring domains that send traffic to the given repository. | ||||||||||||||||||
|
|
||||||||||||||||||
| Args: | ||||||||||||||||||
| repo (str): | ||||||||||||||||||
| The repository in the format "owner/repo". | ||||||||||||||||||
|
|
||||||||||||||||||
| Returns: | ||||||||||||||||||
| pd.DataFrame: | ||||||||||||||||||
| DataFrame containing referrer traffic details with columns: | ||||||||||||||||||
| - `referrer`: Source domain. | ||||||||||||||||||
| - `count`: Number of views. | ||||||||||||||||||
| - `uniques`: Number of unique visitors. | ||||||||||||||||||
| """ | ||||||||||||||||||
| LOGGER.info(f'Fetching traffic referrers for {repo}.') | ||||||||||||||||||
| data = self._get_traffic_data(repo, 'popular/referrers') | ||||||||||||||||||
| df = pd.DataFrame(data, columns=['referrer', 'count', 'uniques']) | ||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||||||||||||||
| LOGGER.info(f'Retrieved {len(df)} referrer records for {repo}.') | ||||||||||||||||||
| return df | ||||||||||||||||||
|
|
||||||||||||||||||
| def get_traffic_paths(self, repo: str) -> pd.DataFrame: | ||||||||||||||||||
| """Fetches the most visited paths in the given repository. | ||||||||||||||||||
|
|
||||||||||||||||||
| Args: | ||||||||||||||||||
| repo (str): | ||||||||||||||||||
| The repository in the format "owner/repo". | ||||||||||||||||||
|
|
||||||||||||||||||
| Returns: | ||||||||||||||||||
| pd.DataFrame: DataFrame containing popular paths with columns: | ||||||||||||||||||
| - `path`: The visited path. | ||||||||||||||||||
| - `title`: Page title. | ||||||||||||||||||
| - `count`: Number of views. | ||||||||||||||||||
| - `uniques`: Number of unique visitors. | ||||||||||||||||||
| """ | ||||||||||||||||||
| LOGGER.info(f'Fetching traffic paths for {repo}.') | ||||||||||||||||||
| data = self._get_traffic_data(repo, 'popular/paths') | ||||||||||||||||||
| df = pd.DataFrame(data, columns=['path', 'title', 'count', 'uniques']) | ||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||
| LOGGER.info(f'Retrieved {len(df)} path records for {repo}.') | ||||||||||||||||||
| return df | ||||||||||||||||||
|
|
||||||||||||||||||
| def get_traffic_views(self, repo: str) -> pd.DataFrame: | ||||||||||||||||||
| """Fetches the number of views for the given repository over time. | ||||||||||||||||||
|
|
||||||||||||||||||
| Args: | ||||||||||||||||||
| repo (str): | ||||||||||||||||||
| The repository in the format "owner/repo". | ||||||||||||||||||
|
|
||||||||||||||||||
| Returns: | ||||||||||||||||||
| pd.DataFrame: | ||||||||||||||||||
| DataFrame containing repository views with columns: | ||||||||||||||||||
| - `timestamp`: Date of views. | ||||||||||||||||||
| - `count`: Number of views. | ||||||||||||||||||
| - `uniques`: Number of unique visitors. | ||||||||||||||||||
| """ | ||||||||||||||||||
| data = self._get_traffic_data(repo, 'views') | ||||||||||||||||||
| return pd.DataFrame(data['views'], columns=['timestamp', 'count', 'uniques']) | ||||||||||||||||||
|
||||||||||||||||||
| return pd.DataFrame(data['views'], columns=['timestamp', 'count', 'uniques']) | |
| return pd.DataFrame(data['views'], columns=['timestamp', 'views', 'unique_visitors']) |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return pd.DataFrame(data['clones'], columns=['timestamp', 'count', 'uniques']) | |
| return pd.DataFrame(data['clones'], columns=['timestamp', 'clones', 'unique_cloners']) |
gsheni marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 'Traffic Referrers': self.get_traffic_referrers(repo), | |
| 'Traffic Paths': self.get_traffic_paths(repo), | |
| 'Traffic Views': self.get_traffic_views(repo), | |
| 'Traffic Clones': self.get_traffic_clones(repo), | |
| 'Traffic Referring Sites': self.get_traffic_referrers(repo), | |
| 'Traffic Popular Content': self.get_traffic_paths(repo), | |
| 'Traffic Visitors': self.get_traffic_views(repo), | |
| 'Traffic Git Clones': self.get_traffic_clones(repo), |

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.