-
Notifications
You must be signed in to change notification settings - Fork 0
Add traffic collection and slack messaging if those fail #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| run: | | ||
| python -m pip install --upgrade pip | ||
| python -m pip install invoke | ||
| python -m pip install -e .[dev] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| python -m pip install -e .[dev] | |
| python -m pip install .[dev] |
github_analytics/github/traffic.py
Outdated
| repo (str): | ||
| The repository in the format "owner/repo". | ||
| endpoint (str): | ||
| The traffic API endpoint (e.g., "popular/referrers" or "popular/paths"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's list the values this could be.
| The traffic API endpoint (e.g., "popular/referrers" or "popular/paths"). | |
| The traffic API endpoint (e.g., "popular/referrers", "popular/paths", "views", "clones"). |
| ouptut_folder (str): | ||
| Folder in which the metrics will be stored. | ||
| """ | ||
| timestamp = datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to save the timeframe for the data as another sheet? So the start and end timestamp (last 2 weeks)?
amontanez24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me!
github_analytics/drive.py
Outdated
| if parent_folder.startswith('gdrive://'): | ||
| parent_folder = parent_folder.replace('gdrive://', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: can we make 'gdrive://' a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was updated 614e4b7
|
Addressed @gsheni feedback, here are the results: https://docs.google.com/spreadsheets/d/1ggFsHydE7csoL95qJHRtR2LXQMTotYi07SkoH3lEpmE/edit?usp=sharing |
444938b to
614e4b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| """ | ||
| LOGGER.info(f'Fetching traffic referrers for {repo}.') | ||
| data = self._get_traffic_data(repo, 'popular/referrers') | ||
| df = pd.DataFrame(data, columns=['referrer', 'count', 'uniques']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| """ | ||
| LOGGER.info(f'Fetching traffic paths for {repo}.') | ||
| data = self._get_traffic_data(repo, 'popular/paths') | ||
| df = pd.DataFrame(data, columns=['path', 'title', 'count', 'uniques']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| df = pd.DataFrame(data, columns=['path', 'title', 'count', 'uniques']) | |
| df = pd.DataFrame(data, columns=['content', 'title', 'views', 'unique_visitors']) |
github_analytics/github/traffic.py
Outdated
| - `uniques`: Number of unique visitors. | ||
| """ | ||
| data = self._get_traffic_data(repo, 'views') | ||
| return pd.DataFrame(data['views'], columns=['timestamp', 'count', 'uniques']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return pd.DataFrame(data['views'], columns=['timestamp', 'count', 'uniques']) | |
| return pd.DataFrame(data['views'], columns=['timestamp', 'views', 'unique_visitors']) |
github_analytics/github/traffic.py
Outdated
| - `uniques`: Number of unique cloners. | ||
| """ | ||
| data = self._get_traffic_data(repo, 'clones') | ||
| return pd.DataFrame(data['clones'], columns=['timestamp', 'count', 'uniques']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return pd.DataFrame(data['clones'], columns=['timestamp', 'count', 'uniques']) | |
| return pd.DataFrame(data['clones'], columns=['timestamp', 'clones', 'unique_cloners']) |
github_analytics/github/traffic.py
Outdated
| 'Traffic Referrers': self.get_traffic_referrers(repo), | ||
| 'Traffic Paths': self.get_traffic_paths(repo), | ||
| 'Traffic Views': self.get_traffic_views(repo), | ||
| 'Traffic Clones': self.get_traffic_clones(repo), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 'Traffic Referrers': self.get_traffic_referrers(repo), | |
| 'Traffic Paths': self.get_traffic_paths(repo), | |
| 'Traffic Views': self.get_traffic_views(repo), | |
| 'Traffic Clones': self.get_traffic_clones(repo), | |
| 'Traffic Referring Sites': self.get_traffic_referrers(repo), | |
| 'Traffic Popular Content': self.get_traffic_paths(repo), | |
| 'Traffic Visitors': self.get_traffic_views(repo), | |
| 'Traffic Git Clones': self.get_traffic_clones(repo), |
584c7ec to
22f04b4
Compare



Resolves #5
CU-86b40hf4k
Workflow: https://github.com/datacebo/github-analytics/actions/runs/13764032556
The workflow failed because the
tokendoesn't have access to the repo data. I'm not sure if we have to add it to the organization or something (it is the[email protected]account.Results are stored in here and organized in folders, one per each sdv-dev repo. Then filenames are a timestamp. Here is SDV example