Skip to content

Conversation

@pR0Ps
Copy link
Owner

@pR0Ps pR0Ps commented Oct 1, 2020

Also adds the ability to list activities using web scraping instead of the API. The activities are returned as ScrapedActivity objects that are mostly compatible with the normal Activity objects that are returned by the list activities function that uses the API.

Fixes #4

NOTE: stravalib moving to Pydantic for its models is going to break a LOT of this. Will need some work.

@pR0Ps pR0Ps force-pushed the feature/standalone-scraping branch from b2f0204 to 923a1c3 Compare October 1, 2020 06:03
@pR0Ps pR0Ps force-pushed the feature/standalone-scraping branch from 923a1c3 to 13737f7 Compare January 11, 2022 05:30
@pR0Ps pR0Ps force-pushed the feature/standalone-scraping branch 2 times, most recently from 9a82176 to d8ed33a Compare January 31, 2022 19:55
@pR0Ps pR0Ps marked this pull request as draft February 3, 2022 06:08
Also adds the ability to list activities using web scraping instead of
the API. The activities are returned as `ScrapedActivity` objects that
are mostly compatible with the normal `Activity` objects that are
returned by the list activities function that uses the API.
This should be done by the library consumer if it's needed
It's not going to be perfect, but the idea is that for the most basic of
cases it should be a pretty close replacement. The goal is to keep the
amount of work to support both API and scraping-based clients to a minimum.

To support this, the WebClient now uses delegation instead of
inheritance to add scraper-based functionality. This enables the
`ScrapingClient` class to use the same function names without
automatically overriding the `stravalib.Client` functions when used
through the `WebClient` class.
The default used to be to just download the JSON blob. It was changed to
request the GPX format instead since this is a more standardized format
for an activity.
Now accepts (but ignores) parameters that the `stravalib` version accepts
 - Make pagination actually work (forgot to increment page number)
 - Handle stopping based on the `before` param
 - Properly handle workout types
 - Move models to a separate file
 - Add more detailed scraping of activity details
 - Add more detailed scraping of bike data
 - Tweak LazyLoaded
 - Add scraping for challenges
 - Tweak gear access
BeautifulSoup v4.9.0 changed how `.text` works for `<script>` tags (ie.
not at all), breaking parsing.

See https://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/revision/564
@pR0Ps pR0Ps force-pushed the feature/standalone-scraping branch from d8ed33a to 3685464 Compare September 16, 2022 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow using the library without API access

2 participants