Skip to content

OTA engine robustnessΒ #1174

@michielbdejong

Description

@michielbdejong

Initial tests with #1164 show a number of problems:

  • even with git commit-graph, the runs are slow, this may be because the repo starts out shallowed to 5 commits but can then build up 6000 commits in a single run. So it's probably necessary to avoid wildcards - will look into that today.
  • the script sometimes trips over itself, and errors with 'git in use' or 'expected only one file to be changed' and then fails
  • if git is left with uncommitted changes, the next run will fail too.
  • the script builds up a lot of memory usage (several Gigabytes)

To reduce the effect of a git repo getting into the wrong state, it would help to shard the snapshots and versions repos -> #1172
Apart from sharding the repos, I also want to rethink the scheduling through a cronjob -> #1173

I think these two measures combined will make the crawler a lot more stable, and also allow us to run it on the same server as the rest of ToS;DR, so it will not only be more reliable, but also cheaper for us to host.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions