Skip to content

Archiving error loop with Travis::Logs::Services::ArchiveLog::VerificationFailed #143

@meatballhat

Description

@meatballhat

In the past few months, there has been a noticeable increase in the occurrence of Travis::Logs::Services::ArchiveLog::VerificationFailed, which is raised during log archiving when comparing the database log content with the S3 log content. This should happen rarely, if ever, as it is an indication that archiving was triggered prior to job completion. There is a tight retry loop inside the ArchiveLog service that I believe is not allowing for enough time to pass before raising an error and causing the job to go to sidekiq retry handling. Ideally, the sidekiq retry handling should be enough to ensure that the logs are eventually correctly archived, in which case we should consider:

  • stop sending these errors to sentry
  • send all sidekiq retries to sentry instead (?)
  • send one event to sentry after being sidekiq retried n number of times
  • something else?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions