Skip to content

[Feature]: Add checkbox to workflow editor to fetch robots.txt and respect disallows #2935

@tw4l

Description

@tw4l

Description

Related to webrecorder/browsertrix-crawler#631

Once this feature is released in the crawler (PR: webrecorder/browsertrix-crawler#888), we'll want to add the corresponding option to the Browsertrix UI.

We may want to set a minimum crawler version for this feature, similar to min_seed_file_crawler_image and min_autoclick_crawler_image

Requirements

  • Checkbox option for adding --robots flag to crawler args
  • Configurable minimum crawler version check in backend

Context

No response

Metadata

Metadata

Assignees

Labels

back endRequires back end dev workfront endRequires front end dev workui/uxThis issue requires UI/UX work

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions