Skip to content

Provide a script to cleanly download and normalize text #3

@ruohoruotsi

Description

@ruohoruotsi

Rather than the current system of each sub-corpora it is own folder with its own code. Create a top-level downloads.sh which can re-assemble the sub-corpora.

Separately, have the downloaded & pre-processed sub-corpora ready to be referenced from ADR, and NMT repos as submodules etc.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions