-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
We can use e.g. data at https://linguatools.org/tools/corpora/wikipedia-monolingual-corpora/
Steps:
- For each xml file in wikipedia
- read_xml("C:/Users/Jan/Desktop/sample-monolingual.xml") %>% xml_text()
- Append that text to 1 file
- Build the model
- Save the model
Looks pretty easy. - Probably run this on a more beefy machine
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels