Skip to content

Latest commit

 

History

History
5 lines (4 loc) · 351 Bytes

File metadata and controls

5 lines (4 loc) · 351 Bytes

resh-edu

This repository contains scripts used to scrape and process data from resh.edu.ru. More information and the dataset can be found here.

A version of the dataset for causal language modeling will be coming in a few days, but only the raw version is available for now.