Skip to content

Commit d4d373c

Browse files
committed
add reader
1 parent 44e14e3 commit d4d373c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

datasets/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ sh data_process.sh
2525
|[senti_clas](https://baidu-nlp.bj.bcebos.com/sentiment_classification-dataset-1.0.0.tar.gz)| [textcnn](../models/contentunderstanding/textcnn/senti_clas_reader.py)|情感倾向分析(Sentiment Classification,简称Senta)针对带有主观描述的中文文本,可自动判断该文本的情感极性类别并给出相应的置信度。情感类型分为积极、消极。情感倾向分析能够帮助企业理解用户消费习惯、分析热点话题和危机舆情监控,为企业提供有利的决策支持|--|
2626
|[one_billion](http://www.statmt.org/lm-benchmark/)| [word2vec](../models/recall/word2vec/word2vec_reader.py) |拥有十亿个单词基准,为语言建模实验提供标准的训练和测试|[One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling](https://arxiv.org/abs/1312.3005)|
2727
|[MIND](https://paddlerec.bj.bcebos.com/datasets/MIND/bigdata.zip)| [naml](../models/rank/naml/NAMLDataReader.py) |MIND即MIcrosoft News Dataset的简写,MIND里的数据来自Microsoft News用户的行为日志。MIND的数据集里包含了1,000,000的用户以及这些用户与160,000的文章的交互行为。|[Microsoft(2020)](https://msnews.github.io)|
28-
|[movielens_pinterest_NCF](https://paddlerec.bj.bcebos.com/ncf/Data.zip)| [NCF](../models/rcall/ncf/movielens_reader.py) |论文原作者处理过的movielens数据集和pinterest数据集,[github](https://github.com/hexiangnan/neural_collaborative_filtering)|[《Neural Collaborative Filtering 》](https://arxiv.org/pdf/1708.05031.pdf)|
28+
|[movielens_pinterest_NCF](https://paddlerec.bj.bcebos.com/ncf/Data.zip)| [NCF](../models/recall/ncf/movielens_reader.py) |论文原作者处理过的movielens数据集和pinterest数据集,[github](https://github.com/hexiangnan/neural_collaborative_filtering)|[《Neural Collaborative Filtering 》](https://arxiv.org/pdf/1708.05031.pdf)|
2929
|[Anime](https://paddlerec.bj.bcebos.com/datasets/Anime/archive.zip)| -- |该数据集包含73,516个用户对12,294个动漫的用户偏好数据。每个用户都可以将动漫添加到列表中并给它一个评分,该数据集是这些评分的汇总。|[Kaggle](https://www.kaggle.com/CooperUnion/anime-recommendations-database)|
3030
|[LFM-1b](https://paddlerec.bj.bcebos.com/datasets/LFM_1b/LFM-1b.zip)| -- |此数据集包含由Last.FM的120,000多个用户创建的十亿多个音乐收听记录。每条收听记录均以艺术家,专辑和曲目名称为特征,并包含一个时间戳。|[ICMR 2016](http://www.cp.jku.at/datasets/LFM-1b/)|
3131
|[LFM-1b UGP](https://paddlerec.bj.bcebos.com/datasets/LFM_1b_UGP/LFM-1b_UGP.zip)| -- |LFM-1b数据集的用户类型档案,作为LFM-1b的补充扩展|[ISM 2017](http://www.cp.jku.at/datasets/LFM-1b/)|

0 commit comments

Comments
 (0)