Skip to content

Commit 8ce45be

Browse files
authored
Merge pull request #591 from yinhaofeng/datasets
add datasets
2 parents f59c749 + d4d373c commit 8ce45be

File tree

16 files changed

+69
-25
lines changed

16 files changed

+69
-25
lines changed

datasets/Adult/run.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/Adult/adult.data

datasets/Avazu/readme.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
you can go to https://www.kaggle.com/c/avazu-ctr-prediction/data, and click 'Download All' to download avazu dataset

datasets/Epinions/run.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/Epinions/soc-Epinions1.txt.gz
2+
gzip -d soc-Epinions1.txt.gz

datasets/Gowalla/run.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/Gowalla/loc-gowalla_totalCheckins.txt.gz
2+
gzip -d loc-gowalla_totalCheckins.txt.gz

datasets/Imagenet/readme.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ImageNet项目是一个大型视觉数据库,用于视觉目标识别研究任务,该项目已手动标注了 1400 多万张图像。ImageNet-1k 是 ImageNet 数据集的子集,其包含 1000 个类别。训练集包含 1281167 个图像数据,验证集包含 50000 个图像数据。2010 年以来,ImageNet 项目每年举办一次图像分类竞赛,即 ImageNet 大规模视觉识别挑战赛(ILSVRC)。挑战赛使用的数据集即为 ImageNet-1k。到目前为止,ImageNet-1k 已经成为计算机视觉领域发展的最重要的数据集之一,其促进了整个计算机视觉的发展,很多计算机视觉下游任务的初始化模型都是基于该数据集训练得到的。
2+
如需要用到Imagenet1k数据集,可以前往[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.3/docs/zh_CN/data_preparation/classification_dataset.md#ImageNet1k)中获取数据

datasets/Imagenet/run.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar
2+
tar -xvf ILSVRC2012_img_train.tar
3+
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
4+
tar -xvf ILSVRC2012_img_val.tar

datasets/JD/run.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/JD/jdata_tfrecord.zip
2+
unzip jdata_tfrecord.zip

datasets/LastFM/run.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/LastFM/lastfm-2k.zip
2+
unzip lastfm-2k.zip

datasets/Phishing_Websites/run.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/Phishing_Websites/train.arff

datasets/Pinterest/run.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
wget https://paddlerec.bj.bcebos.com/datasets/Pinterest/pinterest-20.train.rating

0 commit comments

Comments
 (0)