UCI Machine Learning Repository
- Adults Data Set:
adult.data.bz2(bz2 compressed),adult.names(the description) - Wine Quality Data Set:
winequality-red.csvandwinequality-white.csv
- Phase 3 Release:
ALL.chr22.phase3_1000.vcf.bz2(1000 first variants),integrated_call_samples_v2.20130502.ALL.ped,1000_gen_populations.txt
Others:
- BOM: About Air Temperature Data:
bom_data_Note.txt,nsw_temp.csv - Enron Spam Dataset:
ham.zipandspam.zip(zip compressed documents from ham and spam folders in enron1.tar.gz) - Project Gutenberg: "The Prince" by Machiavelli:
prince_by_machiavelli.txt - The Internet Classics Archive: "The Art of War" by Sun Tzu:
artwar.1b.txt - Twitter:
tweets.json- a sample of tweets captured with the public API