@@ -37,8 +37,8 @@ In this project tutorial, we will explore a dataset about top Twitch streamers a
3737
3838The two CSV files we have are:
3939
40- - [ Top 1000 Twitch Streamers (2024)] ( https://www.kaggle.com/datasets/hibrahimag1/top-1000-twitch-streamers-data-may-2024 ) featuring streamers like [ KaiCenat] ( https://www.twitch.tv/kaicenat ) and [ Jynxzi] ( https://www.twitch.tv/jynxzi ) .
41- - [ Top 1000 Twitch Streamers (2021)] ( https://www.kaggle.com/datasets/aayushmishra1512/twitchdata ) featuring [ xQc] ( https://www.twitch.tv/xqc ) , [ summit1g] ( https://www.twitch.tv/summit1g ) , [ TimTheTatman] ( https://www.twitch.tv/timthetatman ) , and [ pokimane] ( https://www.twitch.tv/pokimane ) .
40+ - [ Top 1000 Twitch Streamers (2024)] ( https://www.kaggle.com/datasets/hibrahimag1/top-1000-twitch-streamers-data-may-2024 ) featuring streamers like [ KaiCenat] ( https://www.twitch.tv/kaicenat ) , [ Jynxzi] ( https://www.twitch.tv/jynxzi ) .
41+ - [ Top 1000 Twitch Streamers (2021)] ( https://www.kaggle.com/datasets/aayushmishra1512/twitchdata ) featuring streamers like [ xQc] ( https://www.twitch.tv/xqc ) , [ summit1g] ( https://www.twitch.tv/summit1g ) , [ TimTheTatman] ( https://www.twitch.tv/timthetatman ) , [ pokimane] ( https://www.twitch.tv/pokimane ) .
4242
4343Do you recognize any of the names?
4444
@@ -63,18 +63,68 @@ Download one of the CSV files and open it up to make sure the CSV file is workin
6363
6464A ** CSV** file stands for ** C** omma-** S** eparated ** V** alues file. It’s a simple way to store spreadsheet data in a plain text format. Each line in the file represents a row, and the columns in that row are separated by commas.
6565
66- So the Top 1000 Twitch Streams data looks like:
66+ So the ** streamers2021.csv** data looks like:
67+
68+ ![ streamers2021 screenshot] ( https://raw.githubusercontent.com/codedex-io/projects/refs/heads/main/projects/analyze-twitch-data-with-sqlite/streamers2021.png )
69+
70+ But behind the scenes, the plain text is just:
71+
72+ ``` output
73+ Channel, Watch time, Stream time, Peak viewers, Average viewers, Followers, Followers gained, Views gained, Partnered, Mature, Language
74+ xQcOW, 6196161750, 215250, 222720, 27716, 3246298, 1734810, 93036735, True, False, English
75+ summit1g, 6091677300, 211845, 310998, 25610, 5310163, 1370184, 89705964, True, False, English
76+ Gaules, 5644590915, 515280, 387315, 109761767635, 1023779102611607, True
77+ ESL_CSGO, 3970318140, 517740, 300575, 7714,3944850, 703986106546942, False
78+ Tfue, 3671000070, 123660, 285644, 29602, 8938903, 206842478998587, False
79+ ... and so on
80+ ```
81+
82+ The column names are:
83+
84+ - Channel
85+ - Watch time
86+ - Stream time
87+ - Peak viewers
88+ - Average viewers
89+ - Followers
90+ - Followers gained
91+ - Views gained
92+ - Partnered
93+ - Mature
94+ - Language
95+
96+ And the ** streamers2024.csv** data looks like:
6797
6898``` output
69- Channel, Watch Time, Stream time, peak viewers, average viewers, Followers, Mature
70- xQcOW, 6196161750,215250,222720, 277163246298, 173481093036735, False
71- summit1g, 6091677300,211845,310998, 2561053101, 63137018489705964, False
72- Gaules, 5644590915,515280,387315,109761767635, 1023779102611607, True
73- ESL_CSGO, 3970318140,517740, 300575,7714,3944850, 703986106546942, False
74- Tfue, 3671000070,123660,285644, 29602, 8938903, 206842478998587, False
99+ RANK, NAME, LANGUAGE, TYPE, MOST_STREAMED_GAME, 2ND_MOST_STREAMED_GAME, AVERAGE_STREAM_DURATION, FOLLOWERS_GAINED_PER_STREAM, AVG_VIEWERS_PER_STREAM, AVG_GAMES_PER_STREAM, TOTAL_TIME_STREAMED, TOTAL_FOLLOWERS, TOTAL_VIEWS, TOTAL_GAMES_STREAMED, ACTIVE_DAYS_PER_WEEK, MOST_ACTIVE_DAY, DAY_WITH_MOST_FOLLOWERS_GAINED
100+ 1, kaicenat, English, personality, Just Chatting, I'm Only Sleeping, 7.6, 18405, 15852, 2.3, 4698, 10600000, 9150000, 194, 3.6, Friday, Saturday
101+ 2, jynxzi, English, personality, Tom Clancy's Rainbow Six Siege, NBA 2K20, 5.4, 3386, 1145, 1.2, 8407, 5760000, 1950000, 54, 5.6, Tuesday, Sunday
102+ 3, caedrel, English, personality, League of Legends, I'm Only Sleeping, 6.3, 689, 12331, 1.3, 6728, 797000, 14200000, 111, 2.8, Thursday, Sunday
103+ 4, caseoh_, English, personality, NBA 2K23, Just Chatting, 4.6, 7185, 0, 3.6, 2554, 4220000, 53,385, 6.2, Friday, Monday
104+ 5, ibai, Spanish, personality, Just Chatting, League of Legends, 4.1, 8289, 190714, 1.5, 6865, 15600000, 359000000, 149, 4.3, Wednesday, Saturday
75105... and so on
76106```
77107
108+ The column names are:
109+
110+ - RANK
111+ - NAME
112+ - LANGUAGE
113+ - TYPE
114+ - MOST_STREAMED_GAME
115+ - 2ND_MOST_STREAMED_GAME
116+ - AVERAGE_STREAM_DURATION
117+ - FOLLOWERS_GAINED_PER_STREAM
118+ - AVG_VIEWERS_PER_STREAM
119+ - AVG_GAMES_PER_STREAM
120+ - TOTAL_TIME_STREAMED
121+ - TOTAL_FOLLOWERS
122+ - TOTAL_VIEWS
123+ - TOTAL_GAMES_STREAMED
124+ - ACTIVE_DAYS_PER_WEEK
125+ - MOST_ACTIVE_DAY
126+ - DAY_WITH_MOST_FOLLOWERS_GAINED
127+
78128There are 1,001 rows in each of the datasets because there is 1 column for the headings and 1,000 streamers each!
79129
80130Let’s move back into our terminal and start the database.
@@ -89,17 +139,32 @@ sqlite twitch.db
89139
90140Inside the SQLite prompt, create a table called ` streams ` :
91141
92- ```
93- CREATE TABLE streams (
94- id TEXT PRIMARY KEY,
95- user_name TEXT,
96- game_name TEXT,
97- title TEXT,
98- viewer_count INTEGER,
99- started_at TEXT
142+ ``` sql
143+ CREATE TABLE streamers (
144+ channel TEXT PRIMARY KEY ,
145+ watch_time INTEGER ,
146+ stream_time INTEGER ,
147+ peak_viewers INTEGER ,
148+ followers INTEGER ,
149+ followers_gained INTEGER ,
150+ partnered TEXT ,
151+ mature TEXT ,
152+ language TEXT
100153);
101154```
102155
156+ - Channel
157+ - Watch time
158+ - Stream time
159+ - Peak viewers
160+ - Average viewers
161+ - Followers
162+ - Followers gained
163+ - Views gained
164+ - Partnered
165+ - Mature
166+ - Language
167+
103168These names will have to match the CSV columns as well as the data type here, or else there will be an error later.
104169
105170### Import CSV into SQLite
0 commit comments