Skip to content

Commit 6a61a87

Browse files
committed
2 parents 2146d78 + 00d34cf commit 6a61a87

File tree

2 files changed

+82
-17
lines changed

2 files changed

+82
-17
lines changed

projects/analyze-twitch-data-with-sqlite/analyze-twitch-data-with-sqlite.mdx

Lines changed: 82 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ In this project tutorial, we will explore a dataset about top Twitch streamers a
3737

3838
The two CSV files we have are:
3939

40-
- [Top 1000 Twitch Streamers (2024)](https://www.kaggle.com/datasets/hibrahimag1/top-1000-twitch-streamers-data-may-2024) featuring streamers like [KaiCenat](https://www.twitch.tv/kaicenat) and [Jynxzi](https://www.twitch.tv/jynxzi).
41-
- [Top 1000 Twitch Streamers (2021)](https://www.kaggle.com/datasets/aayushmishra1512/twitchdata) featuring [xQc](https://www.twitch.tv/xqc), [summit1g](https://www.twitch.tv/summit1g), [TimTheTatman](https://www.twitch.tv/timthetatman), and [pokimane](https://www.twitch.tv/pokimane).
40+
- [Top 1000 Twitch Streamers (2024)](https://www.kaggle.com/datasets/hibrahimag1/top-1000-twitch-streamers-data-may-2024) featuring streamers like [KaiCenat](https://www.twitch.tv/kaicenat), [Jynxzi](https://www.twitch.tv/jynxzi).
41+
- [Top 1000 Twitch Streamers (2021)](https://www.kaggle.com/datasets/aayushmishra1512/twitchdata) featuring streamers like [xQc](https://www.twitch.tv/xqc), [summit1g](https://www.twitch.tv/summit1g), [TimTheTatman](https://www.twitch.tv/timthetatman), [pokimane](https://www.twitch.tv/pokimane).
4242

4343
Do you recognize any of the names?
4444

@@ -63,18 +63,68 @@ Download one of the CSV files and open it up to make sure the CSV file is workin
6363

6464
A **CSV** file stands for **C**omma-**S**eparated **V**alues file. It’s a simple way to store spreadsheet data in a plain text format. Each line in the file represents a row, and the columns in that row are separated by commas.
6565

66-
So the Top 1000 Twitch Streams data looks like:
66+
So the **streamers2021.csv** data looks like:
67+
68+
![streamers2021 screenshot](https://raw.githubusercontent.com/codedex-io/projects/refs/heads/main/projects/analyze-twitch-data-with-sqlite/streamers2021.png)
69+
70+
But behind the scenes, the plain text is just:
71+
72+
```output
73+
Channel, Watch time, Stream time, Peak viewers, Average viewers, Followers, Followers gained, Views gained, Partnered, Mature, Language
74+
xQcOW, 6196161750, 215250, 222720, 27716, 3246298, 1734810, 93036735, True, False, English
75+
summit1g, 6091677300, 211845, 310998, 25610, 5310163, 1370184, 89705964, True, False, English
76+
Gaules, 5644590915, 515280, 387315, 109761767635, 1023779102611607, True
77+
ESL_CSGO, 3970318140, 517740, 300575, 7714,3944850, 703986106546942, False
78+
Tfue, 3671000070, 123660, 285644, 29602, 8938903, 206842478998587, False
79+
... and so on
80+
```
81+
82+
The column names are:
83+
84+
- Channel
85+
- Watch time
86+
- Stream time
87+
- Peak viewers
88+
- Average viewers
89+
- Followers
90+
- Followers gained
91+
- Views gained
92+
- Partnered
93+
- Mature
94+
- Language
95+
96+
And the **streamers2024.csv** data looks like:
6797

6898
```output
69-
Channel, Watch Time, Stream time, peak viewers, average viewers, Followers, Mature
70-
xQcOW, 6196161750,215250,222720, 277163246298, 173481093036735, False
71-
summit1g, 6091677300,211845,310998, 2561053101, 63137018489705964, False
72-
Gaules, 5644590915,515280,387315,109761767635, 1023779102611607, True
73-
ESL_CSGO, 3970318140,517740, 300575,7714,3944850, 703986106546942, False
74-
Tfue, 3671000070,123660,285644, 29602, 8938903, 206842478998587, False
99+
RANK, NAME, LANGUAGE, TYPE, MOST_STREAMED_GAME, 2ND_MOST_STREAMED_GAME, AVERAGE_STREAM_DURATION, FOLLOWERS_GAINED_PER_STREAM, AVG_VIEWERS_PER_STREAM, AVG_GAMES_PER_STREAM, TOTAL_TIME_STREAMED, TOTAL_FOLLOWERS, TOTAL_VIEWS, TOTAL_GAMES_STREAMED, ACTIVE_DAYS_PER_WEEK, MOST_ACTIVE_DAY, DAY_WITH_MOST_FOLLOWERS_GAINED
100+
1, kaicenat, English, personality, Just Chatting, I'm Only Sleeping, 7.6, 18405, 15852, 2.3, 4698, 10600000, 9150000, 194, 3.6, Friday, Saturday
101+
2, jynxzi, English, personality, Tom Clancy's Rainbow Six Siege, NBA 2K20, 5.4, 3386, 1145, 1.2, 8407, 5760000, 1950000, 54, 5.6, Tuesday, Sunday
102+
3, caedrel, English, personality, League of Legends, I'm Only Sleeping, 6.3, 689, 12331, 1.3, 6728, 797000, 14200000, 111, 2.8, Thursday, Sunday
103+
4, caseoh_, English, personality, NBA 2K23, Just Chatting, 4.6, 7185, 0, 3.6, 2554, 4220000, 53,385, 6.2, Friday, Monday
104+
5, ibai, Spanish, personality, Just Chatting, League of Legends, 4.1, 8289, 190714, 1.5, 6865, 15600000, 359000000, 149, 4.3, Wednesday, Saturday
75105
... and so on
76106
```
77107

108+
The column names are:
109+
110+
- RANK
111+
- NAME
112+
- LANGUAGE
113+
- TYPE
114+
- MOST_STREAMED_GAME
115+
- 2ND_MOST_STREAMED_GAME
116+
- AVERAGE_STREAM_DURATION
117+
- FOLLOWERS_GAINED_PER_STREAM
118+
- AVG_VIEWERS_PER_STREAM
119+
- AVG_GAMES_PER_STREAM
120+
- TOTAL_TIME_STREAMED
121+
- TOTAL_FOLLOWERS
122+
- TOTAL_VIEWS
123+
- TOTAL_GAMES_STREAMED
124+
- ACTIVE_DAYS_PER_WEEK
125+
- MOST_ACTIVE_DAY
126+
- DAY_WITH_MOST_FOLLOWERS_GAINED
127+
78128
There are 1,001 rows in each of the datasets because there is 1 column for the headings and 1,000 streamers each!
79129

80130
Let’s move back into our terminal and start the database.
@@ -89,17 +139,32 @@ sqlite twitch.db
89139

90140
Inside the SQLite prompt, create a table called `streams`:
91141

92-
```
93-
CREATE TABLE streams (
94-
id TEXT PRIMARY KEY,
95-
user_name TEXT,
96-
game_name TEXT,
97-
title TEXT,
98-
viewer_count INTEGER,
99-
started_at TEXT
142+
```sql
143+
CREATE TABLE streamers (
144+
channel TEXT PRIMARY KEY,
145+
watch_time INTEGER,
146+
stream_time INTEGER,
147+
peak_viewers INTEGER,
148+
followers INTEGER,
149+
followers_gained INTEGER,
150+
partnered TEXT,
151+
mature TEXT,
152+
language TEXT
100153
);
101154
```
102155

156+
- Channel
157+
- Watch time
158+
- Stream time
159+
- Peak viewers
160+
- Average viewers
161+
- Followers
162+
- Followers gained
163+
- Views gained
164+
- Partnered
165+
- Mature
166+
- Language
167+
103168
These names will have to match the CSV columns as well as the data type here, or else there will be an error later.
104169

105170
### Import CSV into SQLite
227 KB
Loading

0 commit comments

Comments
 (0)