Skip to content

Commit e23f1fe

Browse files
committed
guides: web analytics guide - overview
1 parent 778df72 commit e23f1fe

File tree

5 files changed

+35
-0
lines changed

5 files changed

+35
-0
lines changed

guides/web-analytics/content/1_overview.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,30 @@
22
order: 1
33
title: "Overview"
44
---
5+
6+
Building your own analytics engine, like the one behind Google Analytics, sounds like a very
7+
sophisticated engineering problem. And it truly is. Back then it would require
8+
years of engineering time to ship such a piece of software. But as data
9+
landscape changes, now we have a lot of tools which solves different part of
10+
this problem extremely well: data collection, storage, aggregations and query
11+
engine. By breaking the problem into smaller pieces and solving them one-by-one
12+
by using existing open-source tools we will be able to build own our web
13+
analytics engine.
14+
15+
If you’re familiar with Google Analytics (GA), you probably already know that every web page tracked by GA contains a GA tracking code. It loads an async script that assigns a tracking cookie to a user if it isn’t set yet. It also sends an XHR for every user interaction, like a page load. These XHR requests are then processed and raw event data is stored and scheduled for aggregation processing. Depending on the total amount of incoming requests the data will also be sampled.
16+
17+
Even though this is a high level overview of Google Analytics essentials, it’s enough to reproduce most of the functionality.
18+
19+
You can check the online demo of the application we are going to build here and the complete source code is available on Github.
20+
21+
## Architecture overview
22+
23+
Below you can see the architecture of the application we are going to build.
24+
We'll use Snowplow for data collection, Athena as the main data warehouse, MySQL to store pre-aggregations and Cube.js as the aggregation and querying engine. The frontend will be built with React, Material UI, and Recharts. Although the schema below shows some AWS services, they can be partially or fully substituted by open-source alternatives: Kafka, MinIO and PrestoDB instead of Kinesis, S3 and Athena respectively.
25+
26+
![](https://raw.githubusercontent.com/cube-js/cube.js/master/examples/web-analytics/web-analytics-schema.png)
27+
28+
We'll start with data collection and gradually build the whole application
29+
including the frontend. If you have any questions while going through this guide, please feel free to join this Slack community and post your question there.
30+
31+
Happy Hacking! 💻

guides/web-analytics/content/2_data_collection_and_storage.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,5 @@
22
order: 2
33
title: "Data Collection and Storage"
44
---
5+
6+
test

guides/web-analytics/content/3_data_schema.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,5 @@
22
order: 3
33
title: "Data Schema"
44
---
5+
6+
test

guides/web-analytics/content/4_performance_optimization.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,5 @@
22
order: 4
33
title: "Performance Optimization"
44
---
5+
6+
test

guides/web-analytics/content/5_frontend_customization.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,5 @@
22
order: 5
33
title: "Frontend Customization"
44
---
5+
6+
test

0 commit comments

Comments
 (0)