You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This guide is part of a series that creates a real-time data pipeline with Astra and Decodable. For context and prerequisites, start xref:streaming-learning:use-cases-architectures:real-time-data-pipeline/index.adoc[here].
12
+
====
10
13
11
14
== Creating message topics to capture the stream of click data
This guide is part of a series that creates a real-time data pipeline with Astra and Decodable. For context and prerequisites, start xref:streaming-learning:use-cases-architectures:real-time-data-pipeline/index.adoc[here].
This guide is part of a series that creates a real-time data pipeline with Astra and Decodable. For context and prerequisites, start xref:streaming-learning:use-cases-architectures:real-time-data-pipeline/index.adoc[here].
12
+
====
10
13
11
-
Now we have all the pieces of our data processing pipeline in place. It’s time to start the connection and pipelines up and input some test data.
14
+
Now that we have all the pieces of our data processing pipeline in place, it’s time to start the connection and pipelines up and input some test data.
12
15
13
16
== Starting the processing
14
17
15
-
. Navigate to the “Connections” area and click the 3 dots at the right for each connection. Click the “Start” option on all 3 connections.
18
+
. Navigate to the “Connections” area and click the three dots at the right for each connection.
19
+
Click the “Start” option on all 3 connections.
16
20
+
17
21
image:decodable-data-pipeline/03/image9.png[]
18
22
19
-
. It might take a minute or so but each connection should refresh with a state of “Running”.
23
+
. Be patient.
24
+
It might take a minute or so, but each connection should refresh with a state of “Running”.
20
25
+
21
26
image:decodable-data-pipeline/03/image1.png[]
22
27
+
23
-
TIP: If one of the connections has an issue startup up (like a wrong setting or expired token) you can click on that connection to get more information.
28
+
TIP: If one of the connections has an issue starting up (like an incorrect setting or expired token), click on that connection for more information.
24
29
25
-
. Navigate to the “Pipelines” area and use the same menu on each pipeline to start. Same as the connections, they might take a minute or so to get going. Grab a coffee while you wait - you’ve earned it.
30
+
. Navigate to the “Pipelines” area and use the same three-dot menu on each pipeline to start.
31
+
As with the connections, they might take a minute or so to get going.
32
+
Grab a coffee while you wait - you’ve earned it.
26
33
+
27
34
image:decodable-data-pipeline/03/image3.png[]
28
35
29
-
Let’s make sure we have all the pieces in order…
36
+
Before ingesting data, let’s make sure we have all the pieces in order...
@@ -52,82 +59,146 @@ Let’s make sure we have all the pieces in order…
52
59
]
53
60
----
54
61
55
-
. Click “Upload” to simulate data being posted to the endpoint. You should receive a confirmation that data has been received.
62
+
. Click “Upload” to simulate data being posted to the endpoint. You will receive a confirmation that data has been received.
56
63
57
-
NOTE: NO, this was not a big moment with cheers and balloons. The celebration is at the end of the next area.
64
+
No, this was not the big moment with cheers and balloons - the celebration is at the end of the next area.
58
65
59
66
== Following the flow
60
67
61
68
For this first record of data, let’s look at each step along the way and confirm processing is working.
62
69
63
-
. After the data was ingested, the “Webstore-Raw-Clicks-Normalize-Pipeline” should have received it. You can confirm this by navigating the “Pipelines” area and clicking that pipeline to see its metrics. Notice in the “Input Metrics” area 1 record has been received.
70
+
. After the data was ingested, the “Webstore-Raw-Clicks-Normalize-Pipeline” received it.
71
+
You can confirm this by inspecting the “Webstore-Raw-Clicks-Normalize-Pipeline” pipeline metrics.
72
+
The “Input Metrics” and "Output Metrics" areas report that one record has been received.
73
+
This confirms that the data passed successfully through this pipeline.
64
74
+
65
75
image:decodable-data-pipeline/03/image2.png[]
66
76
67
-
. Also notice in the “Output Metrics” 1 record has been written. This confirms that the data passed successfully through this pipeline.
68
-
69
-
. Next we can go to the “Connectors” area and click the “Astra-Streaming-All-Webclicks-Connector”. In the “Input Metrics” we see that 1 record has been received.
77
+
. In the “Connections” area, click the “Astra-Streaming-All-Webclicks-Connector”.
78
+
In “Input Metrics”, we see that 1 record has been received.
70
79
+
71
80
image:decodable-data-pipeline/03/image4.png[]
72
81
73
-
. Now we can go to our Astra Streaming tenant “webstore-clicks” and navigate to the “Namespace and Topics” area. Expand the “Production” namespace and click the “all-clicks” topic. Notice that “Data In” has 1 message and “Data Out” has 1 message. That means the topic took the data in and a consumer acknowledged receipt of the message.
82
+
. Return to your Astra Streaming tenant “webstore-clicks” and navigate to the “Namespace and Topics” area.
83
+
Expand the “production” namespace and select the “all-clicks” topic.
84
+
Confirm that “Data In” has 1 message and “Data Out” has 1 message. This means the topic took the data in and a consumer acknowledged receipt of the message.
74
85
+
75
86
image:decodable-data-pipeline/03/image6.png[]
76
87
77
-
. On to the “Sinks” tab in Astra and click the “all-clicks” sink. In “Instance Stats” you see “Reads” has a value of 1 and “Writes” has a value of 1. This confirms the Sink consumed a message from the topic and wrote the data to the store.
88
+
. In the “Sinks” tab in Astra, select the “all-clicks” sink. In “Instance Stats” you see “Reads” has a value of 1 and “Writes” has a value of 1. This means the sink consumed a message from the topic and wrote the data to the store.
78
89
+
79
90
image:decodable-data-pipeline/03/image5.png[]
80
91
81
-
. Let’s look at the final data in Astra DB. Navigate to the Astra home and click the “webstore-clicks” Serverless Database. Choose the “CQL Console” tab and copy/paste the following command in the terminal.
92
+
. Finally, let’s look at the final data in your Astra database. Navigate to the Astra home and click the “webstore-clicks” Serverless Database. Choose the “CQL Console” tab and copy/paste the following command in the terminal.
82
93
+
83
-
[source,sql]
94
+
[tabs]
95
+
====
96
+
CQL::
97
+
+
98
+
--
99
+
[source,sql,subs="attributes+"]
84
100
----
85
101
select * from click_data.all_clicks;
86
102
----
103
+
--
104
+
105
+
Result::
87
106
+
88
-
Your should see a single record output.
89
-
+
90
-
image:decodable-data-pipeline/03/image8.png[]
107
+
--
108
+
[source,sql]
109
+
----
110
+
token@cqlsh> EXPAND ON; //this cleans up the output
This confirms that the data was successfully written to the database.
91
130
92
-
{emoji-tada}{emoji-tada} Queue the loud cheers and high-fives! Our pipeline ingested raw web click data, normalized it, and persisted the parsed data to the database! Woot woot!!
131
+
{emoji-tada}{emoji-tada} Cue the cheers and high-fives! Our pipeline ingested raw web click data, normalized it, and persisted the parsed data to the database! Woot woot!!
93
132
94
133
== Follow the flow of the product clicks data
95
134
96
-
Similar to how you followed the above flow follow this flow to confirm the filtered messages were stored.
135
+
Similar to how you followed the above flow of raw click data, follow this flow to confirm the filtered messages were stored.
97
136
98
137
. Navigate to your Decodable pipeline named “Webstore-Product-Clicks-Pipeline”.
99
138
.. The “Input Metrics” should be 1 record and the “Output Metrics” should be 1 record.
100
139
101
140
. Navigate to your Decodable connection named “Astra-Streaming-Product-Webclicks-Connection”.
102
141
.. The “Input Metrics” should be 1 record.
103
142
104
-
. Head over to your Astra tenant to the production/product-clicks topic.
143
+
. Navigate to your Astra tenant and check the production/product-clicks topic.
105
144
.. There should be 1 message in “Data In” and 1 message in “Data Out”.
106
145
107
-
. Finally, to your Astra database CQL Console.
108
-
.. Query the product clicks table
146
+
. Finally, navigate to your Astra database CQL Console.
147
+
.. Query the product_clicks table:
109
148
+
110
-
[source,sql]
149
+
[tabs]
150
+
====
151
+
CQL::
152
+
+
153
+
--
154
+
[source,sql,subs="attributes+"]
111
155
----
112
156
select * from click_data.product_clicks;
113
157
----
114
-
+
115
-
image:decodable-data-pipeline/03/image7.png[]
158
+
--
116
159
117
-
{emoji-rocket}{emoji-rocket} Yesssss! The first web click data we entered happened to be a product click. So the data was filtered in the pipeline and processed into the correct table!
{emoji-rocket}{emoji-rocket} Yesssss! The first web click data we entered happened to be a product click, so the data was filtered in the pipeline and processed into the correct table!
118
174
119
175
== Example real-time site data
120
176
121
-
Let’s see what this can do! To put a load on the pipeline we’ll need a way to continuously post data to our endpoint. Below are a few examples. Use the download button below to download a zip of a static(html) site ecommerce catalog, that silently posts click data to an endpoint. The site is a copy of https://www.blazemeter.com/[BlazeMeter^]’s{external-link-icon} https://www.demoblaze.com/[Demoblaze site^]{external-link-icon}.
177
+
Let’s see what this can do! To put a load on the pipeline, we’ll need a way to continuously post data to our endpoint. Below are a few examples.
122
178
123
-
You’ll need 2 pieces of information the Endpoint URL and an authorization token. Learn more about retrieving both of those in https://docs.decodable.co/docs/connector-reference-rest#authentication[Decodable documentation^]{external-link-icon}.
124
-
125
-
Once you extract the zip, open the folder in your text editor of IDE of choice and look in the script.js file. There are 2 placeholders for the data retrieved above.
179
+
. Use the download button below to download a zip of a static HTML e-commerce catalog that silently posts click data to an endpoint.
180
+
The site is a copy of https://www.blazemeter.com/[BlazeMeter^]’s{external-link-icon} https://www.demoblaze.com/[Demoblaze site^]{external-link-icon}.
Open the phones.html file in your browser (yes, as a local file) and begin clicking on products. Each click should be a new post to your Decodable endpoint.
184
+
. Extract the zip, open the folder in your text editor or IDE of choice, and open the "script.js" file.
185
+
There are 2 placeholders for data you'll need to retrieve from Decodable: the Endpoint URL and an authorization token.
186
+
+
187
+
[source,bash]
188
+
----
189
+
function post_click(url){
190
+
let decodable_token = "access token: <value retrieved from access_token in .decodable/auth>";
191
+
let endpoint_url = "https://ddieruf.api.decodable.co/v1alpha2/connections/4f003544/events";
192
+
----
193
+
+
194
+
Learn more about retrieving the Endpoint URL and auth token in the https://docs.decodable.co/docs/connector-reference-rest#authentication[Decodable documentation^]{external-link-icon}.
This guide is part of a series that creates a real-time data pipeline with Astra and Decodable. For context and prerequisites, start xref:streaming-learning:use-cases-architectures:real-time-data-pipeline/index.adoc[here].
0 commit comments