|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "f60c52ce", |
| 5 | + "id": "5d5afc66", |
6 | 6 | "metadata": { |
7 | 7 | "tags": [] |
8 | 8 | }, |
|
27 | 27 | }, |
28 | 28 | { |
29 | 29 | "cell_type": "markdown", |
30 | | - "id": "7ec7724a", |
| 30 | + "id": "d20eee58", |
31 | 31 | "metadata": {}, |
32 | 32 | "source": [ |
33 | 33 | "## <span style='color:#ff5f27'> 📝 Imports" |
|
36 | 36 | { |
37 | 37 | "cell_type": "code", |
38 | 38 | "execution_count": null, |
39 | | - "id": "42efa933", |
| 39 | + "id": "802d6425", |
40 | 40 | "metadata": {}, |
41 | 41 | "outputs": [], |
42 | 42 | "source": [ |
|
46 | 46 | { |
47 | 47 | "cell_type": "code", |
48 | 48 | "execution_count": null, |
49 | | - "id": "a4120a95", |
| 49 | + "id": "bd55a691", |
50 | 50 | "metadata": {}, |
51 | 51 | "outputs": [], |
52 | 52 | "source": [ |
|
64 | 64 | }, |
65 | 65 | { |
66 | 66 | "cell_type": "markdown", |
67 | | - "id": "010286f1", |
| 67 | + "id": "879a539d", |
68 | 68 | "metadata": {}, |
69 | 69 | "source": [ |
70 | 70 | "## <span style=\"color:#ff5f27;\"> 💽 Loading the Data </span>\n", |
|
84 | 84 | { |
85 | 85 | "cell_type": "code", |
86 | 86 | "execution_count": null, |
87 | | - "id": "491a37b4", |
| 87 | + "id": "586d4111", |
88 | 88 | "metadata": {}, |
89 | 89 | "outputs": [], |
90 | 90 | "source": [ |
|
100 | 100 | { |
101 | 101 | "cell_type": "code", |
102 | 102 | "execution_count": null, |
103 | | - "id": "9ed8c4f0", |
| 103 | + "id": "0cdb9c3d", |
104 | 104 | "metadata": {}, |
105 | 105 | "outputs": [], |
106 | 106 | "source": [ |
|
118 | 118 | { |
119 | 119 | "cell_type": "code", |
120 | 120 | "execution_count": null, |
121 | | - "id": "2543c511", |
| 121 | + "id": "5a6fb4f4", |
122 | 122 | "metadata": {}, |
123 | 123 | "outputs": [], |
124 | 124 | "source": [ |
|
135 | 135 | }, |
136 | 136 | { |
137 | 137 | "cell_type": "markdown", |
138 | | - "id": "717f9102", |
| 138 | + "id": "fbb301ce", |
139 | 139 | "metadata": {}, |
140 | 140 | "source": [ |
141 | 141 | "---" |
142 | 142 | ] |
143 | 143 | }, |
144 | 144 | { |
145 | 145 | "cell_type": "markdown", |
146 | | - "id": "16357302", |
| 146 | + "id": "bf6f9299", |
147 | 147 | "metadata": {}, |
148 | 148 | "source": [ |
149 | 149 | "## <span style=\"color:#ff5f27;\"> 🛠️ Feature Engineering </span>\n", |
|
158 | 158 | { |
159 | 159 | "cell_type": "code", |
160 | 160 | "execution_count": null, |
161 | | - "id": "0b919bf8", |
| 161 | + "id": "d1af85e4", |
162 | 162 | "metadata": {}, |
163 | 163 | "outputs": [], |
164 | 164 | "source": [ |
|
181 | 181 | { |
182 | 182 | "cell_type": "code", |
183 | 183 | "execution_count": null, |
184 | | - "id": "5d6fd9ae", |
| 184 | + "id": "4956cb52", |
185 | 185 | "metadata": {}, |
186 | 186 | "outputs": [], |
187 | 187 | "source": [ |
|
191 | 191 | }, |
192 | 192 | { |
193 | 193 | "cell_type": "markdown", |
194 | | - "id": "92820fab", |
| 194 | + "id": "bc4e505f", |
195 | 195 | "metadata": {}, |
196 | 196 | "source": [ |
197 | 197 | "Next, you will create features that for each credit card aggregate data from multiple time steps.\n", |
|
203 | 203 | { |
204 | 204 | "cell_type": "code", |
205 | 205 | "execution_count": null, |
206 | | - "id": "47116f13", |
| 206 | + "id": "ee689954", |
207 | 207 | "metadata": {}, |
208 | 208 | "outputs": [], |
209 | 209 | "source": [ |
|
223 | 223 | }, |
224 | 224 | { |
225 | 225 | "cell_type": "markdown", |
226 | | - "id": "0b3abcfc", |
| 226 | + "id": "ed7039bb", |
227 | 227 | "metadata": {}, |
228 | 228 | "source": [ |
229 | 229 | "Next lets compute windowed aggregates. Here you will use 4-hour windows, but feel free to experiment with different window lengths by setting `window_len` below to a value of your choice." |
|
232 | 232 | { |
233 | 233 | "cell_type": "code", |
234 | 234 | "execution_count": null, |
235 | | - "id": "f9bf9dd7", |
| 235 | + "id": "791708c4", |
236 | 236 | "metadata": {}, |
237 | 237 | "outputs": [], |
238 | 238 | "source": [ |
|
248 | 248 | }, |
249 | 249 | { |
250 | 250 | "cell_type": "markdown", |
251 | | - "id": "b3021fa4", |
| 251 | + "id": "391e478f", |
252 | 252 | "metadata": {}, |
253 | 253 | "source": [ |
254 | 254 | "### <span style=\"color:#ff5f27;\">⚙️ Convert date time object to unix epoch in milliseconds </span>" |
|
257 | 257 | { |
258 | 258 | "cell_type": "code", |
259 | 259 | "execution_count": null, |
260 | | - "id": "837bafd5", |
| 260 | + "id": "bc6bcae3", |
261 | 261 | "metadata": {}, |
262 | 262 | "outputs": [], |
263 | 263 | "source": [ |
|
270 | 270 | }, |
271 | 271 | { |
272 | 272 | "cell_type": "markdown", |
273 | | - "id": "a3df26a9", |
| 273 | + "id": "bc3fec23", |
274 | 274 | "metadata": {}, |
275 | 275 | "source": [ |
276 | 276 | "## <span style=\"color:#ff5f27;\">👮🏻♂️ Great Expectations </span>" |
|
279 | 279 | { |
280 | 280 | "cell_type": "code", |
281 | 281 | "execution_count": null, |
282 | | - "id": "c9391579", |
| 282 | + "id": "cbc3ca3c", |
283 | 283 | "metadata": {}, |
284 | 284 | "outputs": [], |
285 | 285 | "source": [ |
|
299 | 299 | { |
300 | 300 | "cell_type": "code", |
301 | 301 | "execution_count": null, |
302 | | - "id": "8969ff27", |
| 302 | + "id": "b3891fe5", |
303 | 303 | "metadata": {}, |
304 | 304 | "outputs": [], |
305 | 305 | "source": [ |
|
340 | 340 | }, |
341 | 341 | { |
342 | 342 | "cell_type": "markdown", |
343 | | - "id": "adf03efd", |
| 343 | + "id": "da2f174f", |
344 | 344 | "metadata": {}, |
345 | 345 | "source": [ |
346 | 346 | "---" |
347 | 347 | ] |
348 | 348 | }, |
349 | 349 | { |
350 | 350 | "cell_type": "markdown", |
351 | | - "id": "7c069f5a", |
| 351 | + "id": "c48bbff9", |
352 | 352 | "metadata": {}, |
353 | 353 | "source": [ |
354 | 354 | "## <span style=\"color:#ff5f27;\"> 📡 Connecting to Hopsworks Feature Store </span>" |
355 | 355 | ] |
356 | 356 | }, |
357 | 357 | { |
358 | 358 | "cell_type": "markdown", |
359 | | - "id": "ac32437d", |
| 359 | + "id": "4d68f207", |
360 | 360 | "metadata": { |
361 | 361 | "tags": [] |
362 | 362 | }, |
|
373 | 373 | { |
374 | 374 | "cell_type": "code", |
375 | 375 | "execution_count": null, |
376 | | - "id": "287491fa", |
| 376 | + "id": "4c259c35", |
377 | 377 | "metadata": {}, |
378 | 378 | "outputs": [], |
379 | 379 | "source": [ |
|
386 | 386 | }, |
387 | 387 | { |
388 | 388 | "cell_type": "markdown", |
389 | | - "id": "23b61089", |
| 389 | + "id": "0f80ccc5", |
390 | 390 | "metadata": {}, |
391 | 391 | "source": [ |
392 | 392 | "To create a feature group you need to give it a name and specify a primary key. It is also good to provide a description of the contents of the feature group and a version number, if it is not defined it will automatically be incremented to `1`." |
|
395 | 395 | { |
396 | 396 | "cell_type": "code", |
397 | 397 | "execution_count": null, |
398 | | - "id": "58353322", |
| 398 | + "id": "c78c614a", |
399 | 399 | "metadata": {}, |
400 | 400 | "outputs": [], |
401 | 401 | "source": [ |
|
412 | 412 | }, |
413 | 413 | { |
414 | 414 | "cell_type": "markdown", |
415 | | - "id": "832333fa", |
| 415 | + "id": "45fcf5c3", |
416 | 416 | "metadata": {}, |
417 | 417 | "source": [ |
418 | 418 | "A full list of arguments can be found in the [documentation](https://docs.hopsworks.ai/feature-store-api/latest/generated/api/feature_store_api/#create_feature_group).\n", |
|
423 | 423 | { |
424 | 424 | "cell_type": "code", |
425 | 425 | "execution_count": null, |
426 | | - "id": "7f82e4a2", |
| 426 | + "id": "e97b5aab", |
427 | 427 | "metadata": {}, |
428 | 428 | "outputs": [], |
429 | 429 | "source": [ |
|
435 | 435 | { |
436 | 436 | "cell_type": "code", |
437 | 437 | "execution_count": null, |
438 | | - "id": "0c6832a9", |
| 438 | + "id": "2f6cc0c0", |
439 | 439 | "metadata": {}, |
440 | 440 | "outputs": [], |
441 | 441 | "source": [ |
|
462 | 462 | }, |
463 | 463 | { |
464 | 464 | "cell_type": "markdown", |
465 | | - "id": "36f7e30a", |
| 465 | + "id": "a57846b4", |
466 | 466 | "metadata": {}, |
467 | 467 | "source": [ |
468 | 468 | "At the creation of the feature group, you will be prompted with an URL that will directly link to it; there you will be able to explore some of the aspects of your newly created feature group.\n", |
|
472 | 472 | }, |
473 | 473 | { |
474 | 474 | "cell_type": "markdown", |
475 | | - "id": "1250c3ce", |
| 475 | + "id": "777e6d4e", |
476 | 476 | "metadata": {}, |
477 | 477 | "source": [ |
478 | 478 | "You can move on and do the same thing for the feature group with our windows aggregation." |
|
481 | 481 | { |
482 | 482 | "cell_type": "code", |
483 | 483 | "execution_count": null, |
484 | | - "id": "6b8ef9f6", |
| 484 | + "id": "35ca5bbb", |
485 | 485 | "metadata": {}, |
486 | 486 | "outputs": [], |
487 | 487 | "source": [ |
|
498 | 498 | { |
499 | 499 | "cell_type": "code", |
500 | 500 | "execution_count": null, |
501 | | - "id": "1970657b", |
| 501 | + "id": "e16aa93e", |
502 | 502 | "metadata": {}, |
503 | 503 | "outputs": [], |
504 | 504 | "source": [ |
|
510 | 510 | { |
511 | 511 | "cell_type": "code", |
512 | 512 | "execution_count": null, |
513 | | - "id": "68b0c84f", |
| 513 | + "id": "2d80f5e2", |
514 | 514 | "metadata": {}, |
515 | 515 | "outputs": [], |
516 | 516 | "source": [ |
|
530 | 530 | }, |
531 | 531 | { |
532 | 532 | "cell_type": "markdown", |
533 | | - "id": "e22dce87", |
| 533 | + "id": "eb2254b9", |
534 | 534 | "metadata": {}, |
535 | 535 | "source": [ |
536 | 536 | "Both feature groups are now accessible and searchable in the UI\n", |
|
540 | 540 | }, |
541 | 541 | { |
542 | 542 | "cell_type": "markdown", |
543 | | - "id": "4f8b5d8a", |
| 543 | + "id": "c6e783a2", |
544 | 544 | "metadata": {}, |
545 | 545 | "source": [ |
546 | 546 | "## <span style=\"color:#ff5f27;\">⏭️ **Next:** Part 02: Training Pipeline\n", |
|
555 | 555 | "hash": "e1ddeae6eefc765c17da80d38ea59b893ab18c0c0904077a035ef84cfe367f83" |
556 | 556 | }, |
557 | 557 | "kernelspec": { |
558 | | - "display_name": "Python 3", |
| 558 | + "display_name": "Python 3 (ipykernel)", |
559 | 559 | "language": "python", |
560 | 560 | "name": "python3" |
561 | 561 | }, |
|
569 | 569 | "name": "python", |
570 | 570 | "nbconvert_exporter": "python", |
571 | 571 | "pygments_lexer": "ipython3", |
572 | | - "version": "3.10.11" |
| 572 | + "version": "3.10.13" |
573 | 573 | } |
574 | 574 | }, |
575 | 575 | "nbformat": 4, |
|
0 commit comments