CICD enabled Snowflake based data pipelines for USA CBP imports and exports data. End user application that shows analytical dashboards on this supply chain dataset and AI enabled SQL query engine to do adhoc analysis
- Aditya Kawale
- NUID 002766716
- Email [email protected]
- Nidhi Singh
- NUID 002925684
- Email [email protected]
- Uddhav Zambare
- NUID 002199488
- Email [email protected]
- Codelab Doc - link
- Demo Link - link
- Streamlit - link
- GitRepo Links for Part A - Nidhi- link - Aditya- link - Uddhav- link
Download SnowSQL CLI from this website https://developers.snowflake.com/snowsql/
Pre-requisites:
- Anaconda Installation
- VS Code
- Snowflake VS Code extension
Create a conda environment by running
conda env create -f environment.yml
Activate the conda environment
conda activate snowflake-demo
Point your VS Code workspace to use this newly created conda environment python interpreter
Update the config file at ~/.snowsql/config
by adding the following snippet
[connections.example] will be present already edit it, but add [connections.dev]
[connections.example]
accountname = <Your SF Account Name>
username = <Your SF Username>
password = <<Your SF Password>
[connections.dev]
accountname = <Your SF Account Name>
username = <Your SF Username>
password = <<Your SF Password>
rolename = ROLE_DAMG7245_TEAM7
warehousename = WH_DAMG7245_TEAM7
dbname = DB_DAMG7245_TEAM7
To test connectivity to snowflake using snowsql cli use the command in your terminal:
snowsql -c example
If you have setup MFA for Snowflake, please complete the authorisation to successfully login to Snowflake using CLI
Run the 01_setup_snowflake.sql
to create warehouse, db and roles configured in ~/.snowsql/config
-> [connections.dev]
To test connectivity to newly created artifacts using snowsql cli use the command in your terminal:
snowsql -c dev
Use the below website to 'Get' the free dataset into your snowflake account Note: While configuring the database into your snowflake keep the default settings like database name, schema name etc.
Dataset 1: Trademo, US Imports Bill of Lading
Dataset 2: Resilinc, EventWatch AI
Dataset 3: Veridion, Supplier Intelligence: Product Catalogs
Run the script 02_setup_sfmarketplace_data.sql
to set the alias for these snowflake marketplace datasets
To create the views execute the 03_imports_party_rel_v.py
and 07_create_imports_events_view.py
script. Let's execute it from the terminal. So go back to the terminal in VS Code, make sure that conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps
python 03_imports_party_rel_v.py
cd steps
python 07_create_imports_events_view.py
To test the UDF locally, execute the steps/04_price_per_manifest_udf/app.py
script. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/04_price_per_manifest_udf
python app.py 35
To deploy your UDF to Snowflake we will use the SnowCLI tool. The SnowCLI tool will do all the heavy lifting of packaging up your application, copying it to a Snowflake stage, and creating the object in Snowflake. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/04_price_per_manifest_udf
snow function create
To test the UDF locally, execute the steps/05_kg_to_pounds_udf/app.py
script. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/05_kg_to_pounds_udf
python app.py 35
To deploy your UDF to Snowflake we will use the SnowCLI tool. The SnowCLI tool will do all the heavy lifting of packaging up your application, copying it to a Snowflake stage, and creating the object in Snowflake. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/05_kg_to_pounds_udf
snow function create
To test the procedure locally, you will execute the steps/06_IMPORT_PARTY_METRICS_sp/app.py
script. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/06_orders_update_sp
python app.py
To deploy your sproc to Snowflake we will use the SnowCLI tool. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/06_IMPORT_PARTY_METRICS_sp
snow procedure create
To test the procedure locally, you will execute the steps/08_populate_imports_product_catalog_view_sp/app.py
script. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/08_populate_imports_product_catalog_view_sp
python app.py
To deploy your sproc to Snowflake we will use the SnowCLI tool. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/08_populate_imports_product_catalog_view_sp
snow procedure create
To test the procedure locally, you will execute the steps/09_populate_supplier_product_catalog_view_sp/app.py
script. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/09_populate_supplier_product_catalog_view_sp
python app.py
To deploy your sproc to Snowflake we will use the SnowCLI tool. Like we did in the previous steps, we'll execute it from the terminal. So go back to the terminal in VS Code, make sure that your snowflake-demo conda environment is active, then run the following commands (which assume that your terminal has the root of your repository open):
cd steps/09_populate_supplier_product_catalog_view_sp
snow procedure create
- Aditya : 34
%
- Nidhi : 33
%
- Uddhav : 33
%
Developer | Deliverables |
---|---|
Aditya | Data loading to snowflake for pipeline |
Aditya | Creating views and UDFs for One of the datasets |
Aditya | CICD code and testing |
Uddhav | Architecture Diagram |
Uddhav | Creating views and UDFs for One of the datasets |
Nidhi | Streamlit and Langchain code |
Nidhi | Data Research and Documentation |
WE ATTEST THAT WE HAVEN’T USED ANY OTHER STUDENTS’ WORK IN OUR ASSIGNMENT AND ABIDE BY THE POLICIES LISTED IN THE STUDENT HANDBOOK.