Skip to content

Commit 620c5d7

Browse files
authored
Update README.md
1 parent 910b451 commit 620c5d7

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -156,8 +156,8 @@ All transformations were written in modular `.sql` models, configured via `dbt_p
156156

157157
It shows how each model in the pipeline is derived from raw external source tables in BigQuery:
158158

159-
- **Sources** (`SRC`) like `claims_data_external`, `patient_data_external`, and `ehr_data_external` represent external tables that directly query files stored in Google Cloud Storage
160-
- **Models** (`MDL`) like `high-claim-patients`, `chronic-conditions-summary`, and `health-anomalies` represent transformed tables built using SQL logic in DBT
159+
- **Sources** (`SRC`) like `claims_data_external`, `patient_data_external`, and `ehr_data_external` represent external tables that directly query files stored in Google Cloud Storage
160+
- **Models** (`MDL`) like `high-claim-patients`, `chronic-conditions-summary`, and `health-anomalies` represent transformed tables built using SQL logic in DBT
161161

162162

163163

@@ -172,7 +172,7 @@ It shows how each model in the pipeline is derived from raw external source tabl
172172
To ensure data quality and trust in the pipeline, I implemented column-level tests and added documentation using `schema.yml` files in DBT.
173173
DBT allows us to define tests and metadata **alongside our models** — all inside YAML. These tests run automatically using `dbt test`.
174174

175-
#### Why I Used `schema.yml`:
175+
#### Why I Used `schema.yml`:
176176

177177
- To enforce data integrity on critical columns (`not_null`, `unique`)
178178
- To validate raw data coming from external sources
@@ -222,7 +222,7 @@ This ensures:
222222

223223
<p align="center"> <img src="./images/github-actions-ci-success.png" alt="CI Passing" width="700"/> </p> `
224224

225-
Example snippet from `ci.yml`:
225+
Example snippet from `ci.yml`:
226226

227227
```yaml
228228
on:
@@ -259,7 +259,7 @@ Triggered on push to main
259259

260260
<p align="center"> <img src="./images/github-actions-cd-success.png" alt="CD to Production" width="700"/> </p>
261261

262-
Secure deployment:
262+
Secure deployment:
263263

264264
GCP credentials stored as GitHub Secrets
265265

@@ -273,15 +273,15 @@ gcp-key.json and profiles.yml are generated at runtime (not stored in repo)
273273
- Practiced writing modular, testable SQL models with automated validations
274274
- Built an end-to-end pipeline that mirrors real-world engineering workflows
275275

276-
## Conclusion
276+
## Conclusion
277277

278278
This project started as a hands-on learning exercise and became a full-stack, automated data engineering pipeline. I worked with industry-standard tools (GCP, DBT, GitHub Actions), built my own data sources, and pushed transformations all the way to production.
279279

280280
It reflects both the technical skills I’ve developed and my drive to learn independently and build real, usable solutions.
281281

282282
---
283283

284-
## 🙏 Acknowledgements
284+
## Acknowledgements
285285

286286
This project was built by closely following a YouTube tutorial by [DATA TIME](https://www.youtube.com/playlist?list=PLs9W2D7jqlTXbHWkpNUzIC_G8KpLMH6yZ), which covered how to build an end-to-end data pipeline using DBT, BigQuery, and GitHub Actions.
287287

0 commit comments

Comments
 (0)