diff --git a/docs/2025/data-pipeline/updates/2025-06-11.md b/docs/2025/data-pipeline/updates/2025-06-11.md index 07fe73b030..68aacbf9ee 100644 --- a/docs/2025/data-pipeline/updates/2025-06-11.md +++ b/docs/2025/data-pipeline/updates/2025-06-11.md @@ -11,7 +11,7 @@ SPDX-FileCopyrightText: 2025 Abdulsobur Oyewale --> # WEEK 2 -*(June 11, 2024)* +*(June 11, 2025)* ## Attendees: - [Shaheem Azmal M MD](https://github.com/shaheemazmalmmd) @@ -24,6 +24,9 @@ SPDX-FileCopyrightText: 2025 Abdulsobur Oyewale * After a successful writing of the SQL script to fetch the required content from the fossology server, I proceeded to write a python program to embed the PostgreSQL script into the program using the psycog library to achieve the connection to the Postgres database server. * With this, i was able to automate the collection of copyright content data from the fossology server running in the local host. +![image](/img/data-pipeline/script1.png) +![image](/img/data-pipeline/script2.png) + ## Meeting Discussion: * I discuss with the mentors about the progress of the week and how the project s going, including if there was any obstacle. diff --git a/docs/2025/data-pipeline/updates/2025.06.25.md b/docs/2025/data-pipeline/updates/2025.06.25.md index 33587072a1..a8d4e39d54 100644 --- a/docs/2025/data-pipeline/updates/2025.06.25.md +++ b/docs/2025/data-pipeline/updates/2025.06.25.md @@ -25,6 +25,10 @@ SPDX-FileCopyrightText: 2025 Abdulsobur Oyewale * I created a `pipeline.yml` file and applied the above script preprocessing and Data spliting script into the pipeline, and included the ability to download the output from each script from the logs while it's performing the triggered GitHub actions. +![image](/img/data-pipeline/pipe.png) + +![image](/img/data-pipeline/pipe1.png) + * I was able to deploy this into my GitHub repository to allow me to test this feature and changes separately on my own GitHub Actions before going ahead to create a Pull Request. ## Meeting Discussion: diff --git a/docs/2025/data-pipeline/updates/2025.07.09.md b/docs/2025/data-pipeline/updates/2025.07.09.md new file mode 100644 index 0000000000..7d5b2da8d4 --- /dev/null +++ b/docs/2025/data-pipeline/updates/2025.07.09.md @@ -0,0 +1,36 @@ +--- +title: Week 6 +author: Abdulsobur Oyewale +tags: [gsoc25, Data Pipeline for Safaa] +--- + + + +# WEEK 5 +*(July 02, 2025)* + +## Attendees: +- [Kaushlendra Pratap](https://github.com/Kaushl2208) + +### Engagements +* This week was about migration of all current available progress so far in the demo repository into the main Safaa repository. +It was really challenging and an eye-opener for me on working with the repository codebase. I faced challenges like incompatible code, +trying to fit in the codes i have written so far into the repository, and a lot of git conflict. +I was able to resolve all the code conflict, pushed it to one of my repository branch, and submitted a pull request for review. +But not much of the git conflict resolved. + +## Meeting Discussion: +* In this week meeting, I had the opportunity to have [Kaushlendra Pratap](https://github.com/Kaushl2208) in the meeting. He worked me through resolving this git conflict, +explain what was the cause, gave me guide on how to resolve the problem, explained every line of commands we used and what it does, and how this can +be resolved if such issue comes up in the future. + +![image](/img/data-pipeline/git_issues.png) +![image](/img/data-pipeline/git_resolve.png) + +## Subsequent Steps +* After submitting my code and opening a pull request, i got a lot of feedback on where i can improve, and make adjustment to make the code much better. +My main goal in the next week will be looking into this review, making corrections based on the feedbacks, and learning from it \ No newline at end of file diff --git a/static/img/data-pipeline/git_issues.png b/static/img/data-pipeline/git_issues.png new file mode 100644 index 0000000000..03bf9a10ec Binary files /dev/null and b/static/img/data-pipeline/git_issues.png differ diff --git a/static/img/data-pipeline/git_resolve.png b/static/img/data-pipeline/git_resolve.png new file mode 100644 index 0000000000..ae1b39b424 Binary files /dev/null and b/static/img/data-pipeline/git_resolve.png differ diff --git a/static/img/data-pipeline/pipe.png b/static/img/data-pipeline/pipe.png new file mode 100644 index 0000000000..cc7a12e966 Binary files /dev/null and b/static/img/data-pipeline/pipe.png differ diff --git a/static/img/data-pipeline/pipe1.png b/static/img/data-pipeline/pipe1.png new file mode 100644 index 0000000000..f515230fc0 Binary files /dev/null and b/static/img/data-pipeline/pipe1.png differ diff --git a/static/img/data-pipeline/script1.png b/static/img/data-pipeline/script1.png new file mode 100644 index 0000000000..f06f1dfe7e Binary files /dev/null and b/static/img/data-pipeline/script1.png differ diff --git a/static/img/data-pipeline/script2.png b/static/img/data-pipeline/script2.png new file mode 100644 index 0000000000..8e2c0e4cb6 Binary files /dev/null and b/static/img/data-pipeline/script2.png differ