feat: implemented idempotency #25

Satvik-Singh192 · 2025-10-23T12:01:44Z

Description

This PR implements idempotency in the database load function, resolving Issue #12.

It replaces the previous df.to_sql(..., if_exists="append") method, which was vulnerable to creating duplicate entries if the pipeline was run multiple times.

The new implementation:

Adds a CREATE TABLE IF NOT EXISTS command to define the table schema and, critically, sets the employee_id as the PRIMARY KEY.
Uses a bulk INSERT OR IGNORE query with cursor.executemany(). This command instructs the database to skip any row where the employee_id already exists, ensuring data is not duplicated on subsequent runs.

Semver Changes

Patch (bug fix, no new features)
Minor (new features, no breaking changes)
Major (breaking changes)

Issues

Closes #12

Checklist

I have read the Contributing Guidelines.

Closes OPCODE-Open-Spring-Fest#12

Satvik-Singh192 added 2 commits October 23, 2025 17:29

feat: implemented idempotency

157ee7e

Closes OPCODE-Open-Spring-Fest#12

fix: correct datatype for columns in database

ed9be5b

Dheerajyadav1 added Type:Hard Semver:minor PR:Accept hactoberfest-accepted labels Oct 23, 2025

Dheerajyadav1 merged commit 6e5d072 into OPCODE-Open-Spring-Fest:main Oct 23, 2025
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implemented idempotency #25

feat: implemented idempotency #25

Uh oh!

Satvik-Singh192 commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: implemented idempotency #25

feat: implemented idempotency #25

Uh oh!

Conversation

Satvik-Singh192 commented Oct 23, 2025

Description

Semver Changes

Issues

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants