Refactor: Move all deduplication functions from library to CLI #29
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The
bibfmtlibrary contained several functions that belonged more naturally in thebibdedupCLI tool rather than in the library, as they either performed I/O operations or implemented workflow logic specific to the CLI tool. This:Solution
This PR refactors the deduplication functionality by:
Moved to CLI (
bibfmt/bin/bibdedup.ml):deduplicate_entries- orchestrates the deduplication workflow with progress outputresolve_conflicts- interactively prompts users to resolve field conflictsdisplay_conflict- displays conflict information to usersprompt_user_choice- reads user input for conflict resolutionmerge_entries_non_interactive- merges duplicate entries (workflow logic)Kept in Library (
bibfmt/lib/bibtex.ml):find_duplicate_groups- pure function that identifies duplicates for programmatic usestring_of_field_value,make_fieldfield_conflict,duplicate_groupUpdated Public Interface (
bibfmt/lib/bibtex.mli):deduplicate_entriesandmerge_entries_non_interactivefrom public APIstring_of_field_valueandmake_fieldhelpers (needed by CLI)Benefits
bibdedupCLI works exactly as beforefind_duplicate_groupscan be used programmatically for duplicate detectionTesting
While I don't have the OCaml toolchain set up to run tests, the refactoring:
The
bibdedupCLI should behave identically to before, and the library now provides a cleaner API focused on duplicate detection rather than complete deduplication workflows.Closes #[issue-number]
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.