Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 10, 2025

Problem

The bibfmt library contained several functions that belonged more naturally in the bibdedup CLI tool rather than in the library, as they either performed I/O operations or implemented workflow logic specific to the CLI tool. This:

  1. Mixed business logic with user interaction
  2. Made the library unsuitable for programmatic use in non-interactive contexts
  3. Violated separation of concerns between library and application layers

Solution

This PR refactors the deduplication functionality by:

Moved to CLI (bibfmt/bin/bibdedup.ml):

  • deduplicate_entries - orchestrates the deduplication workflow with progress output
  • resolve_conflicts - interactively prompts users to resolve field conflicts
  • display_conflict - displays conflict information to users
  • prompt_user_choice - reads user input for conflict resolution
  • merge_entries_non_interactive - merges duplicate entries (workflow logic)

Kept in Library (bibfmt/lib/bibtex.ml):

  • find_duplicate_groups - pure function that identifies duplicates for programmatic use
  • Helper functions: string_of_field_value, make_field
  • Type definitions: field_conflict, duplicate_group

Updated Public Interface (bibfmt/lib/bibtex.mli):

  • Removed deduplicate_entries and merge_entries_non_interactive from public API
  • Exposed string_of_field_value and make_field helpers (needed by CLI)

Benefits

  • Cleaner architecture: Library now provides only core duplicate detection functionality
  • Better separation of concerns: All deduplication workflow logic is confined to the CLI layer
  • No duplicate code: All deduplication and merging logic lives in one place
  • Maintains functionality: bibdedup CLI works exactly as before
  • Improved reusability: Library's find_duplicate_groups can be used programmatically for duplicate detection

Testing

While I don't have the OCaml toolchain set up to run tests, the refactoring:

  • Preserves all existing functionality by moving code without changing logic
  • Uses the same algorithms and data structures as before
  • Only changes where the code lives (library vs CLI)

The bibdedup CLI should behave identically to before, and the library now provides a cleaner API focused on duplicate detection rather than complete deduplication workflows.

Closes #[issue-number]

Original prompt

On bibtex.ml, resolve_conflicts, merge_entries and deduplicate_entries are all functions that require interaction or IO to inform the user. They belong better inside the bibdedup.ml cli tool rather than here in the library. Make a PR to move the code appropriately, clean up unnecessary duplication


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Move IO functions to bibdedup CLI tool Refactor: Move IO-dependent deduplication functions from library to CLI Oct 10, 2025
@mseri
Copy link
Owner

mseri commented Oct 10, 2025

@copilot move also merge_entries_non_interactive

Copy link
Contributor Author

Copilot AI commented Oct 10, 2025

@copilot move also merge_entries_non_interactive

Done. Moved merge_entries_non_interactive to bibdedup.ml in commit 8dd3e89.

Copilot AI changed the title Refactor: Move IO-dependent deduplication functions from library to CLI Refactor: Move all deduplication functions from library to CLI Oct 10, 2025
Copilot AI requested a review from mseri October 10, 2025 20:34
@mseri mseri marked this pull request as ready for review October 10, 2025 20:46
@mseri mseri merged commit ca395c3 into main Oct 10, 2025
18 checks passed
@mseri mseri deleted the copilot/move-functions-to-bibdedup-cli branch October 10, 2025 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants