Skip to content

Added Mod 10 check#149

Open
alan-y wants to merge 17 commits intomasterfrom
feature/mod10
Open

Added Mod 10 check#149
alan-y wants to merge 17 commits intomasterfrom
feature/mod10

Conversation

@alan-y
Copy link
Copy Markdown

@alan-y alan-y commented Oct 29, 2025

Added Mod 10 check to checksum() for chi_check(). I have done some initial testing and seems ok to me but definitely needs more testing from others and maybe documentation updates, etc. Happy to take suggestions if coding can be improved as well.

Just thought of an idea - maybe include an option to run Mod 11 only or both Mod 11 and Mod 10?

@alan-y alan-y requested review from Moohan, Nic-Chr and Tina815 October 29, 2025 21:06
@alan-y alan-y self-assigned this Oct 29, 2025
@alan-y alan-y added the enhancement New feature or request label Oct 29, 2025
@alan-y alan-y linked an issue Oct 31, 2025 that may be closed by this pull request
@alan-y alan-y requested a review from Tina815 November 4, 2025 16:43
@Nic-Chr
Copy link
Copy Markdown
Contributor

Nic-Chr commented Nov 5, 2025

I just ran the chi_check() function on a test dataset of CHIs provided by Alan and generated these results.

I’m guessing that it’s normal for some mod10 CHIs to coincidentally pass the mod11 check (and vice versa)?

library(tidyverse)
library(phsmethods) # feature/mod10 version

chis <- readr::read_csv("~/chi_dataset.csv")
#> Rows: 105000 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): CHI_NUMBER, MODULUS
#> lgl (1): IS_VALID
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

chis |> 
  count(MODULUS)
#> # A tibble: 3 × 2
#>   MODULUS     n
#>   <chr>   <int>
#> 1 Mod10   50000
#> 2 Mod11   50000
#> 3 N/A      5000

chis <- chis |> 
  mutate(valid_mod11 = chi_check(CHI_NUMBER, check_mod11 = TRUE, check_mod10 = FALSE),
         valid_mod10 = chi_check(CHI_NUMBER, check_mod11 = FALSE, check_mod10 = TRUE),
         valid_any = chi_check(CHI_NUMBER))

chis |> 
  filter(valid_mod11 == "Valid CHI" & MODULUS != "Mod11") |> 
  count(MODULUS)
#> # A tibble: 1 × 2
#>   MODULUS     n
#>   <chr>   <int>
#> 1 Mod10    4530

chis |> 
  filter(valid_mod10 == "Valid CHI" & MODULUS != "Mod10") |> 
  count(MODULUS)
#> # A tibble: 1 × 2
#>   MODULUS     n
#>   <chr>   <int>
#> 1 Mod11    4920

# All combinations
chis |> 
  count(MODULUS, valid_mod10, valid_mod11)
#> # A tibble: 5 × 4
#>   MODULUS valid_mod10      valid_mod11          n
#>   <chr>   <chr>            <chr>            <int>
#> 1 Mod10   Valid CHI        Invalid checksum 45470
#> 2 Mod10   Valid CHI        Valid CHI         4530
#> 3 Mod11   Invalid checksum Valid CHI        45080
#> 4 Mod11   Valid CHI        Valid CHI         4920
#> 5 N/A     Invalid checksum Invalid checksum  5000

Created on 2025-11-05 with reprex v2.1.1

@Moohan
Copy link
Copy Markdown
Member

Moohan commented Nov 5, 2025

I’m guessing that it’s normal for some mod10 CHIs to coincidentally pass the mod11 check (and vice versa)?

I think definitely some to be expected, although the ~10% number you found is surprising / worrying, as I would have hoped it would be much lower than that!

@alan-y
Copy link
Copy Markdown
Author

alan-y commented Nov 5, 2025

I just ran the chi_check() function on a test dataset of CHIs provided by Alan and generated these results.

I’m guessing that it’s normal for some mod10 CHIs to coincidentally pass the mod11 check (and vice versa)?

Yeah I saw this as well when I tested. The CHIs in that dataset were generated and not real so I'm not sure if the CHIs given to people for real will run into this issue as much? I'm guessing that in reality they can only assign CHIs that haven't already been used for people so the likelihood of this happening (CHI generated with Mod10 also passing Mod11) could be a lot less in practice?

I also think that passing both mod11 and mod10 isn't really the issue. The main issue is probably passing Mod10 but not Mod11 as that is not allowed prior to implementation date.

I'll do some testing on a real current dataset as that'll give us a chance to see how many CHIs are passing Mod10 and not Mod11 which is not currently allowed so we can see the scale of the issue.

alan-y and others added 3 commits November 5, 2025 12:08
Co-authored-by: James Hayes (né McMahon) <james.hayes2@phs.scot>
@alan-y
Copy link
Copy Markdown
Author

alan-y commented Nov 25, 2025

@Moohan I've tried updating the vignette to add something about Mod-10. I'm sure wording will need edited (e.g. making a bit more use of 'plain English' but thought it would be good to put something down for now. I think that the vignette has ended up being run through the air formatter - hope that's ok!

Copy link
Copy Markdown
Member

@Moohan Moohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fairly minor wording change suggestions.

Co-authored-by: James Hayes (né McMahon) <james.hayes2@phs.scot>
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Modulus 10 check digit on CHI

4 participants