Skip to content

Add fetching and processing for EUCC Certificates + refactoring the rest of the pipeline#552

Open
tkachyna wants to merge 13 commits intomainfrom
eucc-fetching
Open

Add fetching and processing for EUCC Certificates + refactoring the rest of the pipeline#552
tkachyna wants to merge 13 commits intomainfrom
eucc-fetching

Conversation

@tkachyna
Copy link
Collaborator

@tkachyna tkachyna commented Jan 21, 2026

This pull request adds:

  • logic for scraping the official ENISA webpage, parsing the metadata obtained from this page and initializing the EUCC dataset and certificates (src/sec-certs/dataset/eucc.py, src/sec-certs/sample/eucc.py )
  • refactors common logic shared between EUCC and CC into files named 'cc_eucc_common.py'.
    Since CC and EUCC share a significantly larger portion of code compared to other datasets in the code base, using this prefix makes the intent explicit and immediately signals to developers that this logic is meant only for these two datasets.
    Type safety is still preserved, ensuring that no other dataset or certificate types can be passed to the functions defined in these common files.
    • originally I wanted to solve these common files problems by defining a Mixin class, which CCDataset and EUCCDataset would implement. However I ran into issues with static analysis and mypy, because mypy could not see references to fields used in the Mixin that were defined in the Dataset class. For some reason, I thought this would work as I was assuming Python has something similar to Scala self-types which I work with everyday, but unfortunately, it does not
  • new regex for german eucc certificates
  • tests for the new functions will be added in one of the other PR

@tkachyna tkachyna force-pushed the eucc-fetching branch 4 times, most recently from 42a2a99 to befd669 Compare January 21, 2026 23:03
@crocs-muni crocs-muni deleted a comment from codecov bot Jan 21, 2026
@crocs-muni crocs-muni deleted a comment from codecov bot Jan 24, 2026
@tkachyna tkachyna self-assigned this Jan 24, 2026
@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 58.67550% with 312 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.97%. Comparing base (b543a1e) to head (cde12ab).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sec_certs/dataset/eucc.py 0.00% 178 Missing ⚠️
src/sec_certs/sample/eucc.py 35.00% 65 Missing ⚠️
src/sec_certs/sample/cc_eucc_common.py 79.19% 61 Missing ⚠️
src/sec_certs/dataset/cc_eucc_common.py 95.58% 5 Missing ⚠️
src/sec_certs/heuristics/common.py 85.72% 2 Missing ⚠️
src/sec_certs/sample/cc.py 97.15% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (b543a1e) and HEAD (cde12ab). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (b543a1e) HEAD (cde12ab)
4 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #552       +/-   ##
===========================================
- Coverage   71.91%   56.97%   -14.93%     
===========================================
  Files          76       78        +2     
  Lines        8834     9126      +292     
===========================================
- Hits         6352     5199     -1153     
- Misses       2482     3927     +1445     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tkachyna tkachyna marked this pull request as ready for review January 25, 2026 10:06
@tkachyna tkachyna requested a review from adamjanovsky January 25, 2026 13:01
Copy link
Collaborator

@adamjanovsky adamjanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR 👍 . Few notes:

  • I'd say it's generally good idea to have PR and the corresponding tests in a single PR. If the tests uncover some problems, you'll be addressing them in somewhat unrelated PR which is not ideal. But proceed as you wish in this case.
  • I agree that inheritance or mix-ins are not ideal in this case. The current approach is somewhat wordy but fairly easy to understand.
  • I requested some minor changes here and there.

Copy link
Collaborator

@adamjanovsky adamjanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still one more import we can get rid of. Once addressed, feel free to merge.

@J08nY J08nY added enhancement New feature or request cc Related to CC certification python Pull requests that update python code labels Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cc Related to CC certification enhancement New feature or request python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants