FIX float counts in Paladin mapping by celiosantosjr · Pull Request #74 · BigDataBiology/macrel

celiosantosjr · 2025-06-19T03:33:15Z

Paladin mapping does some frame-shitfs and probabilistic alignment, which leads to fractional weights propagating into SAM/BAM files. To fix that:

sam filtering to avoid this problem keeping only uniquely mapped matchers
count should consider only uniquely mapped matchers.
dowgraded version and IR version of onnx models to wider compatibility

- include a rounding in ngless counts to avoid split matches in Paladin

- Eliminate problems of compability - do not test using Python 3.8 due to ONNX models

- eliminate ir10 model

- Add downgraded IR versions of Hemo and AMP models to wide compatibility - Downgrade of IR versions to 7 (allows compatibility to the present the installed environment and before ones) - Include the same models adapted with: ``` import onnx # Load the model model = onnx.load("your_model.onnx") # Downgrade IR version from 10 to 7 model.ir_version = 7 # Save the model onnx.save(model, "downgraded_model.onnx") ```

- Downgrade version of the model to ONNX 16 - Dowgrade version to IR version 7 ``` >>> from onnx import version_converter >>> model = onnx.load('TEST.onnx') >>> model.ir_version = 7 >>> model = optimize(model, ["eliminate_identity", "fuse_consecutive_transposes"]) >>> model = version_converter.convert_version(model, 16) >>> onnx.save(model, "TEST.onnx") ```

Key Changes: Added filter() step with samflag(secondary=False) which is the NGless equivalent of -F 256 The filtered alignments are then passed to count() Maintained your existing integer conversion and output

Big change is switching to ONNX for model storage to avoid depending on very particular versions of scikit-learn - Fix float counts in Paladin mapping (BigDataBiology#74) - Fix some issues in MacOS

celiosantosjr added 9 commits June 19, 2025 11:30

Round counts in count.ngl

acaa663

- include a rounding in ngless counts to avoid split matches in Paladin

FIX setup.py

6813e92

- Eliminate problems of compability - do not test using Python 3.8 due to ONNX models

Delete macrel/data/models/AMP.onnx.gz

795b09f

- eliminate ir10 model

Delete macrel/data/models/Hemo.onnx.gz

6115a30

- eliminate ir10 model

Delete macrel/data/models directory

2174108

Create README.txt

f2949d7

ENH count.ngl

4193ebb

Key Changes: Added filter() step with samflag(secondary=False) which is the NGless equivalent of -F 256 The filtered alignments are then passed to count() Maintained your existing integer conversion and output

This was linked to issues Jun 19, 2025

Use a more robust way to distribute models #72

Closed

Non-integer results appear when AMP abundance is measured. #73

Closed

celiosantosjr added 5 commits June 19, 2025 12:50

FIX count.ngl

36cf937

Update count.ngl

a86d4a7

Update count.ngl

2993803

Merge branch 'main' into celiosantosjr-patch-1

81f44aa

Update count.ngl

d3170c1

celiosantosjr added bug Something isn't working enhancement New feature or request labels Jun 19, 2025

celiosantosjr merged commit c21cd19 into main Jun 19, 2025
30 checks passed

celiosantosjr deleted the celiosantosjr-patch-1 branch June 19, 2025 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX float counts in Paladin mapping#74

FIX float counts in Paladin mapping#74
celiosantosjr merged 14 commits intomainfrom
celiosantosjr-patch-1

celiosantosjr commented Jun 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

celiosantosjr commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

celiosantosjr commented Jun 19, 2025 •

edited

Loading