-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
Hello-
Thanks for your efforts, it's an interesting paper and an exciting advancement! I have run some test data on the huggingface and locally and I'm seeing some discrepancies in the outputs, particularly after the instanovo+ outputs. I am running under the default parameters. I am finding that the diffusion_predictions are often empty and aren't consistent with the transformer_predictions in the final output.
Huggingface output
| scan_number | precursor_mz | precursor_charge | retention_time | spectrum_id | experiment_name | transformer_prediction | transformer_log_probability | refined_prediction | refined_log_probability | refined_delta_mass_ppm |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 468.871978759766 | 3 | 212.1862777 | test:0 | test | GHNSYTC[UNIMOD:4]EATHK | -0.4897 | GHNSYTC[UNIMOD:4]EATHK | -0.0007 | 335465.81 |
| 1 | 778.737365722656 | 3 | 212.454416 | test:1 | test | GGEEEEEEEEEEEEEEEEK | -252.0484 | GGEEEEEEEEEEEEEEEEK | -0.1081 | 331829.78 |
Here is local results file before refinement:
| scan_number | precursor_mz | precursor_charge | experiment_name | spectrum_id | predictions | predictions_tokenised | log_probabilities | token_log_probabilities | delta_mass_ppm |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 468.871978759766 | 3 | test | test:0 | GHNSYTC[UNIMOD:4]EATHK | G, H, N, S, Y, T, C[UNIMOD:4], E, A, T, H, K | -0.49145248532295227 | [-0.07410459220409393, -0.013641584664583206, -0.2776679992675781, -0.002582312561571598, -0.11276249587535858, -0.0019204046111553907, -3.8980677345534787e-05, -0.0028956886380910873, -6.09140915912576e-05, -0.00011085849109804258, -0.00025555206229910254, -0.00501991854980588] | 3.7517556583161897 |
| 1 | 778.737365722656 | 3 | test | test:1 | GGEEEEEEEEEEEEEEEEK | G, G, E, E, E, E, E, E, E, E, E, E, E, E, E, E, E, E, K | -252.05099487304688 | [-3.0641820430755615, -2.57352352142334, -1.11761474609375, -1.0334386825561523, -0.8424715995788574, -0.6960052847862244, -0.6549468636512756, -0.7385885119438171, -0.818410336971283, -0.8820850253105164, -0.8136520385742188, -0.6193901896476746, -0.833812415599823, -1.0504907369613647, -1.124313235282898, -1.5463999509811401, -0.7983300685882568, -0.8665404319763184, -0.2932451367378235] | 3149.0891502897002 |
Here is the final table with the final results:
| scan_number | precursor_mz | precursor_charge | experiment_name | spectrum_id | delta_mass_ppm | diffusion_predictions_tokenised | diffusion_predictions | diffusion_log_probabilities | transformer_predictions | transformer_predictions_tokenised | transformer_log_probabilities | transformer_token_log_probabilities |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 468.871978759766 | 3 | test | test:0 | 3.7517556583161897 | [] | -0.003761152969673276 | GHNSYTC[UNIMOD:4]EATHK | G, H, N, S, Y, T, C[UNIMOD:4], E, A, T, H, K | -0.49145248532295227 | [-0.07410459220409393, -0.013641584664583206, -0.2776679992675781, -0.002582312561571598, -0.11276249587535858, -0.0019204046111553907, -3.8980677345534787e-05, -0.0028956886380910873, -6.09140915912576e-05, -0.00011085849109804258, -0.00025555206229910254, -0.00501991854980588] | |
| 1 | 778.737365722656 | 3 | test | test:1 | 3149.0891502897002 | ['F', 'S'] | FS | -0.03220748156309128 | GGEEEEEEEEEEEEEEEEK | G, G, E, E, E, E, E, E, E, E, E, E, E, E, E, E, E, E, K | -252.05099487304688 | [-3.0641820430755615, -2.57352352142334, -1.11761474609375, -1.0334386825561523, -0.8424715995788574, -0.6960052847862244, -0.6549468636512756, -0.7385885119438171, -0.818410336971283, -0.8820850253105164, -0.8136520385742188, -0.6193901896476746, -0.833812415599823, -1.0504907369613647, -1.124313235282898, -1.5463999509811401, -0.7983300685882568, -0.8665404319763184, -0.2932451367378235] |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels