Skip to content

Conversation

@ParfenovS
Copy link

The default value for molecule parameter of query_lines function is 'All'. One, therefore, can expect that the following example code should work and return the lines of all species:

import astropy.units as u
from astroquery.linelists.cdms import CDMS
CDMS.query_lines(min_frequency=100.3 * u.GHz,max_frequency=100.5 * u.GHz,min_strength=-5)

However, this code does not work. This PR adds support for such a use case and makes the code above work. The returned line list can include the species with bad formatting. This PR adds a filter to exclude such species from the list (there are species with bad formatting in the frequency range in the example code above). Maybe it can be a good idea to warn a user that some lines can be excluded from the query results (I think one can discuss on how to do it in a proper way). This PR also adds some test of the new functionality. This test does not follow a pattern of other tests in that it does not utilize the the row number to check the test query result. The intent is to avoid the test breakage due to CDMS data updates.

PR2385 fixed some errors of line list parsing. However, some errors still remained. The starts value for MOLWT field should be decreased as this field occupies a larger number of characters that it is assumed in the current code version. An example is contained in the results of the example code above, where MOLWT field has an extra character for 1-CN-2-CCH-C6H4.
In the current code version, parse_letternumber function does not work correctly for parameters like a0, b0 etc. as it should according to CDMS documentation.
This PR fixes both these parsing errors.

@keflavich
Copy link
Contributor

Thanks for these fixes.

Regarding the failed example cases, I'll want to more carefully review and note them in tests. I'll do that now, just letting you know it's work in progress.

@keflavich
Copy link
Contributor

I'm trying to hunt down a molecule that actually uses the negative values for quantum numbers. Any ideas for a good one to use?

@keflavich
Copy link
Contributor

I built more tests, and in doing so, discovered more problems & corner cases. I have the feeling that CDMS is changing their fixed formatting on me every week, but the lack of test coverage means I have no evidence for that.

@ParfenovS
Copy link
Author

CH3CN is a good example of molecule, where lower case characters are used to signal negative quantum numbers smaller than –9.

@codecov
Copy link

codecov bot commented Apr 28, 2025

Codecov Report

Attention: Patch coverage is 43.18182% with 25 lines in your changes missing coverage. Please review.

Project coverage is 69.82%. Comparing base (4ddb0aa) to head (c94ad8d).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
astroquery/linelists/cdms/core.py 43.18% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3302      +/-   ##
==========================================
- Coverage   69.87%   69.82%   -0.06%     
==========================================
  Files         232      232              
  Lines       19769    19797      +28     
==========================================
+ Hits        13814    13823       +9     
- Misses       5955     5974      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bsipocz bsipocz added the cdms label Apr 28, 2025
@bsipocz bsipocz added this to the v0.4.11 milestone Apr 28, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a changelog, also I see test failures with pytest -P linelists -R. Could you @keflavich have another look?

(also, I see a couple of commits even with a "needs to be squashed" message, so do go ahead and give it a squash, too :) )

ParfenovS and others added 6 commits May 21, 2025 18:19
…; fix CDMS quantum numbers parsing

1) Adding support for CDMS queries with lines of all species
2) Fixing the CMDS lines list parsing

Support CDMS all species option; fix format for CDMS linelist reading; fix CDMS quantum numbers parsing

Adding test for a new functionality when all species are requested from CDMS
trivial formatting

fix my refactor; it was incorrect

oops, fix to last one (yes, this needs to be squashed; pushing fast to skip tests... and spam my inbox...)
shift tag back one spot.  Fix tests to accommodate more complete "tag" name

add the b1 = -21 test

whitespace
@keflavich
Copy link
Contributor

I swear CDMS is trolling me.... this was working 3 weeks ago. I'll look more closely again.

@bsipocz
Copy link
Member

bsipocz commented May 21, 2025

I swear CDMS is trolling me.... this was working 3 weeks ago. I'll look more closely again.

Yeap, I'm sorry that it took some for me to get back to this for a final review. On the bright side, it's not main that is broken now, only the PR 😅

@bsipocz bsipocz modified the milestones: v0.4.11, 0.4.12 Sep 20, 2025
Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some doctest failures, but overall this is looking good (well, I trust your parser and tests, I didn't dive into that part of the code during review).

Please fix the doctests and this can be merged.

if return_response:
return response
result = self._parse_cat(response)
result = self._parse_cat(response.text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we raise_for_status prior to this? Or maybe can we even upstream that raise for status and do it inside _request()? -- that case should/would be a follow-up rather than in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, just adding the raise_for_status here for now. Doing it in request is an idea, but then it changes the behavior of request from being near-equivalent to request.get/request.post to being very different - I don't like that, it will have a lot of secondary effects that I expect would make debugging harder.

return result

def _parse_cat(self, response, *, verbose=False):
def _parse_cat(self, text, *, verbose=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but maybe call it text_response :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function works on any .cat file from CDMS, so I'd like to consider making it public and leaving this as-is. Maybe that can be a different PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants