Skip to content

Conversation

@keflavich
Copy link
Contributor

Followup to #3296 & #3297.

  • the catalog is updated
  • there was an off-by-one error that affected SiC6, aka "Hexatriynylidenesilylidyne", which reads like a bad joke of a molecule name
  • parametrized the Big Data test so one can see the failures more clearly

@keflavich
Copy link
Contributor Author

omg... what happened to that commit message? I'm going to clean that up.

…r this time. Now just guesswork though....'! I didn't check the AI autocomplete... how did it come up with 3300?)
parametrize test over full list of molecules
@bsipocz
Copy link
Member

bsipocz commented Apr 22, 2025

fwiw, I use Tom's script to see what the next number is, without the need of opening up the browser: https://github.com/astropy/astropy-tools/blob/main/next_pr_number.py

@codecov
Copy link

codecov bot commented Apr 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.47%. Comparing base (213a0cd) to head (bc759a5).
Report is 156 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3298   +/-   ##
=======================================
  Coverage   69.47%   69.47%           
=======================================
  Files         232      232           
  Lines       19707    19707           
=======================================
  Hits        13691    13691           
  Misses       6016     6016           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@keflavich
Copy link
Contributor Author

This is ready from my side. lmk if you want another changelog entry (besides fixing the last one...)

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests runtime doubles, so that should ideally be looked into before this can be merged.

for row in species_table:

@pytest.mark.parametrize('row', species_table)
def test_regression_allcats(self, row):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are two failures, and the overall test runtime is double.

FAILED astroquery/linelists/cdms/tests/test_cdms_remote.py::TestRegressionAllCats::test_regression_allcats[row1135] - requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
FAILED astroquery/linelists/cdms/tests/test_cdms_remote.py::TestRegressionAllCats::test_regression_allcats[row1268] - requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are (probably) intermittent errors, and the runtime doubling is likely server side variation: the test is identical to what it was before, it just does more reporting now. Admittedly, that could be pytest overhead, but I suspect not.

(also there are 6 more entries than previously, but it's unlikely that 6 / 1306 doubles the time)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is marked big_data and should be skipped normally, but it's really useful to be able to run occasionally to check if they've added a new molecule that fills a new corner case we hadn't handled before (as happened with 100501)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with keeping the test in even it's slow (locally running I don't skip the bigdata ones), but this PR makes them being 15min long instead of 7. I'm pretty sure that's not pytest overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that this test accounts for 50% of the total astroquery test time? Or that the change in this PR specifically has increased the test time by 2x?

If the latter... I don't understand what could cause that, it's just changing from a for loop to a parameterization over the same list. I greatly prefer this version, since it gives a useful progressbar and a more useful message if it breaks.

My guess is still that it's time-of-day/server-load dependent, since the test is doing ~1300 independent queries (before & after this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I ran the tests locally without failure, so that points to the failures above being intermittent/one-time things

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's this module's timing up from 7 to 15 mins from main to this PR.

(Total time for remote tests is in the ~1.5-2 hours including the reruns, so it's not a critical deal within the big picture, but I would like to understand why this refactoring doubles the runtime).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yeah, that's really weird. I wonder if one of the new molecules has an absolutely gigantic file... does pytest do per-test timing? I'll check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll DM you on slack :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I see what's the issue now, this PR should be ok.

(I see much shorter times with the previous layout as I run into a connectionError, thus a lot of molecules are not run. With the new layout if a molecule fails the next one is still being picked up rather than the whole thing bailing on the remaining of the loop)

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This in fact now fixes that we always run all the molecules

@bsipocz bsipocz merged commit 67b5ed5 into astropy:main Apr 23, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants