-
Notifications
You must be signed in to change notification settings - Fork 322
test: Stop creating extra datasets #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This saves ~12% in execution time and unecessary BQ churn.
and blacken
|
I see that we already have a session-scope python-bigquery/tests/system/conftest.py Line 35 in e587029
Maybe there's a way we can use that with our mix of |
|
Maybe we could even update that import test_utils.prefixer
from google.cloud import bigquery
from . import helpers
prefixer = test_utils.prefixer.Prefixer("python-bigquery", "tests/system")
@pytest.fixture(scope="session", autouse=True)
def cleanup_datasets(bigquery_client: bigquery.Client):
for dataset in bigquery_client.list_datasets():
if prefixer.should_cleanup(dataset.dataset_id):
bigquery_client.delete_dataset(
dataset, delete_contents=True, not_found_ok=True
)
...
@pytest.fixture(scope="session")
def dataset_id(bigquery_client: bigquery.Client, project_id: str):
dataset_id = prefixer.create_prefix()
full_dataset_id = f"{project_id}.{dataset_id}"
dataset = bigquery.Dataset(full_dataset_id)
bigquery_client.create_dataset(dataset)
yield dataset_id
bigquery_client.delete_dataset(dataset, delete_contents=True, not_found_ok=True) |
I'm using that.
You can't use non-auto-use fixtures in unittest tests. I tried. That's why I moved those three tests to pytest tests. |
Will do. |
done |
| rows = sorted(row.values() for row in rows_iter) | ||
| assert rows == [(1, "one"), (2, "two")] | ||
|
|
||
| def temp_dataset(self, dataset_id, location=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test uses the shared dataset_id fixture, but wants to create the dataset itself -- should it be updated to use a different one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. Are you referring to test_table_snapshots? It was using the dataset created in setUp. I don't see it creating a dataset. Am I blind? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was referring to the temp_dataset helper (my eye read that as test_dataset). It is used by test_create_dataset, test_update_dataset, etc. Because it isn't a test, it isn't using the dataset_id fixture, but an actual passed-in dataset_id parameter.
I don't know whether all those tests actually need to be using separate datasets, but could imagine that at least some could share.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A number of the other tests could and should be refactored to use the shared dataset. In the interest of not making this PR too complex, I opted to save updating the other tests for a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does the minimum to get rid of the mostly-unused dataset creation in setUp.
This saves ~12% in execution time and unecessary BQ churn.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #716 🦕
This mostly just stops creating extra datasets. Only 3 tests used tests used the dataset created in
setUp. Those 3 tests now share the dataset created using the dataset_id fixture. This saves ~12% off test-execution time.More could be saved by converting most of the remaining tests in
test_clientto pytest and using the dataset_id fixture. That should be done in follow-on PRs.