test: Stop creating extra datasets #791

jimfulton · 2021-07-21T18:17:44Z

This saves ~12% in execution time and unecessary BQ churn.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #716 🦕

This mostly just stops creating extra datasets. Only 3 tests used tests used the dataset created in setUp. Those 3 tests now share the dataset created using the dataset_id fixture. This saves ~12% off test-execution time.

More could be saved by converting most of the remaining tests in test_client to pytest and using the dataset_id fixture. That should be done in follow-on PRs.

This saves ~12% in execution time and unecessary BQ churn.

and blacken

tswast · 2021-07-21T18:21:50Z

I see that we already have a session-scope dataset_id fixture in

python-bigquery/tests/system/conftest.py

Line 35 in e587029

def dataset_id(bigquery_client):

Maybe there's a way we can use that with our mix of unittest and pytest style tests?

tswast · 2021-07-21T18:22:52Z

Maybe we could even update that conftest.py file to use the new "prefixer" module?

import test_utils.prefixer

from google.cloud import bigquery
from . import helpers


prefixer = test_utils.prefixer.Prefixer("python-bigquery", "tests/system")


@pytest.fixture(scope="session", autouse=True)
def cleanup_datasets(bigquery_client: bigquery.Client):
    for dataset in bigquery_client.list_datasets():
        if prefixer.should_cleanup(dataset.dataset_id):
            bigquery_client.delete_dataset(
                dataset, delete_contents=True, not_found_ok=True
            )

...

@pytest.fixture(scope="session")
def dataset_id(bigquery_client: bigquery.Client, project_id: str):
    dataset_id = prefixer.create_prefix()
    full_dataset_id = f"{project_id}.{dataset_id}"
    dataset = bigquery.Dataset(full_dataset_id)
    bigquery_client.create_dataset(dataset)
    yield dataset_id
    bigquery_client.delete_dataset(dataset, delete_contents=True, not_found_ok=True)

jimfulton · 2021-07-21T18:23:48Z

I see that we already have a session-scope dataset_id fixture in

python-bigquery/tests/system/conftest.py

Line 35 in e587029

def dataset_id(bigquery_client):

I'm using that.

Maybe there's a way we can use that with our mix of unittest and pytest style tests?

You can't use non-auto-use fixtures in unittest tests. I tried. That's why I moved those three tests to pytest tests.

jimfulton · 2021-07-21T18:25:12Z

Maybe we could even update that conftest.py file to use the new "prefixer" module?

Will do.

jimfulton · 2021-07-21T18:53:59Z

Maybe we could even update that conftest.py file to use the new "prefixer" module?

Will do.

done

tseaver · 2021-07-22T18:33:34Z

tests/system/test_client.py

-        rows = sorted(row.values() for row in rows_iter)
-        assert rows == [(1, "one"), (2, "two")]
-
    def temp_dataset(self, dataset_id, location=None):


This test uses the shared dataset_id fixture, but wants to create the dataset itself -- should it be updated to use a different one?

I'm confused. Are you referring to test_table_snapshots? It was using the dataset created in setUp. I don't see it creating a dataset. Am I blind? :)

Sorry, I was referring to the temp_dataset helper (my eye read that as test_dataset). It is used by test_create_dataset, test_update_dataset, etc. Because it isn't a test, it isn't using the dataset_id fixture, but an actual passed-in dataset_id parameter.

I don't know whether all those tests actually need to be using separate datasets, but could imagine that at least some could share.

A number of the other tests could and should be refactored to use the shared dataset. In the interest of not making this PR too complex, I opted to save updating the other tests for a future PR.

This PR does the minimum to get rid of the mostly-unused dataset creation in setUp.

Jim Fulton added 2 commits July 21, 2021 13:56

test: Stop creating extra datasets

709c31d

This saves ~12% in execution time and unecessary BQ churn.

removed unused assignment

2bb3399

and blacken

product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jul 21, 2021

google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jul 21, 2021

use the prefixer

5f86220

jimfulton marked this pull request as ready for review July 22, 2021 16:39

jimfulton requested a review from a team July 22, 2021 16:39

jimfulton requested a review from a team as a code owner July 22, 2021 16:39

jimfulton requested review from shollyman and removed request for a team July 22, 2021 16:39

tseaver reviewed Jul 22, 2021

View reviewed changes

shollyman requested review from tswast and removed request for shollyman July 22, 2021 18:56

tswast approved these changes Jul 23, 2021

View reviewed changes

jimfulton merged commit eef6c8e into master Jul 25, 2021

jimfulton deleted the riversnake-system-tests-fewer-datasets branch July 25, 2021 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: Stop creating extra datasets #791

test: Stop creating extra datasets #791

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

tswast commented Jul 21, 2021

Uh oh!

tswast commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

tseaver Jul 22, 2021

Uh oh!

jimfulton Jul 22, 2021

Uh oh!

tseaver Jul 22, 2021

Uh oh!

jimfulton Jul 22, 2021

Uh oh!

jimfulton Jul 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

test: Stop creating extra datasets #791

test: Stop creating extra datasets #791

Uh oh!

Conversation

jimfulton commented Jul 21, 2021

Uh oh!

tswast commented Jul 21, 2021

Uh oh!

tswast commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

jimfulton commented Jul 21, 2021

Uh oh!

tseaver Jul 22, 2021

Choose a reason for hiding this comment

Uh oh!

jimfulton Jul 22, 2021

Choose a reason for hiding this comment

Uh oh!

tseaver Jul 22, 2021

Choose a reason for hiding this comment

Uh oh!

jimfulton Jul 22, 2021

Choose a reason for hiding this comment

Uh oh!

jimfulton Jul 22, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants