-
Notifications
You must be signed in to change notification settings - Fork 107
Improve client benchmark identification #3496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,11 @@ | ||
| from http import HTTPStatus | ||
| from typing import AnyStr, List, NoReturn, Union | ||
|
|
||
| from flask import current_app | ||
|
|
||
| from pbench.server import JSON, PbenchServerConfig | ||
| from pbench.server.api.resources import ( | ||
| APIAbort, | ||
| ApiAuthorizationType, | ||
| ApiContext, | ||
| APIInternalError, | ||
|
|
@@ -119,11 +121,13 @@ def get_index(self, dataset: Dataset, root_index_name: AnyStr) -> AnyStr: | |
| try: | ||
| index_map = Metadata.getvalue(dataset=dataset, key=Metadata.INDEX_MAP) | ||
| except MetadataError as exc: | ||
| current_app.logger.error("{}", exc) | ||
| raise APIInternalError(f"Required metadata {Metadata.INDEX_MAP} missing") | ||
| if Metadata.getvalue(dataset, Metadata.SERVER_ARCHIVE): | ||
| raise APIAbort(HTTPStatus.CONFLICT, "Dataset is not indexed") | ||
|
Comment on lines
+124
to
+125
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This message is insufficient: in 90% of the cases when users receive this, they are going to ask, "why isn't this dataset indexed?" And, it's unlikely that they are going to realize that it is because indexing was disabled on it when it was uploaded. So, this message really should say something more like, "Indexing of this dataset was disabled by the owner".
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not necessarily by deliberate action of the owner: if a tarball is uploaded without a
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, yes, I'd neglected that detail. But, yes, "marked for archive only" would be much better than "not indexed". |
||
| raise APIInternalError( | ||
| f"Required metadata {Metadata.INDEX_MAP} missing" | ||
| ) from exc | ||
|
|
||
| if index_map is None: | ||
| current_app.logger.error("Index map metadata has no value") | ||
| raise APIInternalError( | ||
| f"Required metadata {Metadata.INDEX_MAP} has no value" | ||
| ) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -612,6 +612,13 @@ class Metadata(Database.Base): | |
| METALOG = "metalog" | ||
| NATIVE_KEYS = [GLOBAL, METALOG, SERVER, USER] | ||
|
|
||
| # BENCHMARK provides a simple identification of the native benchmark where | ||
| # one can be identified. At the simplest for Pbench Agent tarballs this is | ||
| # just "dataset.metalog.pbench.script" but will be expanded in the future | ||
| # for example to identify a "pbench-user-benchmark -- fio" as "fio". | ||
| SERVER_BENCHMARK = "server.benchmark" | ||
| SERVER_BENCHMARK_UNKNOWN = "unknown" | ||
|
Comment on lines
+615
to
+620
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It probably would be good to have a reference here to possible values for this key (i.e., to
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Those would constitute examples, but I already added
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, the value associated with |
||
|
|
||
| # DELETION timestamp for dataset based on user settings and system | ||
| # settings when the dataset is created. | ||
| # | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -79,7 +79,13 @@ def test_upload_all(self, server_client: PbenchServerClient, login_user): | |
| print(f"\t... uploaded {name}: {a}") | ||
|
|
||
| datasets = server_client.get_list( | ||
| metadata=["dataset.access", "server.tarball-path", "dataset.operations"] | ||
| metadata=[ | ||
| "dataset.access", | ||
| "dataset.metalog.pbench.script", | ||
| "server.benchmark", | ||
| "server.tarball-path", | ||
| "dataset.operations", | ||
| ] | ||
| ) | ||
| found = frozenset({d.name for d in datasets}) | ||
| expected = frozenset(tarballs.keys()) | ||
|
|
@@ -91,6 +97,10 @@ def test_upload_all(self, server_client: PbenchServerClient, login_user): | |
| t = tarballs[dataset.name] | ||
| assert dataset.name in dataset.metadata["server.tarball-path"] | ||
| assert dataset.metadata["dataset.operations"]["UPLOAD"]["state"] == "OK" | ||
| assert ( | ||
| dataset.metadata["dataset.metalog.pbench.script"] | ||
| == dataset.metadata["server.benchmark"] | ||
| ) | ||
|
Comment on lines
+100
to
+103
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While this assertion (generally) holds at the moment, this isn't really what we want going forward, right? That is, we're hoping that at some point we'll have (Also, you didn't take this shortcut at lines 190 and 192.)
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the archiveonly test, I know exactly what values to expect; this is an iteration, and all we can really know at this point is that they ought to match. When and if we work out a more general "base benchmark" recognition that doesn't rely on
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
They should only match currently because of an implementation detail, not because that's specifically the way we want these fields to behave. It would be best if the tests were constructed in a way that doesn't require them to change when details of the implementation change.... (Ideally, we would not change the tests nearly as often as we seem to do.) |
||
| assert t.access == dataset.metadata["dataset.access"] | ||
| except HTTPError as exc: | ||
| pytest.fail( | ||
|
|
@@ -169,9 +179,17 @@ def test_archive_only(server_client: PbenchServerClient, login_user): | |
| API.DATASETS_INVENTORY, {"dataset": md5, "target": ""} | ||
| ) | ||
| metadata = server_client.get_metadata( | ||
| md5, ["dataset.operations", "server.archiveonly"] | ||
| md5, | ||
| [ | ||
| "dataset.metalog.pbench.script", | ||
| "dataset.operations", | ||
| "server.archiveonly", | ||
| "server.benchmark", | ||
| ], | ||
| ) | ||
| assert metadata["dataset.metalog.pbench.script"] == "fio" | ||
| assert metadata["server.archiveonly"] is True | ||
| assert metadata["server.benchmark"] == "fio" | ||
|
|
||
| # NOTE: we could wait here; however, the UNPACK operation is only | ||
| # enabled by upload, and INDEX is only enabled by UNPACK: so if they're | ||
|
|
@@ -209,9 +227,18 @@ def test_no_metadata(server_client: PbenchServerClient, login_user): | |
| API.DATASETS_INVENTORY, {"dataset": md5, "target": ""} | ||
| ) | ||
| metadata = server_client.get_metadata( | ||
| md5, ["dataset.operations", "dataset.metalog", "server.archiveonly"] | ||
| md5, | ||
| [ | ||
| "dataset.operations", | ||
| "dataset.metalog", | ||
| "server.archiveonly", | ||
| "server.benchmark", | ||
| ], | ||
| ) | ||
| assert metadata["dataset.metalog"] == {"pbench": {"script": "Foreign"}} | ||
| assert metadata["dataset.metalog"] == { | ||
| "pbench": {"name": name, "script": "unknown"} | ||
| } | ||
| assert metadata["server.benchmark"] == "unknown" | ||
| assert metadata["server.archiveonly"] is True | ||
|
|
||
| # NOTE: we could wait here; however, the UNPACK operation is only | ||
|
|
@@ -361,17 +388,21 @@ def test_details(self, server_client: PbenchServerClient, login_user): | |
| ) | ||
| for d in datasets: | ||
| print(f"\t... checking run index for {d.name}") | ||
| indexed = not d.metadata["server.archiveonly"] | ||
| response = server_client.get( | ||
| API.DATASETS_DETAIL, {"dataset": d.resource_id} | ||
| API.DATASETS_DETAIL, {"dataset": d.resource_id}, raise_error=False | ||
| ) | ||
| detail = response.json() | ||
| indexed = not d.metadata["server.archiveonly"] | ||
| if indexed: | ||
| assert ( | ||
| response.ok | ||
| ), f"DETAILS for {d.name} failed with {detail['message']}" | ||
ndokos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| assert ( | ||
| d.metadata["dataset.metalog.pbench"]["script"] | ||
| == detail["runMetadata"]["script"] | ||
| ) | ||
| else: | ||
| assert response.status_code == HTTPStatus.CONFLICT | ||
| print(f"\t\t... {d.name} is archiveonly") | ||
|
|
||
|
|
||
|
|
@@ -554,8 +585,15 @@ def test_contents(self, server_client: PbenchServerClient, login_user): | |
|
|
||
| for dataset in datasets: | ||
| response = server_client.get( | ||
| API.DATASETS_CONTENTS, {"dataset": dataset.resource_id, "target": ""} | ||
| API.DATASETS_CONTENTS, | ||
| {"dataset": dataset.resource_id, "target": ""}, | ||
| raise_error=False, | ||
| ) | ||
| archive = dataset.metadata["server.archiveonly"] | ||
| if archive: | ||
| assert response.status_code == HTTPStatus.CONFLICT | ||
| return | ||
webbnh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| assert ( | ||
| response.ok | ||
| ), f"CONTENTS {dataset.name} failed {response.status_code}:{response.json()['message']}" | ||
|
|
@@ -565,15 +603,14 @@ def test_contents(self, server_client: PbenchServerClient, login_user): | |
| # directory is technically possible, but not legal unless it's a | ||
| # trivial "archiveonly" dataset. NOTE: this will also fail if | ||
| # either the "directories" or "files" JSON keys are missing. | ||
| archive = dataset.metadata["server.archiveonly"] | ||
| assert json["directories"] or json["files"] or archive | ||
| assert json["directories"] or json["files"] | ||
webbnh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Even if they're empty, both values must be lists | ||
| assert isinstance(json["directories"], list) | ||
| assert isinstance(json["files"], list) | ||
|
|
||
| # Unless archiveonly, we need at least a metadata.log | ||
| assert archive or "metadata.log" in (f["name"] for f in json["files"]) | ||
| # We need at least a metadata.log | ||
| assert "metadata.log" in (f["name"] for f in json["files"]) | ||
|
|
||
| for d in json["directories"]: | ||
| uri = server_client._uri( | ||
|
|
@@ -598,7 +635,7 @@ def test_visualize(self, server_client: PbenchServerClient, login_user): | |
| """ | ||
| datasets = server_client.get_list( | ||
| owner="tester", | ||
| filter=["dataset.metalog.pbench.script:uperf"], | ||
| filter=["server.benchmark:uperf"], | ||
| ) | ||
|
|
||
| for dataset in datasets: | ||
|
|
@@ -624,7 +661,7 @@ def test_compare(self, server_client: PbenchServerClient, login_user): | |
| """ | ||
| datasets = server_client.get_list( | ||
| owner="tester", | ||
| filter=["dataset.metalog.pbench.script:uperf"], | ||
| filter=["server.benchmark:uperf"], | ||
| ) | ||
|
|
||
| candidates = [dataset.resource_id for dataset in datasets] | ||
|
|
@@ -686,15 +723,25 @@ class TestUpdate: | |
| @pytest.mark.parametrize("access", ("public", "private")) | ||
| def test_publish(self, server_client: PbenchServerClient, login_user, access): | ||
| expected = "public" if access == "private" else "private" | ||
| datasets = server_client.get_list(access=access, mine="true") | ||
| datasets = server_client.get_list( | ||
| access=access, mine="true", metadata=["server.archiveonly"] | ||
| ) | ||
| print(f" ... updating {access} datasets to {expected} ...") | ||
| for dataset in datasets: | ||
| server_client.update(dataset.resource_id, access=expected) | ||
| print(f"\t ... updating {dataset.name} to {access!r}") | ||
| meta = server_client.get_metadata( | ||
| dataset.resource_id, metadata="dataset.access" | ||
| response = server_client.update( | ||
| dataset.resource_id, access=expected, raise_error=False | ||
| ) | ||
| assert meta["dataset.access"] == expected | ||
| print(f"\t ... updating {dataset.name} to {access!r}") | ||
| if response.ok: | ||
|
Comment on lines
730
to
+735
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This shows up as a diff because I rearranged code, but I didn't change the line.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True. But, I still think the lines are in the wrong order, etc. 🙂 |
||
| assert not dataset.metadata["server.archiveonly"] | ||
| meta = server_client.get_metadata( | ||
| dataset.resource_id, metadata="dataset.access" | ||
| ) | ||
| assert meta["dataset.access"] == expected | ||
| else: | ||
| assert dataset.metadata[ | ||
| "server.archiveonly" | ||
| ], f"Indexed dataset {dataset.name} failed to update with {response.json()['message']}" | ||
|
|
||
|
|
||
| class TestDelete: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively,