Skip to content

[Validate] Add Metadata and Attribute filters to Metrics #269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Apr 7, 2022

Conversation

gatli
Copy link
Contributor

@gatli gatli commented Mar 31, 2022

This builds on @Anirudh-Scale 3D Cuboid metrics PR: #261

It hasn't been merged so the diff is huge 😅

I've added annotation_filters and prediction_filters that allow arbitrary filtering of the data inputs to the metrics supporting (cond AND cond AND cond ...) or (cond AND cond AND cond ...) or .... For a concrete example:

For example if you would like to test pedestrians closer than 70 meters with more than 15 points and pedestrians further away than 70 m with 5 points only against pedestrian annotations you could write something like this:

close_pedestrians = [MetadataFilter("relative_distance", "<", 70), [MetadataFilter("n_points", ">", 15), FieldFilter("label", "=", "pedestrian")]
farther_away_pedestrians = [MetadataFilter("relative_distance", ">", 70), [MetadataFilter("n_points", ">", 5), FieldFilter("label", "=", "pedestrian")]
either_close_or_farther = [close_pedestrians, farther_away_pedestrians]
only_pedestrian_annotations = [FieldFilter("label", "=" "pedestrian")]

evaluation_functions = [
        eval_functions.cuboid_precision(prediction_filters=[either_close_or_farther], annotation_filters=[only_pedestrian_annotations]),
        eval_functions.cuboid_recall(prediction_filters=[either_close_or_farther], annotation_filters=[only_pedestrian_annotations])]
        ]
 
config_description = [f"{f.key} {f.op} {f.value}" for f in close_pedestrians] + [[f"{f.key} {f.op} {f.value}" for f in farther_away_pedestrians]
test_name = f"Frog Lidar V2 - ({config_description})"
client.validate.create_scenario_test(test_name,
                                         training_slc_id, evaluation_functions=evaluation_functions)

The filters support the comparison operators:

class FilterOp(str, Enum):
    GT = ">"
    GTE = ">="
    LT = "<"
    LTE = "<="
    EQ = "="
    EQEQ = "=="
    NEQ = "!="
    IN = "in"
    NOT_IN = "not in"

TODO:

  • Finish up adding the filters to all Metric classes
  • Make shapely installation optional

Anirudh-Scale and others added 7 commits March 31, 2022 15:43
linting for circle ci

version

used native polygon

adding shapely

adding shapely

changing shapely

changing shapely

updating shapely

poetry added shapely

edge case

np type
@gatli gatli self-assigned this Mar 31, 2022
@gatli gatli requested review from sasha-scale and pfmark March 31, 2022 13:59
Copy link
Contributor

@sasha-scale sasha-scale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall code looks really well structured and clean - amazing work 🙌

Testing locally, I think I found a small bug (although I wasn't able to root cause it).

Here's the bug report info:

https://dashboard.scale.com/nucleus/[email protected]

Here's how I created the unit test:

from nucleus.metrics import MetadataFilter, FieldFilter
BDD_TEST_SLICE = 'slc_c933az3ptmpg0cs2vyk0'
example_annotation_filter = [MetadataFilter("occluded", "==", False), FieldFilter("label", "=", "pedestrian")]
evaluation_functions = [
        client.validate.eval_functions.cuboid_precision(annotation_filters=example_annotation_filter),
        client.validate.eval_functions.cuboid_recall(annotation_filters=example_annotation_filter)
]
config_description = [f"{f.key} {f.op} {f.value}" for f in example_annotation_filter]
test_name = f"BDD Experiments 1 - ({config_description})"
test = client.validate.create_scenario_test(test_name,
                                         BDD_TEST_SLICE, evaluation_functions=evaluation_functions)

The behavior I observed was that when I evaluated the unit test, there was an exception on the ML side: the error is:

"Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 405, in trace_task\n R = retval = fun(*args, **kwargs)\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 697, in __protected_call__\n return self.run(*args, **kwargs)\n File \"/workspace/model_ci/services/evaluate/tasks.py\", line 51, in evaluate\n response: MetricEvaluationResponse = evaluate_metric(request)\n File \"/workspace/model_ci/services/metrics.py\", line 66, in evaluate_metric\n per_item_metrics, overall_metric = compute_metric(preds_grouped, gt_grouped, eval_func)\n File \"/workspace/model_ci/services/metrics.py\", line 42, in compute_metric\n result = nucleus_metric(anno_list, pred_list)\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 272, in __call__\n cuboid_annotations = filter_by_metadata_fields(\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 190, in filter_by_metadata_fields\n and_conditions = [\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 191, in <listcomp>\n filter_to_comparison_function(cond) for cond in or_branch\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 127, in filter_to_comparison_function\n if FilterType(metadata_filter.type) == FilterType.FIELD:\nAttributeError: 'str' object has no attribute 'type'\n"

@@ -83,7 +86,10 @@ def create_scenario_test(
name=name,
slice_id=slice_id,
evaluation_functions=[
ef.to_entry() for ef in evaluation_functions # type:ignore
EvalFunctionListEntry(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! Your solution was fine but I just disliked that it made some immutable fields look mutable 🙂

@gatli
Copy link
Contributor Author

gatli commented Apr 4, 2022

Overall code looks really well structured and clean - amazing work 🙌

Testing locally, I think I found a small bug (although I wasn't able to root cause it).

Here's the bug report info:

https://dashboard.scale.com/nucleus/[email protected]

Here's how I created the unit test:

from nucleus.metrics import MetadataFilter, FieldFilter
BDD_TEST_SLICE = 'slc_c933az3ptmpg0cs2vyk0'
example_annotation_filter = [MetadataFilter("occluded", "==", False), FieldFilter("label", "=", "pedestrian")]
evaluation_functions = [
        client.validate.eval_functions.cuboid_precision(annotation_filters=example_annotation_filter),
        client.validate.eval_functions.cuboid_recall(annotation_filters=example_annotation_filter)
]
config_description = [f"{f.key} {f.op} {f.value}" for f in example_annotation_filter]
test_name = f"BDD Experiments 1 - ({config_description})"
test = client.validate.create_scenario_test(test_name,
                                         BDD_TEST_SLICE, evaluation_functions=evaluation_functions)

The behavior I observed was that when I evaluated the unit test, there was an exception on the ML side: the error is:

"Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 405, in trace_task\n R = retval = fun(*args, **kwargs)\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 697, in __protected_call__\n return self.run(*args, **kwargs)\n File \"/workspace/model_ci/services/evaluate/tasks.py\", line 51, in evaluate\n response: MetricEvaluationResponse = evaluate_metric(request)\n File \"/workspace/model_ci/services/metrics.py\", line 66, in evaluate_metric\n per_item_metrics, overall_metric = compute_metric(preds_grouped, gt_grouped, eval_func)\n File \"/workspace/model_ci/services/metrics.py\", line 42, in compute_metric\n result = nucleus_metric(anno_list, pred_list)\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 272, in __call__\n cuboid_annotations = filter_by_metadata_fields(\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 190, in filter_by_metadata_fields\n and_conditions = [\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 191, in <listcomp>\n filter_to_comparison_function(cond) for cond in or_branch\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 127, in filter_to_comparison_function\n if FilterType(metadata_filter.type) == FilterType.FIELD:\nAttributeError: 'str' object has no attribute 'type'\n"

I'll have a look. I actually just fixed list of AND re-parsing, I think it might be related to that. Thanks for testing it end-to-end!

@gatli
Copy link
Contributor Author

gatli commented Apr 4, 2022

I've updated the deployment and tested all public functions. They all work in my tests 🙂

gatli added 5 commits April 5, 2022 12:53
commit 7aef2e8
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Wed Jan 19 12:02:58 2022 +0000

    Update version and add changelog

commit 5c0fec5
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 13:04:00 2022 +0000

    Improved error message

commit a0d58c2
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 13:00:03 2022 +0000

    Add shapely specific import error

commit 39299d5
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 12:24:19 2022 +0000

    Add apt-get update

commit 1fc50b9
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 12:19:39 2022 +0000

    Add README section

commit 708d9e3
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 12:07:54 2022 +0000

    Adapt test suite to skip metrics tests if shapely not found

commit f50bebd
Author: Gunnar Atli Thoroddsen <[email protected]>
Date:   Mon Jan 17 11:08:06 2022 +0000

    Make Shapely installation optional with extras flag [metrics]
@gatli gatli requested a review from jean-lucas April 5, 2022 12:13
Copy link
Contributor

@pfmark pfmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall really great work! 🚀

Tested it locally and I have a few questions:

  • I added pretty arbitrary filters ([MetadataFilter("relative_distance", ">", 50000), FieldFilter("label", "=", "lalala")]) which should result in no items being passed to the test. Evaluating the test with a model that has the metadata and a model that doesn't have the relevant metadata still has results. Any thought why this happens? Am I getting sth. wrong here?
  • Any chance we can add some feedback to the scenario test creation which would let the user know how many items were filtered per evaluation function? I think this would be super useful.

README.md Outdated
apt-get install libgeos-dev
```

`pip install scale-nucleus[metrics]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default installation would still be pip install scale-nucleus?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup! But this should be pip install scale-nucleus[shapely]

Comment on lines +31 to +32
EQ = "="
EQEQ = "=="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just put == here? They are both mapped to == anyways. I think having both of them is confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One is Python style and the other is SQL style, I think it is fine practice to support both, PyArrow does the same. It has no effect on maintainability but might save somebody a headache so I'll keep it in. 🙂

@gatli gatli changed the title Add Metadata and Attribute filters [Validate] Add Metadata and Attribute filters to Metrics Apr 7, 2022
@gatli gatli merged commit e6328af into master Apr 7, 2022
@gatli gatli deleted the gunnar-metadata-and-attribute-filters branch April 7, 2022 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants