Skip to content

[Validate] Add Metadata and Attribute filters to Metrics #269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Apr 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@ jobs:
- run:
name: Install Environment Dependencies
command: | # install dependencies
apt-get -y install curl
pip install --upgrade pip
apt-get update
apt-get -y install curl libgeos-dev
pip install --upgrade pip
pip install poetry
poetry install
poetry install -E shapely

- run:
name: Black Formatting Check # Only validation, without re-formatting
Expand Down
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@ All notable changes to the [Nucleus Python Client](https://github.com/scaleapi/n
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.9.0](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.9.0) - 2022-04-07

### Added

- Validate metrics support metadata and field filtering on input annotation and predictions
- 3D/Cuboid metrics: Recall, Precision, 3D IOU and birds eye 2D IOU```
- Shapely can be used for metric development if the optional scale-nucleus[shapely] is installed
- Full support for passing parameters to evaluation configurations

## [0.8.4](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.8.4) - 2022-04-06
- Changing `camera_params` of dataset items can now be done through the dataset method `update_items_metadata`

Expand Down
21 changes: 21 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,24 @@ cd docs
sphinx-autobuild . ./_build/html --watch ../nucleus
```
`sphinx-autobuild` will spin up a server on localhost (port 8000 by default) that will watch for and automatically rebuild a version of the API reference based on your local docstring changes.


## Custom Metrics using Shapely in scale-validate

Certain metrics use `shapely` which is added as an optional dependency.
```bash
pip install scale-nucleus[metrics]
```

Note that you might need to install a local GEOS package since Shapely doesn't provide binaries bundled with GEOS for every platform.

```bash
#Mac OS
brew install geos
# Ubuntu/Debian flavors
apt-get install libgeos-dev
```

To develop it locally use

`poetry install --extra shapely`
3 changes: 3 additions & 0 deletions nucleus/annotation.py
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,7 @@ class SegmentationAnnotation(Annotation):
annotations: List[Segment]
reference_id: str
annotation_id: Optional[str] = None
# metadata: Optional[dict] = None # TODO(sc: 422637)

def __post_init__(self):
if not self.mask_url:
Expand All @@ -574,6 +575,7 @@ def from_json(cls, payload: dict):
],
reference_id=payload[REFERENCE_ID_KEY],
annotation_id=payload.get(ANNOTATION_ID_KEY, None),
# metadata=payload.get(METADATA_KEY, None), # TODO(sc: 422637)
)

def to_payload(self) -> dict:
Expand All @@ -582,6 +584,7 @@ def to_payload(self) -> dict:
MASK_URL_KEY: self.mask_url,
ANNOTATIONS_KEY: [ann.to_payload() for ann in self.annotations],
ANNOTATION_ID_KEY: self.annotation_id,
# METADATA_KEY: self.metadata, # TODO(sc: 422637)
}

payload[REFERENCE_ID_KEY] = self.reference_id
Expand Down
7 changes: 7 additions & 0 deletions nucleus/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
from .base import Metric, ScalarResult
from .categorization_metrics import CategorizationF1
from .cuboid_metrics import CuboidIOU, CuboidPrecision, CuboidRecall
from .filtering import (
FieldFilter,
ListOfOrAndFilters,
MetadataFilter,
apply_filters,
)
from .polygon_metrics import (
PolygonAveragePrecision,
PolygonIOU,
Expand Down
104 changes: 102 additions & 2 deletions nucleus/metrics/base.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
import sys
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Iterable, List
from typing import Iterable, List, Optional, Union

from nucleus.annotation import AnnotationList
from nucleus.metrics.filtering import (
ListOfAndFilters,
ListOfOrAndFilters,
apply_filters,
)
from nucleus.prediction import PredictionList


Expand Down Expand Up @@ -86,12 +91,107 @@ def __call__(
metric(annotations, predictions)
"""

def __init__(
self,
annotation_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
prediction_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
):
"""
Args:
annotation_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
prediction_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
"""
self.annotation_filters = annotation_filters
self.prediction_filters = prediction_filters

@abstractmethod
def __call__(
def call_metric(
self, annotations: AnnotationList, predictions: PredictionList
) -> MetricResult:
"""A metric must override this method and return a metric result, given annotations and predictions."""

def __call__(
self, annotations: AnnotationList, predictions: PredictionList
) -> MetricResult:
annotations = self._filter_annotations(annotations)
predictions = self._filter_predictions(predictions)
return self.call_metric(annotations, predictions)

def _filter_annotations(self, annotations: AnnotationList):
if (
self.annotation_filters is None
or len(self.annotation_filters) == 0
):
return annotations
annotations.box_annotations = apply_filters(
annotations.box_annotations, self.annotation_filters
)
annotations.line_annotations = apply_filters(
annotations.line_annotations, self.annotation_filters
)
annotations.polygon_annotations = apply_filters(
annotations.polygon_annotations, self.annotation_filters
)
annotations.cuboid_annotations = apply_filters(
annotations.cuboid_annotations, self.annotation_filters
)
annotations.category_annotations = apply_filters(
annotations.category_annotations, self.annotation_filters
)
annotations.multi_category_annotations = apply_filters(
annotations.multi_category_annotations, self.annotation_filters
)
annotations.segmentation_annotations = apply_filters(
annotations.segmentation_annotations, self.annotation_filters
)
return annotations

def _filter_predictions(self, predictions: PredictionList):
if (
self.prediction_filters is None
or len(self.prediction_filters) == 0
):
return predictions
predictions.box_predictions = apply_filters(
predictions.box_predictions, self.prediction_filters
)
predictions.line_predictions = apply_filters(
predictions.line_predictions, self.prediction_filters
)
predictions.polygon_predictions = apply_filters(
predictions.polygon_predictions, self.prediction_filters
)
predictions.cuboid_predictions = apply_filters(
predictions.cuboid_predictions, self.prediction_filters
)
predictions.category_predictions = apply_filters(
predictions.category_predictions, self.prediction_filters
)
predictions.segmentation_predictions = apply_filters(
predictions.segmentation_predictions, self.prediction_filters
)
return predictions

@abstractmethod
def aggregate_score(self, results: List[MetricResult]) -> ScalarResult:
"""A metric must define how to aggregate results from single items to a single ScalarResult.
Expand Down
62 changes: 58 additions & 4 deletions nucleus/metrics/categorization_metrics.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
from abc import abstractmethod
from dataclasses import dataclass
from typing import List, Set, Tuple, Union
from typing import List, Optional, Set, Tuple, Union

from sklearn.metrics import f1_score

from nucleus.annotation import AnnotationList, CategoryAnnotation
from nucleus.metrics.base import Metric, MetricResult, ScalarResult
from nucleus.metrics.filtering import ListOfAndFilters, ListOfOrAndFilters
from nucleus.metrics.filters import confidence_filter
from nucleus.prediction import CategoryPrediction, PredictionList

Expand Down Expand Up @@ -56,12 +57,37 @@ class CategorizationMetric(Metric):
def __init__(
self,
confidence_threshold: float = 0.0,
annotation_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
prediction_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
):
"""Initializes CategorizationMetric abstract object.

Args:
confidence_threshold: minimum confidence threshold for predictions to be taken into account for evaluation. Must be in [0, 1]. Default 0.0
annotation_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
prediction_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
"""
super().__init__(annotation_filters, prediction_filters)
assert 0 <= confidence_threshold <= 1
self.confidence_threshold = confidence_threshold

Expand All @@ -83,7 +109,7 @@ def eval(
def aggregate_score(self, results: List[CategorizationResult]) -> ScalarResult: # type: ignore[override]
pass

def __call__(
def call_metric(
self, annotations: AnnotationList, predictions: PredictionList
) -> CategorizationResult:
if self.confidence_threshold > 0:
Expand Down Expand Up @@ -139,7 +165,15 @@ class CategorizationF1(CategorizationMetric):
"""Evaluation method that matches categories and returns a CategorizationF1Result that aggregates to the F1 score"""

def __init__(
self, confidence_threshold: float = 0.0, f1_method: str = "macro"
self,
confidence_threshold: float = 0.0,
f1_method: str = "macro",
annotation_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
prediction_filters: Optional[
Union[ListOfOrAndFilters, ListOfAndFilters]
] = None,
):
"""
Args:
Expand Down Expand Up @@ -169,8 +203,28 @@ def __init__(
Calculate metrics for each instance, and find their average (only
meaningful for multilabel classification where this differs from
:func:`accuracy_score`).
annotation_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
prediction_filters: Filter predicates. Allowed formats are:
ListOfAndFilters where each Filter forms a chain of AND predicates.
or
ListOfOrAndFilters where Filters are expressed in disjunctive normal form (DNF), like
[[MetadataFilter("short_haired", "==", True), FieldFilter("label", "in", ["cat", "dog"]), ...].
DNF allows arbitrary boolean logical combinations of single field predicates. The innermost structures
each describe a single column predicate. The list of inner predicates is interpreted as a conjunction
(AND), forming a more selective `and` multiple field predicate.
Finally, the most outer list combines these filters as a disjunction (OR).
"""
super().__init__(confidence_threshold)
super().__init__(
confidence_threshold, annotation_filters, prediction_filters
)
assert (
f1_method in F1_METHODS
), f"Invalid f1_method {f1_method}, expected one of {F1_METHODS}"
Expand Down
Loading