Skip to content

Better guidance on how to use a search expression for SageMaker Experiments #4274

Open
@lorenzwalthert

Description

@lorenzwalthert

What did you find confusing? Please describe.

I want to filter sagemaker experiment runs based on the name of the Run Group (formerly Trial). I found the docs with the following example:

import boto3

search_params={
   "MaxResults": 10,
   "Resource": "TrainingJob",
   "SearchExpression": { 
      "Filters": [{ 
            "Name": "Tags.Project",
            "Operator": "Equals",
            "Value": "Project_Binary_Classifier"
         }]},
  "SortBy": "Metrics.train:binary_classification_accuracy",
  "SortOrder": "Descending"
}

smclient = boto3.client(service_name='sagemaker')
results = smclient.search(**search_params)

The page contained a link to the Search docs where I guessed that I need to use ExperimentTrial as the Resource and allowed Names in Filter are described in the SearchRecord docs. I came up with this code that works

import boto3

pipeline_execution_arn = "arn:aws:sagemaker:eu-central-1:982361546614:pipeline/staging-inference/execution/l8qeaasd6csp"

search_params={
   "MaxResults": 10,
   "Resource": "ExperimentTrial",
   "SearchExpression": { 
      "Filters": [{ 
            "Name": "TrialName",
            "Operator": "Equals",
            "Value": "l8qeaasd6csp"
         }]},
}

smclient = boto3.client(service_name='sagemaker')
results = smclient.search(**search_params)

However, I fail to reproduce the same with the sagemaker sdk. After numerous attempts and looking at the source code, I found how to make it work:

sagemaker.analytics
analytics = sagemaker.analytics.ExperimentAnalytics(
    # experiment_name="staging-inference",
    search_expression={"Filters": [{"Name": "Parents.TrialName", "Operator": "Contains" ,"Value": "l8qeaasd6csp"}]},
)
df = analytics.dataframe()

In particular, various other ways such as use Trial.TrialName, TrialName, RunGroupName, RunGroup.Name and similar instead of Parents.TrialName that were mentioned in the API docs did not work, nor is there a comprehensive example in the SDK guide or the sagemaker examples repo.

Describe how documentation can be improved

Add more examples on how search_expression works in the SDK. All I get is the following, with no mention of how to deal with nested names such as TrialName.
Screenshot 2023-11-27 at 16 19 07

Additional context

It took me 1h to figure this out. I don't think filtering by run group name is something very exotic to do.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions