Skip to content

Add date/time filtering capability to Vertex AI Experiment.get_data_frame API to prevent 503 errors with large runs #5298

@suk1yak1

Description

@suk1yak1

Problem

When attempting to use Experiment.get_data_frame() with projects containing numerous runs, the API consistently returns 503 "Service Unavailable" errors. This appears to happen because the API attempts to fetch all experiment runs at once, which can overload the service when dealing with large datasets.

Current behavior

experiment = aiplatform.Experiment(experiment_name="specific-experiment-name")
df_experiment = experiment.get_data_frame(include_time_series=False)

This call fails with 503 errors when the experiment contains too many runs.

Proposed solution

Add date/time filtering parameters to the get_data_frame() method to limit the number of runs being processed:

df_run = run.get_data_frame(
    include_time_series=False,
    start_time="2023-01-01T00:00:00Z",
    end_time="2023-01-31T23:59:59Z"
)

This would allow users to retrieve manageable chunks of experiment data by time period, preventing service overload and improving reliability when working with large experiments.

Additional benefits

  • Improved performance for large experiments
  • Better user experience by avoiding timeout errors
  • More control over data retrieval for analysis purposes
  • Reduced load on backend services

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: vertex-aiIssues related to the googleapis/python-aiplatform API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions