Skip to content

Tool enhancements #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"recommendations": [
"charliermarsh.ruff",
"njpwerner.autodocstring",
"editorconfig.editorconfig"
]
}
69 changes: 69 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Contributing to WorkflowAI

## Setup

### Prerequisites

- [Poetry](https://python-poetry.org/docs/#installation) for dependency management and publishing

### Getting started

```bash
# We recomment configuring the virtual envs in project with poetry so that
# it can easily be picked up by IDEs

# poetry config virtualenvs.in-project true
poetry install --all-extras

# Install the pre-commit hooks
poetry run pre-commit install
# or `make install` to install the pre-commit hooks and the dependencies

# Check the code quality
# Run ruff
poetry run ruff check .
# Run pyright
poetry run pyright
# or `make lint` to run ruff and pyright

# Run the unit and integration tests
# They do not require any configuration
poetry run pytest --ignore=tests/e2e # make test

# Run the end to end tests
# They require the `WORKFLOWAI_TEST_API_URL` and `WORKFLOWAI_TEST_API_KEY` environment variables to be set
# If they are present in the `.env` file, they will be picked up automatically
poetry run pytest tests/e2e
```

#### Configuring VSCode

Suggested extensions are available in the [.vscode/extensions.json](.vscode/extensions.json) file.

### Dependencies

#### Ruff

[Ruff](https://github.com/astral-sh/ruff) is a very fast Python code linter and formatter.

```sh
ruff check . # check the entire project
ruff check src/workflowai/core # check a specific file
ruff check . --fix # fix linting errors automatically in the entire project
```

#### Pyright

[Pyright](https://github.com/microsoft/pyright) is a static type checker for Python.

> We preferred it to `mypy` because it is faster and easier to configure.

#### Pydantic

[Pydantic](https://docs.pydantic.dev/) is a data validation library for Python.
It provides very convenient methods to serialize and deserialize data, introspect its structure, set validation
rules, etc.

#### HTTPX

[HTTPX](https://www.python-httpx.org/) is a modern HTTP library for Python.
120 changes: 114 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ WorkflowAI supports a long list of models. The source of truth for models we sup
You can set the model explicitly in the agent decorator:

```python
from workflowai import Model

@workflowai.agent(model=Model.GPT_4O_LATEST)
def say_hello(input: Input) -> Output:
...
Expand Down Expand Up @@ -151,16 +153,31 @@ def say_hello(input: Input) -> AsyncIterator[Run[Output]]:
...
```

### Streaming and advanced usage
### The Run object

Although having an agent only return the run output covers most use cases, some use cases require having more
information about the run.

You can configure the agent function to stream or return the full run object, simply by changing the type annotation.
By changing the type annotation of the agent function to `Run[Output]`, the generated function will return
the full run object.

```python
# Return the full run object, useful if you want to extract metadata like cost or duration
@workflowai.agent()
async def say_hello(input: Input) -> Run[Output]:
...
async def say_hello(input: Input) -> Run[Output]: ...


run = await say_hello(Input(name="John"))
print(run.output) # the output, as before
print(run.model) # the model used for the run
print(run.cost_usd) # the cost of the run in USD
print(run.duration_seconds) # the duration of the inference in seconds
```

### Streaming

You can configure the agent function to stream by changing the type annotation to an AsyncIterator.

```python
# Stream the output, the output is filled as it is generated
@workflowai.agent()
def say_hello(input: Input) -> AsyncIterator[Output]:
Expand All @@ -172,6 +189,38 @@ def say_hello(input: Input) -> AsyncIterator[Run[Output]]:
...
```

### Replying to a run

Some use cases require the ability to have a back and forth between the client and the LLM. For example:

- tools [see below](#tools) use the reply ability internally
- chatbots
- correcting the LLM output

In WorkflowAI, this is done by replying to a run. A reply can contain:

- a user response
- tool results

<!-- TODO: find a better example for reply -->

```python
# Returning the full run object is required to use the reply feature
@workflowai.agent()
async def say_hello(input: Input) -> Run[Output]:
...

run = await say_hello(Input(name="John"))
run = await run.reply(user_response="Now say hello to his brother James")
```

The output of a reply to a run has the same type as the original run, which makes it easy to iterate towards the
construction of a final output.

> To allow run iterations, it is very important to have outputs that are tolerant to missing fields, aka that
> have default values for most of their fields. Otherwise the agent will throw a WorkflowAIError on missing fields
> and the run chain will be broken.

### Tools

Tools allow enhancing an agent's capabilities by allowing it to call external functions.
Expand Down Expand Up @@ -222,9 +271,16 @@ def get_current_time(timezone: Annotated[str, "The timezone to get the current t
"""Return the current time in the given timezone in iso format"""
return datetime.now(ZoneInfo(timezone)).isoformat()

# Tools can also be async
async def fetch_webpage(url: str) -> str:
"""Fetch the content of a webpage"""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text

@agent(
id="answer-question",
tools=[get_current_time],
tools=[get_current_time, fetch_webpage],
version=VersionProperties(model=Model.GPT_4O_LATEST),
)
async def answer_question(_: AnswerQuestionInput) -> Run[AnswerQuestionOutput]: ...
Expand Down Expand Up @@ -261,6 +317,29 @@ except WorkflowAIError as e:
print(e.message)
```

#### Recoverable errors

Sometimes, the LLM outputs an object that is partially valid, good examples are:

- the model context window was exceeded during the generation
- the model decided that a tool call result was a failure

In this case, an agent that returns an output only will always raise an `InvalidGenerationError` which
subclasses `WorkflowAIError`.

However, an agent that returns a full run object will try to recover from the error by using the partial output.

```python

run = await agent(input=Input(name="John"))

# The run will have an error
assert run.error is not None

# The run will have a partial output
assert run.output is not None
```

### Definining input and output types

There are some important subtleties when defining input and output types.
Expand Down Expand Up @@ -368,3 +447,32 @@ async for run in say_hello(Input(name="John")):
print(run.output.greeting1) # will be empty if the model has not generated it yet

```

#### Field properties

Pydantic allows a variety of other validation criteria for fields: minimum, maximum, pattern, etc.
This additional criteria are included the JSON Schema that is sent to WorkflowAI, and are sent to the model.

```python
class Input(BaseModel):
name: str = Field(min_length=3, max_length=10)
age: int = Field(ge=18, le=100)
email: str = Field(pattern=r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$")
```

These arguments can be used to stir the model in the right direction. The caveat is have a
validation that is too strict can lead to invalid generations. In case of an invalid generation:

- WorkflowAI retries the inference once by providing the model with the invalid output and the validation error
- if the model still fails to generate a valid output, the run will fail with an `InvalidGenerationError`.
the partial output is available in the `partial_output` attribute of the `InvalidGenerationError`

```python

@agent()
def my_agent(_: Input) -> :...
```

## Contributing

See the [CONTRIBUTING.md](./CONTRIBUTING.md) file for more details.
6 changes: 3 additions & 3 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ unfixable = []
# in bin we use rich.print
"bin/*" = ["T201"]
"*_test.py" = ["S101"]
"conftest.py" = ["S101"]

[tool.pyright]
pythonVersion = "3.9"
Expand Down
4 changes: 2 additions & 2 deletions tests/e2e/tools_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

from workflowai import Run, agent
from workflowai.core.domain.model import Model
from workflowai.core.domain.tool import Tool
from workflowai.core.domain.tool import ToolDefinition
from workflowai.core.domain.tool_call import ToolCallResult
from workflowai.core.domain.version_properties import VersionProperties

Expand All @@ -20,7 +20,7 @@ class AnswerQuestionOutput(BaseModel):


async def test_manual_tool():
get_current_time_tool = Tool(
get_current_time_tool = ToolDefinition(
name="get_current_time",
description="Get the current time",
input_schema={},
Expand Down
90 changes: 90 additions & 0 deletions tests/integration/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
import json
from typing import Any, Optional
from unittest.mock import patch

import pytest
from pydantic import BaseModel
from pytest_httpx import HTTPXMock, IteratorStream

from workflowai.core.client.client import WorkflowAI


@pytest.fixture(scope="module", autouse=True)
def init_client():
with patch("workflowai.shared_client", new=WorkflowAI(api_key="test", endpoint="https://run.workflowai.dev")):
yield


class CityToCapitalTaskInput(BaseModel):
city: str


class CityToCapitalTaskOutput(BaseModel):
capital: str


class IntTestClient:
REGISTER_URL = "https://api.workflowai.dev/v1/_/agents"

def __init__(self, httpx_mock: HTTPXMock):
self.httpx_mock = httpx_mock

def mock_register(self, schema_id: int = 1, task_id: str = "city-to-capital", variant_id: str = "1"):
self.httpx_mock.add_response(
method="POST",
url=self.REGISTER_URL,
json={"schema_id": schema_id, "variant_id": variant_id, "id": task_id},
)

def mock_response(
self,
task_id: str = "city-to-capital",
capital: str = "Tokyo",
json: Optional[dict[str, Any]] = None,
url: Optional[str] = None,
status_code: int = 200,
):
self.httpx_mock.add_response(
method="POST",
url=url or f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run",
json=json or {"id": "123", "task_output": {"capital": capital}},
status_code=status_code,
)

def mock_stream(self, task_id: str = "city-to-capital"):
self.httpx_mock.add_response(
url=f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run",
stream=IteratorStream(
[
b'data: {"id":"1","task_output":{"capital":""}}\n\n',
b'data: {"id":"1","task_output":{"capital":"Tok"}}\n\ndata: {"id":"1","task_output":{"capital":"Tokyo"}}\n\n', # noqa: E501
b'data: {"id":"1","task_output":{"capital":"Tokyo"},"cost_usd":0.01,"duration_seconds":10.1}\n\n',
],
),
)

def check_request(
self,
version: Any = "production",
task_id: str = "city-to-capital",
task_input: Optional[dict[str, Any]] = None,
**matchers: Any,
):
request = self.httpx_mock.get_request(**matchers)
assert request is not None
assert request.url == f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run"
body = json.loads(request.content)
assert body == {
"task_input": task_input or {"city": "Hello"},
"version": version,
"stream": False,
}
assert request.headers["Authorization"] == "Bearer test"
assert request.headers["Content-Type"] == "application/json"
assert request.headers["x-workflowai-source"] == "sdk"
assert request.headers["x-workflowai-language"] == "python"


@pytest.fixture
def test_client(httpx_mock: HTTPXMock) -> IntTestClient:
return IntTestClient(httpx_mock)
Loading