WorkflowAI · guillaq · Feb 4, 2025 · Jan 31, 2025 · Feb 3, 2025 · Feb 3, 2025
diff --git a/.vscode/extensions.json b/.vscode/extensions.json
@@ -0,0 +1,7 @@
+{
+  "recommendations": [
+    "charliermarsh.ruff",
+    "njpwerner.autodocstring",
+    "editorconfig.editorconfig"
+  ]
+}
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,69 @@
+# Contributing to WorkflowAI
+
+## Setup
+
+### Prerequisites
+
+- [Poetry](https://python-poetry.org/docs/#installation) for dependency management and publishing
+
+### Getting started
+
+```bash
+# We recomment configuring the virtual envs in project with poetry so that
+# it can easily be picked up by IDEs
+
+# poetry config virtualenvs.in-project true
+poetry install --all-extras
+
+# Install the pre-commit hooks
+poetry run pre-commit install
+# or `make install` to install the pre-commit hooks and the dependencies
+
+# Check the code quality
+# Run ruff
+poetry run ruff check .
+# Run pyright
+poetry run pyright
+# or `make lint` to run ruff and pyright
+
+# Run the unit and integration tests
+# They do not require any configuration
+poetry run pytest --ignore=tests/e2e # make test
+
+# Run the end to end tests
+# They require the `WORKFLOWAI_TEST_API_URL` and `WORKFLOWAI_TEST_API_KEY` environment variables to be set
+# If they are present in the `.env` file, they will be picked up automatically
+poetry run pytest tests/e2e
+```
+
+#### Configuring VSCode
+
+Suggested extensions are available in the [.vscode/extensions.json](.vscode/extensions.json) file.
+
+### Dependencies
+
+#### Ruff
+
+[Ruff](https://github.com/astral-sh/ruff) is a very fast Python code linter and formatter.
+
+```sh
+ruff check . # check the entire project
+ruff check src/workflowai/core # check a specific file
+ruff check . --fix # fix linting errors automatically in the entire project
+```
+
+#### Pyright
+
+[Pyright](https://github.com/microsoft/pyright) is a static type checker for Python.
+
+> We preferred it to `mypy` because it is faster and easier to configure.
+
+#### Pydantic
+
+[Pydantic](https://docs.pydantic.dev/) is a data validation library for Python.
+It provides very convenient methods to serialize and deserialize data, introspect its structure, set validation
+rules, etc.
+
+#### HTTPX
+
+[HTTPX](https://www.python-httpx.org/) is a modern HTTP library for Python.
diff --git a/README.md b/README.md
@@ -121,6 +121,8 @@ WorkflowAI supports a long list of models. The source of truth for models we sup
 You can set the model explicitly in the agent decorator:
 
 ```python
+from workflowai import Model
+
 @workflowai.agent(model=Model.GPT_4O_LATEST)
 def say_hello(input: Input) -> Output:
     ...
@@ -151,16 +153,31 @@ def say_hello(input: Input) -> AsyncIterator[Run[Output]]:
     ...
 ```
 
-### Streaming and advanced usage
+### The Run object
+
+Although having an agent only return the run output covers most use cases, some use cases require having more
+information about the run.
 
-You can configure the agent function to stream or return the full run object, simply by changing the type annotation.
+By changing the type annotation of the agent function to `Run[Output]`, the generated function will return
+the full run object.
 
 ```python
-# Return the full run object, useful if you want to extract metadata like cost or duration
 @workflowai.agent()
-async def say_hello(input: Input) -> Run[Output]:
-    ...
+async def say_hello(input: Input) -> Run[Output]: ...
+
+
+run = await say_hello(Input(name="John"))
+print(run.output) # the output, as before
+print(run.model) # the model used for the run
+print(run.cost_usd) # the cost of the run in USD
+print(run.duration_seconds) # the duration of the inference in seconds
+```
 
+### Streaming
+
+You can configure the agent function to stream by changing the type annotation to an AsyncIterator.
+
+```python
 # Stream the output, the output is filled as it is generated
 @workflowai.agent()
 def say_hello(input: Input) -> AsyncIterator[Output]:
@@ -172,6 +189,38 @@ def say_hello(input: Input) -> AsyncIterator[Run[Output]]:
     ...
 ```
 
+### Replying to a run
+
+Some use cases require the ability to have a back and forth between the client and the LLM. For example:
+
+- tools [see below](#tools) use the reply ability internally
+- chatbots
+- correcting the LLM output
+
+In WorkflowAI, this is done by replying to a run. A reply can contain:
+
+- a user response
+- tool results
+
+<!-- TODO: find a better example for reply -->
+
+```python
+# Returning the full run object is required to use the reply feature
+@workflowai.agent()
+async def say_hello(input: Input) -> Run[Output]:
+    ...
+
+run = await say_hello(Input(name="John"))
+run = await run.reply(user_response="Now say hello to his brother James")
+```
+
+The output of a reply to a run has the same type as the original run, which makes it easy to iterate towards the
+construction of a final output.
+
+> To allow run iterations, it is very important to have outputs that are tolerant to missing fields, aka that
+> have default values for most of their fields. Otherwise the agent will throw a WorkflowAIError on missing fields
+> and the run chain will be broken.
+
 ### Tools
 
 Tools allow enhancing an agent's capabilities by allowing it to call external functions.
@@ -222,9 +271,16 @@ def get_current_time(timezone: Annotated[str, "The timezone to get the current t
     """Return the current time in the given timezone in iso format"""
     return datetime.now(ZoneInfo(timezone)).isoformat()
 
+# Tools can also be async
+async def fetch_webpage(url: str) -> str:
+    """Fetch the content of a webpage"""
+    async with httpx.AsyncClient() as client:
+        response = await client.get(url)
+        return response.text
+
 @agent(
     id="answer-question",
-    tools=[get_current_time],
+    tools=[get_current_time, fetch_webpage],
     version=VersionProperties(model=Model.GPT_4O_LATEST),
 )
 async def answer_question(_: AnswerQuestionInput) -> Run[AnswerQuestionOutput]: ...
@@ -261,6 +317,29 @@ except WorkflowAIError as e:
     print(e.message)
 ```
 
+#### Recoverable errors
+
+Sometimes, the LLM outputs an object that is partially valid, good examples are:
+
+- the model context window was exceeded during the generation
+- the model decided that a tool call result was a failure
+
+In this case, an agent that returns an output only will always raise an `InvalidGenerationError` which
+subclasses `WorkflowAIError`.
+
+However, an agent that returns a full run object will try to recover from the error by using the partial output.
+
+```python
+
+run = await agent(input=Input(name="John"))
+
+# The run will have an error
+assert run.error is not None
+
+# The run will have a partial output
+assert run.output is not None
+```
+
 ### Definining input and output types
 
 There are some important subtleties when defining input and output types.
@@ -368,3 +447,32 @@ async for run in say_hello(Input(name="John")):
     print(run.output.greeting1) # will be empty if the model has not generated it yet
 
 ```
+
+#### Field properties
+
+Pydantic allows a variety of other validation criteria for fields: minimum, maximum, pattern, etc.
+This additional criteria are included the JSON Schema that is sent to WorkflowAI, and are sent to the model.
+
+```python
+class Input(BaseModel):
+    name: str = Field(min_length=3, max_length=10)
+    age: int = Field(ge=18, le=100)
+    email: str = Field(pattern=r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$")
+```
+
+These arguments can be used to stir the model in the right direction. The caveat is have a
+validation that is too strict can lead to invalid generations. In case of an invalid generation:
+
+- WorkflowAI retries the inference once by providing the model with the invalid output and the validation error
+- if the model still fails to generate a valid output, the run will fail with an `InvalidGenerationError`.
+  the partial output is available in the `partial_output` attribute of the `InvalidGenerationError`
+
+```python
+
+@agent()
+def my_agent(_: Input) -> :...
+```
+
+## Contributing
+
+See the [CONTRIBUTING.md](./CONTRIBUTING.md) file for more details.
diff --git a/poetry.lock b/poetry.lock
diff --git a/pyproject.toml b/pyproject.toml
@@ -64,6 +64,7 @@ unfixable = []
 # in bin we use rich.print
 "bin/*" = ["T201"]
 "*_test.py" = ["S101"]
+"conftest.py" = ["S101"]
 
 [tool.pyright]
 pythonVersion = "3.9"

diff --git a/tests/e2e/tools_test.py b/tests/e2e/tools_test.py
@@ -6,7 +6,7 @@
 
 from workflowai import Run, agent
 from workflowai.core.domain.model import Model
-from workflowai.core.domain.tool import Tool
+from workflowai.core.domain.tool import ToolDefinition
 from workflowai.core.domain.tool_call import ToolCallResult
 from workflowai.core.domain.version_properties import VersionProperties
 
@@ -20,7 +20,7 @@ class AnswerQuestionOutput(BaseModel):
 
 
 async def test_manual_tool():
-    get_current_time_tool = Tool(
+    get_current_time_tool = ToolDefinition(
         name="get_current_time",
         description="Get the current time",
         input_schema={},

diff --git a/tests/integration/conftest.py b/tests/integration/conftest.py
@@ -0,0 +1,90 @@
+import json
+from typing import Any, Optional
+from unittest.mock import patch
+
+import pytest
+from pydantic import BaseModel
+from pytest_httpx import HTTPXMock, IteratorStream
+
+from workflowai.core.client.client import WorkflowAI
+
+
+@pytest.fixture(scope="module", autouse=True)
+def init_client():
+    with patch("workflowai.shared_client", new=WorkflowAI(api_key="test", endpoint="https://run.workflowai.dev")):
+        yield
+
+
+class CityToCapitalTaskInput(BaseModel):
+    city: str
+
+
+class CityToCapitalTaskOutput(BaseModel):
+    capital: str
+
+
+class IntTestClient:
+    REGISTER_URL = "https://api.workflowai.dev/v1/_/agents"
+
+    def __init__(self, httpx_mock: HTTPXMock):
+        self.httpx_mock = httpx_mock
+
+    def mock_register(self, schema_id: int = 1, task_id: str = "city-to-capital", variant_id: str = "1"):
+        self.httpx_mock.add_response(
+            method="POST",
+            url=self.REGISTER_URL,
+            json={"schema_id": schema_id, "variant_id": variant_id, "id": task_id},
+        )
+
+    def mock_response(
+        self,
+        task_id: str = "city-to-capital",
+        capital: str = "Tokyo",
+        json: Optional[dict[str, Any]] = None,
+        url: Optional[str] = None,
+        status_code: int = 200,
+    ):
+        self.httpx_mock.add_response(
+            method="POST",
+            url=url or f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run",
+            json=json or {"id": "123", "task_output": {"capital": capital}},
+            status_code=status_code,
+        )
+
+    def mock_stream(self, task_id: str = "city-to-capital"):
+        self.httpx_mock.add_response(
+            url=f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run",
+            stream=IteratorStream(
+                [
+                    b'data: {"id":"1","task_output":{"capital":""}}\n\n',
+                    b'data: {"id":"1","task_output":{"capital":"Tok"}}\n\ndata: {"id":"1","task_output":{"capital":"Tokyo"}}\n\n',  # noqa: E501
+                    b'data: {"id":"1","task_output":{"capital":"Tokyo"},"cost_usd":0.01,"duration_seconds":10.1}\n\n',
+                ],
+            ),
+        )
+
+    def check_request(
+        self,
+        version: Any = "production",
+        task_id: str = "city-to-capital",
+        task_input: Optional[dict[str, Any]] = None,
+        **matchers: Any,
+    ):
+        request = self.httpx_mock.get_request(**matchers)
+        assert request is not None
+        assert request.url == f"https://run.workflowai.dev/v1/_/agents/{task_id}/schemas/1/run"
+        body = json.loads(request.content)
+        assert body == {
+            "task_input": task_input or {"city": "Hello"},
+            "version": version,
+            "stream": False,
+        }
+        assert request.headers["Authorization"] == "Bearer test"
+        assert request.headers["Content-Type"] == "application/json"
+        assert request.headers["x-workflowai-source"] == "sdk"
+        assert request.headers["x-workflowai-language"] == "python"
+
+
+@pytest.fixture
+def test_client(httpx_mock: HTTPXMock) -> IntTestClient:
+    return IntTestClient(httpx_mock)