diff --git a/python-recipes/agents/05_agent_framework_agent.ipynb b/python-recipes/agents/05_agent_framework_agent.ipynb new file mode 100644 index 00000000..3de6b4ae --- /dev/null +++ b/python-recipes/agents/05_agent_framework_agent.ipynb @@ -0,0 +1,648 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "a9U6R3RkGoCx" + }, + "source": [ + "![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)\n", + "\n", + "# Agentic tasks with Agent Framework and Redis\n", + "\n", + "This notebook demonstrates how to build an agent using Microsoft's [Agent Framework](https://github.com/microsoft/agent-framework/tree/main).\n", + "\n", + "It adapts the work found in [Redis-AI-Resources Autogen Agent Tutorial](https://github.com/redis-developer/redis-ai-resources/blob/main/python-recipes/agents/04_autogen_agent.ipynb), and simply applies the setting in Agent Framework instead.\n", + "\n", + "We'll define an agent, give it access to tools and memory, then set in on a task to see how it uses its abilities.\n", + "\n", + "## Note\n", + "Agent Framework relies on either the OpenAI or AzureOpenAI API for LLM capability. This notebook will rely on the OpenAI API.\n", + "\n", + "## Let's Begin!\n", + "\n", + "\"Open\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "id": "Yx5yx41OVOBw", + "outputId": "dcff6959-305d-4dd6-c205-aaee1278a356" + }, + "outputs": [], + "source": [ + "%pip install agent-framework --pre\n", + "%pip install -q \"redisvl>=0.8.0\" sentence-transformers openai tiktoken python-dotenv redis google pandas requests" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sc4Jn-UYaohD" + }, + "source": [ + "## For Colab download and run a Redis instance\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "_ujoLxHWaoCK", + "outputId": "a51733ed-dfae-4860-b6b9-1093c36ffbe0" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb jammy main\n", + "Starting redis-stack-server, database path /var/lib/redis-stack\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "gpg: cannot open '/dev/tty': No such device or address\n", + "curl: (23) Failed writing body\n" + ] + } + ], + "source": [ + "# NBVAL_SKIP\n", + "%%sh\n", + "curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg\n", + "echo \"deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main\" | sudo tee /etc/apt/sources.list.d/redis.list\n", + "sudo apt-get update > /dev/null 2>&1\n", + "sudo apt-get install redis-stack-server > /dev/null 2>&1\n", + "redis-stack-server --daemonize yes" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XSCgi3Ppa6B1" + }, + "source": [ + "## Format plain text into Markdown-style blockquotes for display in a Colab notebook" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "idsIktqlVpYb" + }, + "outputs": [], + "source": [ + "import textwrap\n", + "\n", + "from IPython.display import display\n", + "from IPython.display import Markdown\n", + "\n", + "def to_markdown(text):\n", + " text = text.replace('•', ' *')\n", + " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sWNdJqBTa_Ff" + }, + "source": [ + "## Import Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "IJsiE43wWSdC" + }, + "outputs": [], + "source": [ + "import asyncio\n", + "import os\n", + "\n", + "import json\n", + "import os\n", + "import re\n", + "import requests\n", + "from collections import Counter\n", + "from typing import List" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "AMlNJ8u4HMNr", + "outputId": "1cf55d70-d604-4814-8d95-fd5fa889e0d9" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Connecting to Redis at: redis://localhost:6379\n" + ] + } + ], + "source": [ + "# Use the environment variable if set, otherwise default to localhost\n", + "REDIS_URL = os.getenv(\"REDIS_URL\", \"redis://localhost:6379\")\n", + "print(f\"Connecting to Redis at: {REDIS_URL}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "neDrDDx9HOiG" + }, + "source": [ + "## Building our agent\n", + "\n", + "We'll be building a restaurant review writing agent that takes in a set of restaurant reviews, collects relevant information, and provides a summary and analysis you can use for SEO." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Wc7tFmQAHRe_" + }, + "source": [ + "### Defining tools\n", + "One of the defining the features of an agent is its ability to use tools so let's give it some. Our agent will decide when to call each tool, construct the appropriate arguments, then retrieve and utilize the results.\n", + "\n", + "With Autogen that just requires we define a well named function with type hints in its signature.\n", + "\n", + "We will have three main tools:\n", + "1. A `summarize()` function that can take in a collection of reviews and boil them all down to a single summary.\n", + "2. A `get_keywords()` function to count the most common words present in the article, becuase LLM's often struggle with character counting.\n", + "3. A `publish_article()` function that will write our final article to a separate file we can then upload elsewhere." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "cur0F0M3HURD" + }, + "outputs": [], + "source": [ + "\n", + "async def summarize(restaurant_name: str, all_reviews: List[str]) -> str:\n", + " \"\"\"takes a list of reviews for a single restaurant and returns a summary of all of them.\"\"\"\n", + " # set up a summarizer model\n", + " summarizer = pipeline('summarization', model='facebook/bart-large-cnn')\n", + " # pass all the reviews\n", + " summary = summarizer('\\n'.join(all_reviews), # concatenate all the reviews together\n", + " max_length=1024,\n", + " min_length=128,\n", + " do_sample=False)[0][\"summary_text\"]\n", + " return restaurant_name + \": \" + summary\n", + "\n", + "\n", + "async def get_keywords(full_text: str) -> List[str]:\n", + " \"\"\"extract the most commonly occurring keywords present in the reviews to know\n", + " which terms it is likely to rank highly for in keyword search engines.\"\"\"\n", + " # define a set of common English stopwords to ignore\n", + " STOPWORDS = {\n", + " 'the', 'of', 'and', 'to', 'for', 'in', 'on', 'at', 'a', 'an', 'is', 'it', 'its', 'with', 'as', 'by', 'from', 'that',\n", + " 'this', 'those', 'be', 'are', 'was', 'were', 'or', 'but', 'not', 'so', 'if', 'then', 'than', 'which', 'who', 'whom',\n", + " 'about', 'into', 'out', 'up', 'down', 'over', 'under', 'again', 'further', 'once', 'here', 'there', 'when',\n", + " 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no',\n", + " 'nor', 'only', 'own', 'same', 'can', 'will', 'just', 'don', 'should', 'now', 'has', 'have', 'had', 'do', 'does',\n", + " 'did', 'their', 'them', 'they', 'you', 'your', 'yours', 'he', 'him', 'his', 'she', 'her', 'hers', 'we', 'us',\n", + " 'our', 'ours', 'i', 's', 'me', 'my', 'mine', 'also', 'place'\n", + " }\n", + " # remove punctuation and lowercase the text\n", + " words = re.findall(r'\\b\\w+\\b', full_text.lower())\n", + " # filter out stopwords\n", + " filtered_words = [word for word in words if word not in STOPWORDS]\n", + " # count occurrences\n", + " word_counts = Counter(filtered_words)\n", + " # return the top 10\n", + " return [word for word, _ in word_counts.most_common(10)]\n", + "\n", + "\n", + "async def publish_article(final_draft: str, file_name:str= \"food_article.md\") -> str:\n", + " \"accepts the final version of an article, writes it to a markdown file and returns the full file location path.\"\n", + " with open(file_name, 'w') as file:\n", + " file.write(final_draft)\n", + "\n", + " full_path = os.path.abspath(__file__)\n", + " return os.path.join(full_path, file_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KjuT0X5WIIKt" + }, + "source": [ + "## Adding relevant memories\n", + "Our agent needs to know what people think of these restaurants so we'll add the user reviews to our agent memory powered by Redis.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "bsPzxix6IJgn" + }, + "outputs": [], + "source": [ + "# fetch the reviews from our public S3 bucket\n", + "# the original dataset can be found here: https://www.kaggle.com/datasets/jkgatt/restaurant-data-with-100-trip-advisor-reviews-each\n", + "def fetch_data(file_name):\n", + " dataset_path = 'datasets/'\n", + " try:\n", + " with open(dataset_path + file_name, 'r') as f:\n", + " return json.load(f)\n", + " except:\n", + " url = 'https://redis-ai-resources.s3.us-east-2.amazonaws.com/recommenders/datasets/two-towers/'\n", + " r = requests.get(url + file_name)\n", + " if not os.path.exists(dataset_path):\n", + " os.makedirs(dataset_path)\n", + " with open(dataset_path + file_name, 'wb') as f:\n", + " f.write(r.content)\n", + " return json.loads(r.content.decode('utf-8'))" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "Uq9-_1BqIK1r", + "outputId": "24f3aa38-3837-4bd8-dccb-7aa304d90640" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "we have 147 restaurants in our dataset, with 14700 total reviews\n" + ] + } + ], + "source": [ + "restaurant_data = fetch_data('factual_tripadvisor_restaurant_data_all_100_reviews.json')\n", + "\n", + "print(f\"we have {restaurant_data['restaurant_count']} restaurants in our dataset, with {restaurant_data['total_review_count']} total reviews\")\n", + "\n", + "restaurant_reviews = restaurant_data[\"restaurants\"] # ignore the count fields\n", + "\n", + "# drop some of the fields that we don't need\n", + "for restaurant in restaurant_reviews:\n", + " for field in ['region', 'country', 'tel', 'fax', 'email', 'website', 'address_extended', 'chain_name', 'trip_advisor_url']:\n", + " if field in restaurant:\n", + " restaurant.pop(field)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Hqu5erM0bBjc" + }, + "source": [ + "## Initialize Redis Provider + Vectorizer" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "sHCkRr_2IOBK" + }, + "outputs": [], + "source": [ + "from agent_framework import ChatMessage, Role\n", + "from agent_framework.openai import OpenAIChatClient\n", + "from agent_framework_redis._chat_message_store import RedisChatMessageStore\n", + "from agent_framework_redis._provider import RedisProvider\n", + "from redisvl.extensions.cache.embeddings import EmbeddingsCache\n", + "from redisvl.utils.vectorize import OpenAITextVectorizer\n", + "from logging import WARNING, getLogger\n", + "from tqdm import tqdm\n", + "\n", + "logger = getLogger()\n", + "logger.setLevel(WARNING)\n", + "\n", + "REDIS_OPENAI_EMBEDDINGS_CACHE_NAME = os.getenv(\"REDIS_OPENAI_EMBEDDINGS_CACHE_NAME\",\n", + " \"openai_embeddings_cache\")\n", + "OPENAI_API_KEY=os.getenv(\"OPENAI_API_KEY\",\n", + " \"\")\n", + "OPENAI_CHAT_MODEL_ID=os.getenv(\"OPENAI_CHAT_MODEL_ID\",\n", + " \"gpt-5-mini\")\n", + "OPENAI_EMBEDDING_MODEL_ID=os.getenv(\"OPENAI_EMBEDDING_MODEL_ID\",\n", + " \"text-embedding-3-small\")\n", + "\n", + "thread_id=\"test_thread\"\n", + "\n", + "vectorizer = OpenAITextVectorizer(\n", + " model=OPENAI_EMBEDDING_MODEL_ID,\n", + " api_config={\"api_key\": OPENAI_API_KEY},\n", + " cache=EmbeddingsCache(name=REDIS_OPENAI_EMBEDDINGS_CACHE_NAME,\n", + " redis_url=REDIS_URL)\n", + ")\n", + "\n", + "provider = RedisProvider(\n", + " redis_url=REDIS_URL,\n", + " index_name=\"restaurant_reviews\",\n", + " prefix=\"trip_advisor\",\n", + " application_id=\"restaurant_reviews_app\",\n", + " agent_id=\"restaurant_reviews_agent\",\n", + " user_id=\"restaurant_reviews_user\",\n", + " redis_vectorizer=vectorizer,\n", + " vector_field_name=\"vector\",\n", + " vector_algorithm=\"hnsw\",\n", + " vector_distance_metric=\"cosine\",\n", + " thread_id=thread_id,\n", + " overwrite_index=True\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MFqKQ3N8XG3u" + }, + "source": [ + "### Add Memories in batches to the Redis Memory provider" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "DXTkhG1AXFse" + }, + "outputs": [], + "source": [ + "batch_size=128\n", + "messages=[]\n", + "await provider.thread_created(thread_id=thread_id)\n", + "for restaurant in tqdm(restaurant_reviews):\n", + " # add each review to our agent memory\n", + " # for brevity we'll take only the first 10 reviews per restaurant\n", + " for review in restaurant['reviews'][:10]:\n", + " try:\n", + " review_title=review['review_title']\n", + " review_text=review[\"review_text\"]\n", + " meta_data=str(\"\\n\".join([str(key) + \": \" + str(val) for key, val in restaurant.items() if key != \"reviews\"]))\n", + " memory=\"\\n\".join([review_title, review_text, meta_data])\n", + " message = ChatMessage(role='system', conversation_id=thread_id, text=memory)\n", + " messages.append(message)\n", + " if len(messages)==batch_size:\n", + " await provider.invoked(request_messages=[message])\n", + " messages=[]\n", + " except Exception as e:\n", + " print(e)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "fO2OVM7lTQPS", + "outputId": "1d4200a4-83f2-4bd5-df12-04ef3968c0d5" + }, + "outputs": [], + "source": [ + "# Viewing sample reviews \n", + "# all=await provider.search_all()\n", + "# # Number of Reviews\n", + "# print(len(all))\n", + "# # Sample Review\n", + "# # print(all[100][\"content\"])\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tiTIAnWWW9Dn" + }, + "source": [ + "### Initialize Agent with Provider and Chat Message Store" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "id": "JSc_Pg3QJmLQ" + }, + "outputs": [], + "source": [ + "chat_message_store_factory = lambda: RedisChatMessageStore(\n", + " redis_url=REDIS_URL,\n", + " thread_id=\"test_thread\",\n", + " key_prefix=\"chat_messages\",\n", + " max_messages=100\n", + ")\n", + "\n", + "client = OpenAIChatClient(model_id=OPENAI_CHAT_MODEL_ID,\n", + " api_key=OPENAI_API_KEY)\n", + "# Create agent wired to the Redis context provider. The provider automatically\n", + "# persists conversational details and surfaces relevant context on each turn.\n", + "review_agent = client.create_agent(\n", + " name=\"MemoryEnhancedAssistant\",\n", + " instructions=(\n", + " \"You are a helpful assistant. Personalize replies using provided context. \"\n", + " \"Before answering, always check for stored context\"\n", + " ),\n", + " tools=[summarize , get_keywords, publish_article],\n", + " context_providers=provider,\n", + " chat_message_store_factory=chat_message_store_factory,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FKkMhtmGW3TZ" + }, + "source": [ + "### Teaching Preferences" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "eD1tTAVsWt3x", + "outputId": "e1593ef6-aecf-47e8-9a61-0b20ba161211" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Got it — I remember you hate beef. I’ll avoid recommending beef dishes and beef‑focused restaurants going forward.\n", + "\n", + "Would you like a beef‑free shortlist now? (Options: by neighborhood, by price tier, by cuisine, or for a special occasion.)\n" + ] + } + ], + "source": [ + "preference = \"Remember that I hate beef\"\n", + "result = await review_agent.run(preference)\n", + "print(result)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kjsFtIR0W5FO" + }, + "source": [ + "### Running a Streaming Task" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "-NlF9YQiL5n2", + "outputId": "5fd90ef2-2248-424f-9975-d6f52bfc7ce9" + }, + "outputs": [], + "source": [ + "# import asyncio\n", + "\n", + "# async def streaming_example(agent, query) -> None:\n", + "# print(f\"User: {query}\")\n", + "# print(\"Agent: \", end=\"\", flush=True)\n", + "# async for chunk in agent.run_stream(query):\n", + "# if chunk.text:\n", + "# print(chunk.text, end=\"\", flush=True)\n", + "# print(\"\\n\")\n", + "\n", + "# writing_task = \"Write an article reviewing the restaurants in the San Francisco bay area. \\\n", + "# Include a brief summary of the most popular restaurants based on the user reviews. \\\n", + "# Group them into categories based on their cuisine and price, and talk about the \\\n", + "# top rated restaurants in each category.\"\n", + "\n", + "# await streaming_example(review_agent, writing_task)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sA0d3smMWl9C" + }, + "source": [ + "That's a lot of output from our agent, and it can be hard to parse through all of it.\n", + "\n", + "There's no need for us to read it closely as we'll have the final article written to an external file cleanly when we're - or rather our agent - is finished." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MbeI2TouXNog" + }, + "source": [ + "## Follow up tasks\n", + "Our agent doesn't have to be a one-and-done worker. We can ask it to continue toward our overall goal of having a well viewed food critic article.\n", + "\n", + "You've probably noticed the agent's output is somewhat messy. That's ok as our final article will be written cleanly to a markdown file." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "F0pGU--3MnLq", + "outputId": "29a4b7d9-f8fa-450e-80db-893ae143e522" + }, + "outputs": [], + "source": [ + "# task_list = [\"Now analyze your article and tell me the key search terms it is likely to rank highly for.\",\n", + "# \"Using your analysis suggest changes to the original article to improve keyword ranking.\",\n", + "# \"Based on your suggestions, edit and modify your article to improve SEO keyword ranking. Give a new list of top keywords\",\n", + "# \"When it is ready, publish the article by saving it to a markdown file.\"]\n", + "\n", + "# for task in task_list:\n", + "# await streaming_example(review_agent, task)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RpLtRTmlXSWr" + }, + "source": [ + "## The finished product\n", + "We got another large block of agent output showing us it's hard work. What we really care about it is the finished product. Check your local directory for a markdown file with our finished article.\n", + "\n", + "That's it!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HcPsZZcBabuv" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.7" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}