Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions Text-Summarizer-Browser-Plugin/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,6 @@ local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

Expand Down
274 changes: 131 additions & 143 deletions Text-Summarizer-Browser-Plugin/README.md
Original file line number Diff line number Diff line change
@@ -1,178 +1,166 @@
# Text summarizer browser Plugin Sample
# Text Summarizer Browser Plugin

A plug-and-play Chrome extension seamlessly integrates with Flask and leverages an OpenVINO backend for fast and efficient summarization of webpages (via URL) and PDFs (via upload). Powered by LangChain tools, it handles advanced tasks like text splitting and vectorstore management to deliver accurate and meaningful summaries.
A plug-and-play Chrome extension that integrates seamlessly with a Flask backend, leveraging OpenVINO for fast and efficient summarization of webpages (via URL) and PDFs (via upload). Powered by LangChain tools, it handles advanced tasks like text splitting and vector store management to deliver accurate and meaningful summaries.

## How it Works
---

<img width="1000" alt="image" src="./assets/Text-Summarizer-Overview.png">
## Table of Contents
- [Overview](#overview)
- [Architecture](#architecture)
- [Prerequisites](#prerequisites)
- [Project Structure](#project-structure)
- [Setup & Installation](#setup--installation)
- [Environment Setup](#environment-setup)
- [Model Preparation](#model-preparation)
- [Running the Backend](#running-the-backend)
- [Loading the Chrome Extension](#loading-the-chrome-extension)
- [Usage](#usage)
- [Webpage Summarization](#webpage-summarization)
- [PDF Summarization](#pdf-summarization)
- [Troubleshooting](#troubleshooting)
- [License](#license)

## Sample Structure
---

The directory contains:
- **backend:** Includes `code.py` and `server.py` for processing text from webpages or PDFs and managing Flask-related operations.
- **extension:** Contains `manifest.json` for the Chrome extension along with `popup.html`, `popup.js`, and `style.css` for the user interface.
## Overview

## Prerequisites
This project provides a minimalist Chrome extension that allows users to summarize the content of any webpage or PDF directly from their browser. The extension communicates with a FastAPI server, which performs the summarization using OpenVINO-optimized models and LangChain utilities.

| Optimized for | Description |
| :------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| OS | Windows 11 64-bit (22H2, 23H2) and newer or Ubuntu* 22.04 64-bit (with Linux kernel 6.6+) and newer |
| Hardware | Intel® Core™ Ultra Processors |
| Software | 1. [Intel® GPU drivers from Intel® Arc™ & Iris® Xe Graphics for Windows](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) or [Linux GPU drivers](https://dgpu-docs.intel.com/driver/client/overview.html) <br> 2. NPU(Optional): [Intel® NPU Driver for Windows](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) or [Linux NPU Driver](https://github.com/intel/linux-npu-driver/releases) |
| Browsers | [Google Chrome](https://www.google.com/chrome/dr/download/?brand=MRUS&ds_kid=43700079286123654&gad_source=1&gclid=EAIaIQobChMI0J3fybvSigMV5dXCBB1TDARCEAAYASAAEgL36_D_BwE&gclsrc=aw.ds) & [Microsoft Edge](https://www.microsoft.com/en-us/edge/download?form=MA13FJ)


1. **Install the below necessary tools/packages:**
- Git
- [Git for Windows](https://git-scm.com/downloads)
- Git for Linux
- For Debian/Ubuntu-based systems:
```bash
sudo apt update && sudo apt -y install git
```
- For RHEL/CentOS-based systems:
```bash
sudo dnf update && sudo dnf -y install git
```
- Miniforge
- [Miniforge for Windows](https://conda-forge.org/download/)
- Miniforge for Linux
Download, install the Miniconda using the below commands.
```bash
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
```
Replace </move/to/miniforge3/bin/folder> with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.
```bash
cd </move/to/miniforge3/bin/folder>
```
```bash
./conda init
```

2. **Create a Conda Environment:**
- Run the command:
```bash
conda create -n summarizer_plugin python=3.11 libuv
```
```bash
conda activate summarizer_plugin
```
## Architecture

3. **Install Dependencies:**
- Execute:
```bash
pip install -r requirements.txt
```

4. **Install an ipykernel to select the "summarizer_plugin" environment:**
```bash
python -m ipykernel install --user --name=summarizer_plugin
```

>**Note**: Run your terminal as admin to avoid any permission issues.
- **Chrome Extension:** User interface for input (webpage URL or PDF upload) and displaying summaries.
- **FastAPI Server:** Backend for processing requests, running summarization models, and returning results.
- **OpenVINO & LangChain:** Accelerated inference and advanced text processing.


## Next Steps
Below are the steps to run the plugin from Jupyter Notebook **OR** Terminal.
![Architecture Overview](./assets/Text-Summarizer-Overview.png)

### **Steps to follow via Jupyter Notebook:**
Once the environment is created, we can run the plugin via [TextSummarizerPlugin.ipynb](./TextSummarizerPlugin.ipynb). Please follow the below steps to open the jupyter notebook:
1. Open Jupyter Notebook & run the cells:
```
jupyter notebook
```
2. Select the **summarizer_plugin** kernel.
---

### **Steps to follow via Terminal:**
1. **Download and Convert the Huggingface Model to OpenVINO IR Format:**
- Log in to Huggingface:
```
huggingface-cli login
```
- Generate a token from Huggingface. For private or gated models, refer to [Huggingface documentation](https://huggingface.co/docs/hub/en/models-gated).
- Convert the model using `optimum-cli`by creating a directory named **models** and saving the models inside it:
## Prerequisites

| Component | Details |
|-------------|-------------|
| OS | Windows 11 64-bit (22H2, 23H2+) or Ubuntu 22.04 64-bit (kernel 6.6+)|
| Hardware | Intel® GPU (Arc™ & Iris® Xe) drivers ([Windows](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) / [Linux](https://dgpu-docs.intel.com/driver/client/overview.html)), NPU (optional: [Windows](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) / [Linux](https://github.com/intel/linux-npu-driver/releases)), Intel® Core™ Ultra Processors |
| Software | [uv](https://docs.astral.sh/uv/)|
| Browsers | [Google Chrome](https://www.google.com/chrome/) or [Microsoft Edge](https://www.microsoft.com/en-us/edge/download)|

---

## Project Structure

```
Text-Summarizer-Browser-Plugin/
├── backend/ # FastAPI server and model code
│ ├── code.py
│ ├── server.py
│ └── ...
├── extension/ # Chrome extension files
│ ├── manifest.json
│ ├── popup.html
│ ├── popup.js
│ └── style.css
├── models/ # OpenVINO IR models
│ └── ...
├── assets/ # Images and diagrams
├── TextSummarizerPlugin.ipynb # Jupyter notebook for running the backend
├── README.md
└── pyproject.toml
```

---

## Setup & Installation

### Environment Setup

1. **Install `uv`** ([docs](https://docs.astral.sh/uv/getting-started/installation/)):
- **Windows:**
```sh
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
mkdir models
cd models
optimum-cli export openvino --model meta-llama/Llama-2-7b-chat-hf --weight-format int4 ov_llama_2
optimum-cli export openvino --model Qwen/Qwen2-7B-Instruct --weight-format int4 ov_qwen7b

- **Linux:**
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
# or
wget -qO- https://astral.sh/uv/install.sh | sh
```
>**Note**: [Raise access request](https://www.llama.com/llama-downloads) for Llama models as it is a gated repository.


2. **Install dependencies:**
```sh
cd AI-PC_Notebooks/Text-Summarizer-Browser-Plugin/
uv sync
# If you encounter issues:
uv clean
```

2. **Load the Extension:**
- To load an unpacked extension in developer mode:
- Go to the Extensions page by entering **chrome://extensions** in a new tab. (By design chrome:// URLs are not linkable.)
- Alternatively, **click the Extensions menu puzzle button and select Manage Extensions** at the bottom of the menu.
- Or, click the Chrome menu, hover over More Tools, then select Extensions.
- Enable **Developer Mode** by clicking the toggle switch next to Developer mode.
- Click the **Load unpacked** button and select the extension directory.
- Refer to [Chrome’s development documentation](https://developer.chrome.com/docs/extensions/get-started/tutorial/hello-world#load-unpacked) for further details.

<img width="389" alt="image" src="https://github.com/user-attachments/assets/c276f522-6f03-4aac-91ff-d38faf8c1f67">

### Model Preparation

1. **Login to Huggingface:**
```sh
uv huggingface-cli login
```
2. **Export models to OpenVINO IR format:**

3. **Pin the Extension:**
> ⚠️ **Warning:** Exporting models can take a significant amount of time, potentially exceeding 10 minutes depending on your hardware and network conditions.

Pin your extension to the toolbar to quickly access your extension.
> **Note:** Llama models require [access approval](https://www.llama.com/llama-downloads).

<img width="389" alt="image" src="https://github.com/user-attachments/assets/1bcc1571-b2d6-4ece-a3ca-c435733436b5">


```sh
mkdir models
cd models
uv run optimum-cli export openvino --model meta-llama/Llama-2-7b-chat-hf --weight-format int4 ov_llama_2
uv run optimum-cli export openvino --model Qwen/Qwen2-7B-Instruct --weight-format int4 ov_qwen7b
```

### Running the Backend

#### Steps to Run the Plugin
- **Via Jupyter Notebook:**

1. **Start the Flask Server:**
- Navigate to the backend folder:
```
cd ../backend
python server.py
```

2. **Open the Chrome Browser:**
- Activate & Pin the loaded extension.
- Plugin UI looks as follows:
Launch Jupyter Lab
```sh
uv run jupyter lab
```
Next, open `TextSummarizerPlugin.ipynb` and run all cells.

<img width="286" alt="image" src="https://github.com/user-attachments/assets/37349acc-ff37-437b-928a-673ca4ad3986">
- **Via Terminal:**

3. **Select an OpenVINO Model:**
- Choose an OpenVINO IR format model previously converted from Huggingface.

<img width="286" alt="image" src="https://github.com/user-attachments/assets/953050c9-c64c-4ce6-831d-626a52547d0b">
Start the FastAPI server:
```sh
cd backend
uv run fastapi dev server.py --port 5000
```

### Loading the Chrome Extension

4. **Interact with the UI:**
- Choose either **Web Page** or **PDF** post selecting one of the converted OV models:
1. Open Chrome and navigate to `chrome://extensions`.
2. Enable **Developer Mode**.
3. Click **Load unpacked** and select the `extension` directory.
4. Pin the extension for quick access.

<img width="285" alt="image" src="https://github.com/user-attachments/assets/065022e9-c9a2-474c-ae4e-5a2f298a9934">
![Loading the Chrome Extension](assets/load_extension.png)

---

- **Web Summarizer:**
1. Enter the URL of the webpage to summarize.
2. Click the "Summarize" button.
3. After summarization, the text appears, and users can ask follow-up questions.
## Usage

<img width="287" alt="image" src="https://github.com/user-attachments/assets/5f308ad3-b5bc-4b3e-9d29-b8002dc88e29">
### Webpage/PDF Summarization
1. Click the extension icon in Chrome.
2. Select an OpenVINO model.
3. Choose **Web Page** or **PDF** mode.
4. Enter the URL or upload a PDF, and click **Summarize**.
5. View the summary and ask follow-up questions.

![Extension Demo](assets/extension_demo.png)

---

- **PDF Summarizer:**
1. Upload a PDF file.
2. Click "Upload & Summarize."
3. After summarization, the text appears, and users can ask additional questions.
## Troubleshooting
- **Dependency Issues:** Run `uv clean` and then `uv sync`.
- **Model Access:** Ensure you have the correct Huggingface access tokens for gated models.
- **Extension Not Loading:** Make sure you select the correct `extension` directory and enable Developer Mode in Chrome.

<img width="290" alt="image" src="https://github.com/user-attachments/assets/4d6e3ce0-1650-4cd0-a073-0e84891518a3">

4. Sample output post summarization.

<img width="300" alt="image" src="https://github.com/user-attachments/assets/ea05eca2-fa53-4b17-9c85-67a692607376">
---

## License

5. **Reload the Page:**
- Refresh the webpage or re-open the plugin to restart.
This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.
Loading