A Streamlit-based web application to extract text from uploaded images and videos using EasyOCR.
- Image OCR: Upload an image (PNG, JPG, JPEG) and extract text.
- Video OCR: Upload a video (MP4, MOV, AVI) and extract text from frames at a specified sampling rate.
- Video ASR: Extract text from the audio channel of videos using automatic speech recognition.
- Supports local inference using Hugging Face models (e.g., OpenAI Whisper, Wav2Vec2).
- Uses GPU acceleration if available.
- Multi-language Support: Support for English, Chinese (Simplified), French, German, Spanish, Japanese, Korean, and Dutch.
- Downloadable Results: Download the extracted text as a
.txtfile.
- Clone the repository (if applicable) or ensure you have the project files.
- Install the required dependencies:
pip install -r requirements.txtNote: For video audio extraction,
ffmpegis required.moviepyusually handles this, but if you encounter issues, ensureffmpegis installed on your system.
Run the Streamlit application using the following command:
streamlit run app.py- Select Languages: Use the sidebar to select one or more languages for OCR. The default is English.
- Upload File: Upload an image or video file.
- Extract Text:
- For Images: The extracted text will be displayed immediately after clicking "Extract Text".
- For Videos: You can adjust the sampling rate (seconds per frame) in the sidebar. Click "Extract Text from Video" to process.
- Download: Click the "Download Text" button to save the results.
- First run might be slow as EasyOCR downloads the necessary models.
- Video processing can be time-consuming depending on the video length and sampling rate.