Extract text from PDFs, PPTs, & URLs (with OCR support). Converts PPT to PDF & handles files or folders. 🦍
-
Updated
Apr 14, 2025 - Python
Extract text from PDFs, PPTs, & URLs (with OCR support). Converts PPT to PDF & handles files or folders. 🦍
This tool compares the text content of two PDF files or images and generates an HTML file highlighting the differences in a format similar to VSCode's Git Diff view. It supports text extraction from PDFs and images (using Tesseract OCR) and provides a visual side-by-side comparison of the differences. Perfect for document version control, proofread
Add a description, image, and links to the image-text-extraction topic page so that developers can more easily learn about it.
To associate your repository with the image-text-extraction topic, visit your repo's landing page and select "manage topics."