add an OCR text layer to PDF files
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted.
-Generates a searchable PDF/A file from a regular PDF
-Places OCR text accurately below the image to ease copy / paste
-Keeps the exact resolution of the original embedded images
-When possible, inserts OCR information as a "lossless" operation without rendering vector information
-Keeps file size about the same
-If requested deskews and/or cleans the image before performing OCR
-Validates input and output files
-Provides debug mode to enable easy verification of the OCR results
-Processes pages in parallel when more than one CPU core is available
-Uses Tesseract OCR engine
-Supports more than 100 languages recognized by Tesseract
-Battle-tested on thousands of PDFs, a test suite and continuous integration
Project information
- Maintainer:
- Registry Administrators
- Driver:
- Not yet selected
- Licence:
- MIT / X / Expat Licence
View full history Series and milestones
trunk series is the current focus of development.
All packages Packages in Distributions
-
ocrmypdf source package in Oracular
Version 16.3.1+dfsg1-1 uploaded -
ocrmypdf source package in Noble
Version 15.2.0+dfsg1-1 uploaded -
ocrmypdf source package in Mantic
Version 14.0.1+dfsg1-1 uploaded -
ocrmypdf source package in Lunar
Version 14.0.1+dfsg1-1 uploaded -
ocrmypdf source package in Jammy
Version 13.4.0+dfsg-1 uploaded