Registered by James R Barlow

add an OCR text layer to PDF files

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted.

-Generates a searchable PDF/A file from a regular PDF
-Places OCR text accurately below the image to ease copy / paste
-Keeps the exact resolution of the original embedded images
-When possible, inserts OCR information as a "lossless" operation without rendering vector information
-Keeps file size about the same
-If requested deskews and/or cleans the image before performing OCR
-Validates input and output files
-Provides debug mode to enable easy verification of the OCR results
-Processes pages in parallel when more than one CPU core is available
-Uses Tesseract OCR engine
-Supports more than 100 languages recognized by Tesseract
-Battle-tested on thousands of PDFs, a test suite and continuous integration

Project information

Maintainer:
Registry Administrators
Driver:
Not yet selected
Licence:
MIT / X / Expat Licence

RDF metadata

View full history Series and milestones

trunk series is the current focus of development.

All packages Packages in Distributions

Get Involved

  • warning
    Report a bug
  • warning
    Ask a question
  • warning
    Help translate

Downloads

Ocrmypdf does not have any download files registered with Launchpad.