tesseract-ocr - Tesseract command line OCR tool (devel)

PPA description

This PPA contains an OCR engine - libtesseract and a command line program - tesseract. The development version available here (currntly 5.0.0 Alpha) is better in many aspects (functionality, speed, stability) but is not 100 % API compatible with version 4.0. Tesseract 4 added a new neural net (LSTM) based OCR engine which is focused on line recognition, but also still supports the legacy Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns. Compatibility with Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). It also needs traineddata files which support the legacy engine, for example those from the tessdata repository. Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box". Tesseract supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. The master branch also has experimental support for ALTO (XML) output.

Adding this PPA to your system

You can update your system with unsupported packages from this untrusted PPA by adding ppa:alex-p/tesseract-ocr-devel to your system's Software Sources. (Read about installing)

sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt-get update
Technical details about this PPA

This PPA can be added to your system manually by copying the lines below and adding them to your system's software sources.

Display sources.list entries for:
deb http://ppa.launchpad.net/alex-p/tesseract-ocr-devel/ubuntu YOUR_UBUNTU_VERSION_HERE main 
deb-src http://ppa.launchpad.net/alex-p/tesseract-ocr-devel/ubuntu YOUR_UBUNTU_VERSION_HERE main 
Signing key:
1024R/8529B1E0F8BF7F65C12FABB0A4BCBD87CEF9E52D (What is this?)

For questions and bugs with software in this PPA please contact Alexander Pozdnyakov.

PPA statistics

8 updates added during the past month.
View package details

Overview of published packages

115 of 15 results
Package Version Uploaded by
leptonlib 1.78.0-1+nmu1ppa1~xenial1 Alexander Pozdnyakov (2019-10-18)
leptonlib 1.78.0-1+nmu1ppa1~bionic1 Alexander Pozdnyakov (2019-10-18)
openjpeg2 2.3.0-1+nmu2ppa1~xenial1 Alexander Pozdnyakov (2019-10-18)
tesseract 5.0.0~git4989-95b98042-1ppa1~groovy1 Alexander Pozdnyakov (9 hours ago)
tesseract 5.0.0~git4989-95b98042-1ppa1~focal1 Alexander Pozdnyakov (9 hours ago)
tesseract 5.0.0~git4989-95b98042-1ppa1~bionic1 Alexander Pozdnyakov (9 hours ago)
tesseract 5.0.0~git4565-0a634846-1ppa1~eoan1 Alexander Pozdnyakov (2020-06-24)
tesseract-equ 5.0.0~git39-f97ee73-1ppa1~xenial1 Alexander Pozdnyakov (2019-10-19)
tesseract-equ 5.0.0~git39-f97ee73-1ppa1~disco1 Alexander Pozdnyakov (2019-10-19)
tesseract-equ 5.0.0~git39-f97ee73-1ppa1~bionic1 Alexander Pozdnyakov (2019-10-19)
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~xenial1 Alexander Pozdnyakov (2019-11-06)
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~groovy1 Alexander Pozdnyakov (2020-11-02)
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~focal1 Alexander Pozdnyakov (2020-05-26)
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~eoan1 Alexander Pozdnyakov (2019-11-06)
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~bionic1 Alexander Pozdnyakov (2019-11-06)
115 of 15 results

Latest updates

  • tesseract 9 hours 40 minutes ago
    Successfully built
  • tesseract 9 hours 40 minutes ago
    Successfully built
  • tesseract 9 hours 40 minutes ago
    Successfully built
  • tesseract-lang 11 weeks ago
    Successfully built
  • tesseract 29 weeks ago
    Successfully built