tesseract-ocr - Tesseract command line OCR tool (devel)

PPA description

This PPA contains an OCR engine - libtesseract and a command line program - tesseract. The development version available here (currntly 5.0.0 ) is better in many aspects (functionality, speed, stability) but is not 100 % API compatible with version 4.0. Tesseract 4 added a new neural net (LSTM) based OCR engine which is focused on line recognition, but also still supports the legacy Tesseract OCR engine of Tesseract 3 which works by recognizing character patterns. Compatibility with Tesseract 3 is enabled by using the Legacy OCR Engine mode (--oem 0). It also needs traineddata files which support the legacy engine, for example those from the tessdata repository. Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box". Tesseract supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. The master branch also has experimental support for ALTO (XML) output.

Adding this PPA to your system

You can update your system with unsupported packages from this untrusted PPA by adding ppa:alex-p/tesseract-ocr-devel to your system's Software Sources. (Read about installing)

sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel
sudo apt update
        
Technical details about this PPA

This PPA can be added to your system manually by copying the lines below and adding them to your system's software sources.

Display sources.list entries for:
deb https://ppa.launchpadcontent.net/alex-p/tesseract-ocr-devel/ubuntu YOUR_UBUNTU_VERSION_HERE main 
deb-src https://ppa.launchpadcontent.net/alex-p/tesseract-ocr-devel/ubuntu YOUR_UBUNTU_VERSION_HERE main 
Signing key:
1024R/8529B1E0F8BF7F65C12FABB0A4BCBD87CEF9E52D (What is this?)
Fingerprint:
8529B1E0F8BF7F65C12FABB0A4BCBD87CEF9E52D

For questions and bugs with software in this PPA please contact Alexander Pozdnyakov.

PPA statistics

Activity
3 updates added during the past month.
View package details

Overview of published packages

13 of 3 results
Package Version Uploaded by
leptonlib 1.78.0-1+nmu1ppa1~bionic1 Alexander Pozdnyakov ()
tesseract 5.3.4+git6311-fbff9362-1ppa1~bionic1 Alexander Pozdnyakov ()
tesseract-lang 1:5.0.0~git39-6572757-2ppa1~bionic1 Alexander Pozdnyakov ()
13 of 3 results

Latest updates

  • tesseract 10 days ago
    Successfully built
  • tesseract 10 days ago
    Successfully built
  • tesseract 10 days ago
    Successfully built
  • tesseract 11 weeks ago
    Failed to build: amd64 arm64 armhf i386 ppc64el s390x
  • tesseract 16 weeks ago
    Successfully built