tesseract assert failure: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.

Bug #565688 reported by Crashbit
182
This bug affects 34 people
Affects Status Importance Assigned to Milestone
Tesseract
Unknown
Unknown
tesseract (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Tesseract fails

ignasi@ignasi-desktop:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu lucid (development branch)
Release: 10.04
Codename: lucid
ignasi@ignasi-desktop:~$

ProblemType: Crash
DistroRelease: Ubuntu 10.04
Package: tesseract-ocr 2.04-2
ProcVersionSignature: Ubuntu 2.6.32-21.32-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-21-generic x86_64
NonfreeKernelModules: nvidia
Architecture: amd64
AssertionMessage: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
CheckboxSubmission: be855d426122c5a11956fef117ded5b1
CheckboxSystem: edda5d4f616ca792bf437989cb597002
CrashCounter: 1
Date: Sun Apr 18 02:36:10 2010
ExecutablePath: /usr/bin/tesseract
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Beta amd64 (20100318)
ProcCmdline: tesseract /tmp/UZBC9Lf9jw/jvkFElmSob.tif /tmp/UZBC9Lf9jw/BE5MYeneZ6 -l spa
ProcEnviron:
 LANG=ca_ES.utf8
 SHELL=/bin/bash
Signal: 6
SourcePackage: tesseract
StacktraceTop:
 raise () from /lib/libc.so.6
 abort () from /lib/libc.so.6
 __assert_fail () from /lib/libc.so.6
 ?? ()
 ?? ()
Title: tesseract assert failure: tesseract: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare

Revision history for this message
Crashbit (crashbit-gmail) wrote :
Revision history for this message
garrison (jim-garrison) wrote :

This bug has also broken ocropus for me. I am running amd64 as well.

$ ocroscript recognize image.jpg
ocroscript: unicharset.cpp:76: const UNICHAR_ID UNICHARSET::unichar_to_id(const char*, int) const: Assertion `ids.contains(unichar_repr, length)' failed.
Aborted

Revision history for this message
Nemo157 (ghostunderscore) wrote :

This may be the same bug as here: http://code.google.com/p/tesseract-ocr/issues/detail?id=265#c0
If so that is supposed to be fixed for the 3.0 release.

Revision history for this message
arndtc (arndtc) wrote :

Hope that the 3.0 release comes soon. GOCR is not near as good of an alternative as tesseract.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in tesseract (Ubuntu):
status: New → Confirmed
Revision history for this message
Luzius Thöny (lucius-antonius) wrote :

deb packages of 3.0 are available here: http://notesalexp.net/oneiric/main/t/tesseract/

(works for me)

Revision history for this message
Jeff Breidenbach (jeff-jab) wrote :

Obsolete; Tesseract 3 is shipping with Ubuntu. Please close.

Revision history for this message
Jeff Breidenbach (jeff-jab) wrote :

Also, Ocropus is no longer shipping.

Changed in tesseract (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.