--- tesseract-2.04.orig/debian/tesseract.1 +++ tesseract-2.04/debian/tesseract.1 @@ -0,0 +1,27 @@ +.TH TESSERACT 1 "August 21, 2007" +.SH NAME +tesseract \- command line OCR tool +.SH SYNOPSIS +.B tesseract +.RI "imagename outputbase [configfile [[+|-]varfile]...] [-l ]" +.SH DESCRIPTION +This manual page documents briefly the +.B tesseract +command. +.PP +\fBtesseract\fP is a commercial quality OCR engine originally developed at +HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated +by UNLV. It was open-sourced by HP and UNLV in 2005. +.SH SEE ALSO +.BR feh (1), +.BR convert (1), +.BR mftraining (1), +.BR cntraining (1), +.BR unicharset_extractor (1), +.BR wordlist2dawg (1). +.br +.SH AUTHOR +tesseract was written by Ray Smith. +.PP +This manual page was written by Jeffrey Ratcliffe , +for the Debian project (but may be used by others). --- tesseract-2.04.orig/debian/watch +++ tesseract-2.04/debian/watch @@ -0,0 +1,3 @@ +version=3 +http://code.google.com/p/tesseract-ocr/downloads/list http://tesseract-ocr.googlecode.com/files/tesseract-(.*)\.ta?r?\.?gz + --- tesseract-2.04.orig/debian/changelog +++ tesseract-2.04/debian/changelog @@ -0,0 +1,103 @@ +tesseract (2.04-1ubuntu1) lucid; urgency=low + + * Merge from debian testing, remaining changes (LP: #477746): + - debian/patches/gcc4.4: fixes FTBFS with gcc 4.4 + + -- Ilya Barygin Fri, 13 Nov 2009 09:49:40 +0300 + +tesseract (2.04-1) unstable; urgency=low + + * New upstream version (Closes: #484052) + * Added -fPIC to CFLAGS + * Removed --as-needed from LDFLAGS + * Bumped standards to 3.8.2 (no changes needed) + * Adapted java patch to fix distclean target + * Moved to dh7 + * Added watch file + * Updated copyright file according to http://dep.debian.net/deps/dep5/ + + -- Jeffrey Ratcliffe Fri, 03 Jul 2009 23:35:24 +0200 + +tesseract (2.03-3) unstable; urgency=low + + * Patch wordlist2dawg + * Bumped standards + * Fixed lintian errors in copyright + + -- Jeffrey Ratcliffe Thu, 15 Aug 2008 23:59:00 +0200 + +tesseract (2.03-2ubuntu1) karmic; urgency=low + + * debian/patches/gcc4.4: fixes FTBFS with gcc 4.4, thanks to Martin + Michlmayr (LP: #445602). + + -- Ilya Barygin Wed, 07 Oct 2009 20:31:40 +0400 + +tesseract (2.03-2) unstable; urgency=low + + * Patch ccmain/baseapi.cpp to allow use with ocropus (Closes: #483896) + + -- Jeffrey Ratcliffe Thu, 12 Jun 2008 23:17:00 +0200 + +tesseract (2.03-1) unstable; urgency=low + + * Initial release of 2.03 (Closes: #478556) + * Switch to quilt for managing patches + * Patch java/makefile to fix install and distclean targets + * Patch ccutil/Makefile.* to fix redefine warnings (Closes: #455397) + * Patch viewer/scrollview.cpp, viewer/svmnode.cpp & viewer/svutil.cpp + to fix FTBFS with gcc 4.3 + * Corrected debian/copyright (thanks Winnie) + + -- Jeffrey Ratcliffe Tue, 22 Apr 2008 20:35:09 +0200 + +tesseract (2.01-4) unstable; urgency=low + + * + libtiff dependency (Closes: #459811) + * Updated description (Closes: #418991) + * Bumped standards + * + Uploaders: Gürkan Sengün + * + XS-DM-Upload-Allowed: yes + + -- Jeffrey Ratcliffe Tue, 08 Jan 2008 22:10:17 +0100 + +tesseract (2.01-3) unstable; urgency=low + + * - Recommends: (Closes: #451865) + + -- Jeffrey Ratcliffe Tue, 20 Nov 2007 21:14:26 +0100 + +tesseract (2.01-2) unstable; urgency=low + + * + Replaces: tesseract-ocr-data (Closes: #451042) + + -- Jeffrey Ratcliffe Thu, 15 Nov 2007 20:16:59 +0100 + +tesseract (2.01-1) unstable; urgency=low + + * Initial release of 2.01 (Closes: #434152) + * Applied tesseract-2.01.patch1.tar.gz + * Changed packaging licence to GPLv3 + + -- Jeffrey Ratcliffe Sat, 20 Oct 2007 09:07:28 +0200 + +tesseract (1.02-3) unstable; urgency=medium + + * Applied patch of Bryan Stillwell to fix + FTBFS on 64 bit arches. (Closes: #398379) + + -- Gürkan Sengün Mon, 11 Dec 2006 11:23:00 +0100 + +tesseract (1.02-2) unstable; urgency=low + + * Applied patch to fix tessdata directory access. (Closes: #400183) + * Split the data to a data package. + + -- Gürkan Sengün Mon, 27 Nov 2006 11:11:31 +0100 + +tesseract (1.02-1) unstable; urgency=low + + * Initial release. (Closes: #390204) + + -- Gürkan Sengün Mon, 9 Oct 2006 17:15:29 +0200 + --- tesseract-2.04.orig/debian/mftraining.1 +++ tesseract-2.04/debian/mftraining.1 @@ -0,0 +1,31 @@ +.TH MFTRAINING 1 "August 21, 2007" +.SH NAME +tesseract \- command line OCR tool +.SH SYNOPSIS +Part of the process to train tesseract for a new language. When the character features of all the training pages have been extracted, we need to cluster them to create the prototypes. The character shape features can be clustered using the mftraining and cntraining programs: +.PP +.B mftraining +.RI "fontfile_1.tr fontfile_2.tr ..." +.PP +This will output two data files: inttemp (the shape prototypes) and pffmtable (the number of expected features for each character). (A third file called Microfeat is also written by this program, but it is not used.) +.SH DESCRIPTION +This manual page documents briefly the +.B mftraining +command. +.PP +\fBtesseract\fP is a commercial quality OCR engine originally developed at +HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated +by UNLV. It was open-sourced by HP and UNLV in 2005. +.SH SEE ALSO +.BR feh (1), +.BR convert (1), +.BR tesseract (1), +.BR cntraining (1), +.BR unicharset_extractor (1), +.BR wordlist2dawg (1). +.br +.SH AUTHOR +tesseract was written by Ray Smith. +.PP +This manual page was written by Jeffrey Ratcliffe , +for the Debian project (but may be used by others). --- tesseract-2.04.orig/debian/docs +++ tesseract-2.04/debian/docs @@ -0,0 +1 @@ +README --- tesseract-2.04.orig/debian/cntraining.1 +++ tesseract-2.04/debian/cntraining.1 @@ -0,0 +1,31 @@ +.TH CNTRAINING 1 "August 21, 2007" +.SH NAME +tesseract \- command line OCR tool +.SH SYNOPSIS +Part of the process to train tesseract for a new language. When the character features of all the training pages have been extracted, we need to cluster them to create the prototypes. The character shape features can be clustered using the mftraining and cntraining programs: +.PP +.B cntraining +.RI "fontfile_1.tr fontfile_2.tr ..." +.PP +This will output the normproto data file (the character normalization sensitivity prototypes). +.SH DESCRIPTION +This manual page documents briefly the +.B cntraining +command. +.PP +\fBtesseract\fP is a commercial quality OCR engine originally developed at +HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated +by UNLV. It was open-sourced by HP and UNLV in 2005. +.SH SEE ALSO +.BR feh (1), +.BR convert (1), +.BR mftraining (1), +.BR tesseract (1), +.BR unicharset_extractor (1), +.BR wordlist2dawg (1). +.br +.SH AUTHOR +tesseract was written by Ray Smith. +.PP +This manual page was written by Jeffrey Ratcliffe , +for the Debian project (but may be used by others). --- tesseract-2.04.orig/debian/copyright +++ tesseract-2.04/debian/copyright @@ -0,0 +1,218 @@ +This package was debianized by Jeffrey Ratcliffe +on Mon, 06 Aug 2007 21:27:22 +0200. + +It was downloaded from http://code.google.com/p/tesseract-ocr/ + +Upstream Authors: +Ray Smith (lead developer) +Phil Cheatle +Simon Crouch +Dan Johnson +Mark Seaman +Sheelagh Huddleston +Chris Newton +... and several others. + +Copyright: + + Copyright 2007 Google Inc. + +License: + + Licensed under the Apache License, Version 2.0 (the "License"); you + may not use this file except in compliance with the License. You may + obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +On a Debian system the complete text of the Apache-2.0 license can be found in +/usr/share/common-licenses/Apache-2.0 + +The Debian packaging is copyright 2007--2009, +Jeffrey Ratcliffe and is licensed under the +Apache-2.0 licence. + + +The files below have different copyright notices: + +Files: cutil/listio.cpp cutil/listio.h cutil/oldlist.cpp cutil/tessarray.cpp + dict/lookdawg.cpp dict/lookdawg.h dict/makedawg.cpp dict/makedawg.h + dict/reduce.cpp dict/reduce.h classify/baseline.cpp + classify/baseline.h classify/hideedge.cpp classify/hideedge.h + classify/protos.cpp classify/protos.h cutil/cutil.cpp cutil/cutil.h + cutil/oldlist.h cutil/tessarray.h dict/choicearr.h dict/dawg.cpp + dict/dawg.h dict/hyphen.cpp dict/hyphen.h dict/permdawg.cpp + dict/permdawg.h dict/permnum.cpp dict/permnum.h dict/trie.cpp + dict/trie.h wordrec/chop.cpp wordrec/chop.h wordrec/chopper.cpp + wordrec/chopper.h wordrec/closed.cpp wordrec/closed.h + wordrec/findseam.cpp wordrec/findseam.h wordrec/gradechop.cpp + wordrec/gradechop.h wordrec/heuristic.cpp wordrec/heuristic.h + wordrec/makechop.cpp wordrec/makechop.h wordrec/measure.h + wordrec/metrics.cpp wordrec/metrics.h wordrec/olutil.cpp + wordrec/olutil.h wordrec/pieces.cpp wordrec/pieces.h + wordrec/plotseg.cpp wordrec/plotseg.h wordrec/seam.cpp + wordrec/seam.h wordrec/split.cpp wordrec/split.h wordrec/tally.cpp + wordrec/tally.h +Copyright: Copyright 1987, Hewlett-Packard Company +License: Apache-2.0 + +Files: classify/adaptive.cpp classify/adaptive.h classify/adaptmatch.cpp + classify/adaptmatch.h classify/blobclass.cpp classify/blobclass.h + classify/cluster.cpp classify/cluster.h classify/clusttool.cpp + classify/clusttool.h classify/cutoffs.cpp classify/cutoffs.h + classify/extract.cpp classify/extract.h classify/featdefs.cpp + classify/featdefs.h classify/flexfx.cpp classify/flexfx.h + classify/float2int.cpp classify/float2int.h classify/fpoint.cpp + classify/fpoint.h classify/fxdefs.cpp classify/fxdefs.h + classify/intfx.cpp classify/intfx.h classify/intmatcher.cpp + classify/intmatcher.h classify/intproto.cpp classify/intproto.h + classify/kdtree.cpp classify/kdtree.h classify/mf.cpp + classify/mfdefs.cpp classify/mfdefs.h classify/mf.h + classify/mfoutline.cpp classify/mfoutline.h classify/mfx.cpp + classify/mfx.h classify/normfeat.cpp classify/normfeat.h + lassify/normmatch.cpp classify/normmatch.h classify/ocrfeatures.cpp + classify/ocrfeatures.h classify/outfeat.cpp classify/outfeat.h + classify/picofeat.cpp classify/picofeat.h classify/sigmenu.cpp + classify/sigmenu.h classify/speckle.cpp classify/speckle.h + classify/xform2d.cpp classify/xform2d.h cutil/bitvec.cpp + cutil/bitvec.h cutil/danerror.cpp cutil/danerror.h cutil/efio.cpp + cutil/efio.h cutil/emalloc.h cutil/funcdefs.h cutil/general.h + cutil/minmax.h cutil/oldheap.cpp cutil/oldheap.h dict/matchdefs.h + dict/stopper.cpp dict/stopper.h training/cnTraining.cpp + training/mergenf.cpp training/mergenf.h training/mfTraining.cpp + training/name2char.h wordrec/badwords.cpp wordrec/badwords.h + wordrec/mfvars.cpp wordrec/mfvars.h +Copyright: Copyright Hewlett-Packard Company, 1988 +License: Apache-2.0 + +Files: ccutil/host.h +Copyright: Copyright Hewlett-Packard Company, 1988-1996 +License: Apache-2.0 + +Files: cutil/globals.cpp ccstruct/blobs.cpp ccstruct/blobs.h + ccstruct/vecfuncs.cpp ccstruct/vecfuncs.h classify/fxid.h + cutil/debug.cpp cutil/debug.h cutil/globals.h cutil/tordvars.h + cutil/variables.cpp cutil/variables.h dict/choices.cpp dict/choices.h + dict/permute.cpp dict/permute.h wordrec/djmenus.cpp + wordrec/msmenus.cpp wordrec/outlines.cpp wordrec/outlines.h + wordrec/plotedges.cpp wordrec/plotedges.h wordrec/render.cpp + wordrec/render.h ccmain/expandblob.cpp ccutil/errcode.cpp ccutil/globaloc.cpp +Copyright: Copyright 1989, Hewlett-Packard Company +License: Apache-2.0 + +Files: classify/extern.h cutil/freelist.h cutil/structures.cpp + cutil/structures.h cutil/tordvars.cpp dict/context.cpp dict/context.h + dict/states.cpp dict/states.h wordrec/associate.cpp + wordrec/associate.h wordrec/bestfirst.cpp wordrec/bestfirst.h + wordrec/djmenus.h wordrec/matchtab.cpp wordrec/matchtab.h + wordrec/matrix.cpp wordrec/matrix.h wordrec/msmenus.h + wordrec/wordclass.cpp wordrec/wordclass.h ccutil/basedir.cpp + ccutil/basedir.h ccutil/errcode.h ccutil/fileerr.h ccutil/globaloc.h + ccutil/lsterr.h ccutil/memryerr.h ccutil/memry.h ccutil/serialis.cpp + ccutil/serialis.h ccutil/stderr.h image/imgerrs.h image/img.h + image/imgio.cpp image/imgio.h image/imgs.cpp image/imgs.h + image/imgtiff.cpp image/imgtiff.h image/imgunpk.h +Copyright: Copyright 1990, Hewlett-Packard Company +License: Apache-2.0 + +Files: ccmain/pgedit.cpp textord/blkocc.cpp textord/blkocc.h + ccmain/charcut.h ccmain/pagewalk.cpp ccmain/pagewalk.h + ccmain/tessio.h ccstruct/blckerr.h ccstruct/blread.cpp + ccstruct/blread.h ccstruct/coutln.cpp ccstruct/coutln.h + ccstruct/crakedge.h ccstruct/genblob.cpp ccstruct/genblob.h + ccstruct/ipoints.h ccstruct/linlsq.cpp ccstruct/linlsq.h + ccstruct/mod128.cpp ccstruct/mod128.h ccstruct/ocrblock.cpp + ccstruct/ocrblock.h ccstruct/ocrrow.cpp ccstruct/ocrrow.h + ccstruct/pdblock.cpp ccstruct/pdblock.h ccstruct/points.cpp + ccstruct/points.h ccstruct/polyblob.cpp ccstruct/polyblob.h + ccstruct/polyvert.cpp ccstruct/polyvert.h ccstruct/poutline.cpp + ccstruct/poutline.h ccstruct/quadratc.cpp ccstruct/quadratc.h + ccstruct/quspline.cpp ccstruct/quspline.h ccstruct/rect.cpp + ccstruct/rect.h ccstruct/statistc.cpp ccstruct/statistc.h + ccstruct/stepblob.cpp ccstruct/stepblob.h ccstruct/werd.cpp + ccstruct/werd.h ccutil/bits16.cpp ccutil/bits16.h ccutil/clst.cpp + ccutil/clst.h ccutil/elst2.cpp ccutil/elst2.h ccutil/elst.cpp + ccutil/elst.h ccutil/mainblk.cpp ccutil/mainblk.h ccutil/ndminx.h + ccutil/strngs.cpp ccutil/strngs.h ccutil/varable.cpp ccutil/varable.h + image/bitstrm.cpp image/bitstrm.h textord/drawedg.cpp + textord/drawedg.h textord/edgblob.cpp textord/edgblob.h + textord/edgloop.cpp textord/edgloop.h textord/scanedg.cpp + textord/scanedg.h textord/tessout.h +Copyright: Copyright 1991, Hewlett-Packard Ltd +License: Apache-2.0 + +Files: ccmain/adaptions.h ccmain/callnet.cpp ccmain/callnet.h + ccmain/charcut.cpp ccmain/control.cpp ccmain/control.h + ccmain/fixxht.cpp ccmain/fixxht.h ccmain/imgscale.cpp + ccmain/imgscale.h ccmain/reject.cpp ccmain/reject.h + ccmain/scaleimg.cpp ccmain/scaleimg.h ccmain/tessbox.cpp + ccmain/tessbox.h ccmain/tessedit.cpp ccmain/tessedit.h + ccmain/tessembedded.cpp ccmain/tessembedded.h + ccmain/tesseractmain.cpp ccmain/tesseractmain.h ccmain/tessvars.cpp + ccmain/tessvars.h ccmain/tfacep.h ccmain/tfacepp.cpp ccmain/tfacepp.h + ccmain/tstruct.cpp ccmain/tstruct.h ccmain/werdit.cpp ccmain/werdit.h + ccstruct/blobbox.cpp ccstruct/blobbox.h ccstruct/lmedsq.cpp + ccstruct/lmedsq.h ccstruct/normalis.cpp ccstruct/normalis.h + ccstruct/pageres.cpp ccstruct/pageres.h ccstruct/pdclass.h + ccstruct/ratngs.cpp ccstruct/ratngs.h ccutil/hashfn.cpp + ccutil/hashfn.h ccutil/memblk.cpp ccutil/memblk.h ccutil/memry.cpp + cmain/adaptions.cpp textord/drawtord.cpp textord/drawtord.h + textord/makerow.cpp textord/makerow.h textord/pithsync.cpp + textord/pithsync.h textord/pitsync1.cpp textord/pitsync1.h + textord/tordmain.cpp textord/tordmain.h textord/wordseg.cpp + textord/wordseg.h wordrec/drawfx.cpp wordrec/drawfx.h + wordrec/tessinit.cpp wordrec/tessinit.h wordrec/tface.cpp +Copyright: Copyright 1992, Hewlett-Packard Ltd +License: Apache-2.0 + +Files: ccmain/applybox.cpp ccmain/applybox.h ccmain/blobcmp.cpp + ccmain/blobcmp.h ccmain/charsample.cpp ccmain/matmatch.cpp + ccmain/matmatch.h ccmain/paircmp.cpp ccmain/paircmp.h + ccstruct/labls.cpp ccstruct/labls.h ccstruct/polyaprx.cpp + ccstruct/polyaprx.h ccstruct/polyblk.cpp ccstruct/polyblk.h + ccstruct/quadlsq.cpp ccstruct/quadlsq.h ccstruct/rwpoly.cpp + ccstruct/rwpoly.h ccstruct/txtregn.cpp ccstruct/txtregn.h + textord/blobcmpl.h textord/fpchop.cpp textord/fpchop.h + textord/oldbasel.cpp textord/oldbasel.h textord/sortflts.cpp + textord/sortflts.h textord/topitch.cpp textord/topitch.h + textord/tovars.cpp textord/tovars.h wordrec/charsample.h + ccmain/fixspace.cpp ccmain/fixspace.h +Copyright: Copyright 1993, Hewlett-Packard Ltd +License: Apache-2.0 + +Files: ccmain/docqual.cpp ccmain/docqual.h ccmain/output.cpp ccmain/output.h + ccstruct/rejctmap.cpp ccstruct/rejctmap.h textord/underlin.cpp + textord/underlin.h +Copyright: Copyright 1994, Hewlett-Packard Ltd +License: Apache-2.0 + +Files: ccutil/nwmain.h ccutil/tessopt.cpp ccutil/tessopt.h ccutil/tprintf.cpp + ccutil/tprintf.h +Copyright: Copyright 1995, Hewlett-Packard Co +License: Apache-2.0 + +Files: ccstruct/callcpp.cpp ccstruct/hpddef.h ccutil/debugwin.cpp + ccutil/debugwin.h ccutil/notdll.h ccutil/ocrclass.h + ccutil/ocrshell.cpp ccutil/ocrshell.h cutil/callcpp.h +Copyright: Copyright 1996, Hewlett-Packard Co +License: Apache-2.0 + +Files: image/imgbmp.cpp image/imgbmp.h +Copyright: Copyright 1998, Ray Smith +License: Apache-2.0 + +Files: ccmain/baseapi.cpp ccutil/scanutils.cpp ccutil/scanutils.h + image/svshowim.cpp image/svshowim.h ccutil/unichar.cpp + ccutil/unichar.h ccutil/unicharmap.cpp ccutil/unicharmap.h + ccutil/unicharset.cpp ccutil/unicharset.h + training/unicharset_extractor.cpp training/wordlist2dawg.cpp +Copyright: Copyright 2006, Google Inc +License: Apache-2.0 + +Files: tessdll.cpp tessdll.h +Copyright: Copyright 2007, Jetsoftdev +License: Apache-2.0 --- tesseract-2.04.orig/debian/compat +++ tesseract-2.04/debian/compat @@ -0,0 +1 @@ +7 --- tesseract-2.04.orig/debian/tesseract-ocr.install +++ tesseract-2.04/debian/tesseract-ocr.install @@ -0,0 +1,4 @@ +usr/bin/* +usr/share/tessdata/configs/* usr/share/tesseract-ocr/tessdata/configs/ +usr/share/tessdata/confsets usr/share/tesseract-ocr/tessdata/ +usr/share/tessdata/tessconfigs/* usr/share/tesseract-ocr/tessdata/tessconfigs/ --- tesseract-2.04.orig/debian/tesseract-ocr-dev.install +++ tesseract-2.04/debian/tesseract-ocr-dev.install @@ -0,0 +1,3 @@ +usr/lib/* +usr/include/tesseract/*.h + --- tesseract-2.04.orig/debian/wordlist2dawg.1 +++ tesseract-2.04/debian/wordlist2dawg.1 @@ -0,0 +1,32 @@ +.TH WORDLIST2DAWG 1 "August 21, 2007" +.SH NAME +tesseract \- command line OCR tool +.SH SYNOPSIS +Part of the process to train tesseract for a new language. Tesseract uses 3 dictionary files for each language. Two of the files are coded as a Directed Acyclic Word Graph (DAWG), and the other is a plain UTF-8 text file. To make the DAWG dictionary files, you first need a wordlist for your language. The wordlist is formatted as a UTF-8 text file with one word per line. Split the wordlist into two sets: the frequent words, and the rest of the words, and then use wordlist2dawg to make the DAWG files: +.PP +.B wordlist2dawg +.RI "frequent_words_list freq-dawg" +.PP +.B wordlist2dawg +.RI "words_list word-dawg" +.SH DESCRIPTION +This manual page documents briefly the +.B wordlist2dawg +command. +.PP +\fBtesseract\fP is a commercial quality OCR engine originally developed at +HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated +by UNLV. It was open-sourced by HP and UNLV in 2005. +.SH SEE ALSO +.BR feh (1), +.BR convert (1), +.BR mftraining (1), +.BR cntraining (1), +.BR unicharset_extractor (1), +.BR tesseract (1). +.br +.SH AUTHOR +tesseract was written by Ray Smith. +.PP +This manual page was written by Jeffrey Ratcliffe , +for the Debian project (but may be used by others). --- tesseract-2.04.orig/debian/rules +++ tesseract-2.04/debian/rules @@ -0,0 +1,13 @@ +#!/usr/bin/make -f +CFLAGS = -Wall -g -fPIC -DTESSDATA_PREFIX=/usr/share/tesseract-ocr/ +%: + dh --with quilt $@ + +override_dh_auto_test: + +override_dh_auto_clean: + dh_auto_clean + dh_clean java/com/Makefile java/com/google/Makefile java/com/google/scrollview/Makefile java/com/google/scrollview/events/Makefile java/com/google/scrollview/ui/Makefile + +override_dh_auto_configure: + ./configure --host=$(DEB_HOST_GNU_TYPE) --build=$(DEB_BUILD_GNU_TYPE) --prefix=/usr --mandir=\$${prefix}/share/man --infodir=\$${prefix}/share/info CFLAGS="$(CFLAGS)" CXXFLAGS="$(CFLAGS)" LDFLAGS="-Wl,-z,defs" --- tesseract-2.04.orig/debian/tesseract-ocr.manpages +++ tesseract-2.04/debian/tesseract-ocr.manpages @@ -0,0 +1,6 @@ +debian/tesseract.1 +debian/mftraining.1 +debian/cntraining.1 +debian/unicharset_extractor.1 +debian/wordlist2dawg.1 + --- tesseract-2.04.orig/debian/unicharset_extractor.1 +++ tesseract-2.04/debian/unicharset_extractor.1 @@ -0,0 +1,32 @@ +.TH UNICHARSET_EXTRACTOR 1 "August 21, 2007" +.SH NAME +tesseract \- command line OCR tool +.SH SYNOPSIS +Part of the process to train tesseract for a new language. Tesseract needs to know the set of possible characters it can output. To generate the unicharset data file, use the unicharset_extractor program on the training pages bounding box files: +.PP +.B unicharset_extractor +.RI "fontfile_1.box fontfile_2.box ..." +.SH DESCRIPTION +This manual page documents briefly the +.B unicharset_extractor +command. +.PP +\fBtesseract\fP is a commercial quality OCR engine originally developed at +HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated +by UNLV. It was open-sourced by HP and UNLV in 2005. +.PP +Tesseract needs to have access to character properties isalpha, isdigit, isupper, islower. This data must be encoded in the unicharset data file. Each line of this file corresponds to one character. The character in UTF-8 is followed by a hexadecimal number representing a binary mask that encodes the properties. Each bit corresponds to a property. If the bit is set to 1, it means that the property is true. The bit ordering is (from least significant bit to most significant bit): isalpha, islower, isupper, isdigit. +.PP +.SH SEE ALSO +.BR feh (1), +.BR convert (1), +.BR mftraining (1), +.BR cntraining (1), +.BR tesseract (1), +.BR wordlist2dawg (1). +.br +.SH AUTHOR +tesseract was written by Ray Smith. +.PP +This manual page was written by Jeffrey Ratcliffe , +for the Debian project (but may be used by others). --- tesseract-2.04.orig/debian/control +++ tesseract-2.04/debian/control @@ -0,0 +1,39 @@ +Source: tesseract +Section: graphics +Priority: optional +Maintainer: Ubuntu Developers +XSBC-Original-Maintainer: Jeffrey Ratcliffe +Build-Depends: debhelper (>= 7), libtiff4-dev, quilt (>= 0.40) +Standards-Version: 3.8.2 +Homepage: http://code.google.com/p/tesseract-ocr/ +XS-DM-Upload-Allowed: yes + +Package: tesseract-ocr +Architecture: any +Depends: ${shlibs:Depends}, ${misc:Depends}, tesseract-ocr-language +Replaces: tesseract-ocr-data +Description: Command line OCR tool + The Tesseract OCR engine was originally developed at HP between 1985 and 1995. + It was open-sourced by HP and UNLV in 2005 and Google has lead further + development. + . + The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV + Accuracy test. Between 1995 and 2006 it had little work done on it, but it + is probably one of the most accurate open source OCR engines available. It + will read a binary, grey or color image and output text. + +Package: tesseract-ocr-dev +Architecture: any +Depends: ${shlibs:Depends}, ${misc:Depends}, tesseract-ocr +Description: Development files for the tesseract command line OCR tool + The Tesseract OCR engine was originally developed at HP between 1985 and 1995. + It was open-sourced by HP and UNLV in 2005 and Google has lead further + development. + . + The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV + Accuracy test. Between 1995 and 2006 it had little work done on it, but it + is probably one of the most accurate open source OCR engines available. It + will read a binary, grey or color image and output text. + . + This package contains the header files + --- tesseract-2.04.orig/debian/patches/java +++ tesseract-2.04/debian/patches/java @@ -0,0 +1,29 @@ +# Description: Fixes FTBFS due to distclean not working in java directory +# Origin: Adapted from 2.03-1 +Index: tesseract-2.04/java/Makefile.in +=================================================================== +--- tesseract-2.04.orig/java/Makefile.in 2009-07-01 00:24:19.000000000 +0200 ++++ tesseract-2.04/java/Makefile.in 2009-07-04 00:06:08.000000000 +0200 +@@ -402,7 +402,7 @@ + + clean-am: clean-generic mostlyclean-am + +-distclean: distclean-recursive ++distclean: + -rm -f Makefile + distclean-am: clean-am distclean-generic distclean-tags + +Index: tesseract-2.04/java/makefile +=================================================================== +--- tesseract-2.04.orig/java/makefile 2009-07-04 00:16:06.000000000 +0200 ++++ tesseract-2.04/java/makefile 2009-07-04 00:16:52.000000000 +0200 +@@ -47,6 +47,9 @@ + clean : + rm -f ScrollView.jar *.class + ++distclean: clean ++ -rm -f Makefile ++ + # all-am does nothing, to make the java part optional. + all all-am install : + --- tesseract-2.04.orig/debian/patches/series +++ tesseract-2.04/debian/patches/series @@ -0,0 +1,2 @@ +java +gcc-4.4 --- tesseract-2.04.orig/debian/patches/gcc-4.4 +++ tesseract-2.04/debian/patches/gcc-4.4 @@ -0,0 +1,12 @@ +Index: tesseract-2.04-1ubuntu1/viewer/svutil.cpp +=================================================================== +--- tesseract-2.04-1ubuntu1.orig/viewer/svutil.cpp 2009-11-13 09:48:19.000000000 +0300 ++++ tesseract-2.04-1ubuntu1/viewer/svutil.cpp 2009-11-13 09:49:10.000000000 +0300 +@@ -43,6 +43,7 @@ + #endif + + #include ++#include + + const int kBufferSize = 65536; + const int kMaxMsgSize = 4096;