pdfimages produces inverted image for black & white image

Bug #134313 reported by Jeffrey Ratcliffe
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gimp (Ubuntu)
Fix Released
Low
Unassigned
poppler (Ubuntu)
Fix Released
Low
Unassigned
xpdf (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: xpdf

For any PDF with black & white images, pdfimages extracts them with the colours reversed

Revision history for this message
Jeffrey Ratcliffe (jeffreyratcliffe) wrote :

Try the following:

pdfimages pg_0003.pdf x

File x-000.pbm is produced, and the colours are inverted (reversed)

Revision history for this message
Yves-Antoine (yae) wrote :

This behavior can be reproduced with the GIMP :
  - Open the attached file test.gif in the GIMP :
  - Save it as test.pbm, choosing raw data formatting.
  - Close the image window
  - Open the previously saved file test.pbm
  - The image is inverted

I confirm the behavior with pdfimages included in package poppler-utils

All of this is tested on an up to date Gutsy as of 2nd of October 2007
gimp 2.4.0~rc1-3ubuntu2
poppler-utils 0.6-0ubuntu1

gimp and poppler-utils should be added to the list of affected packages.

Revision history for this message
Yves-Antoine (yae) wrote :

Adding the resulting file test.pbm from abave test

Revision history for this message
Yves-Antoine (yae) wrote :

Here is a program to invert a PBM file, as a workaround to subject bug.
Compilation:
gcc -o pbminvert pbminvert.c
Execution:
cat file_to_be_inverted.pbm | ./pbminvert > inverted_file.pbm

Revision history for this message
Greg Grossmeier (greg.grossmeier) wrote :

I can confirm this in GIMP, but not with pdfimages in Gutsy. pdfimage does not produce inverted pdms using the "pdfimages file x"

poppler-utils: 0.6-0ubuntu2.1
GIMP: 2.4.2-0ubuntu0.7.10.1

Changed in gimp:
status: New → Confirmed
Revision history for this message
Greg Grossmeier (greg.grossmeier) wrote :

Is the pdfimages issue still present for anyone?

Changed in xpdf:
status: New → Invalid
Changed in poppler:
status: New → Incomplete
Revision history for this message
Yves-Antoine (yae) wrote :

Hello Greg,

Thanks for looking into this issue.

No, everything works now fine with the pdfimages program on uptodate Hardy.

I had added poppler because the package poppler-utils, is the one hosting the pdfimages command on a vanilla Ubuntu system, so it is no longer an issue with poppler as well : I have just tested on a system with poppler-utils installed.

I confirm the bug is still there on uptodate Hardy with the GIMP.

Regards
Yves-Antoine

Changed in poppler:
status: Incomplete → Invalid
Revision history for this message
Jeffrey Ratcliffe (jeffreyratcliffe) wrote :

Here is a PDF whose image (PBM) is extracted reversed.

Changed in poppler:
status: Invalid → Confirmed
Revision history for this message
Jeffrey Ratcliffe (jeffreyratcliffe) wrote :

This bug is present in 0.6.4-1ubuntu2 in Hardy.

Changed in gimp (Ubuntu):
importance: Undecided → Low
Changed in poppler (Ubuntu):
importance: Undecided → Low
Revision history for this message
madbiologist (me-again) wrote :

I tested this on Ubuntu 10.04 "Lucid Lynx" alpha 2 with poppler updated to the latest available Ubuntu package.

Uname: Linux 2.6.32-10-generic i686
Packages: evince 2.29.5-0ubuntu1
                  poppler 0.12.3-0ubuntu1

I used pdfimages to extract the image from the file in comment #8. The PBM image is now normal, and no longer reversed/inverted.

Changed in poppler (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Thomas Hotz (thotz-deactivatedaccount) wrote :

Thank you for telling us and testing this. I'll mark this bug as fixed.

Changed in gimp (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Karl Kastner (kastner-karl) wrote :

Still occuring for regularly on ubuntu 17.10 artful. I regularly ocr pdf of scanned document and this occurs frequently for fax-compressed files.

Revision history for this message
Gabriel Staples (ercaguy) wrote :

Still occurring on Ubuntu 18.04. This is weird. There's got to be a way to fix it!

Revision history for this message
Gabriel Staples (ercaguy) wrote :

Version info:
```
$ pdfimages -v
pdfimages version 0.62.0
Copyright 2005-2017 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011 Glyph & Cog, LLC
```

Revision history for this message
Gabriel Staples (ercaguy) wrote :

Related:

"In general, Acrobat and Reader will show a negative image when the image data indicates that it needs to be inverted. When you have images that are compressed with the CCITTG4 algorithm, there is a flag in the image data that indicates if a logical "1" indicates black or white. The software that created this document must have used the wrong value for this flag. I've seen similar effects also with certain CMYK JPEG images (but it's been years since I encountered the last one of those) where the PDF generator created corrupt data."

Source: https://answers.acrobatusers.com/Why-does-the-PDF-show-as-a-reversed-image-q132625.aspx

Revision history for this message
Gabriel Staples (ercaguy) wrote :

`pdfimages -list mypdf.pdf` shows all of my pages were encoded in `ccitt` format.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.