Embedded jpg exported to EPS/PDF as non-jpeg (JPEGs are stored /FlateDecode instead of /DCTDecode)

Bug #168708 reported by Norbert Nemec
134
This bug affects 23 people
Affects Status Importance Assigned to Milestone
Inkscape
Fix Released
Medium
Krzysztof Kosinski
inkscape (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Embedding jpg images in a svg file works beautifully. Exporting the same to
an eps or a pdf is a pain. The image is actually converted to a lossless
tiff image, blowing up the file size immensly. The only thing to reduce the
file size lateron is to go via pdf2ps and ps2pdf, which encodes all the
internal raster images as jpg. This doubles the amount of jpg-artefacts and
I loose every control over which raster images should be compressed in
which way.

Postscript level 2 does allow embedding JPG (see
http://www.pdflib.com/products/more/jpeg2ps.html)

Tags: exporting
Revision history for this message
Norbert Nemec (nnemec) wrote :

In fact, this is *not* a duplicate of bug #168434: The other bug report is about lossless raster images stored in PDF files without compression. That issue was fairly easy to fix and has been fixed by switching to Cairo PDF export.

This bug, however is about embedded *JPG* raster images. As of now, Inkscape decompresses these images and saves embeds them in a lossless format.

AFAIK, Cairo does not yet offer any mechanism to handle raw JPG data and avoid recompression with loss of quality. (see long thread starting with http://lists.freedesktop.org/archives/cairo/2007-January/009096.html) That means, Inkscape cannot really solve this issue either. Therefore, this bug should be considered "on hold" until Cairo has resolved the issue on their side.

Proposed "solutions" like offering an option for the compression level in jpg images embedded in exported PDF are not satisfactory as they still do not avoid the unnecessary recompression.

Changed in inkscape:
status: New → In Progress
nightrow (jb-benoit)
Changed in inkscape:
importance: Undecided → Medium
Revision history for this message
wvengen (wvengen) wrote :

I experienced this problem just now, and the effect was that printing the resulting pdf caused the printer to take unreasonably long times for processing it (not surprising with these file sizes). I can see this being serious issue for people using the pdfs (eps won't do when transparency is present) in their LyX/LaTeX documents.

Revision history for this message
Jan David Mol (jjdmol) wrote :

Same here. Mailing a pdf created by Inkscape is a bit of a pain this way if it includes a large JPG such as a photo.

Revision history for this message
enodev (jdi) wrote :

This may in the future be possible with a new addition to the (not yet released) cairo API:
  cairo_surface_{set,get}_mime_data

http://cgit.freedesktop.org/cairo/commit/?id=3c684347f49a581bfba35202ec61a5f6334acd4a
http://cgit.freedesktop.org/cairo/commit/?id=3707178fa48e23b85c5640f3cee72e19f49c700b

Poppler wants to use it as well for this purpose

http://<email address hidden>/msg03310.html

Revision history for this message
Pander (pander) wrote :

In version 0.47, compressed PDF export is still not supported. This is the workaround:
  http://wiki.inkscape.org/wiki/index.php/Current_PDF_Support#Uncompressed_PDF_Output
When it takes a long time for cairo to support for compressed PDF, implement a workaround.

Provide an option when saving as PDF that is called "compress via EPS" or at least provide some text to the users that when compression is needed, to export first to EPS and then convert it to PDF. Which would only work easily on Linux, since these users have epstopdf.

Please improve PDF compression in one way or another (workaround, upgrade cairo, some helpful text, etc.) as soon as possible.

Revision history for this message
Adrian Johnson (ajohnson-redneon) wrote :

Cairo 1.10 supports embedding of JPEG and other formats in PS/PDF.

The cairo_surface_set_mime_data() function is used to attach the JPEG data to in image surface. Then if the backend supports JPEG it wil used the JPEG data instead of the image data.

http://www.cairographics.org/manual/cairo-cairo-surface-t.html#cairo-surface-set-mime-data

Revision history for this message
su_v (suv-lp) wrote :

Setting status back to 'Triaged' because at the moment there is no Inkscape developer assigned to this bug, nor are there patches available that implement the cairo-version-dependent new feature for the embedding of JPEG images in exported PDF files.

@Norbert - if you are working on a patch, please revert the status change.

@Adrian - thanks for the update about the new feature in cairo 1.10!

Changed in inkscape:
status: In Progress → Triaged
Revision history for this message
Norbert Nemec (norbert-nemec-list) wrote :

No, I am not working on this and never did. I might have spent some time on this three years ago, but back then Cairo was not ready for it. Now, I do not have the time to spare any more...

Revision history for this message
chrysn (chrysn) wrote :

as the eps export workaround didn't work for me, i created another workaround, which works by exporting the bloated pdf first and then replacing the images with their pre-compressed jpeg versions.

the attached script is written in an extreme quick-and-dirty fashion. it takes a pdf file and several jpeg files as arguments, reads the jpegs, hashes them (in an ultra-primitive way) and then walks through the pdf file, tries to read the big images, hashes them too, and if it matches, uses imagemagick's convert tool to convert the jpg to pdf (which it does as it is supposed to be, saving the dct image). then it extracts the pdf chunk from the converted jpg and pastes it into the main pdf, replacing the old image.

there are countless ways how this approach could fail (grayscale jpg, imagemagick changing the names it uses inside the pdf, features of pdf i don't know), but it us usable as a workaround and typically reduces output sizes by a factor of ten.

Changed in inkscape (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Paul Sladen (sladen)
summary: - embedded jpg exported to eps&pdf as non-jpg
+ Embedded jpg exported to EPS/PDF as non-jpeg (JPEGs are stored
+ /FlateDecode instead of /DCTDecode)
Revision history for this message
Peter (ppp) wrote :

Just a few comments on Chrysn's ''workaround python script''

I played around with it, but finally I've given up. The file size of replacement file looks right, but opening it, the text is totally garbled and the picture isn't visible. Probably one of the uncounted ways to fail.

But I just want to mention here my experiences, mostly regarding Windows OS.

In line 12 & 54, the file must be opened in binary mode to let pyPdf work correctly.
In line 52, I had to use a renamed version of convert, because the equal-named windows system file messes around.
In line 60, I commented out the delete command, as this needs an inode file system.

Revision history for this message
phil (fongpwf) wrote :

Any explanation on why the priority is set to low? This bug makes exporting drawings with high resolution JPG files embedded useless.

Revision history for this message
phil (fongpwf) wrote :

Nevermind, I now realize the low importance is only in the inkscape package in Ubuntu. It is normal in Inkscape.

Revision history for this message
Alex Valavanis (valavanisalex) wrote :

Hi Phil,

The priority relates to Ubuntu as a whole - not just to the Inkscape package. Inkscape isn't a core package for Ubuntu (such as the Xserver or the linux kernel) so usability bugs won't normally be prioritised very highly. Don't worry though, I can assure you that the upstream developers are very active and I (among others) will happily help to apply any decent patches to the Ubuntu package as soon as they become available.

Revision history for this message
Patrick Storz (ede123) wrote :

This problem severely limits Inkscape's usefulness when one would wants to annotate raster graphics by overlaying them with vector elements.

E.g. I'm using Inkscape to create images for my thesis. Since I'm using pdftex I'd want to export PDFs from Inkscape, but since I can't save the graphics without loosing quality I abandoned Inkscape in favor of GIMP and pure LaTeX.

This bug is only part of a fundamental problem of Inkscape, see e.g.
- Bug #871563
- Bug #958371
- Bug #871563
and probably many more.

When this can be solved in Cairo with cairo_surface_set_mime_data() as laid out before, this should be a high priority bug and should be fixed as soon as possible since it also creates issues like bug #1220912 that could be avoided by not re-encoding embedded or linked images.

Revision history for this message
Riccardo Masala (raggioscuro) wrote :

I had the same problem.
My personal workaround: open the svg file with Scribus and export from here.

Revision history for this message
sam tygier (samtygier) wrote :

Has this been fixed? I just tested latest bzr, making a file with just a jpeg image in it, and then I exported to PDF. The PDF file is only 1KB bigger than the original jpeg.

Revision history for this message
su_v (suv-lp) wrote :

Based on tests with archived builds, this was fixed in revision 12516:
<http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/12516>

(reproduced with r12511, not reproduced with r12517)

Revision history for this message
su_v (suv-lp) wrote :

[ @Krzysztof - please reopen if the report was closed prematurely. ]

Changed in inkscape:
assignee: nobody → Krzysztof Kosinski (tweenk)
milestone: none → 0.91
status: Triaged → Fix Committed
Changed in inkscape:
status: Fix Committed → Fix Released
Changed in inkscape (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.