form text input placed wrong

Bug #152929 reported by Nicolò Chieffo
4
Affects Status Importance Assigned to Milestone
Poppler
Fix Released
Medium
poppler (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

Binary package hint: evince

the new evince has forms to put input. this particular form (attached) has this problem (in windows, with acrobat, it is ok):
when I insert the letter in 1-caracter forms, the letter is displayed under the image "cell", it should be placed in the center of the cell instead.
I will attach a screenshot for a more accurate description.

Revision history for this message
Nicolò Chieffo (yelo3) wrote : form text input paced wrong

Binary package hint: evince

the new evince has forms to put input. this particular form (attached) has this problem (in windows, with acrobat, it is ok):
when I insert the letter in 1-caracter forms, the letter is displayed under the image "cell", it should be placed in the center of the cell instead.
I will attach a screenshot for a more accurate description.

Revision history for this message
Nicolò Chieffo (yelo3) wrote :
Revision history for this message
Sebastien Bacher (seb128) wrote :

Thanks for your bug report. This bug has been reported to the developers of the software. You can track it and make comments here: https://bugs.freedesktop.org/show_bug.cgi?id=12808

Changed in evince:
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Nicolò Chieffo (yelo3) wrote :
Revision history for this message
Nicolò Chieffo (yelo3) wrote :
Revision history for this message
Nicolò Chieffo (yelo3) wrote :

the error is the position of the number "8" under the form "DATA DI NASCITA"

Changed in poppler:
status: Unknown → Confirmed
Revision history for this message
Sergio Zanchetta (primes2h) wrote :

I have exactly the same problem with the same pdf.

Revision history for this message
In , Mvrable (mvrable) wrote :

I have run into the same problem, and can verify that it is still present in the most recent (as of 2008-01-29) development sources from git. I spent some time debugging, and have found what I think is the problem.

Form field contents are displayed in Annot::drawText. The text to display (passed as text) is a UTF-16 string, with byte-order mark (BOM). The field value is, I believe, set in FormWidgetText::setContent, which explicitly adds a BOM to the string.

When generating the appearance stream, the text is converted (in Annot::writeTextString) from Unicode to the appropriate 8-bit characters needed for the selected font. However, the string width is calculated before this, in the main body of Annot::drawText, treating the original unconverted UTF-16 string as an 8-bit string. The two bytes in the BOM (FE FF) are treated as characters to display, so the computed width is too large. This doesn't affect left-justified form fields, but centered and right-justified fields are placed incorrectly.

I currently have an ugly patch which works around this bug, and field alignment appears correct after applying it. But I don't yet handle anything other than a simple single-line form field, so there are other cases which are probably still buggy.

A larger issue, which I'm trying to figure out, is how the form field contents are supposed to be interpreted. Section 8.6.3 of the PDF 1.6 specification says "The field's text is held in a text string (or, beginning with PDF 1.5, a stream) in the V (value) entry of the field dictionary. The contents of this text string or stream are used to construct an appearance stream for displaying the field...". The phrase "text string" seems to imply that the string is either in PDFDocEncoding or UTF-16, which is what poppler seems to assume. However, from a little experimentation it seems Acrobat Reader (sorry, forget which version) simply treats the field value as a string to be interpreted according to whatever encoding is used by the font for the field, not PDFDocEncoding. I'm currently trying to make some sense of this, and figure out what the correct fix is for the problem.

Revision history for this message
In , Mvrable (mvrable) wrote :

Some more investigation of the behavior of Adobe Reader 7.0.9 (Windows):

I'm not sure I should use Adobe Reader as a guide for proper behavior. My test file is http://www.irs.gov/pub/irs-pdf/f1040.pdf. Adobe Reader is exhibiting some rather strange behavior here: the default appearance string for most form fields specifies /HeBo (Helvetica-Bold) as a font, but when editing a field and saving the resulting file, it looks like Adobe Reader is using /Helvetica-Condensed-Bold as a font. Additionally, the two fonts have a different encodings specified (WinAnsiEncoding vs. StandardEncoding) so it's not so surprising that some character encoding issues are coming up.

At the very least, it does seem that Adobe Reader will decode form field values that are encoded in UTF-16 (though it still displays them incorrectly). So, using UTF-16 for form field values in poppler seems reasonable.

I'll see if I can't clean up my earlier patch a bit and post something.

Revision history for this message
In , Mvrable (mvrable) wrote :

Created an attachment (id=14224)
Patch to add better Unicode support to form fields, fixing some alignment bugs

This is a patch I've created which fixes (for me) the mis-alignment of text in form fields. This has also been posted to the mailing list, but I'm including it here as well so there is a record with the bug itself.

This may not be committed to the poppler tree until after some reorganization (to split forms code out from the core annotations support), but the attached patch should apply cleanly to git commit 3e994e8586fa1c87ef7e7f82af1cdacf2cd36310.

Revision history for this message
In , Carlos Garcia Campos (carlosgc) wrote :

Patch has been committed to git master. Thank you very much.

Changed in poppler:
status: Confirmed → Fix Released
Revision history for this message
Pedro Villavicencio (pedro) wrote :

fixed upstream.

Changed in poppler:
status: Triaged → Fix Committed
Revision history for this message
Sebastien Bacher (seb128) wrote :

the bug is fixed in jaunty now

Changed in poppler:
status: Fix Committed → Fix Released
Changed in poppler:
importance: Unknown → Medium
Changed in poppler:
importance: Medium → Unknown
Changed in poppler:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.