automatic handling of non breaking spaces

Bug #36977 reported by Sebastien Bacher on 2006-03-28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself

Bug Description

I don't know for other langages but in french non-breaking spaces are used before
";:!? »" chars and after "«".

A non-breaking space can't be separated of the previous word.
ie: you want to write "example :" and not "example
:" (with a standard space nothing prevent the ":" to be on a new line).

The non-breaking space defined in gucharmap:

To make this easier a part of the french translators use some scripts to add non-breaking spaces autmatically. An example of such script for emacs can be find here (the encoding seems to be wrong on the list archive and non-breaking spaces are displayed as "?"):

Basically if you enter a ";:!?" it automatically add a non-breaking space before so you don't even have to bother to use the compose key every time. If you enter a "«" it automatically display "« _ »" (where "_" is the cursor and the spaces are unbreakables ones).

That makes a big difference for translators (in one case you need to deal with the compose key all the time, in the other case you only have to type) and that would be a nice feature to get in rosetta.

Changed in rosetta:
assignee: nobody → carlos
status: Unconfirmed → Confirmed
Carlos Perelló Marín (carlos) wrote :

Here we have some extra information about nonbreaking spaces:

Claude Paroz (paroz) wrote :

Another related problem is that Mozilla based code doesn't preserve non breaking spaces in input forms.

Carlos Perelló Marín (carlos) wrote :

Yeah, we are aware of that problem, the solution we are going to use is to add a visual tag to represent the non breaking spaces, that way you will see if there is one and mozilla will not 'eat' it.

Didier Raboud (odyx) wrote :

But the question is not to be able to "see" if english version has non-breaking spaces, but to be able to translate "the dog:" in "le chien :" (note the non-breaking space). So we still have to type it, to _add_ it, as english version normally have NO non-breaking spaces.

Or I do miss-understand it?

Carlos Perelló Marín (carlos) wrote :

You missunderstand it, I'm going to add the option to type in a special tag like we do with the '\t' char, we add it as [tab] so you would add something similar to note a non breaking spaces.

I cannot give you a lot of details here, because I need to think a bit more on this, but that's the main idea I have in my mind. This will workaround also the problem with firefox ignoring it.

Didier Raboud (odyx) wrote :

OK. Fine. Sorry then.

That's a good workaround then. But _please_ let the possibility to type normal ones, I use it all the days with Konqueror.

Matthew Revell (matthew.revell) wrote :

Following discussion with Danilo, I've reported bug #81281, which deals with visual tags for non-breaking spaces.

Changed in rosetta:
assignee: carlos → nobody
Changed in rosetta:
importance: Medium → Low
tags: added: feature
Nicolas Delvaux (malizor) wrote :

In fact, at least in French, non-breaking spaces are used before ": »" chars and after "«" but *narrow* non-breaking spaces are used before ";!?" chars.

The narrow non-breaking spaces is U+202F
I opened bug #608631 about this.

Otherwise, this automatic handling of non breaking spaces is a must have to achieve perfect translations in some locals!
Perhaps it should just be an automatic suggestion from Launchpad when it detects there is no [nbsp] or [nnbsp] in some matching cases?

verdy_p (verdy-p) wrote :

Nicolas : all those non-breaking spaces (fines) around punctuations should be narrow, including those after "«" and before "»" which are not different from those before ";:!?".

What is true is the THINSP should never be used in texts, because it is breaking, even if most fonts do not define any mapping for NNBSP, but only for THINSP (the main reason being that the breaking/non-breaking property difference is NEVER handled by fonts themselves but by always by plain text renderers and layout engines for rich text renderers.

The same is true for the unneeded mapping of NBSP (U+000A0) : fonts can just define the mapping to the standard ASCII SPACE (U+0020), and renderers will use the same glyph for representing boths. Why ? Simply because fonts are not true programs, and can't perform any layout. All what fonts define is a table of glyphs to map for a selected subset of the UCS (if he UCS encoding is used in that font).

Fonts also don't have any mapping for controls (except the special format controls needed to control the joining behavior of ligatures within GSUB / GDEF substitution rules, where ZWJ and ZWNJ may be given specific roles without defining any glyph by themselves, and possibly also CGJ which is completely transparent in fonts and is used for blocking the canonical relative reordering of combining diacritics), so they dont map anything for line-breaking CR or LF, or for TAB : if they do it, they just map them to the same glyph as the normal SPACE.

> Nicolas : all those non-breaking spaces (fines) around punctuations
> should be narrow, including those after "«" and before "»" which are
> not different from those before ";:!?".
Well, it seems to depend on the typographer ;-)
(but I agree, it's more logical)

verdy_p (verdy-p) wrote :

Also please reopen Bug #608631

Your unliateral decision to make it "Won't fix" was extremely bad and damaging, because it was not absolutely not thought. Visual tags are needed and adding more visual tags for "problematic" characters or easily confusable characters is needed, and the GUI form can also be improved to explain what these visual tags mean.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers