Comment 11 for bug 44919

Revision history for this message
Diogo Matsubara (matsubara) wrote :

OOPS-874A388 is a recent occurance.

<matsubara> flacoste: hi, did you file a bug about the badly encoded query string OOPS?
<flacoste> matsubara: no
<flacoste> matsubara: would consider as part of an existing actually, i don't think it's different
<matsubara> flacoste: which one?
<flacoste> bug 44919
<flacoste> it's the fact that when the query isn't encoded properly it's a bytes string instead of unicode
<flacoste> there might be a way to blanket those
<flacoste> into a UFD
<flacoste> wouldn't solve the issues with fields sent using POST though
<flacoste> maybe we could also blanket that at the publiation level
<BjornT> flacoste: well, the question is whether UFD is the right thing there. it's the browser that is misbehaving. UFD will be just as bad as an oops for the user. the user won't understand what is wrong, and will think that launchpad is broken.
<flacoste> BjornT: right, it's just as bad for the user, but it won't be part of our OOPS report anymore :-)
<flacoste> which I think is what is annoying kiko
<flacoste> BjornT: a "proper" fix might be to try to parse those strings using the first value in HTTP_ACCEPT_ENCODING
<flacoste> since that's what it seems to be in these cases
<BjornT> flacoste: true. i would use a different exception, though (like BrokenUserAgent)
<flacoste> right
<matsubara> flacoste: what you suggest is like trying to detect what encoding they're sending to us?
<flacoste> matsubara: well not auto-detect, but guess for broken one based on their HTTP_ACCEPT_ENCODING
<flacoste> and only try the first value
<matsubara> flacoste: can't we use chardet for that?
<flacoste> matsubara: what charset are you talking about?
<matsubara> flacoste: http://chardet.feedparser.org/
<BjornT> flacoste: you can't use HTTP_ACCEPT_ENCODING. you have to use HTTP_ACCEPT_CHARSET, which most of these oops seem to lack...
<BjornT> i think replacing undecodable characters with ? would do fine.
<flacoste> BjornT: well, the two OOPSes that kiko pasted do contain a HTTP_ACCEPT_CHARSET header, but the first value is wrong in the first case and would work in the second case, so that's not bullet-proof
<flacoste> chardet might be better
<flacoste> well, using ? causes data lossage on the client
<flacoste> so i say either raise BrokenUserAgent
<flacoste> or try to detect encoding using chardet, although that might also lead to data lossage
<BjornT> flacoste: i think some data lossage is ok. if the user has a broken browser, he's probably used to not being able to use non-ascii characters...