Comment 6 for bug 605775

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 605775] Re: Loggerhead doesn't support linking to the raw content

On 7 December 2010 15:51, Max Kanat-Alexander <email address hidden> wrote:
> The only URL that uses file ids by default still is the download URL.
> The raw controller listed in that MP is using paths like all of the
> other controllers, and once that MP goes in, it will be very simple to
> also make the download URL use paths. (The general architecture of
> loggerhead also still allows using file ids in the query string
> parameters if you want.)
>
> As far as how the raw view is supposed to be used, I suppose there are
> two cases:
>
> 1) Somebody wants to see the raw content of the file quickly without any of the view or annotation issues.
> 2) Somebody wants to use loggerhead to serve some content.
>
> At first I thought that #2 wasn't going to be feasible, but now I've
> discovered that the raw view is so fast that it could actually be done.

That is pretty cool.

> For #1, you could certainly say, "just serve everything as text/plain",
> but that doesn't actually solve the problem of XSS, because IE 7 and
> below will still sniff the content and render it as whatever the browser
> *thinks* it is. I believe it's only IE 8 and above that support X
> -Content-Type-Options.
>
> So I figured I'd go with the most logical choice and attempt to serve
> the content with its actual, correct MIME type. That's particularly
> valuable for binary files like images or other media, which couldn't
> have a raw view otherwise.
>
> One advantage to this also would be that it gives people the ability to
> rapidly get a single file out of bzr without having to check out an
> entire repository.
>
> For the most part, controlling the MIME type of a file is only the
> illusion of security, which is worse than no security (because it makes
> people believe that they are secure when they are not).
>
> The solution to the XSS problem is very much doable, and it would just
> involve having a secondary domain for serving raw content. I started to
> implement it as part of the above MP, but it turned out to be more
> complicated than I was expecting, so I wanted to save it for a second
> patch, since patches should generally be small and focused so that they
> can be polished and debugged appropriately (among many other important
> reasons to keep changes small and focused).

I wasn't suggesting just changing the mime type to text/plain, and I
agree that's not enough. What I was suggesting was to serve an html
page that contains nothing but the entity-escaped plain text of the
file. For "just show me the plain text" this seems like an easy/safe
solution, though it will be much less useful if you want a URL for a
machine download. However, in the second case probably attachment
disposition is enough.

So I think we have three cases:

1- I want to paste a URL to wget: use /download (is that and get it
as disposition: attachment
2- I want to see just the text in a web browser, without frills: use
/text and you get content-type: html, escaped
3- I want to serve binary assets direct from loggerhead: use /binary
and you get what this patch does, either from the same domain or a
different one. Launchpad would presumably have this off for the
moment; at any rate I don't feel an urgent need to turn it on.

I'm not sure if using a single separate domain will be enough if you
have mutually-untrusting branches served from the same machine.

--
Martin