Feed URL's and ID best practices are unclear

Bug #175482 reported by Islam Alshaikh إسلام الشيخ
16
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Brad Crittenden

Bug Description

The RSS feed URL and entry URL's in the Announcement feeds are not correct. From the code, it is not clear where those URL's should point. Should they point back at the feeds themselves, or should they point at the specific web-presented content that they are syndicating?

In the case of Announcement feeds, should:

 (1) the Feed.url point to:
   * feeds.launchpad.net/project/announcements.atom or
   * launchpad.net/project/

 (2) the FeedEntry.url point to:
 * the Announcement in the Launchpad, or
 * the Announcement.url (i.e. the web page of the announcement if it is elsewhere)

From the IFeed it is not at all clear what the best practice is for each fo the various URL's and ID's that make up a feed or a feed entry.

Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Can we clarify what is meant by this? There are two links, IIRC. One for the feed as a whole, and one for each entry. Is that correct? Can you give me exact examples of where those two should point?

Changed in launchpad:
assignee: nobody → sabdfl
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Mark Shuttleworth (sabdfl) wrote :

Folks, I've just subscribed a couple of LP developers who have worked on the feeds infrastructure. This bug was originally filed as an issue with the announcements feeds, about where the various URL's in the feed should be pointing. But it seems to me that the real problem is that there is no *in code* documentation as to the best practices to follow, for Feed and FeedEntry subclasses, for:

 * the id's
 * the url's (link_alternate etc)

At the moment, this is assigned to me because of the way it affects announcement feeds but I would also like to draw the attention of the feed subsystem architects to the problem and ask them to fix the interfaces and the docstrings. Also, I would like to reiterate that we should have a super-commented implementation, and I would be delighted to make that the announcements.py codebase.

description: updated
Revision history for this message
Leonard Richardson (leonardr) wrote :
Download full text (5.8 KiB)

Hopefully this little essay will clear things up. I've found a few errors, most of them stemming from incorrect use of the "alternate" relationship between links.

= Feeds =

There are three URLs mentioned in the interface. The only problem here
is that "alternate_url" is used incorrectly.

site_url: Only used internally as a root address. Shows up indirectly
as the root of the URL to the icon.

url: The URL to the feed resource itself. So for:

  http://feeds.launchpad.net/ubuntu/announcements.atom

 the url would be:

  http://feeds.launchpad.net/ubuntu/announcements.atom

 and in the feed document I expect to see:

  <link rel="self"
          href="http://feeds.launchpad.net/ubuntu/announcements.atom"/>

 which is correct.

alternate_url: The URL to a resource that is the human-readable
 equivalent of the feed. So for:

  http://feeds.launchpad.net/ubuntu/announcements.atom

 the alternate_url would be:

  http://launchpad.net/ubuntu/+announcements

 Both those URLs are in some sense "the same thing": the list of announcments
 for Ubuntu. In http://launchpad.net/ubuntu/+announcements we see this
 markup:

  <link rel="alternate" type="application/atom+xml"
        href="http://feeds.launchpad.net/ubuntu/announcements.atom"
        title="Announcements for Ubuntu" />

 Similarly, in http://feeds.launchpad.net/ubuntu/announcements.atom
 I expect to see this markup:

    <link rel="alternate"
          href="https://launchpad.net/ubuntu/+announcements"/>

 Instead I see this, which is wrong, and probably causing some
 confusion.

    <link rel="alternate"
          href="https://launchpad.net/ubuntu"/>

 The Ubuntu project is not 'the same thing' as the list of Ubuntu
 announcements. This would be acceptable (if suboptimal) if
 https://launchpad.net/ubuntu were the only place you could go to see
 Ubuntu announcements, but fortunately we have
 https://launchpad.net/ubuntu/+announcements. The alternate_url should
 be changed.

= Entries =

You've got one URL here, link_alternate, which is used
incorrectly. There are also two other URLs you could use here, one of
which (link_via) will do what you're currently using link_alternate
for.

link_self (NEW): This isn't present but I thought I'd mention it. This
 is the link to the Atom entry itself. It's analogous to the "url"
 you've defined for a feed. If it were present, it look like this in
 the Atom feed at
 https://feeds.launchpad.net/ubuntu/announcements.atom:

    <link rel="self"
          href="https://feeds.launchpad.net/ubuntu/announcements/2.atom">

 It's not present because you don't give individual announcements
 their own URLs. We're going to start doing something like this for
 web services, but you don't have to.

link_alternate: Analogous to the feed's alternate_url. Currently the
 feed's alternate_url is wrong and this is wrong in an analogous way.

 This is the URL to a different representation of 'the same thing' as
 this entry. Right now it looks like this:

    <link rel="alternate"
          href="http://www.ubuntu.com/news/macedonia-school-computers"/>

 This is wrong; http://www.ubuntu.com/news/macedonia-school-computers
 is not 'the same thing' as the second entry in the U...

Read more...

Brad Crittenden (bac)
Changed in launchpad:
assignee: sabdfl → bradcrittenden
Revision history for this message
Mark Shuttleworth (sabdfl) wrote : Re: [Bug 175482] Re: Feed URL's and ID best practices are unclear

Leonard, the issue is not specific to the implementation in
Announcements. The issue is that the interfaces do not explain best
practice, so EVERY Launchpad developer who tries to implement Feeds is
going to get it wrong. First, the API naming should be improved, so that
things like link_alternate actually say what they are. Second, the
docstrings should be clearer. Third, one of the Feed implementations (I
woul dbe happy for it to be the Announcements one) should be designated
as a reference implementation, and LOTS of comments (paragraphs of them)
added. This way, people can cargo-cult that implementation, read the
comments, tweak and amend, and get it right.

W.r.t. the tag: url I was not trying to use a real URL fragment, I was
trying to ensure that the "announcements" space was separated from any
other space, so that someone who does an RSS feed for, for example bugs,
does not inadvertently clash in namespaces. The model we have used
elsewhere is that "namespaces" are indicated by +. That's why we have
/ubuntu/+source/apache. It means that "source" is not a distroseries,
it's a new namespace under distro.

Please feel free to update feeds/announcements.py accordingly, designate
it the reference implementation, and comment it in detail!

Thanks,
Mark

Revision history for this message
Brad Crittenden (bac) wrote :

Mark's and Leonard's feedback is being incorporated into a fix for the announcements feeds. Many of the issues do affect all feeds and the code has been refactored to take those changes into account.

Leonard's suggestions for "link self" and "link via" for entries have been noted but will be deferred from the fix for this bug.

Revision history for this message
Brad Crittenden (bac) wrote :

Committed in RF 5455

Changed in launchpad:
status: Incomplete → Fix Committed
Elliot Murphy (statik)
Changed in launchpad:
milestone: none → 1.2.1
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.