DistroSeries:+queue Timeout accepting many packages queue page

Bug #276950 reported by Diogo Matsubara
46
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Critical
Unassigned

Bug Description

The newest known at the moment is OOPS-1907G1439

Branch: launchpad-rev-12632.1
Revno: 12633
SQL time: 11140 ms
Non-sql time: 1809 ms
Total time: 12949 ms
Statement Count: 284

278. 2016 9942ms SQL-launchpad-main-master
SELECT PackageUpload.archive,
       PackageUpload.changesfile,
       PackageUpload.date_created,
       PackageUpload.distroseries,
       PackageUpload.id,
       PackageUpload.pocket,
       PackageUpload.signing_key,
       PackageUpload.status
FROM PackageUpload
WHERE packageupload.distroseries = 104
  AND packageupload.archive IN (1, 534)
  AND packageupload.status IN (0)
ORDER BY PackageUpload.id DESC LIMIT 31
OFFSET 0

Related branches

description: updated
Celso Providelo (cprov)
Changed in soyuz:
milestone: none → pending
Revision history for this message
Björn Tillenius (bjornt) wrote :

I'm going to take a look and see I there's something I can do.

Changed in soyuz:
assignee: nobody → bjornt
Changed in soyuz:
assignee: Björn Tillenius (bjornt) → nobody
tags: added: queue-page
Changed in soyuz:
status: New → Triaged
importance: Undecided → High
summary: - Timeout accepting more than 6 packages in queue page
+ DistroSeries:+queue Timeout accepting many packages queue page
description: updated
Revision history for this message
Scott Kitterman (kitterman) wrote :

Visibly it 'feels' faster recently and I've not had this issue since the most current LP release. I'll post an updated OOPS when it happens again.

Revision history for this message
Scott Kitterman (kitterman) wrote :

Conveniently it happened almost right away:

OOPS-1721K2125

Revision history for this message
Scott Kitterman (kitterman) wrote :

Tried again (this is 4 source packages)

https://launchpad.net/ubuntu/maverick/+queue?queue_state=1&queue_text=

OOPS-1721E1997

Worked on the third try.

description: updated
Revision history for this message
William Grant (wgrant) wrote :

The repeated DistroArchSeries and Archive queries come mostly from SourcePackageRelease.getBuildByArch. It has a bad habit of retrieving the matching DAS for every series in the ancestry, each with a separate query. It then goes on to find every distro archive for every ancestor series... roughly three queries per ancestor per build. Those can be collapsed into a total of two queries per build. That should cut out roughly 400 of the 1400 queries in OOPS-1721E1997, which is a respectable start. A lot of the rest is to do with bug notifications.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 276950] Re: DistroSeries:+queue Timeout accepting many packages queue page

Is there any reason you can't do one query to get the data for all the builds?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

> DistroArchSeries.architecturetag = %s ORDER BY DistroArchSeries.id +
> + Consolidating those into just-query-it-once would save 2.5 seconds
> + straight up.
> +
> + Its also possible that the view is preparing stuff needed for GET on
> + POST, which would be wasteful.

It's not, it redirects after a POST.

> + The big killer however is ~ 600 individual bug lookups, which I am
> + guessing is being driven by notifications or something : if so we'll
> + want to restructure the events being raised.

Correct - uploads that close bugs. All of that closure/notification code is
in the lp.bugs tree.

This page is *horrendously* difficult to optimise. I've been chipping away at
it for 3 years now. The problem is that it pulls data from all over the place
and the data it needs depends on other data which depends on other data. I
know of one definite place that can be attacked that has not been optimised
yet.

Revision history for this message
Robert Collins (lifeless) wrote :

On Mon, Sep 20, 2010 at 8:17 PM, Julian Edwards
<email address hidden> wrote:
>> DistroArchSeries.architecturetag = %s ORDER BY DistroArchSeries.id +
>> + Consolidating those into just-query-it-once would save 2.5 seconds
>> + straight up.
>> +
>> + Its also possible that the view is preparing stuff needed for GET on
>> + POST, which would be wasteful.
>
> It's not, it redirects after a POST.

BTW, since I made that guess, wgrant confirmed on IRC that the POST is
prepping the data for the GET, then ignoring it, then doing its work,
and finally the redirect -> the GET that users see.

-Rob

Revision history for this message
William Grant (wgrant) wrote :

It doesn't prepare all the GET data, but it does do some of it. The fix is trivial (move the performQueueAction call into initialize()).

A lot of the remaining time is indeed bug notifications, which are causing problems elsewhere anyway.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

> BTW, since I made that guess, wgrant confirmed on IRC that the POST is
> prepping the data for the GET, then ignoring it, then doing its work,
> and finally the redirect -> the GET that users see.
>
> -Rob

Spe4ial. :/

tags: added: pg83
Revision history for this message
Scott Kitterman (kitterman) wrote :

Here's a newer oops: (Error ID: OOPS-1768F1835)

Stuart Bishop (stub)
tags: removed: pg83
Revision history for this message
Robert Collins (lifeless) wrote :

The slow query in the updated description is slow on qastaging:

SELECT PackageUpload.archive, PackageUpload.changesfile, PackageUpload.date_created, PackageUpload.distroseries, PackageUpload.id, PackageUpload.pocket, PackageUpload.signing_key, PackageUpload.status FROM PackageUpload WHERE packageupload.distroseries = 103 AND packageupload.archive IN (1, 534) AND packageupload.status IN (0) ORDER BY PackageUpload.id DESC LIMIT 31 OFFSET 0;
 archive | changesfile | date_created | distroseries | id | pocket | signing_key | status
---------+-------------+----------------------------+--------------+---------+--------+-------------+--------
       1 | 57505775 | 2010-10-12 15:35:10.566515 | 103 | 2153607 | 30 | 5446 | 0
(1 row)

Time: 1625.788 ms

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..824.59 rows=31 width=36) (actual time=0.575..1627.040 rows=1 loops=1)
   -> Index Scan Backward using distroreleasequeue_pkey on packageupload (cost=0.00..81368.69 rows=3059 width=36) (actual time=0.571..1627.035 rows=1 loops=1)
         Filter: ((archive = ANY ('{1,534}'::integer[])) AND (distroseries = 103) AND (status = 0))
 Total runtime: 1627.106 ms
(4 rows)

description: updated
description: updated
Changed in launchpad:
importance: High → Critical
Revision history for this message
William Grant (wgrant) wrote :

The current indices have been inappropriate since ArchiveRework in 2006-7, but the planner still makes bad choices. adding an index on packageupload(archive, distroseries, status, id) brings the query down to <1ms.

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (8.1 KiB)

SELECT DISTINCT BinaryPackageName.id,
                BinaryPackageName.name
FROM BinaryPackageName,
     BinaryPackagePublishingHistory,
     BinaryPackageRelease,
     DistroArchSeries
WHERE BinaryPackagePublishingHistory.binarypackagerelease = BinaryPackageRelease.id
  AND BinaryPackagePublishingHistory.distroarchseries = DistroArchSeries.id
  AND DistroArchSeries.distroseries = 103
  AND BinaryPackagePublishingHistory.status IN (2,
                                                1)
  AND BinaryPackagePublishingHistory.archive IN (1,
                                                 534)
  AND BinaryPackageRelease.binarypackagename = BinaryPackageName.id
  AND BinaryPackageName.id IN (151425,
                               151426,
                               67843,
                               67846,
                               112519,
                               151432,
                               96654,
                               112530,
                               151427,
                               175764,
                               85790,
                               85791,
                               77217,
                               77218,
                               77219,
                               77221,
                               77222,
                               77223,
                               77224,
                               81577,
                               134905,
                               151430,
                               73520,
                               71474,
                               73526,
                               73527,
                               175544,
                               100925,
                               175553,
                               160578,
                               160579,
                               88773,
                               81144,
                               104275,
                               151431,
                               88789,
                               120023,
                               86619,
                               86620,
                               67841,
                               67937,
                               67938,
                               81768,
                               84716,
                               175763,
                               119540,
                               182134,
                               182135,
                               146424,
                               146425,
                               126586,
                               67838,
                               133717)
  AND (1=1)

->

cold

 HashAggregate (cost=15994.44..15994.97 rows=53 width=29) (actual time=201944.860..201944.889 rows=53 loops=1)
   -> Hash Join (cost=165.86..15994.14 rows=60 width=29) (actual time=21018.844..201943.066 rows=720 loops=1)
         Hash Cond: (binarypackagepublishinghistory.distroarchseries = distroarchseries.id)
         -> Nested Loop (cost=163.74..15988.25 rows=843 width=33) (actual time=1090.569..201937.187 rows=5186 loops=1)
               -> Nested Loop (cos...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (5.0 KiB)

This is the core work:
select BinaryPackageRelease.binarypackagename from
     BinaryPackagePublishingHistory,
     BinaryPackageRelease,
     DistroArchSeries
WHERE BinaryPackagePublishingHistory.binarypackagerelease = BinaryPackageRelease.id
  AND BinaryPackagePublishingHistory.distroarchseries = DistroArchSeries.id
  AND DistroArchSeries.distroseries = 103
  AND BinaryPackagePublishingHistory.status IN (2, 1)
  AND BinaryPackagePublishingHistory.archive IN (1, 534)
  AND BinaryPackageRelease.binarypackagename IN (151425,
                               151426, 67843, 67846, 112519,
                               151432, 96654, 112530, 151427,
                               175764, 85790, 85791, 77217,
                               77218, 77219, 77221, 77222,
                               77223, 77224, 81577, 134905,
                               151430, 73520, 71474, 73526,
                               73527, 175544, 100925, 175553,
                               160578, 160579, 88773, 81144,
                               104275, 151431, 88789, 120023,
                               86619, 86620, 67841, 67937,
                               67938, 81768, 84716, 175763,
                               119540, 182134, 182135, 146424,
                               146425, 126586, 67838, 133717);

 Hash Join (cost=179847.88..408910.32 rows=4516 width=4) (actual time=1006.846..2036.877 rows=720 loops=1)
   Hash Cond: (binarypackagepublishinghistory.binarypackagerelease = binarypackagerelease.id)
   -> Hash Join (cost=24542.79..252348.43 rows=96931 width=4) (actual time=502.347..1714.819 rows=215414 loops=1)
         Hash Cond: (binarypackagepublishinghistory.distroarchseries = distroarchseries.id)
         -> Bitmap Heap Scan on binarypackagepublishinghistory (cost=24540.67..246288.13 rows=1357031 width=8) (actual time=502.045..1173.741 rows=1752450 loops=1)
               Recheck Cond: ((arch...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

Most of the cost is here:

         -> Hash (cost=154514.79..154514.79 rows=63224 width=8) (actual time=116.328..116.328 rows=67260 loops=1)
               -> Bitmap Heap Scan on binarypackagerelease (cost=1770.43..154514.79 rows=63224 width=8) (actual time=25.402..88.795 rows=67260 loops=1)
                     Recheck Cond: (binarypackagename = ANY ('{151425,151426,67843,67846,112519,151432,96654,112530,151427,175764,85790,85791,77217,77218,77219,77221,77222,77223,77224,81577,134905,151430,73520,71474,73526,73527,175544,100925,175553,160578,160579,88773,81144,104275,151431,88789,120023,86619,86620,67841,67937,67938,81768,84716,175763,119540,182134,182135,146424,146425,126586,67838,133717}'::integer[]))
                     -> Bitmap Index Scan on binarypackagerelease_binarypackagename_key (cost=0.00..1754.62 rows=63224 width=0) (actual time=18.315..18.315 rows=67260 loops=1)
                           Index Cond: (binarypackagename = ANY ('{151425,151426,67843,67846,112519,151432,96654,112530,151427,175764,85790,85791,77217,77218,77219,77221,77222,77223,77224,81577,134905,151430,73520,71474,73526,73527,175544,100925,175553,160578,160579,88773,81144,104275,151431,88789,120023,86619,86620,67841,67937,67938,81768,84716,175763,119540,182134,182135,146424,146425,126586,67838,133717}'::integer[]))

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (4.2 KiB)

with bpr as (select binarypackagerelease.id, binarypackagename from binarypackagerelease where binarypackagename in (151425,
                               151426, 67843, 67846, 112519,
                               151432, 96654, 112530, 151427,
                               175764, 85790, 85791, 77217,
                               77218, 77219, 77221, 77222,
                               77223, 77224, 81577, 134905,
                               151430, 73520, 71474, 73526,
                               73527, 175544, 100925, 175553,
                               160578, 160579, 88773, 81144,
                               104275, 151431, 88789, 120023,
                               86619, 86620, 67841, 67937,
                               67938, 81768, 84716, 175763,
                               119540, 182134, 182135, 146424,
                               146425, 126586, 67838, 133717))
select bpr.binarypackagename from
     bpr,
     BinaryPackagePublishingHistory
WHERE BinaryPackagePublishingHistory.binarypackagerelease =bpr.id
  AND BinaryPackagePublishingHistory.distroarchseries in ( 97, 98, 96, 99, 101, 102)
  AND BinaryPackagePublishingHistory.status IN (2, 1)
  AND BinaryPackagePublishingHistory.archive IN (1, 534);

about 800:

 Merge Join (cost=425359.81..427362.24 rows=63224 width=4) (actual time=1825.164..1891.782 rows=720 loops=1)
   Merge Cond: (bpr.id = binarypackagepublishinghistory.binarypackagerelease)
   CTE bpr
     -> Bitmap Heap Scan on binarypackagerelease (cost=1770.43..154514.79 rows=63224 width=8) (actual time=43.265..141.848 rows=67260 loops=1)
           Recheck Cond: (binarypackagename = ANY ('{151425,151426,67843,67846,112519,151432,96654,112530,151427,175764,85790,85791,77217,77218,77219,77221,77222,77223,77224,81577,134905,151430,73520,71474,73526,73527,175544,100925,175553,160578,160579,88773,81144,104275,151431,88789,120023,86619,86620,67841,67937,67938,81768,84716,175763,119540,182134,182135,146424,146425,126586,67838,133717}'::integer[]))
           -> Bitmap Index Scan on binarypackagerelease_binarypackagename_key (cost=0.00..1754.62 rows=63224 width=0) (actual time=30.468..30.468 rows=67260 loops=1)
                 Index Cond: (binarypackagename = ANY ('{151425,151426,67843,67846,112519,151432,96654,112530,151427,175764,85790,85791,77217,77218,77219,77221,77222,77223,77224,81577,134905,151430,73520,71474,73526,73527,175544,100925,175553,160578,160579,88773,81144,104275,151431,88789,120023,86619,86620,67841,67937,67938,81768,84716,175763,119540,182134,182135,146424,146425,126586,67838,133717}'::integer[]))
   -> Sort (cost=6306.02..6464.08 rows=63224 width=8) (actual time=280.025..302.738 rows=66539 loops=1)
         Sort Key: bpr.id
         Sort Method: quicksort Memory: 6225kB
         -> CTE Scan on bpr (cost=0.00..1264.48 rows=63224 width=8) (actual time=43.273..210.836 rows=67260 loops=1)
   -> Sort (cost=264539.00..265066.04 rows=210814 width=4) (actual time=1482.507..1534.476 rows=215414 loops=1)
         Sort Key: binarypackagepublishinghistory.binarypackagerelease
         Sort Method: external sort Disk: 3792kB
         -> Bitmap Heap Scan on binarypackagepu...

Read more...

description: updated
Revision history for this message
Robert Collins (lifeless) wrote :

slow query uses

 Limit (cost=0.00..1843.12 rows=31 width=36) (actual time=248.269..25253.418 rows=1 loops=1)
   -> Index Scan Backward using distroreleasequeue_pkey on packageupload (cost=0.00..93404.35 rows=1571 width=36) (actual time=248.266..25253.412 rows=1 loops=1)
         Filter: ((archive = ANY ('{1,534}'::integer[])) AND (distroseries = 104) AND (status = 0))
 Total runtime: 25253.503 ms

at the moment. 1.6 seconds hot.

Revision history for this message
Robert Collins (lifeless) wrote :

With a new index:
create index packageupload__distroseries__status__archive__idx on packageupload(distroseries, status, archive, id);'

we get

explain analyze SELECT PackageUpload.archive,
       PackageUpload.changesfile,
       PackageUpload.date_created,
       PackageUpload.distroseries,
       PackageUpload.id,
       PackageUpload.pocket,
       PackageUpload.signing_key,
       PackageUpload.status
FROM PackageUpload
WHERE packageupload.distroseries = 104
  AND packageupload.archive IN (1, 534)
  AND packageupload.status IN (0)
ORDER BY distroseries desc, status desc, archive desc, PackageUpload.id DESC LIMIT 31
 OFFSET 0;
                                                                                      QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..105.89 rows=31 width=36) (actual time=0.106..0.108 rows=1 loops=1)
   -> Index Scan Backward using packageupload__distroseries__status__archive__idx on packageupload (cost=0.00..5365.99 rows=1571 width=36) (actual time=0.104..0.106 rows=1 loops=1)
         Index Cond: ((distroseries = 104) AND (status = 0))
         Filter: (archive = ANY ('{1,534}'::integer[]))
 Total runtime: 0.159 ms
(5 rows)

Time: 1.931 ms

Revision history for this message
Robert Collins (lifeless) wrote :

Our existing query does not use the new index on its own.

Revision history for this message
Robert Collins (lifeless) wrote :

launchpad_qastaging=> \di+ packageupload__distroseries__status__archive__idx
                                                  List of relations
 Schema | Name | Type | Owner | Table | Size | Description
--------+---------------------------------------------------+-------+----------+---------------+-------+-------------
 public | packageupload__distroseries__status__archive__idx | index | postgres | packageupload | 72 MB |

Revision history for this message
Robert Collins (lifeless) wrote :

Our existing index gives a reasonable plan when changed to use a sort that uses the new index - and it should consistently get better results when the history is spread all over the place :

launchpad_qastaging=> explain analyze SELECT PackageUpload.archive,
       PackageUpload.changesfile,
       PackageUpload.date_created,
       PackageUpload.distroseries,
       PackageUpload.id,
       PackageUpload.pocket,
       PackageUpload.signing_key,
       PackageUpload.status
FROM PackageUpload
WHERE packageupload.distroseries = 104
  AND packageupload.archive IN (1, 534)
  AND packageupload.status IN (0)
ORDER BY distroseries desc, status desc, archive desc, PackageUpload.id DESC LIMIT 31
 OFFSET 0;
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=5400.68..5400.76 rows=31 width=36) (actual time=0.053..0.054 rows=1 loops=1)
   -> Sort (cost=5400.68..5404.61 rows=1571 width=36) (actual time=0.052..0.052 rows=1 loops=1)
         Sort Key: archive, id
         Sort Method: quicksort Memory: 25kB
         -> Index Scan using packageupload__distroseries__status__idx on packageupload (cost=0.00..5353.91 rows=1571 width=36) (actual time=0.032..0.034 rows=1 loops=1)
               Index Cond: ((distroseries = 104) AND (status = 0))
               Filter: (archive = ANY ('{1,534}'::integer[]))
 Total runtime: 0.116 ms
(8 rows)

(Note that strictly speaking this is a different sort order, but for the *specific* query in hand the only difference is that different archives will be grouped together in the output. I'm not sure why we have a limited query here on POST rather than a) get all or b) check for existence. Thats something for code spelunking.

Revision history for this message
Robert Collins (lifeless) wrote :

With this index: create index packageupload__distroseries__status__archive__idx on packageupload(archive, distroseries, status, id);

original query (packageupload.id desc):

                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..1843.12 rows=31 width=36) (actual time=49.705..1815.657 rows=1 loops=1)
   -> Index Scan Backward using distroreleasequeue_pkey on packageupload (cost=0.00..93404.35 rows=1571 width=36) (actual time=49.700..1815.649 rows=1 loops=1)
         Filter: ((archive = ANY ('{1,534}'::integer[])) AND (distroseries = 104) AND (status = 0))
 Total runtime: 1816.044 ms
(4 rows)

ORDER BY archive desc, PackageUpload.id DESC gives

                                                                                         QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..1150.06 rows=31 width=36) (actual time=237.566..329.911 rows=1 loops=1)
   -> Index Scan Backward using packageupload__distroseries__status__archive__idx on packageupload (cost=0.00..58282.10 rows=1571 width=36) (actual time=237.563..329.906 rows=1 loops=1)
         Index Cond: ((distroseries = 104) AND (status = 0))
         Filter: (archive = ANY ('{1,534}'::integer[]))
 Total runtime: 329.973 ms

Revision history for this message
Robert Collins (lifeless) wrote :

create index packageupload__distroseries__status__archive__idx on packageupload(date_created, distroseries, archive) where status in (0,1);

launchpad_qastaging=> explain analyze SELECT PackageUpload.archive,
       PackageUpload.changesfile,
       PackageUpload.date_created,
       PackageUpload.distroseries,
       PackageUpload.id,
       PackageUpload.pocket,
       PackageUpload.signing_key,
       PackageUpload.status
FROM PackageUpload
WHERE packageupload.distroseries = 104
  AND packageupload.archive IN (1, 534)
  AND packageupload.status IN (0)
ORDER BY PackageUpload.date_created DESC LIMIT 31
 OFFSET 0;
                                                                                      QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..104.34 rows=31 width=36) (actual time=0.258..7.719 rows=1 loops=1)
   -> Index Scan Backward using packageupload__distroseries__status__archive__idx on packageupload (cost=0.00..5287.65 rows=1571 width=36) (actual time=0.257..7.716 rows=1 loops=1)
         Index Cond: (distroseries = 104)
         Filter: ((archive = ANY ('{1,534}'::integer[])) AND (status = 0))
 Total runtime: 7.787 ms
(5 rows)

Revision history for this message
Robert Collins (lifeless) wrote :

this index:
create index packageupload__distroseries__status__archive__idx on packageupload(id, distroseries, archive) where status in (0,1);

launchpad_qastaging=> explain analyze SELECT PackageUpload.archive,
       PackageUpload.changesfile,
       PackageUpload.date_created,
       PackageUpload.distroseries,
       PackageUpload.id,
       PackageUpload.pocket,
       PackageUpload.signing_key,
       PackageUpload.status
FROM PackageUpload
WHERE packageupload.distroseries = 104
  AND packageupload.archive IN (1, 534)
  AND packageupload.status IN (0)
ORDER BY PackageUpload.id DESC LIMIT 31
 OFFSET 0;
                                                                                      QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit (cost=0.00..104.34 rows=31 width=36) (actual time=0.155..6.775 rows=1 loops=1)
   -> Index Scan Backward using packageupload__distroseries__status__archive__idx on packageupload (cost=0.00..5287.65 rows=1571 width=36) (actual time=0.153..6.772 rows=1 loops=1)
         Index Cond: (distroseries = 104)
         Filter: ((archive = ANY ('{1,534}'::integer[])) AND (status = 0))
 Total runtime: 6.829 ms
(5 rows)

Revision history for this message
Robert Collins (lifeless) wrote :

better name:
create index packageupload__id_distroseries__archive__idx on packageupload(id, distroseries, archive) where status in (0,1);

Revision history for this message
Robert Collins (lifeless) wrote :

There may be other problems lurking, this is quite an old bug after all. However we've added the index and *that* cause should be gone.

We can reopen this or open a new clean bug if problems reoccur.

Changed in launchpad:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.