Comment 3 for bug 1066775

Revision history for this message
Raphaƫl Badin (rvb) wrote :

This is mainly because the call to NodesHandler.list() is very expensive. 3 SQL queries are issued for each node.

The fact that we return the fields 'macaddress_set' and 'tag_names' is responsible for this (http://paste.ubuntu.com/1285463/).

In a single node record (http://paste.ubuntu.com/1285467/), the macaddress_set bit is expensive to compute: it has to fetch the MAC address and then it fetches (again) the node itself to compute the 'resource_uri' bit. This is obviously fairly stupid but the problem is that we can't use "nodes = nodes.prefetch_related('macaddress_set__node')" because internally, piston uses queryset.iterator (https://docs.djangoproject.com/en/dev/ref/models/querysets/#iterator) to iterate over the query and this discards any optimization made with prefetch_related(). I've tried adding the prefetch_related() call and changed the source of piston to use queryset.all() instead of queryset.iterator() and it works fine.

For the tag_name stuff, it's the same problem (the fact that piston uses query.iterator() forbids the usage of prefetch_related) but on top of that it seems that values_list() can't work with prefetch_related either. If I change piston to use queryset.all() instead of queryset.iterator() and change tag_name to not use values_list (http://paste.ubuntu.com/1285476/), then the number of queries is constant.