Bug #704015 “st and client.py's reads aren't checking for EOF co...” : Bugs : OpenStack Object Storage (swift)

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-03-02:

#1

bigclash.sh Edit (1.2 KiB, text/x-sh)

This script fails to reproduce the problem: the download of the object always succeeds.

How would one go about reducing the size of file?

Revision history for this message

John Dickinson (notmyname) wrote on 2012-03-02:

#2

A guess on how I would test it

1) Upload file A with length X (this writes it to replicas I, J, and K)

2) fail replica K

3) Upload file B with length Y (where X>Y) to the same object name (this writes to replicas I and J)

4) restore replica K

5) begin downloading the object from replica K

6) have replication copy from replica I or J to replica K while the client is downloading the object

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-03-14:

#3

Download full text (3.9 KiB)

Here's an update. I think I understand now why gholt or notmyname didn't fix this bug in 10 minutes.

There is no "loop". It seems like the client is on CPU, but that's because the reaper thread is doing sleep(0.01). The actual problem is a hang in the worker thread. It hangs here:

        def _object_body():
            buf = resp.read(resp_chunk_size)
            while buf:
                yield buf
                buf = resp.read(resp_chunk_size)
        object_body = _object_body()

It is invoked from here, much later:
                for chunk in body:
                    fp.write(chunk)
                    read_length += len(chunk)
                    if md5sum:
                        md5sum.update(chunk)
                fp.close()

The worker thread goes into resp.read(65536) and hangs in there instead of getting an empty buf back. The resp here is a normal httplib.HTTPResponse, not overloaded or whatever.

Here's what strace looks like. First, here's a usual cycle, including -EAGAIN:

7969 <... select resumed> ) = 0 (Timeout)
7972 <... write resumed> ) = 1
7969 select(0, NULL, NULL, NULL, {0, 4000} <unfinished ...>
7972 write(19, "\276\321S;\262\371Sbz(S\35g\276\230\242t\263\213i\374\316\20\335\253~\301\240v\343\245x"..., 4096) = 4096
7972 write(19, "7P\306kG\357\206V\241\245\303\321\34\373\337\217F\262,\224\240\237\260\265\220\370\207\r\264\364>D"..., 61440) = 61440
7972 recvfrom(18, 0x7f8db0025164, 65536, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
7972 epoll_ctl(5, EPOLL_CTL_ADD, 18, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=18, u64=14381426149290934290}}) = 0
7972 epoll_wait(5, <unfinished ...>
7969 <... select resumed> ) = 0 (Timeout)
7969 select(0, NULL, NULL, NULL, {0, 2614}) = 0 (Timeout)
7969 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
.............
7969 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
7969 select(0, NULL, NULL, NULL, {0, 2000} <unfinished ...>
7972 <... epoll_wait resumed> {{EPOLLIN, {u32=18, u64=14381426149290934290}}}, 1023, 60000) = 1
7972 epoll_ctl(5, EPOLL_CTL_DEL, 18, {EPOLLWRBAND|EPOLLMSG|EPOLLERR|EPOLLHUP|0x19cb020, {u32=0, u64=14381426149290934272}}) = 0
7972 recvfrom(18, "\246R@y\2\201\373,\356\3559Zt\25FT\262q\250\205pvm\333\351J\215E\260\250N\360"..., 65536, 0, NULL, NULL) = 27016
............
7972 recvfrom(18, "\263f\3604\325\307\333\355\352Z|?\6m\303%\214\244\260\341\n <\236x\330\3374.z?a"..., 3768, 0, NULL, NULL) = 3768
7969 <... select resumed> ) = 0 (Timeout)
7972 madvise(0x7f8db0056000, 32768, MADV_DONTNEED <unfinished ...>

So, the library maintains a fiction of a blocking socket, by adding a polling block in case of -EAGAIN, then removing it when EPOLLIN occurs.

The hang occurs like this:

7972 recvfrom(18, "\v\245\375\227\v\306E\357\366/.N\t\317\366\333\340\3768\307\314\315\333\314j\373v\236\v\360\237R"..., 65536, 0, NULL, NULL) = 65536
7972 write(19, "\v\245\375\227\v\306E\357\366/.N\t\317\366\333\340\3768\307\314\315\333\314j\373v\236\v\360\237R"..., 4096) = 4096
7972 write(19, "\300\22J\2458\252\331\315;\302O\vd\343\366\232\277\374\242wpPD\25^\241\0%L\357\215\213"..., 61440) = 61440
7972 r...

Here's an update. I think I understand now why gholt or notmyname didn't fix this bug in 10 minutes.

There is no "loop". It seems like the client is on CPU, but that's because the reaper thread is doing sleep(0.01). The actual problem is a hang in the worker thread. It hangs here:

def _object_body():
            buf = resp.read(resp_chunk_size)
            while buf:
                yield buf
                buf = resp.read(resp_chunk_size)
        object_body = _object_body()

It is invoked from here, much later:
                for chunk in body:
                    fp.write(chunk)
                    read_length += len(chunk)
                    if md5sum:
                        md5sum.update(chunk)
                fp.close()

The worker thread goes into resp.read(65536) and hangs in there instead of getting an empty buf back. The resp here is a normal httplib.HTTPResponse, not overloaded or whatever.

Here's what strace looks like. First, here's a usual cycle, including -EAGAIN:

7969  <... select resumed> )            = 0 (Timeout)
7972  <... write resumed> )             = 1
7969  select(0, NULL, NULL, NULL, {0, 4000} <unfinished ...>
7972  write(19, "\276\321S;\262\371Sbz(S\35g\276\230\242t\263\213i\374\316\20\335\253~\301\240v\343\245x"..., 4096) = 4096
7972  write(19, "7P\306kG\357\206V\241\245\303\321\34\373\337\217F\262,\224\240\237\260\265\220\370\207\r\264\364>D"..., 61440) = 61440
7972  recvfrom(18, 0x7f8db0025164, 65536, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
7972  epoll_ctl(5, EPOLL_CTL_ADD, 18, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=18, u64=14381426149290934290}}) = 0
7972  epoll_wait(5,  <unfinished ...>
7969  <... select resumed> )            = 0 (Timeout)
7969  select(0, NULL, NULL, NULL, {0, 2614}) = 0 (Timeout)
7969  select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
.............
7969  select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
7969  select(0, NULL, NULL, NULL, {0, 2000} <unfinished ...>
7972  <... epoll_wait resumed> {{EPOLLIN, {u32=18, u64=14381426149290934290}}}, 1023, 60000) = 1
7972  epoll_ctl(5, EPOLL_CTL_DEL, 18, {EPOLLWRBAND|EPOLLMSG|EPOLLERR|EPOLLHUP|0x19cb020, {u32=0, u64=14381426149290934272}}) = 0
7972  recvfrom(18, "\246R@y\2\201\373,\356\3559Zt\25FT\262q\250\205pvm\333\351J\215E\260\250N\360"..., 65536, 0, NULL, NULL) = 27016
............
7972  recvfrom(18, "\263f\3604\325\307\333\355\352Z|?\6m\303%\214\244\260\341\n <\236x\330\3374.z?a"..., 3768, 0, NULL, NULL) = 3768
7969  <... select resumed> )            = 0 (Timeout)
7972  madvise(0x7f8db0056000, 32768, MADV_DONTNEED <unfinished ...>

So, the library maintains a fiction of a blocking socket, by adding a polling block in case of -EAGAIN, then removing it when EPOLLIN occurs.

The hang occurs like this:

7972  recvfrom(18, "\v\245\375\227\v\306E\357\366/.N\t\317\366\333\340\3768\307\314\315\333\314j\373v\236\v\360\237R"..., 65536, 0, NULL, NULL) = 65536
7972  write(19, "\v\245\375\227\v\306E\357\366/.N\t\317\366\333\340\3768\307\314\315\333\314j\373v\236\v\360\237R"..., 4096) = 4096
7972  write(19, "\300\22J\2458\252\331\315;\302O\vd\343\366\232\277\374\242wpPD\25^\241\0%L\357\215\213"..., 61440) = 61440
7972  recvfrom(18, 0x7f8db0025164, 65536, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
7972  epoll_ctl(5, EPOLL_CTL_ADD, 18, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=18, u64=14381426149290934290}}) = 0
7972  epoll_wait(5,  <unfinished ...>
7969  <... select resumed> )            = 0 (Timeout)
7969  select(0, NULL, NULL, NULL, {0, 2456}) = 0 (Timeout)
....................... hangs forever

The socket closed on server side does not trigger the EPOLLIN -or- EPOLLHUP. Why? No idea. It may be just the way things are in Linux.

I think the right solution is to get the library to use a normal, blocking socket. I know from experience that that one works fine and returns the EOF as expected. We already have a thread, although a green one. I am open to suggestions what methods or classes to overload, or any other suggestions.

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-03-16:

#4

The problem that I reproduced (or "reproduced") using truncate(1) turned out not to have anything with the client at all. If the file is truncated, the DiskFile in object server reaches EOF, and quits. The proxy server does nothing about it either. However, the client does not know that there was an EOF, so it just sits there. Meanwhile, servers keep their connections open and cached, and so we just hung. In theory, there should be a timeout on an idle connection, but it does not happen. No matter, object server has to signal the premature EOF, and proxy server has to pass it to client. And the only way to do it in HTTP is to break the TCP connection.

Pete Zaitcev (zaitcev) on 2012-03-16

Changed in swift:
status:	New → In Progress
assignee:	nobody → Pete Zaitcev (zaitcev)

Revision history for this message

gholt (gholt) wrote on 2012-03-16:

#5

I was just re-alerted to this bug report, sorry I haven't been paying attention.

I'm pretty sure the original bug was due to the file size changing for a file being uploaded.

It reads like someone mentioned it to me in a conversation and I dropped it in here real quick.

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-03-16:

#6

Very well. Since I am unable to make uploads and downloads race each other, we need a reproducer (either a script or a set of instructions). otherwise I may be chasing an imaginary bug, because I "reproduce" it by tinkering with the backing store directly.

I'm going to put this away for now.

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-03-16:

#7

patch 01, object server only Edit (2.1 KiB, text/plain)

Saving WIP for future reference.

Chmouel Boudjnah (chmouel) on 2012-06-07

Changed in swift:
importance:	Undecided → Low

Revision history for this message

Pete Zaitcev (zaitcev) wrote on 2012-06-07:

#8

See also bug 736485: apparently Mike Barton had clients hanging forever too.

Revision history for this message

Tom Fifield (fifieldt) wrote on 2014-05-31:

#9

This bug has been idle for a long time. Can it be reproduced in a supported version?

Changed in swift:
status:	In Progress → Incomplete
assignee:	Pete Zaitcev (zaitcev) → nobody

Revision history for this message

Launchpad Janitor (janitor) wrote on 2014-07-31:

#10

[Expired for OpenStack Object Storage (swift) because there has been no activity for 60 days.]

Changed in swift:
status:	Incomplete → Expired

OpenStack Object Storage (swift)

st and client.py's reads aren't checking for EOF condition

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches