ceph-osd from Folsom leaks memory constantly

Bug #1215014 reported by James Troup
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
Won't Fix
High
Unassigned

Bug Description

ii ceph 0.48.3-0ubuntu1~cloud0 distributed storage and file system

Freshly started ceph-osd:

root 26317 1.7 0.4 612016 78224 ? Ssl 16:35 0:03 /usr/bin/ceph-osd --cluster=ceph -i 15 -f

ceph-osd after 6 months+:

root 7892 0.7 8.9 6441656 1465660 ? Ssl 2012 2774:02 /usr/bin/ceph-osd --cluster=ceph -i 16 -f

All our ceph-osds seem to do this; this is a Folsom Openstack cloud from the Ubuntu Cloud Archive on Ubuntu 12.04 LTS.

Tags: prodstack
James Troup (elmo)
tags: added: prodstacks
tags: added: prodstack
removed: prodstacks
Revision history for this message
Sage Weil (sage-newdream) wrote :

There are known leaks in cuttlefish and bobtail, but we have not backported fixes because they are very very slow. Dumpling (0.67.x) is leak free.

Changed in ceph (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Revision history for this message
James Page (james-page) wrote :

Hi James

Any chance you could enable the profiler following:

http://ceph.com/docs/master/rados/troubleshooting/memory-profiling/

I have a hunch that this might actually be related to the version of google-perftools with have in precise - but this at least will give us some more information about what's chewing memory.

Cheers

James

Changed in ceph (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
James Troup (elmo) wrote : Re: [Bug 1215014] Re: ceph-osd from Folsom leaks memory constantly
Download full text (3.7 KiB)

James Page <email address hidden> writes:

> Any chance you could enable the profiler following:
>
> http://ceph.com/docs/master/rados/troubleshooting/memory-profiling/
>
> I have a hunch that this might actually be related to the version of
> google-perftools with have in precise - but this at least will give us
> some more information about what's chewing memory.

So, given:

57274 root 20 0 1427m 713m 9932 S 1 4.5 250:20.11 /usr/bin/ceph-osd --cluster=ceph -i 34 -f
58731 root 20 0 1001m 470m 10m S 80 2.9 229:51.78 /usr/bin/ceph-osd --cluster=ceph -i 36 -f
60881 root 20 0 998m 461m 11m S 0 2.9 181:30.65 /usr/bin/ceph-osd --cluster=ceph -i 37 -f
64272 root 20 0 910m 351m 11m S 1 2.2 197:16.14 /usr/bin/ceph-osd --cluster=ceph -i 35 -f
64584 root 20 0 855m 321m 10m S 1 2.0 101:52.52 /usr/bin/ceph-osd --cluster=ceph -i 2 -f
60613 root 20 0 824m 307m 9744 S 0 1.9 164:26.79 /usr/bin/ceph-osd --cluster=ceph -i 3 -f
58493 root 20 0 785m 296m 8852 S 0 1.9 147:28.98 /usr/bin/ceph-osd --cluster=ceph -i 4 -f
64948 root 20 0 773m 273m 9.9m S 0 1.7 150:23.20 /usr/bin/ceph-osd --cluster=ceph -i 33 -f

I choose OSD 34.

2013-08-29 13:09:57.574853 7fcdbcdeb700 0 log [INF] : osd.34 started profiler
2013-08-29 13:09:57.576890 7fcdbcdeb700 0 osd.34 6002 do_command r=0
2013-08-29 13:10:06.076818 7fcdbcdeb700 0 log [INF] : osd.34tcmalloc heap stats:------------------------------------------------
2013-08-29 13:10:06.138029 7fcdbcdeb700 0 log [INF] : MALLOC: 666882640 ( 636.0 MB) Bytes in use by application
2013-08-29 13:10:06.138057 7fcdbcdeb700 0 log [INF] : MALLOC: + 62881792 ( 60.0 MB) Bytes in page heap freelist
2013-08-29 13:10:06.138080 7fcdbcdeb700 0 log [INF] : MALLOC: + 3149808 ( 3.0 MB) Bytes in central cache freelist
2013-08-29 13:10:06.138097 7fcdbcdeb700 0 log [INF] : MALLOC: + 83456 ( 0.1 MB) Bytes in transfer cache freelist
2013-08-29 13:10:06.208103 7fcdbcdeb700 0 log [INF] : MALLOC: + 5633984 ( 5.4 MB) Bytes in thread cache freelists
2013-08-29 13:10:06.209350 7fcdbcdeb700 0 log [INF] : MALLOC: + 6160384 ( 5.9 MB) Bytes in malloc metadata
2013-08-29 13:10:06.209386 7fcdbcdeb700 0 log [INF] : MALLOC: ------------
2013-08-29 13:10:06.209404 7fcdbcdeb700 0 log [INF] : MALLOC: = 744792064 ( 710.3 MB) Actual memory used (physical + swap)
2013-08-29 13:10:06.210670 7fcdbcdeb700 0 log [INF] : MALLOC: + 143736832 ( 137.1 MB) Bytes released to OS (aka unmapped)
2013-08-29 13:10:06.210685 7fcdbcdeb700 0 log [INF] : MALLOC: ------------
2013-08-29 13:10:06.210903 7fcdbcdeb700 0 log [INF] : MALLOC: = 888528896 ( 847.4 MB) Virtual address space used
2013-08-29 13:10:06.210919 7fcdbcdeb700 0 log [INF] : MALLOC:
2013-08-29 13:10:06.211010 7fcdbcdeb700 0 log [INF] : MALLOC: 75893 Spans in use
2013-08-29 13:10:06.211338 7fcdbcdeb700 0 log [INF] : MALLOC: 193 Thread heaps in use
2013-08-29 13:10:06.211350 7fcdbcdeb700 0 log [INF] : MALLOC: 4096 Tcmalloc page size
2013-08-29 13:10:06.211361 7fcdbcdeb700 0 log [INF] : -...

Read more...

Changed in ceph (Ubuntu):
status: Incomplete → New
James Page (james-page)
Changed in ceph (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.