[corosync] Memory leak on 1.2.3

Steven Dake sdake at redhat.com
Tue Dec 20 15:11:49 GMT 2011


On 12/20/2011 03:12 AM, Chris Alexander wrote:
> Hi all,
> 
> We are using Corosync as part of the Redhat cluster stack. Their
> currently supported version is 1.2.3.
> 

While Red Hat's corosync is "version 1.2.3" the z stream almost entirely
matches the flatiron 1.4 branch.  I take patches and apply them to the RPM.

> Every few days our nodes are (non-simultaneously) being fenced due to
> corosync taking up vast amounts of memory (i.e. 100% of the box). Please
> see a sample log message, we have several just like this, [1] which
> occurs when this happens. Note that it is not always corosync being
> killed - but it is clearly corosync eating all the memory (see top
> output from three servers at various times since their last reboot, [2]
> [3] [4]).
> 
> The corosync version is 1.2.3:
> [g at cluster1 ~]$ corosync -v
> Corosync Cluster Engine, version '1.2.3'
> Copyright (c) 2006-2009 Red Hat, Inc.
> 
> We had a bit of a dig around and there are a significant number of
> bugfix updates which address various segfaults, crashes, memory leaks
> etc. in this minor as well as subsequent minor versions. [5] [6] However
> it seems the Redhat repos haven't been updated past 1.2.3 as yet.
> 
> We're trialling the Fedora 14 (fc14) RPMs for corosync and corosynclib
> (v1.4.2) to see if it fixes the particular issue we are seeing (i.e.
> whether or not the memory keeps spiralling way out of control).
> 

The latest z stream would be your best solution here.

> Has anyone else seen an issue like this, and is there any known way to
> debug or fix it? If we can assist debugging by providing further
> information, please specify what this is (and, if non-obvious, how to
> get it). Any additional tips also welcome.
> 

I haven't seen this problem in the field.  Please report to it to
support.  They may have seen it and can map it to a BZ, or if not help
reproduce it and get it fixed.

Regards
-steve

> Thanks again for your help
> 
> Chris
> 
> [1] http://pastebin.com/CbyERaRT
> [2] http://pastebin.com/uk9ZGL7H
> [3] http://pastebin.com/H4w5Zg46
> [4] http://pastebin.com/KPZxL6UB
> [5] http://rhn.redhat.com/errata/RHBA-2011-1361.html
> [6] http://rhn.redhat.com/errata/RHBA-2011-1515.html
> 
> 
> _______________________________________________
> discuss mailing list
> discuss at corosync.org
> http://lists.corosync.org/mailman/listinfo/discuss



More information about the discuss mailing list