[corosync] [PATCH v2] Fixed bug: Corosync may trigger token lost timeout in RECOVERY state

John Thompson thompa26 at gmail.com
Wed Nov 30 21:35:55 GMT 2011


On Thu, Dec 1, 2011 at 7:19 AM, Steven Dake <sdake at redhat.com> wrote:

> The analysis looks good but solution looks wrong.
>
> If token is lost (the previous ring id fails for some reason) on the
> node that cancels the token timeout, the protocol will freeze.
>
> this part from your logs doesn't look correct:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46695.
> > Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240
> messages in RECOVERY.
> > ==>Costed 13 seconds to copy all old ring messages
>
> At most 3969 messages would be originated, not 16240 messages.  Perhaps
> the loop is operating incorrectly?  Can you print out range and
> messages_originated at line 2111 with your test case?  One of these
> values may be getting corrupted by an errant memcpy.
>

The message numbers printed here are in hex and so the difference is 16240
between them

Regards,
John

Insepcting the code the memset and memcpy operations look correct in
> size in the range loop (although the previous could could have an errant
> memcpy).
>
> Regards
> -steve
>
>
> On 11/30/2011 08:20 AM, Yunkai Zhang wrote:
> > I found that corosync may trigger token lost timeout in RECOVERY state.
> > The reason is that different nodes may spend different time to copy old
> > ring messages from its regular_sort_queue to retrans_message_queue in
> > *memb_state_recovery_enter* function.
> >
> > Let's have a look at the following logs(don't care about the line
> numbers,
> > I have modify the source code for debugging):
> >
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2034 my_memb:5, new_memb:5,
> trans_memb:5
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [0] member
> 192.168.0.92:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [1] member
> 192.168.0.124:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [2] member
> 192.168.0.142:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [3] member
> 192.168.0.160:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2037 TRANS [4] member
> 192.168.0.164:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [0] member
> 192.168.0.92:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq
> 1161924 rep 192.168.0.92
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high
> delivered 46695 received flag 1
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [1] member
> 192.168.0.124:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq
> 1161924 rep 192.168.0.92
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high
> delivered 46695 received flag 1
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [2] member
> 192.168.0.142:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq
> 1161924 rep 192.168.0.92
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 42725 high
> delivered 42725 received flag 0
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [3] member
> 192.168.0.160:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq
> 1161924 rep 192.168.0.92
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high
> delivered 46695 received flag 1
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2041 position [4] member
> 192.168.0.164:
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2045 previous ring seq
> 1161924 rep 192.168.0.92
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2051 aru 46695 high
> delivered 46695 received flag 1
> >
> > We can see that there was five nodes forming a new ring at 00:51:53,
> they had
> > different aru value and node(192.168.0.142) had the smallest aru value.
> >
> > Here is some logs comes from these five nodes, we can see how much time
> they
> > spend for copying old ring messages:
> >
> > [logs from 192.168.0.92]
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46695.
> > Nov 30 00:52:06 corosync [TOTEM ] totemsrp.c:2145 Originated 16240
> messages in RECOVERY.
> > ==>Costed 13 seconds to copy all old ring messages
> >
> > [logs from 192.168.0.124]
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46695.
> > Nov 30 00:52:09 corosync [TOTEM ] totemsrp.c:2145 Originated 16240
> messages in RECOVERY.
> > ==>Costed 16 seconds to copy all old ring messages
> >
> > [logs from 192.168.0.142]
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46693.
> > Nov 30 00:51:56 corosync [TOTEM ] totemsrp.c:2145 Originated 2321
> messages in RECOVERY.
> > ==>Costed 3 seconds to copy all old ring messages
> >
> > [logs from 192.168.0.160]
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46695.
> > Nov 30 00:51:58 corosync [TOTEM ] totemsrp.c:2145 Originated 16240
> messages in RECOVERY.
> > ==>Costed 5 seconds to copy all old ring messages
> >
> > [192.168.0.164]
> > Nov 30 00:51:53 corosync [TOTEM ] totemsrp.c:2111 copying all old ring
> messages from 42726-46695.
> > Nov 30 00:52:05 corosync [TOTEM ] totemsrp.c:2145 Originated 16240
> messages in RECOVERY.
> > ==>Costed 12 seconds to copy all old ring messages
> >
> > After the node(192.168.0.142) costed 3 seconds to copy all messages, it
> start
> > waiting for the commit_token from it's upstream(192.168.0.124), but it
> could
> > not received as it's upstream(192.168.0.124) was busy copying messages
> at that
> > time between 00:51:56 to 00:52:09. So this node(192.168.0.142) would
> finally
> > trigger token lost timeout in RECOVERY state.
> >
> > More seriously, after it triggered token lost timeout, the
> node(192.168.0.142)
> > enter GATHER state, but it could not receive JoinMSG from other nodes
> because
> > others was also busy at that time, and then this node ran into consensus
> > expired timeout state finally.
> >
> > To solve this issue, we can cancel token_timeout
> > (but keeping token_retransmit_timeout) in RECOVERY state.
> >
> > Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
> > ---
> >  exec/totemsrp.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/exec/totemsrp.c b/exec/totemsrp.c
> > index 5a78962..e0299e7 100644
> > --- a/exec/totemsrp.c
> > +++ b/exec/totemsrp.c
> > @@ -2206,7 +2206,7 @@ originated:
> >       instance->my_install_seq = SEQNO_START_MSG;
> >       instance->last_released = SEQNO_START_MSG;
> >
> > -     reset_token_timeout (instance); // REVIEWED
> > +     cancel_token_timeout (instance);
> >       reset_token_retransmit_timeout (instance); // REVIEWED
> >
> >       instance->memb_state = MEMB_STATE_RECOVERY;
>
> _______________________________________________
> discuss mailing list
> discuss at corosync.org
> http://lists.corosync.org/mailman/listinfo/discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.corosync.org/pipermail/discuss/attachments/20111201/71b819c3/attachment.html>


More information about the discuss mailing list