[corosync] [PATCH v2] Merge downlist and joinlist into one confchg event

Yunkai Zhang qiushu.zyk at taobao.com
Tue Nov 8 06:25:02 GMT 2011

On Tue, Nov 8, 2011 at 2:13 PM, Yunkai Zhang <qiushu.zyk at taobao.com> wrote:
> On Tue, Nov 8, 2011 at 11:50 AM, Steven Dake <sdake at redhat.com> wrote:
>> On 11/07/2011 08:26 PM, Angus Salkeld wrote:
>>> On Mon, Nov 07, 2011 at 12:18:49PM +0800, Yunkai Zhang wrote:
>>>> In the previous version of cpg.c, we collected downlist info by
>>>> interchanging messages among all nodes in the cluster. In my opinion,
>>>> this is not necessary.
>>>> We can calculate this info according
>>>> left_nodes(my_member_nodes-trans_nodes) and its relative
>>>> process_infos. So I discard message interchanging and collect downlist
>>>> info directly in cpg_leftlist_collect function.
>>> This would need a _lot_ of testing! I strongly suggest leaving the
>>> downlist messaging in to avoid regressions.
>>>> Although I calculate downlist information directly in local, but
>>>> Iwon't send it until I have collected all joinlist as regular
>>>> messagesamong all nodes in the cluster. Even if there are no one
>>>> joinlistcontent among all nodes, this patch will also send _bare_
>>>> joinlistincluding only the message header so that all nodes can
>>>> receive alljoinlist message and reach synchronization finally. So I
>>>> think itwon't deliver configuration changes to the application ahead
>>>> of theregular messages.
>>>> I known that merging downlist and joinlist into one confchg event
>>>> willbreak wire compatibility, but I think it will be more reasonable,
>>>> andmost important it reflect the truth and obey the CPG API
>>>> description.
>>> You break on-wire compatibility by removing the downlist not by
>>> merging the events to the client applications
>>> (on-wire == totem messages to other nodes not ipc message to application).
> Hi Steven, Angus:
> Thanks for this explanation, it makes me understand the _on-wire_
> concept exactly, and I have realized that keeping _on-wire_ is
> necessary in some situation.
> But what I confuse is that why we need to keep downlist on-wire?
> Suppose there was two ring: R1(c1,c2) and R2(c3,c4,c5), and they
> merged into R3(c1,c3) at a time for some reason. In this case, what a
> downlist should be received by c1 and c3? I think c1 and c2 should

s/I think c1 and c2 should/I think c1 and c3 should/

> receive _different_ downlist because they comes from _different_ ring,
> the downlist for c1 should be D1(c2), and the downlist for c2 should

s/and the downlist for c2 should/and the downlist for c3 should/

> be D2(c4, c5).
> But according the means we using now, we will interchange downlist
> messages among all nodes(c1, c3) in R3 and choose the downlist
> containing the largest number nodes as the final downlist for all
> nodes. So c1 and c3 will receive the same downlist: D2(c4, c5) in this
> case, but it seems so strange for c1! c1 didn't know what happen to c2
> belonged to its old ring R1, and who are c4 and c5.
> In my opinion, downlist need not to keep on-wire and should not to
> keep on-wire - nodes comes from different old ring should receive
> different downlist.
> If I misunderstood something in this case, please point it out :D.
>> If we ever decide in the future to break onwire compat, we should
>> address the many pending things related to onwire which need fixing.
>> short list:
>> 1. remove evil.c
>> 2. remove syncv1
>> 3. rework cpg for the 5th time
>> 4. add unlimited redundant ring count to totem messages
>> 5. rewrite totempg so that all messages are aligned and
>> fragmentation/assembly work correctly in all cases and maintain VS
>> guarantees
>> There may be others.
>> We are not addressing any on-wire compat changes without addressing all
>> of them.
>> Regards
>> -steve
>>> The problem with that is what happens when you are upgrading and you
>>> have a cluster with a mix of versions - they will not aggree on the
>>> same cpg group membership.
>>> -Angus
>>>> --
>>>> Yunkai Zhang
>>>> Work at Taobao
>>> _______________________________________________
>>> discuss mailing list
>>> discuss at corosync.org
>>> http://lists.corosync.org/mailman/listinfo/discuss
>> _______________________________________________
>> discuss mailing list
>> discuss at corosync.org
>> http://lists.corosync.org/mailman/listinfo/discuss
> --
> Yunkai Zhang
> Work at Taobao

Yunkai Zhang
Work at Taobao

More information about the discuss mailing list