[corosync] information request

Slava Bendersky volga629 at networklab.ca
Sun Nov 24 19:34:52 UTC 2013


Hello Steven, 
Here testing results 
Iptables is stopped both end. 

[root at eusipgw01 ~]# iptables -L -nv -x 
Chain INPUT (policy ACCEPT 474551 packets, 178664760 bytes) 
pkts bytes target prot opt in out source destination 

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) 
pkts bytes target prot opt in out source destination 

Chain OUTPUT (policy ACCEPT 467510 packets, 169303071 bytes) 
pkts bytes target prot opt in out source destination 
[root at eusipgw01 ~]# 


First case is udpu transport and rrp: none 

totem { 
version: 2 
token: 160 
token_retransmits_before_loss_const: 3 
join: 250 
consensus: 300 
vsftype: none 
max_messages: 20 
threads: 0 
nodeid: 2 
rrp_mode: none 
interface { 
member { 
memberaddr: 10.10.10.1 
} 
ringnumber: 0 
bindnetaddr: 10.10.10.0 
mcastport: 5405 
} 
transport: udpu 
} 

Error 

Nov 24 14:25:29 corosync [MAIN ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. 

pbx01*CLI> corosync show members 

============================================================= 
=== Cluster members ========================================= 
============================================================= 
=== 
=== 
============================================================= 


And the same with rrp: passive. I think unicast is more related to some incompatibility with vmware ? Only multicast going though, bur even then it not forming completely the cluster. 

Slava. 

----- Original Message -----

From: "Steven Dake" <sdake at redhat.com> 
To: "Slava Bendersky" <volga629 at networklab.ca>, "Digimer" <lists at alteeve.ca> 
Cc: discuss at corosync.org 
Sent: Sunday, November 24, 2013 12:01:09 PM 
Subject: Re: [corosync] information request 


On 11/23/2013 11:20 PM, Slava Bendersky wrote: 



Hello Digimer, 
Here from asterisk box what I see 
pbx01*CLI> corosync show members 

============================================================= 
=== Cluster members ========================================= 
============================================================= 
=== 
=== Node 1 
=== --> Group: asterisk 
=== --> Address 1: 10.10.10.1 
=== Node 2 
=== --> Group: asterisk 
=== --> Address 1: 10.10.10.2 
=== 
============================================================= 

[2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6) 
[2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6) 




These errors come from asterisk via the cpg libraries because corosync cannot get a proper configuration. The first message on tihs thread contains the scenarios under which those occur. In a past log you had the error indicating a network fault. This network fault error IIRC indicates firewall is enabled. The error from asterisk is expected if your firewall is enabled. This was suggested before by Digimer, but can you confirm you totally disabled your firewall on the box (rather then just configured it as you thought was correct). 

Turn off the firewall - which will help us eliminate that as a source of the problem. 

Next, use UDPU mode without RRP - confirm whether that works 

Next use UDPU _passive_ rrp mode - confirm whether that works 

One thing at a time in each step please. 

Regards 
-steve 



<blockquote>

Is possible that message related to permission who running corosync or asterisk ? 

And another point is when I send ping I see MAC address of eth0 which is default gateway and not cluster interface. 


</blockquote>
Corosync does not use the gateway address in any of its routing calculations. Instead it physically binds to the interface specified as detailed in corosync.conf.5. By physically binding, it avoids the gateway entirely. 

Regards 
-steve 


<blockquote>

pbx01*CLI> corosync ping 
[2013-11-24 01:16:54] NOTICE[2057]: res_corosync.c:303 ast_event_cb: (ast_event_cb) Got event PING from server with EID: 'MAC address of the eth0' 
[2013-11-24 01:16:54] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6) 


Slava. 


----- Original Message -----

From: "Slava Bendersky" <volga629 at networklab.ca> 
To: "Digimer" <lists at alteeve.ca> 
Cc: discuss at corosync.org 
Sent: Sunday, November 24, 2013 12:26:40 AM 
Subject: Re: [corosync] information request 

Hello Digimer, 
I am trying find information about vmware multicast problems. But on tcpdump I see multicas traffic from remote end. I can't confirm if packet arrive as should be. 
Can please confirm that memberaddr: is ip address of second node ? 

06:05:02.408204 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 221) 
10.10.10.1.5404 > 226.94.1.1.5405: [udp sum ok] UDP, length 193 
06:05:02.894935 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 221) 
10.10.10.2.5404 > 226.94.1.1.5405: [bad udp cksum 1a8c!] UDP, length 193 


Slava. 



----- Original Message -----

From: "Digimer" <lists at alteeve.ca> 
To: "Slava Bendersky" <volga629 at networklab.ca> 
Cc: discuss at corosync.org 
Sent: Saturday, November 23, 2013 11:54:55 PM 
Subject: Re: [corosync] information request 

If I recall correctly, VMWare doesn't do multicast properly. I'm not 
sure though, I don't use it. 

Try unicast with no RRP. See if that works. 

On 23/11/13 23:16, Slava Bendersky wrote: 
> Hello Digimer, 
> All machines are rhel 6.4 based on vmware , there not physical switch 
> only from vmware. I set rrp to none and cluster is formed. 
> With this config I am getting constant error messages. 
> 
> [root at eusipgw01 ~]# cat /etc/redhat-release 
> Red Hat Enterprise Linux Server release 6.4 (Santiago) 
> 
> [root at eusipgw01 ~]# rpm -qa | grep corosync 
> corosync-1.4.1-15.el6.x86_64 
> corosynclib-1.4.1-15.el6.x86_64 
> 
> 
> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) 
> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) 
> 
> iptables 
> 
> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 5404:5407 -j 
> NFLOG --nflog-prefix "dmz_ext2fw: " --nflog-group 2 
> -A INPUT -i eth1 -m pkttype --pkt-type multicast -j NFLOG 
> --nflog-prefix "dmz_ext2fw: " --nflog-group 2 
> -A INPUT -i eth1 -m pkttype --pkt-type unicast -j NFLOG --nflog-prefix 
> "dmz_ext2fw: " --nflog-group 2 
> -A INPUT -i eth1 -p igmp -j NFLOG --nflog-prefix "dmz_ext2fw: " 
> --nflog-group 2 
> -A INPUT -j ACCEPT 
> 
> 
> ------------------------------------------------------------------------ 
> *From: *"Digimer" <lists at alteeve.ca> 
> *To: *"Slava Bendersky" <volga629 at networklab.ca> 
> *Cc: *discuss at corosync.org 
> *Sent: *Saturday, November 23, 2013 10:34:00 PM 
> *Subject: *Re: [corosync] information request 
> 
> I don't think you ever said what OS you have. I've never had to set 
> anything in sysctl.conf on RHEL/CentOS 6. Did you try disabling RRP 
> entirely? If you have a managed switch, make sure persistent multicast 
> groups are enabled or try a different switch entirely. 
> 
> *Something* is interrupting your network traffic. What does 
> iptables-save show? Are these physical or virtual machines? 
> 
> The more information about your environment that you can share, the 
> better we can help. 
> 
> On 23/11/13 22:29, Slava Bendersky wrote: 
>> Hello Digimer, 
>> As an idea, might be some settings in sysctl.conf ? 
>> 
>> Slava. 
>> 
>> 
>> ------------------------------------------------------------------------ 
>> *From: *"Slava Bendersky" <volga629 at networklab.ca> 
>> *To: *"Digimer" <lists at alteeve.ca> 
>> *Cc: *discuss at corosync.org 
>> *Sent: *Saturday, November 23, 2013 10:27:22 PM 
>> *Subject: *Re: [corosync] information request 
>> 
>> Hello Digimer, 
>> Yes I set to passive and selinux is disabled 
>> 
>> [root at eusipgw01 ~]# sestatus 
>> SELinux status: disabled 
>> [root at eusipgw01 ~]# cat /etc/corosync/corosync.conf 
>> totem { 
>> version: 2 
>> token: 160 
>> token_retransmits_before_loss_const: 3 
>> join: 250 
>> consensus: 300 
>> vsftype: none 
>> max_messages: 20 
>> threads: 0 
>> nodeid: 2 
>> rrp_mode: passive 
>> interface { 
>> ringnumber: 0 
>> bindnetaddr: 10.10.10.0 
>> mcastaddr: 226.94.1.1 
>> mcastport: 5405 
>> } 
>> } 
>> 
>> logging { 
>> fileline: off 
>> to_stderr: yes 
>> to_logfile: yes 
>> to_syslog: off 
>> logfile: /var/log/cluster/corosync.log 
>> debug: off 
>> timestamp: on 
>> logger_subsys { 
>> subsys: AMF 
>> debug: off 
>> } 
>> } 
>> 
>> 
>> Slava. 
>> 
>> ------------------------------------------------------------------------ 
>> *From: *"Digimer" <lists at alteeve.ca> 
>> *To: *"Slava Bendersky" <volga629 at networklab.ca> 
>> *Cc: *"Steven Dake" <sdake at redhat.com> , discuss at corosync.org 
>> *Sent: *Saturday, November 23, 2013 7:04:43 PM 
>> *Subject: *Re: [corosync] information request 
>> 
>> First up, I'm not Steven. Secondly, did you follow Steven's 
>> recommendation to not use active RRP? Does the cluster form with no RRP 
>> at all? Is selinux enabled? 
>> 
>> On 23/11/13 18:29, Slava Bendersky wrote: 
>>> Hello Steven, 
>>> In multicast it log filling with this message 
>>> 
>>> Nov 24 00:26:28 corosync [TOTEM ] A processor failed, forming new 
>>> configuration. 
>>> Nov 24 00:26:28 corosync [TOTEM ] A processor joined or left the 
>>> membership and a new membership was formed. 
>>> Nov 24 00:26:31 corosync [CPG ] chosen downlist: sender r(0) 
>>> ip(10.10.10.1) ; members(old:2 left:0) 
>>> Nov 24 00:26:31 corosync [MAIN ] Completed service synchronization, 
>>> ready to provide service. 
>>> 
>>> In uudp it not working at all. 
>>> 
>>> Slava. 
>>> 
>>> 
>>> ------------------------------------------------------------------------ 
>>> *From: *"Digimer" <lists at alteeve.ca> 
>>> *To: *"Slava Bendersky" <volga629 at networklab.ca> 
>>> *Cc: *"Steven Dake" <sdake at redhat.com> , discuss at corosync.org 
>>> *Sent: *Saturday, November 23, 2013 6:05:56 PM 
>>> *Subject: *Re: [corosync] information request 
>>> 
>>> So multicast works with the firewall disabled? 
>>> 
>>> On 23/11/13 17:28, Slava Bendersky wrote: 
>>>> Hello Steven, 
>>>> I disabled iptables and no difference, error message the same, but at 
>>>> least in multicast is wasn't generate the error. 
>>>> 
>>>> 
>>>> Slava. 
>>>> 
>>>> ------------------------------------------------------------------------ 
>>>> *From: *"Digimer" <lists at alteeve.ca> 
>>>> *To: *"Slava Bendersky" <volga629 at networklab.ca> , "Steven Dake" 
>>>> <sdake at redhat.com> 
>>>> *Cc: *discuss at corosync.org 
>>>> *Sent: *Saturday, November 23, 2013 4:37:36 PM 
>>>> *Subject: *Re: [corosync] information request 
>>>> 
>>>> Does either mcast or unicast work if you disable the firewall? If so, 
>>>> then at least you know for sure that iptables is the problem. 
>>>> 
>>>> The link here shows the iptables rules I use (for corosync in mcast and 
>>>> other apps): 
>>>> 
>>>> https://alteeve.ca/w/AN!Cluster_Tutorial_2#Configuring_iptables 
>>>> 
>>>> digimer 
>>>> 
>>>> On 23/11/13 16:12, Slava Bendersky wrote: 
>>>>> Hello Steven, 
>>>>> Than what I see when setup through UDPU 
>>>>> 
>>>>> Nov 23 22:08:13 corosync [MAIN ] Compatibility mode set to whitetank. 
>>>>> Using V1 and V2 of the synchronization engine. 
>>>>> Nov 23 22:08:13 corosync [TOTEM ] adding new UDPU member {10.10.10.1} 
>>>>> Nov 23 22:08:16 corosync [MAIN ] Totem is unable to form a cluster 
>>>>> because of an operating system or network fault. The most common cause 
>>>>> of this message is that the local firewall is configured improperly. 
>>>>> 
>>>>> 
>>>>> Might be missing some firewall rules ? I allowed unicast. 
>>>>> 
>>>>> Slava. 
>>>>> 
>>>>> 
> ------------------------------------------------------------------------ 
>>>>> *From: *"Steven Dake" <sdake at redhat.com> 
>>>>> *To: *"Slava Bendersky" <volga629 at networklab.ca> 
>>>>> *Cc: *discuss at corosync.org 
>>>>> *Sent: *Saturday, November 23, 2013 10:33:31 AM 
>>>>> *Subject: *Re: [corosync] information request 
>>>>> 
>>>>> 
>>>>> On 11/23/2013 08:23 AM, Slava Bendersky wrote: 
>>>>> 
>>>>> Hello Steven, 
>>>>> 
>>>>> My setup 
>>>>> 
>>>>> 10.10.10.1 primary server -----EoIP tunnel vpn ipsec ----- dr 
> server 
>>>>> 10.10.10.2 
>>>>> 
>>>>> On both servers is 2 interfaces eth0 which default gw out and eth1 
>>>>> where corosync live. 
>>>>> 
>>>>> Iptables: 
>>>>> 
>>>>> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 
>> 5404:5407 
>>>>> -A INPUT -i eth1 -m pkttype --pkt-type multicast 
>>>>> -A INPUT -i eth1 -p igmp 
>>>>> 
>>>>> 
>>>>> Corosync.conf 
>>>>> 
>>>>> totem { 
>>>>> version: 2 
>>>>> token: 160 
>>>>> token_retransmits_before_loss_const: 3 
>>>>> join: 250 
>>>>> consensus: 300 
>>>>> vsftype: none 
>>>>> max_messages: 20 
>>>>> threads: 0 
>>>>> nodeid: 2 
>>>>> rrp_mode: active 
>>>>> interface { 
>>>>> ringnumber: 0 
>>>>> bindnetaddr: 10.10.10.0 
>>>>> mcastaddr: 226.94.1.1 
>>>>> mcastport: 5405 
>>>>> } 
>>>>> } 
>>>>> 
>>>>> Join message 
>>>>> 
>>>>> [root at eusipgw01 ~]# corosync-objctl | grep member 
>>>>> runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(10.10.10.2) 
>>>>> runtime.totem.pg.mrp.srp.members.2.join_count=1 
>>>>> runtime.totem.pg.mrp.srp.members.2.status=joined 
>>>>> runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(10.10.10.1) 
>>>>> runtime.totem.pg.mrp.srp.members.1.join_count=254 
>>>>> runtime.totem.pg.mrp.srp.members.1.status=joined 
>>>>> 
>>>>> Is it possible that ping sends out of wrong interface ? 
>>>>> 
>>>>> Slava, 
>>>>> 
>>>>> I wouldn't expect so. 
>>>>> 
>>>>> Which version? 
>>>>> 
>>>>> Have you tried udpu instead? If not, it is preferable to multicast 
>>>>> unless you want absolute performance on cpg groups. In most cases the 
>>>>> performance difference is very small and not worth the trouble of 
>>>>> setting up multicast in your network. 
>>>>> 
>>>>> Fabio had indicated rrp active mode is broken. I don't know the 
>>>>> details, but try passive RRP - it is actually better then active 
>>> IMNSHO :) 
>>>>> 
>>>>> Regards 
>>>>> -steve 
>>>>> 
>>>>> Slava. 
>>>>> 
>>>>> 
>>>> ------------------------------------------------------------------------ 
>>>>> *From: *"Steven Dake" <sdake at redhat.com> 
>>>>> *To: *"Slava Bendersky" <volga629 at networklab.ca> , 
>>> discuss at corosync.org 
>>>>> *Sent: *Saturday, November 23, 2013 6:01:11 AM 
>>>>> *Subject: *Re: [corosync] information request 
>>>>> 
>>>>> 
>>>>> On 11/23/2013 12:29 AM, Slava Bendersky wrote: 
>>>>> 
>>>>> Hello Everyone, 
>>>>> Corosync run on box with 2 Ethernet interfaces. 
>>>>> I am getting this message 
>>>>> CPG mcast failed (6) 
>>>>> 
>>>>> Any information thank you in advance. 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> https://github.com/corosync/corosync/blob/master/include/corosync/corotypes.h#L84 
>>>>> 
>>>>> This can occur because: 
>>>>> a) firewall is enabled - there should be something in the logs 
>>>>> telling you to properly configure the firewall 
>>>>> b) a config change is in progress - this is a normal response, and 
>>>>> you should try the request again 
>>>>> c) a bug in the synchronization code is resulting in a blocked 
>>>>> unsynced cluster 
>>>>> 
>>>>> c is very unlikely at this point. 
>>>>> 
>>>>> 2 ethernet interfaces = rrp mode, bonding, or something else? 
>>>>> 
>>>>> Digimer needs moar infos :) 
>>>>> 
>>>>> Regards 
>>>>> -steve 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________ 
>>>>> discuss mailing list 
>>>>> discuss at corosync.org 
>>>>> http://lists.corosync.org/mailman/listinfo/discuss 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________ 
>>>>> discuss mailing list 
>>>>> discuss at corosync.org 
>>>>> http://lists.corosync.org/mailman/listinfo/discuss 
>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Digimer 
>>>> Papers and Projects: https://alteeve.ca/w/ 
>>>> What if the cure for cancer is trapped in the mind of a person without 
>>>> access to education? 
>>>> 
>>> 
>>> 
>>> -- 
>>> Digimer 
>>> Papers and Projects: https://alteeve.ca/w/ 
>>> What if the cure for cancer is trapped in the mind of a person without 
>>> access to education? 
>>> 
>> 
>> 
>> -- 
>> Digimer 
>> Papers and Projects: https://alteeve.ca/w/ 
>> What if the cure for cancer is trapped in the mind of a person without 
>> access to education? 
>> 
>> 
>> _______________________________________________ 
>> discuss mailing list 
>> discuss at corosync.org 
>> http://lists.corosync.org/mailman/listinfo/discuss 
>> 
> 
> 
> -- 
> Digimer 
> Papers and Projects: https://alteeve.ca/w/ 
> What if the cure for cancer is trapped in the mind of a person without 
> access to education? 
> 


-- 
Digimer 
Papers and Projects: https://alteeve.ca/w/ 
What if the cure for cancer is trapped in the mind of a person without 
access to education? 


_______________________________________________ 
discuss mailing list 
discuss at corosync.org 
http://lists.corosync.org/mailman/listinfo/discuss 



_______________________________________________
discuss mailing list discuss at corosync.org http://lists.corosync.org/mailman/listinfo/discuss 

</blockquote>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.corosync.org/pipermail/discuss/attachments/20131124/5e6e6061/attachment-0001.html>


More information about the discuss mailing list