Node never fenced after token loss and kill because rejoin without restart in RHEL 5.6

Solution Unverified - Updated 2024-08-06T08:07:54+00:00 -

Issue

A token loss occurred and the node was killed when it attempted to rejoin the cluster without a restart, but it was never fenced.
gfs_controld on a remaining node in the cluster shows repeated cpg_mcast_joined retries after the removed node was never fenced.

Apr 30 22:04:58 node4 gfs_controld[10877]: cpg_mcast_joined retry 100 MSG_PLOCK
Apr 30 22:04:58 node4 gfs_controld[10877]: cpg_mcast_joined retry 200 MSG_PLOCK

rgmanager became stuck waiting for the node to be fenced

Apr 30 22:05:02 node4 clurgmgrd[13935]: <info> Waiting for node #1 to be fenced

group_tool dump does not show groupd ever processing a confchg as expected following the node removal

1335837897 cman: node 1 removed
1335837897 add_recovery_set_cman nodeid 1

Environment

Red Hat Enterprise Linux (RHEL) 5 with the High Availability Add On
openais prior to release 0.80.6-28.el5_6.2 (RHEL 5.6), 0.80.6-30.el5_7.1 (RHEL 5.7), or 0.80.6-36.el5 (RHEL 5.8)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

Node never fenced after token loss and kill because rejoin without restart in RHEL 5.6

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links