Skip to content

Commit 1c01967

Browse files
Changwei Getorvalds
authored andcommitted
ocfs2: fix cluster hang after a node dies
When a node dies, other live nodes have to choose a new master for an existed lock resource mastered by the dead node. As for ocfs2/dlm implementation, this is done by function - dlm_move_lockres_to_recovery_list which marks those lock rsources as DLM_LOCK_RES_RECOVERING and manages them via a list from which DLM changes lock resource's master later. So without invoking dlm_move_lockres_to_recovery_list, no master will be choosed after dlm recovery accomplishment since no lock resource can be found through ::resource list. What's worse is that if DLM_LOCK_RES_RECOVERING is not marked for lock resources mastered a dead node, it will break up synchronization among nodes. So invoke dlm_move_lockres_to_recovery_list again. Fixs: 'commit ee8f7fc ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down")' Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED6E0F9@H3CMLB14-EX.srv.huawei-3com.com Signed-off-by: Changwei Ge <[email protected]> Reported-by: Vitaly Mayatskih <[email protected]> Tested-by: Vitaly Mayatskikh <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 98d6c09 commit 1c01967

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

fs/ocfs2/dlm/dlmrecovery.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2419,6 +2419,7 @@ static void dlm_do_local_recovery_cleanup(struct dlm_ctxt *dlm, u8 dead_node)
24192419
dlm_lockres_put(res);
24202420
continue;
24212421
}
2422+
dlm_move_lockres_to_recovery_list(dlm, res);
24222423
} else if (res->owner == dlm->node_num) {
24232424
dlm_free_dead_locks(dlm, res, dead_node);
24242425
__dlm_lockres_calc_usage(dlm, res);

0 commit comments

Comments
 (0)