Problem
During tests one of the LUN paths got disabled on the SAN. As a consequence, one of the nodes would no longer start up correctly.
Error codes:
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-9452 TIME_JOIN 1ac9011: start of slave_response 4
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-9450 TIME_JOIN 1ac9011: slave_response begin do_next 4
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-8222 slave: missing disk 1330071079.29.s51uh10
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-7830 cannot find disk 1330071079.29.s51uh10
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-11092 cleanup_client: (Cannot find disk on slave node) 222
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-11467 kernel_fail_join() : Reconfiguration interrupted: Reason is retry to add a node failed (13, 0)
Feb 24 11:52:41 s51uh10 vmunix: WARNING: VxVM vxio V-5-0-164 Failed to join cluster jav1prodcl, aborting
Feb 24 11:52:41 s51uh10 vmunix: NOTICE: TIME_JOIN 0x1ac9011 VxVM cvm:vxio V-5-3-9367 volcvm_abort: END RECONFIG (ABORT)==========
All disks were available but one path for all LUNs remained off-line.
Solution
Disable ALUA for the SCSI driver:
# scsimgr set_attr -N "/escsi/esdisk" -a alua_enabled=0
# scsimgr save_attr -N "/escsi/esdisk" -a alua_enabled=0
ALUA is defined as Asymmetric Logical Unit Access
- Check the current settings:
# scsimgr get_attr -N "/escsi/esdisk"
# scsimgr get_attr -N "/escsi/esdisk"
SCSI ATTRIBUTES FOR SETTABLE ATTRIBUTE SCOPE : "/escsi/esdisk"
name = transient_secs
current = 120
default = 120
saved =
name = format_secs
current = 86400
default = 86400
saved =
name = start_unit_secs
current = 60
default = 60
saved =
name = max_retries
current = 45
default = 45
saved =
name = path_fail_secs
current = 120
default = 120
saved =
name = esd_secs
current = 30
default = 30
saved =
name = max_q_depth
urrent = 8
default = 8
saved =
name = load_bal_policy
current = round_robin
default = round_robin
saved =
name = disable_flags
current = WCE
default = WCE
saved =
name = infinite_retries_enable
current = false
default = false
saved =
name = alua_enabled
current = true
default = true
saved =
name = retry_delay_enabled
current = true
default = true
saved =
name = entry_name
current = /escsi/esdisk
default =
saved =
name = ping_type
current = basic
default = basic
saved =
name = ping_recovery
current = immediate
default = immediate
saved =
name = ping_count_threshold
current = 0
default = 0
saved =
name = ping_time_threshold
current = 0
default = 0
saved =
name = congest_max_retries
current = 90
default = 90
saved =
name = priority_type
current = none
default = none
saved =
- Update the attribute:
# scsimgr set_attr -N "/escsi/esdisk" -a alua_enabled=0
# scsimgr save_attr -N "/escsi/esdisk" -a alua_enabled=0
- Check again:
# scsimgr get_attr -N "/escsi/esdisk"
SCSI ATTRIBUTES FOR SETTABLE ATTRIBUTE SCOPE : "/escsi/esdisk"
name = transient_secs
current = 120
default = 120
saved =
name = format_secs
current = 86400
default = 86400
saved =
name = start_unit_secs
current = 60
default = 60
saved =
name = max_retries
current = 45
default = 45
saved =
name = path_fail_secs
current = 120
default = 120
saved =
name = esd_secs
current = 30
default = 30
saved =
name = max_q_depth
current = 8
default = 8
saved =
name = load_bal_policy
current = round_robin
default = round_robin
saved =
name = disable_flags
current = WCE
default = WCE
saved =
name = infinite_retries_enable
current = false
default = false
saved =
name = alua_enabled
current = false
default = true
saved = false
name = retry_delay_enabled
current = true
default = true
saved =
name = entry_name
current = /escsi/esdisk
default =
saved =
name = ping_type
current = basic
default = basic
saved =
name = ping_recovery
current = immediate
default = immediate
saved =
name = ping_count_threshold
current = 0
default = 0
saved =
name = ping_time_threshold
current = 0
default = 0
saved =
name = congest_max_retries
current = 90
default = 90
saved =
name = priority_type
current = none
default = none
saved =
Leave a comment