Problem

During tests one of the LUN paths got disabled on the SAN. As a consequence, one of the nodes would no longer start up correctly.

Error codes:

Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-9452 TIME_JOIN 1ac9011: start of slave_response 4
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-9450 TIME_JOIN 1ac9011: slave_response begin do_next 4
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-8222 slave: missing disk 1330071079.29.s51uh10
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-7830 cannot find disk 1330071079.29.s51uh10
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-11092 cleanup_client: (Cannot find disk on slave node) 222
Feb 24 11:52:41 s51uh10 vxvm:vxconfigd: V-5-1-11467 kernel_fail_join() : Reconfiguration interrupted: Reason is retry to add a node failed (13, 0)
Feb 24 11:52:41 s51uh10 vmunix: WARNING: VxVM vxio V-5-0-164 Failed to join cluster jav1prodcl, aborting
Feb 24 11:52:41 s51uh10 vmunix: NOTICE: TIME_JOIN 0x1ac9011 VxVM cvm:vxio V-5-3-9367 volcvm_abort: END RECONFIG (ABORT)==========

All disks were available but one path for all LUNs remained off-line.

Solution

Disable ALUA for the SCSI driver:

# scsimgr set_attr -N "/escsi/esdisk" -a alua_enabled=0
# scsimgr save_attr -N "/escsi/esdisk" -a alua_enabled=0

ALUA is defined as Asymmetric Logical Unit Access

  1. Check the current settings:
# scsimgr get_attr -N "/escsi/esdisk"

# scsimgr get_attr -N "/escsi/esdisk"

        SCSI ATTRIBUTES FOR SETTABLE ATTRIBUTE SCOPE : "/escsi/esdisk"

name = transient_secs
current = 120
default = 120
saved =

name = format_secs
current = 86400
default = 86400
saved =

name = start_unit_secs
current = 60
default = 60
saved =

name = max_retries
current = 45
default = 45
saved =

name = path_fail_secs
current = 120
default = 120
saved =

name = esd_secs
current = 30
default = 30
saved =

name = max_q_depth
urrent = 8
default = 8
saved =

name = load_bal_policy
current = round_robin
default = round_robin
saved =

name = disable_flags
current =  WCE
default =  WCE
saved =

name = infinite_retries_enable
current = false
default = false
saved =

name = alua_enabled
current = true
default = true
saved =

name = retry_delay_enabled
current = true
default = true
saved =

name = entry_name
current = /escsi/esdisk
default =
saved =

name = ping_type
current = basic
default = basic
saved =

name = ping_recovery
current = immediate
default = immediate
saved =

name = ping_count_threshold
current = 0
default = 0
saved =

name = ping_time_threshold
current = 0
default = 0
saved =

name = congest_max_retries
current = 90
default = 90
saved =

name = priority_type
current = none
default = none
saved =
  1. Update the attribute:
# scsimgr set_attr -N "/escsi/esdisk" -a alua_enabled=0
# scsimgr save_attr -N "/escsi/esdisk" -a alua_enabled=0
  1. Check again:
# scsimgr get_attr -N "/escsi/esdisk"                   

        SCSI ATTRIBUTES FOR SETTABLE ATTRIBUTE SCOPE : "/escsi/esdisk"

name = transient_secs
current = 120
default = 120
saved =

name = format_secs
current = 86400
default = 86400
saved =

name = start_unit_secs
current = 60
default = 60
saved =

name = max_retries
current = 45
default = 45
saved =

name = path_fail_secs
current = 120
default = 120
saved =

name = esd_secs
current = 30
default = 30
saved =

name = max_q_depth
current = 8
default = 8
saved =

name = load_bal_policy
current = round_robin
default = round_robin
saved =

name = disable_flags
current =  WCE
default =  WCE
saved =

name = infinite_retries_enable
current = false
default = false
saved =

name = alua_enabled
current = false
default = true
saved = false

name = retry_delay_enabled
current = true
default = true
saved =

name = entry_name
current = /escsi/esdisk
default =
saved =

name = ping_type
current = basic
default = basic
saved =

name = ping_recovery
current = immediate
default = immediate
saved =

name = ping_count_threshold
current = 0
default = 0
saved =

name = ping_time_threshold
current = 0
default = 0
saved =

name = congest_max_retries
current = 90
default = 90
saved =

name = priority_type
current = none
default = none
saved =

Comments