Dell EMC Unity : SPA/B Panic - Alert Text: Storage Processor SP A/B is restarting  Message Id: 14:6046f  (Dell EMC Correctable)


   Article Number:     533464                                   Article Version: 2     Article Type:    Break Fix 




Dell EMC Unity 550F





Unity OE 4.5.x or below may experience panics of  both SPs in close proximity .   
    The panic log indicators are:   
    CSX RT: panic requested at: <file-unknown>:0 (thread: 139884958963456 aka 139884958963456) [PID:20090 TID:14025 CORE:0 [csx_ic_std.x] [dnsUpd] [04/13/2019 16:18:19 UTC]] (panic action:DEFAULT expr:<no-expr> flags:-) [info:0]   
    Ktrace indicators:   
    019-04-13T16:20:19.613Z" "spb" "Kittyhawk_safe" "20094" "unix/spb/root" "ERROR" "1:1678040" :: "Peer SP A is Down." :: Category=System Component=espkg   
    "2019-04-13T16:21:16.394Z" "pspb" "Kittyhawk_safe" "20094" "unix/spb/root" "INFO" "1:115000d" :: "Internal Information only. PSM File "udoctor" Deleted." :: Category=System Component=psm   
    "2019-04-13T16:21:25.278Z" "priungpr02_spb" "Neo_CEM" "879" "N/A" "ERROR" "14:603ab" :: "Domain Controller servers configured for the SMB server 'DCNAME' are not reachable." :: Category=User Component=Health   
    "2019-04-13T16:21:25.392Z" "spb" "Neo_CEM" "879" "N/A" "WARN" "14:6046f" :: "Storage Processor SP A is restarting" :: Category=User Component=Health   
    2019/04/13-16:22:02.666821  94K     7EE0388A0705     sade:SMB: 6:[CIFSERVER]  No site defined for domain "". Ensure that the Active Directory Site configuration i   
    2019/04/13-16:22:02.666823    0     7EE0388A0705     sade:SMB: 6:[CIFS-SERVER]  ncludes the data mover's attached network.   
    2019/04/13-16:22:02.774794 107K     7EE0388A0709     sade:SMB: 6:[CIFS=SERVER]  Obsolete DC=XXX at IP=IP-addr of domain DOMNAME removed from list after 605707 seco               <<<<===========   
    2019/04/13-16:22:02.774796    0     7EE0388A0709     sade:SMB: 6:[priungpr02ap002]  nds (DomCnt=5,275592 lastUpdt=Sat Apr  6 16:06:26 2019   
    2019/04/13-16:22:02.774802    3     7EE0388A0709     sade:SMB: 6:[priungpr02ap002]   DC*XXX(IP-addr) R=1 S=17,0xc0000233/27   






The root cause is related to decommissioned domain controller on the customer's network.   
    Due to a Dclist parameter that keeps track of cached DCs , the SPs may panic due to code related problem.   






    The customer has decommissioned and removed a DC from their arm of domain controllers servers   






Workaround param disablement applied.                                                           






  Q:  Do we have any potential issues running the system with the param in disabled state?     
       A: With the param change, a DC that has been removed from the environment will still be in our list to contact.  This should not cause any impact.  The reason the DC removal feature was added was that some customers did not like having retired DCs remembered and attempts made to contact them.  Note that "ghost" DC entries will not survive a reboot so they are not permanent even with the feature disabled     
      Q: If a DC is rebooted or shutdown will that potentially panic the SP?     
      A: if the DC is shutdown for more than a week as the default value for checking the DC is set to 7 days, we will hit the panic as the system will scan/purge the data again after a week and if the DC is down, it will make the system panic, expecting something wrong. So once the parameter is set to  "0" there won't be check and hence the panic will be avoided.