[IDPA] IDPA deployment on DP8300, DP8800 fails on Avamar with the error "FAILED: Configuring Avamar server. ERROR: Avamar configuration failed"

           

   Article Number:     538753                                   Article Version: 2     Article Type:    Break Fix 
   

 


Product:

 

Integrated Data Protection Appliance Family,Integrated Data Protection Appliance Hardware,Integrated Data Protection Appliance SW,DP8300 Appliance,DP8800 Appliance,Integrated Data Protection Appliance Software

 

Issue:

 

 

On the ACM UI, the following error can be seen on Avamar deployment/configuration failing at 11% :       
       
        User-added image       
       
       
        The diagnostic report on the ACM UI shows the following error message:       
       
        User-added image       
       
       
        In the server.log, we see the following failure messages for Avamar configuration :
     
     
      2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status : Checking Task Prerequisites (1 of 172)             |   0 | 2019/10/18-20:52:18 |        
        2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status :        
        2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status : Available Transitions:       
        2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status : Retry Current Task       
        2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status : Call EMC Support       
        2019-10-18 15:02:19,458 INFO  [Thread-3605]-avadapter.ConfigAvamarServerTask$StreamWrapper: Avamar config CLI status : Abort Workflow       
        2019-10-18 15:02:19,461 INFO  [pool-63-thread-2]-avadapter.ConfigAvamarServerTask: Execution of Avamar config CLI completed. Exit Value = 0       
        2019-10-18 15:02:19,461 INFO  [pool-63-thread-2]-avadapter.ConfigAvamarServerTask: Error while executing Avamar config CLI :        
        2019-10-18 15:02:19,462 ERROR [pool-63-thread-2]-avadapter.ConfigAvamarServerTask: Exception occurred while executing Avamar config server task. com.emc.vcedpa.common.exception.ApplianceException: Avamar configuration failed.
     
     
     
     
      Review the Avamar Utility node for logs under "/usr/local/avamar/var/avi/server_log/avinstaller.log.0" and /data01/avamar/repo/<package_name>/tmp/workflow.log".       
       
        The following errors can be seen on the Avamar side :
   
        

      2019-10-18 21:02:18 (+0000) 4107124 INFO: The following exeception was raised while running the command "sudo -H -u root bash -c \"ssh-agent bash /usr/local/avamar/var/run_command-tmpscript.18709.4107124\" </dev/null >/usr/local/avamar/var/run_command-outer-output.18709.4107124 2>&1": #<SignalException: SIGTERM>2019-10-18 21:02:18 (+0000) 4107124 INFO: The tmpscript at the time of the exception:2019-10-18 21:02:18 (+0000) 4107124 INFO: ["#!/bin/bash\n", "# Temporary script\n", "[ -r /etc/profile ] && . /etc/profile\n", "export LOGNAME=root USERNAME=root\n", "ssh-add /home/admin/.ssh/dpnid && (ssh -q root@x.x.x.x env) >/usr/local/avamar/var/run_command-sysout.18709.4107124 2>&1 ; RC=$?\n", "[ -d /usr/local/avamar/var/run_command-keys.18709.4107124 ] && rm -rf /usr/local/avamar/var/run_command-keys.18709.4107124\n", "exit $RC\n"]2019-10-18 21:02:18 (+0000) 4107124 INFO: The sysout at the time of the exception:2019-10-18 21:02:18 (+0000) 4107124 INFO: []2019-10-18 21:02:18 (+0000) 4107124 INFO: The outer_output at the time of the exception:2019-10-18 21:02:18 (+0000) 4107124 INFO: ["Identity added: /home/admin/.ssh/dpnid (/home/admin/.ssh/dpnid)\n"]    
From the above exception, it is clear that the Avamar Configuration workflow was not able to ssh to one of the nodes with IPs highlighted "x.x.x.x" which caused the prechecks to fail on Avamar, causing the Avamar install sles workflow to halt and the Avamar configuration to be marked as failed on ACM.    
   
                                                                

 

 

Cause:

 

 

Avamar PreCheck task in the Install Sles workflow runs a few commands and verifies it can login successfully to all storage nodes with ssh keys loaded. In this case, it was not able to ssh to one of the nodes which caused the precheck to workflow to halt on Avamar AVI and Avamar configuration to be marked as failed on ACM. SSH issue should be fixed for the node.                                                            

 

 

Resolution:

 

 

1: Login to the Avamar Utility Node as 'admin' user.       
            Note: The password for Avamar nodes is set to default at this point which is "changeme".       
       
        2: Load the ssh keys on the Utility node:       
            ssh-agent bash         
              ssh-add /home/admin/.ssh/dpnid
       
       
        3: SSH to each storage node to make sure they are all reachable and we can ssh to them.       
       
            For DP8300:       
            Test the following commands and make sure we can ssh to all required nodes:       
             From Utility Node: mapall date       
             ssn 0.0       
             ssn 0.1       
             ssn 0.2       
       
             For DP8800:       
            Test the following commands and make sure we can ssh to all required nodes:       
             From Utility Node: mapall date       
             ssn 0.0       
             ssn 0.1       
             ssn 0.2       
             ssn 0.3       
       
        4: If any of the nodes does not return the date or throws an error running the above date command, then that node is the affected node. You may see the following output on a bad node:
   
        

      dmin@xxxxxxxxxxxx:~/.ssh/> mapall dateUsing /usr/local/avamar/var/probe.xml(0.0) ssh -q  -x  -o GSSAPIAuthentication=no admin@x.x.x.x 'date'Tue Nov 12 16:22:43 UTC 2019(0.1) ssh -q  -x  -o GSSAPIAuthentication=no admin@x.x.x.x 'date'Tue Nov 12 16:23:38 UTC 2019(0.2) ssh -q  -x  -o GSSAPIAuthentication=no admin@x.x.x.x 'date'In the above example, we can see that node 0.2 did not return the date output confirming the issue with that node. We can manually ssh to the node as well to confirm the issue:ssn 0.2Using /usr/local/avamar/var/probe.xmlssh -x admin@x.x.x.x ''^CERROR: command "ssh -x admin@x.x.x.x  ''" failed, return: 2, exitcode: 0, signal: 2, dumped core: 05: Reboot the affected node or restart the sshd service on this node to fix the issue.   IPMI Tool Utility can be used to remotely reboot the node using ACM.   IPMITOOL IP Addresses on each Avamar Node:   Utility Node:192.168.100.114    
   
         Storage Node 1:192.168.100.115   Storage Node 2:192.168.100.116   Storage Node 3:192.168.100.117   Storage Node 4: 192.168.100.118From ACM issue the following commands to reboot the affected node:Check Power Status via commandlineipmitool -I lanplus -H 192.168.100.11x -U root -P Idpa_1234 power status    
   
      Power off node via commandlineipmitool -I lanplus -H 192.168.100.11x -U root -P Idpa_1234 power off Power on node via commandlineipmitool -I lanplus -H 192.168.100.11x -U root -P Idpa_1234 power on Note: Test manual ssh to the affected node after its rebooted. Repeat Step 3 to confirm issue is fixed.     
   
    6: On the Avamar AVI Installer UI, Abort the Avamar Install workflow. On the ACM UI, Hit retry to rollback and retry the configuration.       
       
        7: Retry the Configuration. 
                                                           

 

 

Notes:

 

 

Please remove any Avamar configuration checkpoint dat files in ACM config directory before proceeding with deployment again.