RecoverPoint: Deployment upgrade fails when waiting for Gen 6 RPA to boot up from ISO upgrade - "WARN  - Apply Failed. - Connection failed. Verify the RPA is reachable make sure the login credentials are correct check user status and try again."

           

   Article Number:     534102                                   Article Version: 3     Article Type:    Break Fix 
   

 


Product:

 

RecoverPoint,RecoverPoint CL,RecoverPoint EX,RecoverPoint SE,RecoverPoint Gen6 Server

 

Issue:

 

 

   

      When upgrading RecoverPoint cluster with Gen6 RPAs, the operation may fail on Gen6 RPAs with error:         
          WARN  - Apply Failed. - Connection failed. Verify the RPA is reachable, make sure the login credentials are correct, check user status and try again.         
         
          Error in ui.log ( DM Folder\logs\ui.log):         
          2019-05-18 04:56:23,429 [ApplyNDUUPdaterThread] (ApplyNDUSummaryPage.java:419) INFO  - setProcessStarted[true] RUNNING         
          2019-05-18 04:56:23,437 [main] (ApplyNDUSummaryPage.java:454) WARN  - Apply Failed. - Connection failed. Verify the RPA is reachable, make sure the login credentials are correct, check user status and try again.         
          2019-05-18 04:56:29,430 [ApplyNDUUPdaterThread] (ApplyNDUSummaryPage.java:419) INFO  - setProcessStarted[false] FAILED_ERROR         
         
          Error in dm.log ( DM Folder\logs\dm.log):         
          2019-05-18 03:46:28,706 [pool-1-thread-13] (NDUApplyCommand.java:1090) DEBUG - Finished ISO upgrading rpa: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]         
          2019-05-18 03:46:28,706 [pool-1-thread-13] (GeneralUtils.java:115) DEBUG - Sleeping for: Waiting for rpas to go down: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN], time to sleep in millis is: 300000         
          2019-05-18 03:51:28,719 [pool-1-thread-13] (BaseDMCommand.java:198) DEBUG - Command:VerifyRPAsAreUpCommand Started         
          2019-05-18 03:51:28,719 [pool-1-thread-13] (VerifyRPAsAreUpCommand.java:37) DEBUG - Verifing rpas are up: [RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]], timeoutInSecs=900         
          2019-05-18 03:51:28,720 [pool-1-thread-21] (BaseDMCommand.java:198) DEBUG - Command:VerifyBoxUpLogicCommand Started         
          2019-05-18 03:51:28,721 [pool-1-thread-21] (BaseDMCommand.java:198) DEBUG - Command:AuthenticateCommand Started         
          2019-05-18 03:51:28,721 [pool-1-thread-21] (BaseInstallationServerAdapter.java:75) DEBUG - Connecting to rpa: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]         
          2019-05-18 03:51:50,846 [pool-1-thread-21] (BaseInstallationServerAdapter.java:222) INFO  - rpa: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]         
           *** has no alternative IS credentials.         
          2019-05-18 03:51:50,847 [pool-1-thread-21] (BaseInstallationServerAdapter.java:216) ERROR - rpa: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]         
           *** no alternative IPs, cannot retry operation: VALIDATE_PASSWORD, params={}         
          com.kashya.installation.exception.InstallationServiceException: rpa: RPA [ip=XXX.XXX.XXX.XXX, isCredentails=Credentials [user=boxmgmt, password=*****], fapiCredentials=Credentials [user=admin, password=*****], alternativeISCredentails=null, number=4, siteName=Site1, version=5.0.SP1.P2(e.265), boxWWNs=null, rpaState=null, generation=UNKNOWN]         
           *** has no alternative IS credentials.         
          ...         
          2019-05-18 04:06:43,322 [pool-1-thread-21] (ReturnValue.java:51) ERROR -          
          2019-05-18 04:06:43,322 [pool-1-thread-21] (BaseDMCommand.java:253) DEBUG - Command VerifyBoxUpLogicCommand FINISHED with returnValue=[ReturnValue [_isSuccess=false, _message=Rpa: XXX.XXX.XXX.XXX is down., _data=null, _throwable=com.emc.dm.exception.InstallationFlowException: Rpa: XXX.XXX.XXX.XXX is down., _failureReason=UNKNOWN]]         
          2019-05-18 04:06:43,322 [pool-1-thread-21] (BaseDMCommand.java:207) DEBUG - Command:VerifyBoxUpLogicCommand Finished         
          2019-05-18 04:06:43,323 [pool-1-thread-13] (BaseDMCommand.java:152) ERROR - VerifyRPAsAreUpCommand FAILED         
          ...         
          2019-05-18 04:06:43,324 [pool-1-thread-13] (ReturnValue.java:51) ERROR - Found some RPAs that are down. Their IP addresses are:          
          2019-05-18 04:06:43,324 [pool-1-thread-13] (BaseDMCommand.java:253) DEBUG - Command VerifyRPAsAreUpCommand FINISHED with returnValue=[ReturnValue [_isSuccess=false, _message=Found some RPAs that are down. Their IP addresses are: , _data=null, _throwable=com.emc.dm.exception.InstallationFlowException: Rpa: XXX.XXX.XXX.XXX is down., _failureReason=UNKNOWN]]         
          2019-05-18 04:06:43,324 [pool-1-thread-13] (BaseDMCommand.java:207) DEBUG - Command:VerifyRPAsAreUpCommand Finished         
          2019-05-18 04:06:43,325 [pool-1-thread-13] (DMUtils.java:51) ERROR - Got a failed return value for: Verifing rpas are up after ISO upgrade, return value is: ReturnValue [_isSuccess=false, _message=Found some RPAs that are down. Their IP addresses are: , _data=null, _throwable=com.emc.dm.exception.InstallationFlowException: Rpa: XXX.XXX.XXX.XXX is down., _failureReason=UNKNOWN]         
          com.emc.dm.exception.InstallationFlowException: Rpa: XXX.XXX.XXX.XXX is down.         
              at com.emc.dm.commands.infra.Commands.throwFirstRVExceptionIfThereAreAny(Commands.java:53)         
              at com.emc.dm.commands.infra.Commands.invokeAllAndVerifySuccess(Commands.java:131)         
              at com.emc.dm.commands.infra.BaseDMCommand.invokeAllAndVerifySuccess(BaseDMCommand.java:101)         
              at com.emc.dm.commands.concrete.verify.VerifyRPAsAreUpCommand.inner(VerifyRPAsAreUpCommand.java:48)         
              at com.emc.dm.commands.infra.BaseDMCommand.execute(BaseDMCommand.java:41)         
              at com.emc.dm.commands.concrete.status.NDUApplyCommand.runRPAPhaseISOUpgradeStep(NDUApplyCommand.java:459)         
              at com.emc.dm.commands.concrete.status.NDUApplyCommand.runRPAPhase(NDUApplyCommand.java:101)         
              at com.emc.dm.commands.concrete.status.NDUApplyCommand.inerExecuteAction(NDUApplyCommand.java:73)         
              at com.emc.dm.commands.concrete.status.BaseInstallationActionWithStatusLogicCommand.innerWithoutReturnValue(BaseInstallationActionWithStatusLogicCommand.java:37)         
              at com.emc.dm.commands.concrete.status.BaseStatusCommand.inner(BaseStatusCommand.java:50)         
              at com.emc.dm.commands.infra.BaseDMCommand.execute(BaseDMCommand.java:41)         
              at com.emc.dm.commands.concrete.status.ApplyCommand.innerWithoutReturnValue(ApplyCommand.java:48)         
              at com.emc.dm.commands.concrete.status.BaseStatusCommand.inner(BaseStatusCommand.java:50)         
              at com.emc.dm.commands.infra.BaseDMCommand.execute(BaseDMCommand.java:41)         
              at com.emc.dm.commands.infra.BaseDMCommand.call(BaseDMCommand.java:161)         
              at com.emc.dm.commands.infra.BaseDMCommand.call(BaseDMCommand.java:32)         
              at java.util.concurrent.FutureTask.run(Unknown Source)         
              at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)         
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)         
              at java.lang.Thread.run(Unknown Source)         
          Caused by: com.emc.dm.exception.InstallationFlowException: Rpa: XXX.XXX.XXX.XXX is down.         
              at com.emc.dm.actions.VerifyBoxUpLogicCommand.inner(VerifyBoxUpLogicCommand.java:40)         
              ... 7 more         
          Caused by: com.emc.recoverpoint.utils.state_observer.StateException: State wasn't reached.         
              at com.emc.recoverpoint.utils.state_observer.StateObserver.waitUntilStateIsReached(Predicate.scala:22)         
              at com.emc.dm.actions.VerifyBoxUpLogicCommand.inner(VerifyBoxUpLogicCommand.java:38)         
              ... 7 more
   

                                                             

 

 

Cause:

 

 

   

      Deployment Manager has 2 timeouts for the time it takes RPA to complete ISO upgrade including boot:         
          Waiting for rpas to go down - time to sleep in millis is: 300000         
          Verifing rpas are up - timeoutInSecs=900         
         
          So combined DM allocates 20 minutes for the RPA to come back up.         
          Gen 6 RPA may take slightly longer to complete the ISO upgrade - 25 minutes or so
     
          

                                                             

 

 

Change:

 

 

Use Deployment Manager to upgrade RecoverPoint cluster with Gen6 RPAs                                                           

 

 

Resolution:

 

 

   

      Workaround:         
         
          Wait for DM to fail - Verify the RPA answers ping and allow DM to retry the operation         
         
          Resolution:
     
     
      Change parameter on DM Folder\conf\dm.properties before starting Deployment Manager for upgrade     
      # Timeout in seconds for a box to come up from reboot after auto upgrade     
      BOX_UP_AFTER_AUTO_UPGRADE_TIMEOUT_IN_SECS = 900     
     
      to     
      # Timeout in seconds for a box to come up from reboot after auto upgrade     
      BOX_UP_AFTER_AUTO_UPGRADE_TIMEOUT_IN_SECS = 1800