Skip to main content
  • Place orders quickly and easily
  • View orders and track your shipping status
  • Enjoy members-only rewards and discounts
  • Create and access a list of your products
  • Manage your Dell EMC sites, products, and product-level contacts using Company Administration.

Determining Whether an Avamar System is Experiencing a Time Synchronization (NTP) issue.

Summary: How to determine whether an Avamar system is experiencing a time synchronization (NTP) issue.

This article may have been automatically translated. If you have any feedback regarding its quality, please let us know using the form at the bottom of this page.

Article Content


Instructions

Time synchronization amongst all nodes is essential for the healthy operation of an Avamar system.

If nodes within an Avamar system are not time synchronized, we can expect the following types of behavior:

  • The Avamar server is unable to start
  • Nodes go offline
  • HFScheck fails with MSG_ERR_CGSAN_FAILED
  • HFScheck fails with MSG_ERR_HFSCHECKERRORS
  • Checkpoints fail
  • Garbage Collection fails
  • Data consistency issues (if the time changes during garbage collection)

Examples of error messages commonly reported as a result of loss of time synchronization:

  • samconn::checkallsucceed request failed DPNTIMECHECK=230 
  • FATAL ERROR: <0001> dpn time mismatch: synchronize clocks and retry
  • ERROR: <0001> dpncheckmanager::verifyStartup cgsan died unexpectedly. terminating  
  • not enough valid responses received in time
Avamar experiences problems with NTP time synchronization for various reasons, for example;
  • Problems with the time synchronization (ntpd) server
  • Problems with the time synchronization client
  • Network problems
To diagnose such an issue, we must first recognize that it exists.

This article helps the reader determine whether the Avamar system is experiencing a time synchronization issue. Resolving the issue is beyond the scope of this article.

There are many websites which cover NTP troubleshooting and the reader is encouraged to investigate them. Helpful web URLs available at the time of writing are listed in the 'external links' section.

To proceed:
1. Log in to the Avamar server as admin per KB Avamar: How to Log in to an Avamar Server and Load Various Keys..

2. To determine whether Avamar nodes are time synchronized, check the current time and date of each node on the Avamar system. See APPENDIX A for sample output.

mapall --all --parallel '/bin/date'

When all nodes report the same date and time this means that the time is fully synchronized between all the nodes on this system.

3. To keep time synchronized on the nodes, Avamar uses Network Time Protocol (NTP). The Linux command "ntpq -pn" returns the state of time synchronization. See APPENDIX B for sample output.

mapall --all --noerror '/usr/sbin/ntpq -p'

 

4. General Avamar Server Observations:

  • All nodes are set to prefer 128.xxx.xxx.xx as the primary time source.
  • The secondary time source for all nodes is the local BIOS clock on "avmtest1" (node 0.s).
  • The tertiary time source is set to be avmtest2 (node 0.0) which is itself referring to avmtest1.
  • All nodes are synchronizing with avmtest1. The time server marked with an asterisk (*) is the one that the node is currently synchronizing with.
  • In this case, 128.xxx.xxx.xx is located remotely. It has a 'reach' value of 0 (currently unreachable). It is useless as a time source.
  • avmtest1 and avmtest2 both have a reachability register of octal 377. This is the highest figure attainable. Therefore, the nodes are all synchronizing with the secondary source.
Note: The 'reach' field: A full discussion of reach-ability is beyond the scope of this article. However, the 'reach' value is essentially a report on the status of the previous eight transactions between the NTP client and NTP server. A value of 377 means that the last eight transactions were all successful. See the references below to understand how the 'reach' value works.

5. Looking at the ntpq output for node 0.2;

 

(0.2) ssh  -x  admin@10.64.18.164 '/usr/sbin/ntpq -p'

     remote           refid      st t when poll reach   delay   offset  jitter

==============================================================================
 128.xxx.xxx.xx  .INIT.          16 u    - 1024    0    0.000    0.000 4000.00
*avmtest1.emcvmw LOCAL(0)         9 u   54  256  377    0.085   -0.116   0.002
+avmtest2.emcvmw xx.xx.xx.xxx    10 u   56  256  377    0.090    0.073   0.012

We learn that:
  • Node 0.2 is polling avmtest1 every 256 seconds
  • Node 0.2 is currently synchronizing with avmtest1
  • avmtest1 is at stratum 9, implying that node 0.2 is at stratum 10.
  • Node 0.2 is polling avmtest1 once every 256 seconds.
  • The reachability register for avmtest1 is octal 376.
  • The clock on avmtest1 is 0.116 milliseconds (or 116 microseconds) behind the clock on avmtest1.
  • The roundtrip delay to avmtest1 is 85 milliseconds.
  • The measurement of the variance in latency on the network (jitter) between node 0.2 and avmtest1 is 2 milliseconds.

NTP configuration (/etc/ntp.conf):
If reviewing the /etc/ntp.conf file on node 0.2  it corresponds to the ntpq output above.

#Customer premises / external time servers.

#
server xxx.xxx.xxx.xx     <--  Primary time source (this is an external server located remote to the Avamar grid)
# - - - - -
# DPN time servers here and in the other module(s).
#
server xx.xx.xx.xxx   <--  Secondary time source (this is the utility node)
server xx.xx.xx.xxx   <--  Tertiary time source (this is node 0.0)

Logging:
NTP logging is directed to the /var/log/messages file.
To view NTP-related logging, grep the contents of /var/log/messages* for 'ntp'

Resolving Time Synchronization Issues:
If an Avamar experiences time synchronization issues, the problem must be fixed. Resolving time synchronization issues is beyond the scope of this article.

If an external time server is unreliable, as in the example given above, it is acceptable to use an internal time server. The internal time may drift slowly from UTC, but the most important consideration is that data nodes are time synchronized with one another.

The Avamar utility asktime tool can be used to select new, preferred time sources for NTP.
See Avamar: How to configure NTP on an Avamar Server using asktime 

Additional Information:
http://support.microsoft.com/kb/939322 - Windows Domain controllers should not be used for good time keeping.

Additional Information

APPENDIX A:
Example of all nodes showing synchronized time.

Note: The '--parallel' flag runs the command on each node simultaneously. On a system where time is synchronized you see an output similar to the following:
Note: The utility node (0.x) is set to the local time zone, in this example 'BST' whereas the data nodes are set to the 'UTC' time zone. This is expected behavior.

mapall --all --parallel 'date'

Using /usr/local/avamar/var/probe.xml
(0.s) ssh  -x  admin@xx.xx.xx.xxx 'date'
(0.0) ssh  -x  admin@xx.xx.xx.xxx 'date'
(0.1) ssh  -x  admin@xx.xx.xx.xxx 'date'
(0.2) ssh  -x  admin@xx.xx.xx.xxx 'date'
Mon Jun 20 12:01:12 BST 2011
Mon Jun 20 11:01:12 UTC 2011
Mon Jun 20 11:01:12 UTC 2011
Mon Jun 20 11:01:12 UTC 2011


APPENDIX B:

Example of ntpq output from an Avamar with one utility node and three data nodes:
Note: If adding an 'n' flag to the command below (ntpq -pn), name resolution is not used. Output is returned quickly, and IP addresses are shown instead of hostnames. This affects the readability of the output.

 
mapall --all --noerror '/usr/sbin/ntpq -p'
(0.s) ssh  -x  admin@10.xx.xx.xxx '/usr/sbin/ntpq -p'
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 128.xxx.xxx.xx  .INIT.          16 u    - 1024    0    0.000    0.000 4000.00
*LOCAL(0)        LOCAL(0)         8 l    8   64  377    0.000    0.000   0.001

(0.0) ssh  -x  admin@10.xx.xx.xxx '/usr/sbin/ntpq -p'
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 128.xxx.xxx.xx  .INIT.          16 u    - 1024    0    0.000    0.000 4000.00
*avmtest1.emcvmw LOCAL(0)         9 u  750 1024  377    0.126   -0.197   0.001  

(0.1) ssh  -x  admin@10.xx.xx.xxx '/usr/sbin/ntpq -p'

     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 128.xxx.xxx.xx  .INIT.          16 u    - 1024    0    0.000    0.000 4000.00
*avmtest1.emcvmw LOCAL(0)         9 u  194  256  377    0.095   -0.139   0.004
+avmtest2.emcvmw xx.xx.xx.xxx    10 u  189  256  377    0.097    0.062   0.005

 (0.2) ssh  -x  admin@10.xx.xx.xxx '/usr/sbin/ntpq -p'
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 128.xxx.xxx.xx  .INIT.          16 u    - 1024    0    0.000    0.000 4000.00
*avmtest1.emcvmw LOCAL(0)         9 u   54  256  377    0.085   -0.116   0.002
+avmtest2.emcvmw xx.xx.xx.xxx    10 u   56  256  377    0.090    0.073   0.012
Related articles 

Article Properties


Affected Product

Avamar

Product

Avamar

Last Published Date

21 Aug 2023

Version

3

Article Type

How To