|Article Number: 516547||Article Version: 3||Article Type: Break Fix|
ViPR SRM 3.0,ViPR SRM 3.5,ViPR SRM 3.6,ViPR SRM 3.7,ViPR SRM 4.0,ViPR SRM 4.1,ViPR SRM,Watch4net,Watch4net 6.6,Watch4net 6.5,Watch4net 6.4,Watch4net 6.3
SEVERE -- [2017-12-19 23:27:36 AEDT] -- CollectorWorkerCommandTask::run(): nulljava.lang.NullPointerException at com.watch4net.apg.common.jmxutils.VMUtils.listPrivateJVMs(VMUtils.java:176) at com.watch4net.apg.common.jmxutils.VMUtils.getVmInfos(VMUtils.java:53) ....
When the Watch4net/ViPR SRM installation directory is a mount point with a file system mounted on it, then each time the fsck utility is run on the file system to check and correct it, fsck creates a lost+found directory in the root of the file system. The fsck utility is executed on the file system after a system crash when the file system is flagged as "dirty".
The lost+found directory is accessible by the root user only. Since the health collector executes as user "apg", it does not have access to the lost+found directory. This will stop the health collector's search for JVM instances on the host and as a result it will not collect any data.
A binary installation is performed in /app/APG where /app/APG is a mount point for a file system. fsck, when executed on the file system, will create the directory /app/APG/lost+found/ - this directory will be accessible to the root user only. When executing 'find /app/APG -name ".*"' as user "apg" on the host, a "permission denied" message can be observed on the lost+found directory.
There is a fix for the problem scheduled for EMC M&R/Watch4net 6.9.
A workaround for versions prior to this is to remove the lost+found directory and its contents in the SRM installation directory and restart the health collector each time the health graphs and reports stop collecting data.