I guess the title could have been “Frequently Asked Questions”.  But as with everything in life, it’s all a matter of perspective.  Honestly these aren’t frequently asked, but IMO they are facts about InsightIQ that are not well known or well understood.

 

The following facts about InsightIQ are based on version 3.2, code name “Lumen”.


Latest Version

 

The latest version of Isilon InsightIQ (IIQ) at the time of writing this blog, is version 3.2, code name “Lumen”.  The latest production revision is version 3.2.2, released on 9/31/2015.

 

Release notes and user guides are on the InsightIQ - Isilon Info Hub (bookmark this): https://community.emc.com/docs/DOC-42096

 

Installation

 

InsightIQ can be installed on:

  1. Bare metal machine or VM running CentOS or RHEL, using “.sh” RPM-based deployment method
  2. VMware ESXi host, using an “.ova” deployment method
  3. Any machine that supports running VMware Player, using the popular “.vmdk” VM method

 

During the installation of InsightIQ, the administrator needs to provide to InsightIQ the login credentials for the InsightIQ UNIX user on the cluster.  Along with the credential information, an IP address is required.  This IP address can be the IP address of a specific node, or the SmartConnect Service IP (SSIP).  The official guidance is to use the SSIP address.  The Isilon cluster itself must have the InsightIQ license activated. Otherwise attempting to connect to the cluster during InsightIQ installation will fail.

 

During the installation of InsightIQ, the administrator has a choice of where to locate the InsightIQ datastore.  The datastore is IIQ’s own repository of historical performance metrics from the monitored cluster(s).  The admin has a choice of either storing the datastore on a local disk on the IIQ machine, or storing the datastore over NFS to some NFS server (usually onto the Isilon cluster itself over NFS).  BOTH methods are supported configurations.

 

This IIQ datastore does not contain File System Analytics information.  It contains only historical performance metrics.  (The word “historical” is technically accurate. Even though the performance data stored may be “historical”, it can be as fresh as 30-seconds old.)

 

Data Sources and Update Frequency

 

There is a statistics engine running on the Isilon cluster that aggregates various metrics at a rate of once every 5 seconds.  This statistics engine is referred to as the “statsd” engine.  These metrics get stored on-cluster for approximately 1-hour. One could see a brief example of these metrics from the WebUI main page where cluster CPU and throughput graphs are displayed -- notice the 1-hour duration of the graphs.  After the metrics have been stored on-cluster for an hour, they start to expire and will be lost forever unless retrieved by InsightIQ.

 

InsightIQ polls performance metrics from the monitored cluster(s) every 15 seconds over OneFS API.  The performance metrics comes from the statsd engine.  The performance metrics are saved in InsightIQ’s datastore.  The IIQ datastore prior to v3.2.0 “Lumen” was based on sqlite3 database.  On version 3.2.0 and onward, the IIQ datastore has been refactored to be based on PostgreSQL, making the InsightIQ application much more responsive and visibly faster.

 

For file system analytics reports, InsightIQ gets the File System Analytics (FSA) job results from the monitored cluster(s) using an NFS mount. The FSA results are in sqlite3 database format.  The completed FSA results reside on the Isilon cluster itself, and are never stored on the machine that hosts the InsightIQ instance.  When the end user tells InsightIQ to grab a particular FSA report, IIQ reaches over NFS to the Isilon cluster and grabs the corresponding FSA result set.

 

InsightIQ v4.0 “Aurora”, to be released in the first half of 2016 to support OneFS 8.0, will get both the performance metrics as well as the FSA job reports over OneFS API.

 

Officially Supported Maximums

 

A single InsightIQ instance can monitor up to 8 clusters or 150 nodes, whichever is lower.  In practice, your mileage may vary, as the metrics that IIQ receives from the cluster is impacted by how busy the cluster is, and how much metrics are generated.  Official verbiage on page 10 of the InsightIQ v3.2 Installation Guide (https://support.emc.com/docu59976_InsightIQ-3.2-Installation-Guide.pdf?language=en_US):

 

You can configure InsightIQ to monitor more than one Isilon cluster simultaneously. The maximum number of clusters that you can simultaneously monitor varies depending on the resources available to the virtual machine. It is recommended that you do not monitor a cluster that contains more than 80 nodes and that you monitor no more than eight clusters or 150 nodes at a time. If you want to monitor more clusters or nodes than this, we recommend that you deploy an additional instance of InsightIQ.

 

A typical IIQ instance is configured with at least 8 GB of RAM.  One could go higher to 16 GB and err on the side of caution.  While there is no official guidance on the number of CPU/vCPU cores necessary to run IIQ, many customers have opted to deploy IIQ on 4-core or even 8-core machines in anticipation of heavier use of IIQ down the road for monitoring more clusters or more nodes.  Keep in mind that for each Isilon cluster this IIQ instance monitors, the IIQ instance makes API calls to each of these clusters every 15 seconds to poll performance metrics, aggregate the data, and stores the data. The more clusters and nodes it monitors, the more work it has to do.

 

Officially Supported Datastore and Network Configuration

  • InsightIQ supports the placement of its datastore either locally on the machine that is running InsightIQ, or on the Isilon cluster over an NFS mount, or on any other NFS-enabled server.
  • If storing InsightIQ’s datastore on a NFS server (on the Isilon cluster or another NFS-enabled server), the NFS server hosting the datastore needs to be on the same LAN as the IIQ instance.
  • The InsightIQ instance and the Isilon cluster being monitored need to be on the same LAN (not geographically distributed).

 

While having the InsightIQ instance outside of the immediate LAN where the cluster resides can functionally work, it is not a supported configuration.  The reason is due to latency.  Keep in mind that InsightIQ polls performance metrics from each monitored cluster at a rate of every 15 seconds.  IIQ then needs to aggregate the cluster configuration with the performance metrics and then store the aggregated metrics in its own datastore. If the IIQ datastore is not on the same LAN or in the same geographical location, the increased 2-way network latency can really slow down IIQ to a point where IIQ becomes impractical to use as a monitoring solution.

 

Related Resources