You might know that InsightIQ collects statistics about OneFS storage clusters from the OneFS API. But what is the API? And how does InsightIQ fetch historical performance data using this API to make beautiful charts? Can you use the API to retrieve the same data as InsightIQ and use the results in a different analytics application? This blog aims to answer these questions.

 

 

Architecture Overview

 

This diagram illustrates the flow of information between OneFS, the InsightIQ application, and your web browser:

 

iiq_papi.png


The OneFS API is a RESTful API that allows you to manage your storage cluster programmatically. The API provides methods to access statistical data on the storage cluster via a daemon called isi_stats_d. This is only a small part of what the API provides. In fact, the OneFS web administration interface relies heavily on the API to interact with your storage cluster. InsightIQ uses many other API calls to get information about a cluster, including quota reports, deduplication results, and cluster status.

 

In the background, the InsightIQ server makes regular REST calls to the OneFS API, and then stores the data it receives in a PostgreSQL database. The data is then aggregated so that you can view information about the whole cluster, or break the data out by node, protocol, etc. After 24 hours, the stored data is converted, or downsampled, into larger time intervals in order to save space and increase the performance of InsightIQ. With this process, InsightIQ can serve data to the application quickly, without having to make a round-trip to the storage cluster. Additionally, this way InsightIQ can retain more historical data than is stored in the cluster’s memory.


When you access the InsightIQ administration interface from your browser, the browser sends a request to the InsightIQ server for data. The server then requests the data from the PostgreSQL database. The data from the PostgreSQL database is returned through the server to the browser. Finally, the data is converted into the charts and tables you see in the InsightIQ administration interface.

 

 

An example request to the OneFS API


Let’s take a closer look at one of InsightIQ’s most frequently used API calls: /platform/1/statistics/history. This request fetches performance statistics.


On the OneFS side, the statistics for our performance charts all come from “statistics keys”. Each key represents a different metric. The statistics daemon uses these keys to manage the data. For example, let’s look at the node.clientstats.connected.nfs key. This key is one of many that are used to populate the Connected Clients chart in InsightIQ. Specifically, it collects the number of connected NFS clients per node.


This is an example of how to use the API call with the key:


https://<cluster_ip>:8080/platform/1/statistics/history?key=node.clientstats.connected.nfs&devid=all&degraded=true&interval=30&memory_only=true&

begin=1444172910&end=1444173060


(Tip: You can see all of the OneFS API calls that your InsightIQ application makes, just like this one, by enabling debug logging for the logger_api_connection module. Then, these API calls are logged to /var/log/insightiq_clusters/<cluster_host>-<GUID>/api_connection.log. For more details on how to enable debug logging in InsightIQ for specific loggers, see KB 176679: How to enable debug logging for InsightIQ.)


Let’s break down the parameters used.


  • key: This parameter is the key mentioned above, node.clientstats.connected.nfs. In this example, there is only one key, but multiple “key” parameters can be used in one API call to query multiple keys at once.
  • devid: This parameter is the node device ID to query. It is specified either by a number or by “all”. In the example, data is fetched for all nodes at once.
  • degraded: If this parameter is enabled, data is returned even if some data is unavailable. This means errors might be present in the returned data. Any errors that occur are indicated in the response from the API, so they can be handled by the application.
  • interval: This parameter is the minimum sample interval in seconds. In this example, data is queried in 30-second intervals. The API responds with the data formatted in increments of a whole number multiple of the collection interval. For example, if you requested an interval of 59, the API responds with the data in intervals of 60, since this statistic collects data in intervals of 5 seconds. 60 is the next whole number multiple of 5. For data that doesn’t update frequently, InsightIQ requests longer sample times, and fetches data less frequently.
  • memory_only: This parameter indicates whether or not to use the data saved in the cluster’s memory. If disabled, data is fetched from a database on the cluster. Enabling this option returns statistics more quickly, but there is a smaller time range of data available. Each key has different data retention settings, called a retention policy. InsightIQ relies solely on in-memory data in order to fetch statistics quickly.
  • begin: This is the beginning of the time interval to fetch from the cluster, in UNIX timestamp format. Since memory_only is enabled, the time range must be within the retention policy time for the statistic queried, or else no data is returned.
  • end: This parameter is the end of the time interval to fetch from the cluster, in UNIX timestamp format.


(Tip: You can see a more thorough description of the available options for this API call by visiting this URL on your cluster: https://<cluster_ip>:8080/platform/1/statistics/history?describe)

(Tip: You can see all available keys on your cluster and all of their retention policies and intervals by visiting this URL on your cluster: https://<cluster_ip>:8080/platform/1/statistics/keys)

 

 

An example response from the OneFS API


This is an example JSON snippet from the response:

{

    "stats" : [ ..., {

        "devid" : 26,

        "error" : null,

        "error_code" : null,

        "key" : "node.clientstats.connected.nfs",

        "values" :: [{

            "time" : 1444172914,

            "value" : 15

        },{

            "time" : 1444172945,

            "value" : 15

        },{

            "time" : 1444172975,

            "value" : 16

        },{

            "time" : 1444173005,

            "value" : 16

        },{

            "time" : 1444173035,

            "value" : 17

        }]

    }, ... ]

}


This response snippet tells us that for the node with device ID 26, there were 15 connected NFS clients on 10/6/2015, at 4:08:34 PM (1444172914 in UNIX time). Over the next couple of minutes, the number increased, and at 4:10:35 PM (1444173035 in UNIX time) there were 17 connected NFS clients. A total of five 30-second samples were returned. As you can see in the screenshot, these values can be seen in the UI with the right filters applied to the corresponding chart.


screenshot.png


Hopefully you now have a better understanding of InsightIQ’s data collection process, as well as how to use the OneFS API to fetch statistics! We encourage you to use this information to make requests to the OneFS API in your own analytics applications.


Stay tuned for further blog posts about how InsightIQ works with statistics from OneFS.