Hi again! The VMAX for Splunk 2.0 add-on and app have been out for a small amount of time so I decided it was time to follow on from the version 1.0 blog and put one together for 2.0.

 

A number of VMAX for Splunk users have been in contact in the release of the initial offering, and a number of improvements have been made based on their suggestions, so I hope version 2.0 will be as well received as our first outing!

 

Same as always, I will try to cover everything that goes with getting VMAX for Splunk set up in your environment, but if there is anything you would like more information on, isn't covered, or you are still in need of some help, get in contact in the comments or via vmax.splunk.support@emc.com.


Downloading Links for VMAX for Splunk 2.0

  • VMAX for Splunk Technology Add-on (TA): Splunkbase
  • VMAX for Splunk Technology Add-on User-guide: DECN
  • VMAX for Splunk App: Splunkbase
  • VMAX for Splunk App User-guide: DECN


About the VMAX for Splunk TA and App

The Splunk Technology Add-on (TA) for Dell EMC VMAX allows a Splunk Enterprise administrator to collect inventory, performance information, and summary information from VMAX storage arrays. You can then directly analyse the data or use it as a contextual data feed to correlate with other operational or security data in Splunk Enterprise.

 

The Splunk VMAX TA is configured to report events in 5 minute intervals which is the lowest possible granularity for performance metrics reporting.  Event metric values are representative of the value recorded at that point in time on the VMAX. Values shown for an event in Splunk at 10:00am represent their respective values at 10:00am on the VMAX.


The Splunk App for Dell EMC VMAX allows a Splunk Enterprise administrator to take inventory, performance information, and summary information from VMAX storage arrays through the VMAX Technical Add-on (TA) and present them in pre-built dashboards, tables and time charts for in-depth analysis and drill-downs to event.


Improvements from Version 1.0

There have been a number of improvements. The new TA is configured to work with Unisphere 8.4, so will feature new endpoints and metrics not previously available in version 1.0. In addition, the number one ask from our customers, to define an instance of Unisphere per VMAX input has been included, removing the restriction which required an instance of the TA per Unisphere instance. This will greatly benefit customer environments where embedded Unisphere is in operation, or if they have multiple instances of Unisphere configured to manage their arrays. SSL has been included also, giving customers the option to encrypt all of their data travelling between Unisphere and their Splunk instances.

 

The amount of metrics ingested from Unisphere has greatly increased, users can now collect metrics for the following levels:

  • Array
  • Storage Resource Pool
  • Storage Group
  • Director (FE/BE/RDF/IM/EDS)
  • Ports
  • Port Groups
  • Hosts
  • Initiators
  • Workload Planner (Compliance/Headroom)
  • VMAX System Alerts

 

In addition, the TA gives customers the ability to select what metrics they would like to ingest into Splunk instead of collecting all metrics. For example, if a customer wants to only collect Array level metrics, they can turn off all other metric reporting levels.

 

The front-end app has had a complete overhaul also. All reporting levels mentioned above will have its own dedicated dashboard(s), each driven by the KPIs defined in Unisphere to allow for all information to be collated in one place and with control over filtering right down to individual resources, or reporting on everything all at once. To help drive these bulkier searches, the app has post-search processing configured to reduce the resource usage footprint of the app. Instead of having an individual search per dashboard panel, one search could drive up to 12 panels at once, drastically increasing report generation time and performance.  Drilldowns have been greatly improved, users have the ability to report against all arrays/resources, or drill down by array and dynamically load resources for further drilldowns, giving users complete control over what level they want to view their VMAX environment at.

 

Data Collection and Source Types

The VMAX TA provides the index-time and search-time knowledge for inventory, performance metrics, and summary information. By default, all VMAX data is indexed into the default Splunk index, this is the ‘main’ index unless changed by the admin.


The add-on collects many different kinds of events for VMAX including performance, inventory, and summary metrics. Depending on the activity of the Hosts, Port Groups & Initiators in your environment, there may be events where there are no performance metrics collected. This can be confirmed if there is a metric present in the event named ‘perf_data’ with a value of ‘false. To limit the amount of data collected and stored on a VMAX, only active Hosts, Port Groups & Initiators are reported against, so it is intended behaviour to have no performance metrics for those which have been inactive for some time. More on this in the section 'Active vs. Inactive Performance Metrics Gathering'.


The source type used for the Splunk Add-on for VMAX is 'dellemc:vmax:rest'. All events are in key=value pair formats. All events have an assigned 'reporting_level' which indicates the level at which the event details, along with the associated VMAX array ID & if reporting at lower levels, the object ID e.g. Storage Group, Director, Host.

 

Hardware and Software Requirements

To install and configure the VMAX TA & App, you must have Splunk admin privileges. Because this add-on runs on Splunk Enterprise, all of the Splunk Enterprise system requirements apply.

 

There are no specific hardware or software requirements for the VMAX TA, it will point towards your existing environment and Unisphere to gather metrics. The VMAX app no longer requires additional packages to be installed in Splunk, it is ready to go once installed in the environment and configured for use.


VMAX Support Matrix

 

VMAXSupported
VMAX-1 Series (VMAX, VMAXe, VMAX-SE)No
VMAX-2 Series (VMAX 10K, VMAX 20K, VMAX 40K)No
VMAX-3 Series (VMAX 100K, 200K, 400K, 250F(X), 450F(X), 850F(X), 950F(X)Yes

 

Pulling Data from VMAX-2 Series Arrays (not using the VMAX TA)

Occasionally I get a query trough to the support box asking about support for VMAX-2 series arrays, whilst this isn't supported by the VMAX Add-on for Splunk there is another option. Cody Hosterman has put together a great blog article on setting up a Solutions Enabler as a syslog source and forwarding the data to Splunk via UDP.

 

One of my colleagues verified this method works recently, so I am presuming it still does, however, if this changes in the future please let me know so I can stop telling people about it!

 

Single Instance/Distributed Environment Installations

In a distributed deployment, install the Splunk VMAX TA to your search heads and heavy forwarders. This TA does not support universal forwarders because the TA requires Python. The add-on does not need to be installed on indexers because it does not support universal forwarders or light forwarders, thus parsing occurs on the heavy forwarder rather than on indexers. The app only needs to be installed on the search heads, and requires no additional configuration.

 

For a detailed single/distributed installation instructions, refer to Splunk's "Installing add-ons" that describes how to install an add-on in the following deployment scenarios:

  • Single-instance Splunk Enterprise
  • Distributed Splunk Enterprise
  • Splunk Cloud
  • Splunk Light

 

VMAX TA Installation Considerations

The add-on does not require the ability to modify VMAX configuration. It is highly recommended that you create a read-only user account with to provide greater security over access to your storage network.

 

The VMAX TA works through the RESTful communications between Splunk and Unisphere, so it is necessary to have Unisphere setup and running in your environment with your arrays added. I wont go into details about REST here, but if you would like to know more about it my colleague Paul Martin has put together a great series of blog articles on REST & VMAX to get you started. The first article in that series is 'Getting Started with the REST API'.

 

Performance of data collection is dependent on many factors, such as VMAX system load, Splunk Enterprise system load, and environmental factors such as network latency.  I have written a useful VMAX for Splunk sizer script which is designed to mimic the function of Splunk and provide information into the reccomended reporting intervals for the arrays in your environment, more on the sizer script later in blog.

 

Enabling VMAX Performance Metrics Gathering

Before any metrics can be collected from a VMAX you must also ensure that the VMAX is registered to collect performance metrics. This is enabled from within the Unisphere for VMAX Web UI.

 

To register your VMAX(s) follow these steps:

 

1. Log in to Unisphere and from the main home screen identify the VMAX you want to add to Splunk

 

2. In the VMAX’s summary panel, under ‘System Utilization’ click ‘Register this system to collect performance metrics’

 

Pt2 - Click Register.png


3. A new page for ‘System Registrations’ will open where you will see your VMAX listed. Click the VMAX to highlight it and click ‘Register’

 

Pt3 - Settings register.png


4. When the ‘Registration’ dialogue window opens, select the check-box for ‘Root Cause Analysis’ and click OK.


Pt4 - RCA Apply.png


5. If the registration process is successful, you will see a green dot to signify root-cause analysis is enabled.


Pt5 - RCA enabled.png


6. With the registration process complete, leave Unisphere for 8-24hrs to start gathering performance metrics before adding the VMAX to Splunk. Performance metrics collection is not immediate, for more information please refer to the ‘Performance Management – Metrics’ section of the ‘Unisphere 8.4 Online Help’ guide available on support.emc.com

 

Active vs. Inactive Performance Metrics Gathering

To limit the amount of data collected and stored on a VMAX, only active Port Groups, Hosts, and Initiators are reported against for performance metrics. Inactivity is determined by no activity being recorded by performance monitors for a specified amount of time. The VMAX TA ingests a wide range of metrics across each of the reporting levels. To get detailed definitions of each of the performance metrics see the ‘Performance Management – Metrics’ section of the ‘Unisphere 8.4 Online Help’ guide available on support.emc.com

 

This is not enforced by Splunk but is the behaviour of the VMAX, recording zero values for every Port Group, Host, and Initiator in an environment would very quickly fill databases with useless data.

 

When the VMAX TA is collecting information on the Port Groups, Hosts, or Initiators in your environment, it will first obtain a list of all objects for each reporting level. Using this list, calls will be to Unisphere for performance metrics for each, if an object is inactive, no performance metrics will be returned. This inactivity is reflected in the VMAX events through the key/value pairs below.

 

{reporting_level}_perf_details: false

{reporting_level}_perf_message: No active {reporting_level} performance data available

 

SSL Configuration

One of the biggest asks from version 1.0 of the add-on was to have end-to-end SSL availability in the VMAX/Splunk environment. Thankfully, the root cause of the issue surrounding hostname resolution was identified and SSL is available as a configuration option for each VMAX input.

 

SSL is enabled by default in the VMAX TA when adding inputs. In order to retrieve the required certificate from Unisphere follow the following steps:


1. Get the CA certificate of the Unisphere server. This pulls the CA cert file and saves it as .pem file:


# openssl s_client -showcerts -connect {unisphere_host}:8443 </dev/null 2> \
  /dev/null|openssl x509 -outform PEM >{unisphere_host}.pem








 

Where {unisphere_host} is the IP address or hostname of the Unisphere instance.


2. OPTIONAL: If you want to add the cert to the system certificate bundle so no certificate path is specified in the VMAX data input, copy the .pem file to the system certificate directory as a .crt file:


# {unisphere_host}.pem /usr/share/ca-certificates/{unisphere_host}.crt









3. OPTIONAL: Update CA certificate database with the following commands:


# dpkg-reconfigure ca-certificates
# update-ca-certificates








 

Check that the new {unisphere_host}.crt will activate by selecting ask on the dialog. If it is not enabled for activation, use the down and up keys to select, and the space key to enable or disable.


4. If steps 2 & 3 are skipped and instead the cert from step 1 will just remain in a local directory, you can specify the location of the .pem cert in the VMAX data input setting 'SSL Cert Location'. Otherwise, leave ‘SSL Cert Location’ blank and ‘Enable SSL’ enabled to use the cert from the system certificate bundle.


VMAX for Splunk Sizer

An additional script has been included with the VMAX TA to help determine the optimum reporting interval required for your VMAX data inputs. This sizer is meant to be used with one instance of Unisphere at a time, it is not concerned with performance across multiple instances of Unisphere as this would fall under the remit of Splunk performance. This sizer will help set the VMAX TA input intervals so that each input has enough time to complete before the reporting interval is exceeded and metric collection intervals are missed.

 

Metrics collection run times depend entirely on the environment, the VMAX itself, how heavily utilised and loaded with resources it is, so there is no one size fits all option. This script will simulate Splunk and gather summary and performance metrics from an instance of Unisphere and VMAX(s) of your choosing. These collection runs will also run concurrently as Splunk does. When complete, information will be output as to how long metric collection lasted for a given VMAX, and the recommended reporting interval time.

 

To run VMAX for Splunk sizer script, you will require Python 2.7 and the Python Requests library:

 

To run the sizer script, follow the steps below:

 

1. Navigate to the VMAX TA folder containing the sizer script and configuration file:

 

# cd {splunk_dir}/etc/apps/TA-DellEMC-VMAX/bin/sizer








2. Open the vmax_splunk_sizer_config.ini configuration file for editing.

 

3. Under [ENVIRONMENT_SETTINGS] set

  • The Unisphere IP address or hostname
  • The Unisphere port (default is 8443)
  • The Unisphere username & password
  • The required SSL setup:
    • If you require no SSL, set this to False
    • If you have an SSL cert loaded into the system bundle, set this to True
    • If you have an SSL cert but want to specify the path, set this to the path to the cert
  • Your VMAX numerical IDs, for more than one VMAX separate with a comma


4. Under [REPORTING_LEVELS], if you want to turn on or off any specific reporting level change the value to False

 

5. Debug mode is not necessary unless diagnosing an issue with VMAX for Splunk support, but if you would like to see all calls output to screen, change this to True


5. With all the environment settings configured, run the VMAX for Splunk environment sizer script using the python file 'rest_vmax_splunk_sizer.py'


$ python rest_vmax_splunk_sizer.py








6. Once the script has run to completion, details of the metrics collection run will be output to the screen along with recommendations on the optimum reporting interval for each VMAX.


Sizer.png

Installation and configuration overview for the Splunk Add-on for VMAX

Once you have Splunk set up and running in your environment, there is very little required to get the VMAX TA set up and collecting information. There are no additional requirements or dependencies, so once you have the VMAX TA downloaded from the Splunkbase website or through the app store from within Splunk you are good to go with set up!

 

I am going to go through the process of setting up the TA first, adding VMAX arrays as data inputs afterwards, then finally setting up the VMAX app to start viewing meaningful analysis of your environment through Splunk. The installation of both the TA and app follow the same procedure as all others so I won't bother including screenshots of the process.

 

1. From your Splunk home screen, click the cog icon beside ‘Apps’ to navigate to the ‘Manage Apps’ section.

 

2. Within the ‘Manage Apps’ section, click the button ‘Install App from file’.

 

3. Click ‘Choose File’, select the VMAX Add-on for Splunk, and click ‘Upload’.

 

4. Once the upload is complete you will be prompted to restart Splunk to complete the installation, click ‘Restart now’. When Splunk restarts, navigate back to the home screen and you will now see a dashboard panel for the VMAX TA. Click on the panel to start adding your VMAX(s) to Splunk.

 

5. Once opened, you can add VMAX(s) to Splunk by clicking on the ‘Create New Input’ button in the top right of the UI.

 

6. To add a VMAX to Splunk, you must enter a number of details into Splunk about the instance of Unisphere used, VMAX details, SSL details, and reporting metrics configuration. The table below lists each option, its default value if there is one, and a description of the option. Once all options are set, click ‘Add’ to add the VMAX as a data input to Splunk.

 

Input.PNG.png

 

Input

Default

Description

Name

None

The name of the input as it will appear in Splunk

Interval

300

The metrics collection interval. This should be set in increments of 300s as this is the reporting interval of performance metrics in Unisphere. For more information on determining the ideal setting for the reporting interval for your environment, see the ‘VMAX for Splunk Sizer’ section above.

Index

Default

The index to which data from Unisphere for this VMAX will be written.

Unisphere IP Address

None

Unisphere IP address or hostname.

Unisphere Port

8443

Unisphere port.

Unisphere Username

None

Unisphere username.

Unisphere Password

None

Unisphere password.

VMAX Numerical ID

None

The 12-digit numerical VMAX ID

Enable SSL

True

If you require end-to-end SSL communication between Splunk and Unisphere. Uncheck to disable SSL entirely. See ‘SSL Configuration’ section above for more information on SSL set-up.

SSL Cert Location

None

If ‘Enable SSL’ is enabled, this option has two behaviours:

  1. If left blank, Splunk will search the system certs bundle for a valid Unisphere cert.
  2. If a path is provided, this is the path Splunk will use to access the Unisphere cert independently of the system certs bundle.

REST Request Timeout

60

The amount of time Splunk will wait for a response from Unisphere for any given call before timing out and logging an error. If changing from default, consider Unisphere load, setting it too low may have a negative impact on metrics collection.

Array

True

Collect array level metrics.

Alerts

True

Collect VMAX system alerts.

Collect VMAX only metrics

False

If enabled, Splunk will collect only those metrics which directly specify the Array ID in the alert description (see known issues section for impact of enabling this option). If disabled, Splunk will gather all system alerts from the instance of Unisphere it is collecting VMAX metrics from, even for those VMAXs which are not added as an input to Splunk.

Storage Resource Pool

True

Collect Storage Resource Pool metrics.

Storage Group

True

Collect Storage Group metrics.

Director

True

Collect Director metrics.

Port

True

Collect Port metrics.

Port Group

True

Collect Port Group metrics.

Host

True

Collect Host metrics.

Initiator

True

Collect Initiator metrics.

Workload Planner

True

Collect Workload Compliance & Headroom metrics.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7. To add another VMAX to the TA, repeat steps 5-6 as many times as necessary.

 

8. When all VMAX(s) have been added to the TA, you will see them listed within the TA. From here you can enable, disable, or edit the options for a given VMAX after it has been configured.

 

InputsMain.png


9. Once a VMAX has been added to the VMAX TA, it starts gathering information immediately. To access that data, use Splunk Search to start looking at VMAX related events using the SPL query: sourcetype="dellemc:vmax:rest”


SPLSearch.png


Troubleshooting the VMAX TA

The VMAX TA has been developed to give the end-user as much detail as possible about the activity of the add-on in their environment. All add-on logged events will either be marked as info, error, or critical depending on the nature of the event. If you are having any issues with the add-on, the logs will be able to give you precise information as to the cause of the problem. These issues could be related, but not limited to:

  • Incorrect Unisphere configuration or username/password combination
  • Incorrect SSL setup
  • Incorrect Array ID
  • VMAX is not performance registered
  • Performance metrics timestamp is not up-to-date


The two log files that you can use to diagnose problems with this add-on are:

  • /{splunk_install_dir}/splunk/var/log/splunk/ta_dellemc_vmax_inputs.log
  • /{splunk_install_dir}/splunk/var/log/splunk/splunkd.log


Before the add-on successfully runs for the first time, error logs go to splunkd.log. After the add-on successfully runs, error logs go to ta_dellemc_vmax_inputs.log.


Installation and Configuration Overview for the VMAX App for Splunk

As there are no dependencies required for the installation of the VMAX App for Splunk, the set-up is completed from the Splunk Web UI and, if required, associated VMAX App macro configuration file.


The VMAX App uses Splunk macros to shorten lengthy and frequent search queries. By default, these queries use the Splunk default index as configured by the user, typically this is the ‘main’ index but can be changed. If you store your VMAX event data in a different index from the default index, or have distributed your VMAX data across multiple indexes, you will need to configure these macros to use the correct indexes in order for the app dashboards to work as intended.


1. From your Splunk home screen, click the cog icon beside ‘Apps’ to navigate to the ‘Manage Apps’ section.

 

2. Within the ‘Manage Apps’ section, click the button ‘Install App from file’.


3. Click ‘Choose File’, select the VMAX App for Splunk, and click ‘Upload’.


4. (OPTIONAL) If you have configured the VMAX TA for Splunk to index event data in an index other than the default index you will need to reconfigure the VMAX App macro configuration. Navigate to the installation directory of the VMAX App for Splunk which contains all default configuration files:

 

# cd {splunk_dir}/etc/apps/App-DellEMC-VMAX/default







Copy macros.conf to the local directory in the App installation directory:


# cp macros.conf {splunk_dir}/etc/apps/App-DellEMC-VMAX/local







Edit the newly copied macros.conf so that each ‘index=’ key/value pair represents the indexes in use in your environment. Each reporting level ingested by the VMAX TA corresponds to a macro in macros.conf, so you will be able to set different indexes for array level, host level, alert level metrics for example.


Example:

[vmax_array]
definition = index=main sourcetype=dellemc:vmax:rest reporting_level="Array"







Becomes:

[vmax_array]
definition = index=vmax_index sourcetype=dellemc:vmax:rest reporting_level="Array"







Once all the macros have been updated to reflect the indexes in use, save the file and return to Splunk UI.


5. Once the VMAX App is added to Splunk you will be not be prompted to restart Splunk to complete the installation, but it is advisable to restart Splunk before using the App and also to apply any optional changes made in step 4. Navigate to ‘Settings > Server Controls’, and click ‘Restart now’. When Splunk restarts, navigate back to the home screen and you will now see a dashboard panel for the VMAX App.


Overview-All.PNG.png

Host-Select-1.PNG.png

VMAX_Alerts-All.PNG.png


VMAX App Usage & Navigation

Navigating throughout the VMAX App is done entirely through the menu featured at the top of the screen when you open the VMAX App.


AppMenu1.png

 

By default, the drop-down boxes in each dashboard will feature all the objects available at that reporting level through the ‘ALL’ option. You can drill down further by array ID and more depending on the dashboard you are within.


2AppMenu1.png

 

The default time-range for each dashboard time chart is last 24 hours. This time can be changed by changing the time settings under ‘Time Frame’. It works the same as all other Splunk time range pickers.

 

AppMenu3.png

 

It is important to note that not all panels within each dashboard are set to represent data from the last 24 hours, most are set to present the most recent and up to date information to the user. As a rule of thumb, only time charts in the VMAX App dashboards reflect data from the time range specified in the time range picker. All other panels, such as tables, single number panels, pie charts etc., all represent the most-recent information up-to-date from the most recent VMAX event data. If there is an issue with data collection and the data is older than 10minutes old some of the panels may display a ‘No Results Found’ message. To check this, check your VMAX TA logs to determine if data is still being collected by the VMAX TA.

 

Troubleshooting the VMAX App for Splunk

The VMAX App for Splunk has been designed in such a way that there is little or no user interaction required in order to get it running in the environment. The only use-case for when manual configuration is required is when indexes are used by the VMAX TA which differ from the default Splunk index (see installation & configuration section above for more info).

 

If you are having issues with the macros in your VMAX App, check the indexes configured for use by each of the VMAX TA data inputs, the indexes specified by each input should match those used by the VMAX App in the macros.conf configuration file. If these do match and everything appears to be correct, restart Splunk to ensure that the changes have been applied and settings are read from the VMAX App’s local directory.

 

As the VMAX App only takes the data ingested by the VMAX TA and presents it in dasboards populated with various panels, there is no other VMAX App settings or areas which could cause issues to appear within the App. If you are facing any issues in the VMAX App and the macros are configured correctly, the next place to troubleshoot is in the VMAX TA itself.

 

Known Issues

VMAX Alerts

When requesting VMAX alert information from the Unisphere REST API through the /system/alert endpoints there is no key/value pairs for the array and object ID. When the event is processed in the VMAX TA, the alert description is parsed and if an array or object ID is present, it is added before the data is indexed in Splunk. At present, most information can be parsed from the description but this does not always work.

 

An example of not being able to parse the array/object info from an alert is with array metadata usage. Using the REST API there is no IDs associated with the alert so unless the user reverts to Unisphere manually there is no way they can know what array the alert belongs to.

 

The impact of this in the VMAX TA is that when ‘Collect VMAX only metrics’ is enabled, certain alerts which do not have the array/object info in the alert description will not be ingested into Splunk. This is because this option uses the array ID key/value pair to determine if it is related to the VMAX ID specified in the data input.

 

It must be noted however, that the data for which the alert describes can be viewed within the various dashboards of the VMAX App for Splunk. For example, the array metadata usage percentage is featured as a time chart in the VMAX dashboard. This is because for each of the reporting levels, every possible piece of information pertaining to each reporting levels’ objects is retrieved from Unisphere.

If ‘Collect VMAX only metrics’ is left disabled, all system alerts from the instance of Unisphere specified in the VMAX data input will be ingested into Splunk. This also means that any system alerts for other arrays which are not added as data inputs, but present in that instance of Unisphere associated with the data input are ingested into Splunk.

 

REST Response Code 401

Occasionally when the Unisphere REST API is under heavy load of REST requests it may return a 401 response to a request from the VMAX TA. This is temporary and will clear itself, usually immediately.

 

Contacting Support

For any and all issues or queries, please contact vmax.splunk.support@emc.com. Include as much information as possible about the issue, the Operating System and Splunk versions, and associated VMAX TA logs.