Find Communities by: Category | Product

NetWorker, Avamar, SHA-1 Certificates, and You

 

As you may be aware, the major browser vendors are gradually (or not so gradually) sunsetting support for SSL certificates signed using the SHA-1 hashing algorithm. This is coming very, very soon. As you may also be aware, Dell EMC released a technical advisory because certain NetWorker and Avamar components use certificates signed using SHA-1:

 

ETA 493820: Avamar, NetWorker: Browser support for SHA-1 Certificates expiring January 1, 2017 may cause incompatibility with Avamar and NetWorker Virtual Edition Browser UI functionality

https://support.emc.com/kb/493820

 

If you use Avamar or NetWorker, you should review this ETA since this may impact operations in your environment

 

Note: While I do work for Dell EMC, this post is not an official Dell EMC document. The ETA document is the official Dell EMC response to this issue. Any information provided here is provided as-is by me personally and should only be used at your own risk.

 

Frequently Asked Questions

 

Q: Will this impact my backups?

A: No.

 

Q: What types of Avamar nodes are affected by this issue? Does this issue affect both physical Avamar nodes and Avamar Virtual Edition?

A: This issue affects physical Avamar nodes of all supported hardware configurations as well as Avamar Virtual Edition. The issue affects services running on the Utility Node (or Single Node for Single Node Servers).

 

Q: Is the remote access hardware (DRAC, RMM, etc.) on physical Avamar Servers affected by this issue?

A: For Gen4 and Gen4S, yes. You can use http instead of https as a workaround. The tools and procedures for replacing these certificates need to be tidied up before they can be made available for customer use. This effort is ongoing. The SSL certificate used for the remote access interface on Gen4T hardware uses a stronger signature algorithm, so Gen4T is not affected by this issue.

 

Q: What types of NetWorker servers are affected by this issue? Does this issue affect physical NetWorker servers, NetWorker servers installed in virtual environments, or the NetWorker Virtual Edition (NVE) appliance?

A: This issue affects only the NetWorker Virtual Edition (NVE) appliance. The issue does not affect physical NetWorker servers or virtual NetWorker servers where the NetWorker software has been installed on a "beige box" Windows or Linux virtual machine.

 

Q: Is the NetWorker Virtual Backup Appliance (VBA) affected?

A: Yes.

 

Q: Is any other NetWorker software affected?

A: No. All other NetWorker certificates use SHA-256 or SHA-512 signatures.

 

Q: Exactly what problem(s) will this issue cause?

A: Starting with Chrome 56, the Chrome browser can no longer access certain browser-based interfaces on the Avamar Server, AVE, NVE, or VBA. For other browsers, an additional warning message will be displayed indicating that the server's certificate is using a weak signature algorithm. If you browse to https://myserver.example.com to access the "Documents and Downloads" page, you will receive a certificate error. Depending on the URL used to access certain features, there may also be issues accessing Avamar Installer, DTLT, or other browser-based services.

 

Q: What Avamar software services are affected?

A: The Apache Web Server and any interfaces that use it are affected. That means only browser-based services like the Documents and Downloads page, DTLT, Avamar Client Manager, Proxy Deployment Manager, etc. will be affected. The Avamar Extended Retention (AER) GUI is affected.

 

Q: Are the Avamar Administrator Server (MCS) or Avamar Administrator GUI (MC-GUI) affected?

A: No. While the MCS does use a SHA-1 certificate, these services do not use a browser engine and are therefore unaffected by this issue. The procedure for replacing this certificate is in development.

 

Q: How do I fix it?

A: The affected certificates will need to be replaced. See KB 493774 for the instructions to replace the Apache Web Server certificate and KB 467848 for the instructions to replace the AER GUI certificate. Ideally, these certificates should be replaced with certificates signed by an internal or external certificate authority (CA) but there are also instructions in the KB for regenerating the certificate as a self-signed certificate using SHA-256.


Q: One of the commands in the KB failed.

A: Probably a typo. Please, please copy and paste. If you get really stuck, feel free to post in the Avamar or NetWorker Community (I watch both) or reply here.


Q: I completed the procedure in KB 493774. Why does my certificate still say it's using SHA-1?

A: Make sure you're looking at the signature algorithm, not the certificate fingerprints. It's the signature algorithm that matters.


Q: Can I contact support about this?

A: Yes but please don't. The instructions for checking and replacing the certificates are fairly straightforward and we're expecting this issue will probably put a strain on the support team as it is.


Q: Can I use SHA-512 instead?

A: Go for it. In the openssl command for signing the certificate, replace the -sha256 flag with -sha512 instead.


Q: Does this issue affect the security of the system?

A: Not yet. While SHA-1 has been broken (see the "Shattered" attack), it would still take significant time (days or months) and resources (potentially six figure dollar amounts) to generate a collision. It's time to walk to the exits in an orderly fashion (by replacing any SHA-1 certificates still around with more secure SHA-2 or SHA-3 certificates), not time to panic.


Entirely apart from the weakness of SHA-1 certificates, if you're still using the self-signed certificates that ship with these systems, worrying about the SHA-1 signatures is akin to worrying that the picture on your fake ID might be fake. Self-signed certificates provide no assurance of identity. If you really want to make sure your system is secure, you must use certificates signed by a certificate authority.


Q: I heard Avamar uses SHA-1 internally. Is that true? Isn't that a security risk?

A: The Avamar de-dupe and storage mechanisms use SHA-1 hashes internally. However, these hashes are not used in a security-sensitive context. Avamar also has a number of mechanisms in place to detect SHA-1 collisions and help protect the integrity of the data even in the event of a collision. You're much more likely to be affected by disk errors than SHA-1 collisions in the de-dupe engine.


Q: The Webkit svn repository fell over when somebody checked the "Shattered" PDFs into it. Would backing up these two PDFs cause an issue on the Avamar server?

A: No. Avamar uses sub-file hashing when backing up and restoring files. The Avamar architecture is resilient against this type of issue since we don't use the SHA-1 hash of the whole file to uniquely identify it. Since there's sometimes a difference between theory and practice, I also specifically tested this scenario in the lab. The two PDFs backed up and restored without issue and there was no impact to the Avamar server.


Q: I have another question.

A: Please reply here or post in the Avamar or NetWorker community.

One of my colleagues recently forwarded on a question he received about how to find the count of virtual machines by domain. Since the machine type is stored in the MCS database, a SQL query of the mcdb views seemed like the most reasonable way to retrieve the data.

 

The first step is to connect to the mcdb as the viewuser. Details on various ways to do this are included in the Administration Guide.

 

I connected to the database by logging into the Avamar utility node as the admin user and running the following command at the shell:

psql -p 5555 -U viewuser mcdb

 

Once in the psql terminal, the following query will return a table with the VM counts:

SELECT substring(full_domain_name FROM '(.*)/[^/]+$') AS domain, count(client_name) AS vm_count
FROM v_clients
WHERE client_type='VMACHINE'
GROUP BY domain;

 

Here's an example of the output from a test system:

mcdb=# SELECT substring(full_domain_name FROM '(.*)/[^/]+$') AS domain, count(client_name) AS vm_count FROM v_clients WHERE client_type='VMACHINE' GROUP BY domain;

                  domain                  | vm_count

------------------------------------------+----------

/vcenter.example.com/VirtualMachines     |        4

/MC_RETIRED                              |        1

/vcenter.example.com                     |        1

(3 rows)

Ian Anderson

Hotfix Numbering Jump

Posted by Ian Anderson Aug 11, 2014

This is more of a tidbit but I thought it was worth posting. Some of you may have noticed that there was a huge jump in Avamar hotfix numbers. This jump was caused by a tools consolidation inside DPAD. The bug database numbers jumped when bug data from two separate bug tracking systems was merged.

On Backup Jobs in Avamar

It's been a while since I've posted anything here so I thought I would put together a short-but-hopefully-informative overview of the various types of backup jobs you might see in Avamar.


Scheduled Backups

Scheduled backups are started by the Avamar Administrator Server (MCS) based on the configured groups and their associated schedules. These backups appear in the activity monitor and report the group name and schedule name in the description of the job, the log files, etc.. It's possible to manually start a "scheduled" backup by opening the "Manage Schedules..." page in the GUI, selecting a schedule and using the "Run Now" button.

 

On-Demand Backups

There are two types of on-demand backups in Avamar.

 

"MOD" or "MCS On-Demand" backups are one-off backups initiated through the Avamar Administrator GUI or mccli. Manually starting a group or using the "Backup and Restore" window in the GUI will run an MCS On-Demand backup.

 

The second type of on-demand backup is the "COD" or "Call of Duty" "Client On-Demand" backup. Client On-Demand backups are started through the Avamar tray icon on the client itself. DTLT is a special case here -- backups started through the DTLT interface are not COD backups because the backup request comes from the DTLT application, not from the client itself.

 

For MOD and COD backups, you will see those terms in the job descriptions and log file names.

 

Naked Ad-Hoc Backups

Finally, we have "NAH" or "Naked Ad-Hoc" backups. For other backup types, the MCS generates a workorder that is picked up by the agent on the client and executed. Naked Ad-Hoc backups start up avtar directly and do not use a workorder. NAH backups are mainly used when an outside application has control over job scheduling, such as with NetWorker / Avamar integration or Oracle RMAN backups. NAH backups do not appear in the Avamar Administrator's Activity Monitor.

A fair number of the questions I've answered recently on the Avamar required some use of the avtar client binary at a command prompt so I figured I would gather up some of the knowledge here so I can just post a link to this entry in case these types of questions come up again.

 

avtar is like tar

The first thing to learn about avtar is that the syntax was designed to be similar to the UNIX / Linux tar command (avtar -- get it?). When I mention this to my training classes, you can almost see the light bulb turn on.

 

To get a list of files and directories:

tar -tf filename.tar <options>

avtar –t <options> file1 file2 dir1

 

To extract (restore) files and directories:

tar -xf filename.tar <options> file1 file2 dir1

avtar –x <options> file1 file2 dir1

 

The avtar commands can be run from any system where the client software is installed or the utility node of the server. Note that it is not possible to restore files to a remote system using avtar (the files will always be restored locally). There are several options that are required for avtar to make a connection to the server:

 

--server=<server hostname or IP>

--id=<username>

--ap=<password>

--path=<client domain and account name>

 

Edit: You may also see the client domain and account name specified using the --account flag. The --account and --path flags are interchangeable.

 

Example 1

avtar -t --server=avamar1.example.com --id=ianderson@avamar/ --ap=Password1 --path=/clients/testclient.example.com

 

This example command would:

  1. attempt to log into the Avamar server using the user account "ianderson" in the / (root) domain
  2. get a listing of the files in the most recent backup for the client "testclient.example.com" in the /clients domain

 

Other avtar Commands

There are some other avtar commands that can be useful as well:

avtar --backups

avtar --showlog

 

The "backups" command will list the backups of the specified client that are on the server.

 

The "showlog" command will print out the log of the session that created the specified backup (up to the point where the backup was "sealed" or finalized on the server).

 

The required options from above (server, id, ap and path) are also required for these commands.

 

Useful avtar Options

Besides the required options above, there are some additional avtar options that may be useful depending on your needs:

 

--encrypt=ssl

Enable SSL encryption.

 

--labelnum=<backup number>

Operate on the specified backup instead of the most recent backup (you can find backup label numbers in the GUI, by using avtar --backups or by using mccli backup show).

 

--target=<path where files should be written>

During restore, write the files and folders being restored to the specified directory or folder.

 

Example 2

avtar -x --server=avamar1.example.com --id=ianderson@avamar/clients --ap=Password1 --path=/clients/testclient.example.com --labelnum=214 --target=. /home/testuser/testfile.txt

 

In this example, the command would:

  1. attempt to log into the Avamar server using the user account "ianderson" in the /clients domain
  2. restore a file called /home/testuser/testfile.txt from backup number 214 of the client "testclient.example.com", writing the data to the current working directory

 

Avamar User Authentication Syntax

By far the most confusing part of the avtar syntax is the --id parameter so I'll break it down a bit.

 

We're going to be using the word "domain" a lot here and (unfortunately) it means at least two different things in Avamar:

  • An authentication domain is how the software decides whether to query an external authentication system (LDAP or NIS) or use the built-in Avamar authentication mechanism. The authentication domain for an LDAP or NIS server will normally be the DNS domain name associated with the directory service, e.g. example.com). For built-in authentication, we use the hard-coded "avamar" authentication domain.
  • An Avamar domain is a container in the Avamar server's user accounting system. For example, when we talk about the "clients domain", we are referring to the Avamar domain called /clients. The top level container is called the root domain and it is denoted by the slash character / (like in UNIX or Linux). These containers can hold other domains (called sub-domains), Avamar user accounts, and client accounts (also called machine accounts).

 

Now that we have (hopefully) clarified what we're talking about when we say "domain", the general syntax of the --id parameter is as follows:

--id=<username>@<authentication domain>

 

The examples given above use the "avamar" authentication domain which is the built-in Avamar authentication mechanism. For accounts in the avamar authentication domain, an additional parameter -- the user account path -- is required. The syntax then becomes:

--id=<username>@avamar<user account path>

 

This user account path is the Avamar domain containing the specified user account (e.g. /clients). In the previous example, the id flag used was:

--id=ianderson@avamar/clients

 

This flag tells the Avamar server that the user account uses Avamar authentication, that the account is in the /clients Avamar domain and that the account name is "ianderson". Because external authentication requires an authentication domain, Avamar authentication is assumed if the authentication domain isn't specified. This means we could abbreviate the example to:

--id=ianderson@/clients

 

If instead we were using LDAP authentication, the --id parameter would look something like this:

--id=ianderson@example.com

 

Notice that we don't specify an Avamar domain here. We don't need the Avamar domain because this account is not an Avamar user account.

 

Note: Unfortunately avtar only supports legacy LDAP authentication -- the avtar command line does not support the LDAP Maps feature that was introduced in Avamar 6.1. The developers are aware of this and support for LDAP Maps is planned for a future release.

 

A Word of Warning

The avtar binary also has the ability to create new backups. As a general practice, it is not recommended to run avtar command line backups. There are several reasons this is not recommended:

  • The Avamar Administrator Server is not aware of these command line backups (called "naked ad-hoc" backups) so they cannot be monitored or managed through mccli or the Avamar Administrator GUI (though they are visible using the session monitor)
  • It's easy to make a mistake. Specifying the wrong path may write a backup to the wrong account on the Avamar server, for example.
  • Running avtar at the command line may cause the password for the Avamar user account to be visible in plain text in process lists, history files, etc.. One way to avoid this is by omitting the --ap flag, in which case avtar will prompt for a password.

If command line backups are needed, the recommended approach is to use mccli. The mccli utility is available for RHEL systems (and SLES systems starting in Avamar 6.1.1-87) and can also be run from the Avamar utility node. Ignore this advice at your peril.

 

Wrap-up

So those are the basics of avtar. If you need more information on the syntax, you can find some information in the client guide or in the output of avtar --help. If you need some specific guidance or you have any questions, please don't hesitate to comment here, post on the Avamar or get in touch with support.

Waiter, I'll Have the Daily Maintenance with a Side of Backups

If you've been paying attention to the Avamar space, by now I'm sure you've heard the news that garbage collection in the next Avamar release will no longer be a read-only operation1. Great! Fantastic news. Now backups can run... whenever! Any time! 24x7!

 

Unfortunately, as the saying goes, there's no such thing as a free lunch. So what's this lunch going to cost?

 

Performance Impact

Server resources are finite and during periods of strenuous activity, the bottleneck on the server is generally the storage I/O performance. The most intensive I/O operations on an Avamar server are:

  • Garbage collection
  • The daily data integrity check (a.k.a. hfscheck), specifically the "indexsweep" phase and the "refcheck" phase.
  • The peak of backup activity (which is typically shortly after the start of the backup window)

 

The performance team is currently running tests to quantify the impact of overlapping garbage collection and backups but the bottom line is that -- due to resource contention -- overlapping these operations is guaranteed to take longer than allowing them to run serially.

 

Session Limits

The same session limits that apply during hfscheck will apply during garbage collection. As of today, this limit is set at 20 sessions per node but this may change before release.

 

Gotchas

I'll have to give a little bit of background on the changes to the garbage collect algorithm that allow backups to run at the same time before I can explain one of the most important "gotchas" to this new functionality.

 

Essentially, a new flag was added to the system that records whether or not each chunk2 was recently referenced by a backup. If any client makes a move that looks like it might lead to that chunk being added to a new backup, the chunk will be flagged as referenced. The garbage collector will operate as it always did (finding unreferenced chunks and deleting them), except that the deletion of any chunk with this "referenced" flag set will be aborted.

 

Once garbage collection has finished and the number of running backups3 drops to 0, all of these "referenced" flags will be reset. In order for these flags to reset, the system must be idle -- no backups, no incoming replication, and no garbage collection. If the system is configured to run backups 24x7 with no idle time at all, it's theoretically possible for all of the data on the system to remain locked indefinitely. The flag reset is extremely fast (less than a minute) but it's critically important. Compared with a daily three to five hour read-only blackout window, this is a pretty dramatic improvement (but that hasn't stopped the support team from starting a pool about how long it will take for a customer to open an SR because they've been bit by this).

 

Best Practices

The best practices are currently preliminary (the software isn't out yet so these may change once the product starts to roll out to the field) but the current recommendations from Engineering are:

  • Avoid overlapping backups with garbage collection unless absolutely necessary.
  • If you must overlap backups and garbage collection, try to schedule the overlap for the period of lowest backup activity on the system. For most customers this will be at the end of the backup window when only the longest running clients are still backing up.
  • Ensure that there is a brief window every day where the system is completely idle to give the system has a chance to reset the "referenced" flags.

 


1 Consider this the standard disclaimer about features and release dates being subject to change.

2 Some detail removed for the sake of simplicity.

3 A "running backup" for the purposes of garbage collection is any backup sending a data stream to the Avamar server. Restore sessions, "progress" avtar sessions, and (most) Avamar / DDR integrated backups do not count.

I recently received a tweet from @nicnicGS3 asking about Avamar capacity information in the PostgreSQL database. You can find information about the capacity of the system in the v_node_space view.

 

For this example, I've used psql on the utility node to log into the database directly but any of the connectivity methods from the Avamar Administration Guide can be used.

 

If you want to see the capacity utilization on a per-node and per-partition basis, you can use the following SELECT statement:

 

SELECT node, disk, utilization FROM v_node_space WHERE date_time = (SELECT MAX(date_time) FROM v_node_space);

 

Here's an example of the output:

mcdb=# SELECT node, disk, utilization FROM v_node_space WHERE date_time = (SELECT MAX(date_time) FROM v_node_space);

node | disk | utilization

-----+------+-------------

0.0  |    0 |      12.92

0.0  |    1 |      12.77

0.0  |    2 |      12.77

0.1  |    0 |      12.92

0.1  |    1 |      12.77

0.1  |    2 |      12.92

0.2  |    0 |      12.92

0.2  |    1 |      12.77

0.2  |    2 |      12.62

0.3  |    0 |      13.08

0.3  |    1 |      12.77

0.3  |    2 |      12.92

(12 rows)

 

If instead you want to see the overall server capacity (the number that would appear as the overall utilization of the system in the Avamar Administrator GUI), you can run the following SELECT statement instead.

 

SELECT MAX(utilization) as system_utilization FROM v_node_space WHERE date_time = (SELECT MAX(date_time) FROM v_node_space);

 

Here's an example of the output from the same system:

mcdb=# SELECT MAX(utilization) as system_utilization FROM v_node_space WHERE date_time = (SELECT MAX(date_time) FROM v_node_space);

system_utilization

--------------------

              13.08

(1 row)

 

Notes:

1. The capacity information in the mcdb may be stale by up to 5 minutes.

2. The utilization numbers stored in the v_node_space table are normalized to the user capacity limit of the system (in other words, the utilization numbers are out of 100%, not out of the 65% you would see in tools such as status.dpn).

Important Edit (2013-03-06): *** The node add workflow for Avamar 6.1 has been temporarily withdrawn. The manual node add procedure is live in the procedure generator now. ***

 

I recently came across a tweet by @MennodeLiege linking to a post on his blog discussing the new Avamar 6.1 Node Add procedure (and some of its shortcomings). Rather than posting a lengthy comment, I thought it would be worthwhile to discuss it here. In particular, I wanted to provide some explanations about certain quirks of the process.

Besides this, you need to wipe [all] the files below the /etc/sysconfig/network folder.

...

 

When the node addition procedure starts, you need to be sure the networking configuration is wiped out of the specified location. Why is EMC doing this? First of all, they want to make it easier for guys like us to add nodes to the grid.

The main goal of the "workflow" (the software bundled in the AVP package that drives the installer) is definitely to make the node add procedure easier for people performing it in the field.

 

Avamar nodes shipped fresh from the factory do not have an IPv4 address assigned to them. Knowing this, you might wonder why the procedure has instructions to remove the network configuration from the node. Well, one of the other things the nodes do not have when shipped from the factory is the kernel update for the 208 day uptime bug. The workflow currently does not have the ability to install this patch on the node(s) automatically, so until the workflow is updated to perform this task, the patch has to be installed manually. To copy the patch over to the new node, the node has to have an IP address. Once the patch has been applied, the IP is removed, the node is rebooted and the workflow can proceed.

The workflow will now try to detect new nodes connected to the internal switches by using somekind of broadcast.

This part is actually pretty clever. The workflow running on the utility node is actually querying the switch to find out the MAC address of the new node and transforming that MAC address into an IPv6 Link Local Address. Knowing this link local address lets us connect to the new node on the internal network before we have assigned the internal IPv4 address to it. How cool is that?

 

At this point I should mention that the steps to install patches and remove the network configuration only apply to Gen4 hardware. We can't use this switch magic on Gen3 (or older) hardware since we don't have internal switches on older ADS hardware. That means Gen3 nodes have to have an IP configured.

Now he will compare this to the dpnnetutil.xml file, which is a file you generated at the initial grid installation. This means that when you changed the networking configuration on an Avamar OS level after the initial installation, the dpnnetutil.xml is not up to date anymore. Now the node addition workflow will give you all kinds of errors resulting in contacting EMC support most of the time.

 

Now you manually updated the dpnnetutil.xml file, which cost you quite some time I must say! Now you can proceed with the workflow.

I have bad news, I have good news and I have better news.

 

The bad news is that, unfortunately, you're stuck manually updating dpnnetutil.xml for the time being. If you run into trouble, please don't hesitate to call support since we have several people in every time zone who are experienced with updating this file.

 

The good news is that my team (the Application Engineering Team) is developing a tool to re-synchronize the dpnnetutil.xml file. This tool should be available to field personnel in the next couple of weeks.

 

The better news is that long term, the workflow will be updated to re-synchronize the dpnnetutil.xml file automatically.

In the dpnnetutil procedure, even when using subinterfaces, you need to specify an IP configuration for the Bond0, besides doing this for the subinterfaces. Something you can easily workaround by configuring the networking part by yourself without using dpnnetutil. Now this is not possible anymore! Even the node addition workflow needs to have an IP configured on the Bond0 or else the workflow will fail again and you again need to contact EMC support. I needed to do this and it took EMC 4 days to find a solution for this issue. Besides this, they do not want to share this solution with me at this moment which is something I can understand.

Unfortunately during development of the workflows, an assumption was made that there would always be an IP address on the "untagged" bond0 interface. This assumption has carried through to the current workflows. The developers are aware of the problem and are working on removing this limitation but there is a fair amount of code that has to be changed. I do not know when this limitation will be removed but in the meantime, the support team has a workaround available.

 

The reason the workaround isn't being released publicly is that it involves "short-circuiting" some very important configuration and health checks of the server so those checks need to be done by hand or Very Bad Things might happen.

 

If you have any questions or comments on this post specifically or about the new node add procedure generally, please don't hesitate to leave a comment here and I will answer as quickly as I can.

 

If you're interested in test-driving the dpnnetutil.xml synchronization tool ("nodedb_sync") and you're willing to provide feedback on your experience with it, send me a PM with your e-mail address. I may be able to arrange early access for you.

 

Edit: Removed roaming herds of typos.

Filter Blog

By date:
By tag: