Backing Up Archived Files
Archiving and backing up data are different operations:
- Archiving is when you make a copy of a data set for long-term preservation of the system, with no regard to the state of the system.
- Backing up is when you make a copy of a data set with the express goal of system recovery in case of failure.
Preserving the integrity of archived data and the corresponding indexes is critical. Like all electronic data, archived data is subject to risks unless properly managed with respect to integrity and recoverability. Indexes can be rebuilt from the source data (i.e., archives), but if source data is destroyed, it cannot be regenerated from the indexes.
The following four data types have to be backed up:
- Indexes, stored as .XML
- Email, stored as .XML
- Attachments, or ATT, stored in their native formats
- Audits, or ADT, stored as .XML
- System configurations, stored as LDAP
When considering the various data backup solutions available for protecting archived data, you must consider the following:
- Data type
- Volume of data
- Required backup frequency
- Recovery requirements.
For example, audit files consist of millions of small documents and require an appropriate tool designed to handle them. The average user sends and receives approximately 7,500 messages per year, and in a company with 1,000 users, this amounts to over 7.5 million messages per year. This all needs to be taken into consideration.
Solr has a built-in ability to deposit the contents of its shards to a specified location. This functionality can be used to back up index files. Using the web console, a backup can be triggered much like a regular archive job.
- Create a shared location.
- The backup assumes that all Solr servers in the cluster can see the same location using the same path. The easiest way to do this is with a network share mounted to the same place on all servers.
- Create a folder on the C:\ drive of the master Archive server called C:\SolrBackup.
- Share this folder on the network, accessible to the 'netmail' local user (read/write permissions).
- Mount the shared location.
- Since this is a Windows fileshare, you need to install CIFS components on the CentOS machines so they can read/write to it:
- yum install samba-client samba-common cifs-utils
- Create a mount point:
- mkdir /mnt/backup
- Edit the /etc/fstab file to add this line, which will mount this fileshare to the mount point mentioned above:
- \\MasterArchive\SolrBackup /mnt/backup cifs user,uid=500,rw,suid,username=netmail,password=M3ss4g1ng,domain=MasterArchive 0 0
- Instruct the OS to mount it:
- mount /mnt/backup
- Test the location by writing and copying a file to /mnt/backup. It should appear in C:\SolrBackup on the master Archive server.
- Configure the job.
- Log into the Netmail web console and navigate to Indexing.
- In the Backup tab, name the backup job.
- Specify the path that the index servers must use to access the shared location, i.e., the mount point we created earlier (/mnt/backup).
- Click Save.
- Run the job.
- In the Backup tab, click Run. Assuming everything is configured correctly, data will appear on C:\SolrBackup.
The following steps show how to identify the email files that need to be backed up.
If you are using IPRO Archive Store, copies of the email data can be made across disks, clusters, and sites. In this case, traditional backup is not required.
The following steps show how to identify the attachment files that need to be backed up.
If you are using IPRO Archive Store, copies of the attachment data can be made across disks, clusters, and sites. In this case, traditional backup is not required.