Previous Topic: Configure a Data Repository Backup to the Same Host (Single-Node and Cluster Installations)Next Topic: Installing the Data Aggregator Component


Configure Data Repository

Configure Data Repository for automated backups.

Follow these steps:

  1. Log in to Data Repository as the Linux user account for the database administrator user.

    Note: In a cluster installation, you can log in to Data Repository from any of the three hosts that is participating in the cluster. However, we recommend logging in to the Data Repository host that will initiate the backups.

  2. To create a reusable configuration script to use to back up and restore Data Repository, type the following command as the Linux user account for the database administrator user:
    /opt/vertica/bin/vbr.py --setupconfig
    

    Note: We recommend launching this command in the target directory for the configuration file. The Linux user account for the database administrator user must have privileges to write to that directory.

    You are prompted to provide answers to various questions and statements. The list of questions and statements and a description of their typical answers are as follows:

  3. Back up Data Repository. Type the following command:
    /opt/vertica/bin/vbr.py --task backup --config-file configuration_directory_path_filename
    
    configuration_directory_path_filename

    Indicates the directory path and filename of the configuration file you created previously. This file is located where you ran the backup utility (/opt/vertica/bin/vbr.py).

    For example:

    /opt/vertica/bin/vbr.py --task backup --config-file /home/vertica/vert-db-production.ini
    

    If you are prompted about the authenticity of the host, answer yes.

    Note: In a cluster installation, you only have to perform this step on one of the hosts that are participating in the cluster.

    Data Repository is backed up.

  4. (Optional) If you do not want to retain the Data Repository password in clear text for future manual backups, do the following steps:
    1. Verify that the following line exists under the [Database] section:
      dbPromptForPassword = True
      
    2. Remove the following line from the [Database] section:
      dbPassword = password
      

    Note: For automated backups, the dbPassword line must remain in the configuration file with a corresponding password. Set the dbPromptForPassword to False.

  5. Do the following to set up an automated daily backup (recommended) of Data Repository:
    1. Open your preferred text editor to create a new wrapper shell script.
    2. The contents of the wrapper shell script should contain the following single line:
      /opt/vertica/bin/vbr.py --task backup --config-file configuration_directory_path_filename
      
      configuration_directory_path_filename

      Indicates the directory path and filename of the configuration file you created previously. This file is located where you ran the backup utility (/opt/vertica/bin/vbr.py).

      For example:

         /opt/vertica/bin/vbr.py --task backup --config-file /home/vertica/vert-db-production.ini
      
    3. Save the contents to a new file named backup_script.sh in a location of your choice.

      For example:

      /home/vertica/backup_script.sh
      
    4. Change permissions for running the script by typing the following command:
      chmod 777 location_backup_script.sh/backup_script.sh
      

      For example:

      chmod 777 /home/vertica/backup_script.sh
      
    5. As the Linux user account for the database administrator user type the following command:
      crontab -e
      
    6. Add a cron job that will run the backup script that you created previously.

      Note: We suggest that you create a cron job to run the script daily at an off-peak time.

      For example:

      00 02 * * *   /home/vertica/backup_script.sh >/tmp/backup.log  2>&1
      

      This example cron job will run the backup script every day at 2:00 AM.

      Important! The first time you back up Data Repository, a full backup is done. This full backup can take a considerable amount of time to complete, and depends on the amount of historical data that exists. Once an initial backup has been performed, subsequent scheduled backups will be incremental. In the case of a daily backup, an incremental backup will have to account for database activity that has occurred within the last 24 hours only (for example, amount of time that has passed since the last backup).