Previous Topic: Working with Controller Volume Streams - 3tsrv sd getNext Topic: How to Correct BFC Status Initializing Errors in the BFC


Working with Controller Impex Volume

The impex volume is important, but not crucial for controller and grid operation. It is used as a means of importing and exporting volumes, classes and appliances, and therefore it is needed as a way for the grid and the controller to interact with the outside world (leaving aside the connections for the different applications and appliances). It does not contain, however, any file which is crucial for the controller to boot, and getting it 100% full, or corrupted, will not cause specific controller operation disruption.

The impex size volume is the only one whose size may be modified on grid installation. It is an unpartitioned volume, which makes it simpler to work with it. Its default size is 64 Gbyte, which may be too little for many customers' environments. Neither the boot nor the meta volume sizes are configurable. In a similar way, contrarily to the meta and boot volumes, the impex volume is created empty. However, since it is mounted on the controller at startup, it is not so simple to modify its size in case this is needed, or to rebuild it if necessary. This document will present procedures for verifying its integrity, increasing temporarily its size and, ultimately, altogether replacing it.

The impex volume is mounted as /dev/hdc. If doing df –kh at the controller, it will be seen as:

FilesystemSizeUsedAvailUse%Mounted on

/dev/hda11.2G873M272M77%/
/dev/hdb1008M33M925M4%/var/applogic
none 385M48K385M1%/dev/shm
127.0.0.1:/var/local/lib/3trsh1.2G873M272M77%/vol
127.0.0.1:/var/local/lib/3trsh_lib/etc1.2G873M272M77%/vol/etc
127.0.0.1:/var/local/lib/3trsh_lib/lib 1.2G873M272M77%/vol/lib
/dev/hdc63G50G11G83%/vol/_impex

Before carrying out any operation that may imply umounting the impex volume for doing fsck, resizing, etc, it must be considered that the controller runs a cron job at the hour and 42 minutes to remount it. This cron job may be found in /etc/cron.d/3tclmon:

PATH="/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:
/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/bin"
MAILTO=""
14 * * * * root /usr/local/applogic/bin/3tctlmon sys check --all < /dev/null >/dev/null 2>&1
28 * * * * root /usr/local/applogic/bin/3tctlmon vol check --ro < /dev/null > /dev/null 2>&1
42 * * * * root /usr/local/applogic/bin/3tctlmon vol check --impex < /dev/null > /dev/null 2>&1
52 23 * * * root /usr/local/applogic/bin/3tctlmon vol check --diskspace < /dev/null > /dev/null 2>&1
So as a first step, that line should be commented and cron restarted before unmounting the impex volume or performing any operation on it.
Just as a complementary note, the 3tctlmon carries out the mounting of the impex volume by invoking the /etc/init.d/3tsrh-init script, which in turn calls the /usr/local/applogic/bin/3trsh script. This script creates as well the nfs mounts shown in the controller mttab:
# mount it (export and mount via NFS - loopback graft, to get a r/o tree)
if [[ "$R" == 0 ]] ; then
if ! exportfs | grep -q "$JAIL_IMG" ; then
_exportfs -i -o ro,secure,no_subtree_check,all_squash _
"127.0.0.1:${JAIL_IMG}_lib"
_exportfs -i -o ro,secure,no_subtree_check,all_squash _
"127.0.0.1:${JAIL_IMG}"
fi
_mount -t nfs -o ro 127.0.0.1:$JAIL_IMG $JAIL_ROOT && _
_mount -t nfs -o ro 127.0.0.1:${JAIL_IMG}_lib/etc $JAIL_ROOT/etc && _
mount -t nfs -o ro 127.0.0.1:${JAIL_IMG}_lib$JAIL_L $JAIL_ROOT$JAIL_L
R=$?
fi

Checking the impex Volume

If the impex volume fails to mount or we suspect it has some data corruption, fsck may be run directly on it by unmounting it and running fsck directly on the impex volume

To do that, follow these steps:

  1. Comment the line in the /etc/cron.d/3tctlmon cron file corresponding to checking the impex volume:
    14 * * * * root /usr/local/applogic/bin/3tctlmon sys check --all < /dev/null >/dev/null 2>&1
    28 * * * * root /usr/local/applogic/bin/3tctlmon vol check --ro < /dev/null > /dev/null 2>&1
    # 42 * * * * root /usr/local/applogic/bin/3tctlmon vol check --impex < /dev/null > /dev/null 2>&1
    52 23 * * * root /usr/local/applogic/bin/3tctlmon vol check --diskspace < /dev/null > /dev/null 2>&1
    
  2. Restart cron:
    service crond restart
    stopping crond: [ OK ]
    starting crond: [ OK ]
    
  3. Umount the impex volume
    umount /vol/_impex
    
  4. Run fsck on impex
    fsck /dev/hdc
    fsck 1.35 (28-Feb-2004)
    e2fsck 1.35 (28-Feb-2004)
    /dev/hdc: clean, 161/4194304 files, 13167274/16777216 blocks
    
  5. Remount the impex volume
    mount /dev/hdc /vol/_impex
    
  6. Remove the comment from the 3tctlmon vol check --impex line in 3tctlmon file in /etc/cron.d and restart the crond service as before

Temporarily increasing the size of the impex volume

If it is necessary to increase temporarily the size of the impex volume (for instance to carry out a large import/export operation) but this does not need to be a definite solution, there is a solution documented in the forums (http://forum.3tera.com/showthread.php?p=2101) which is included here for completeness:

  1. On the controller, create an empty application and assign to it a new volume that is large enough to hold the entity you want to export or import:
    3t app create impex
    Creating descriptors for application 'impex'...
    Warning: Application 'impex' does not have an owner.
    3t vol create impex:data fs=ext3 size=128G
    Creating volume impex:data
    Warning: Application 'Sys_Filer_impex-user-data' does not have an owner.
    Preparing to create filesystem... Done
    Creating filesystem... Done
    Cleaning up... Done
    Volume 'impex:data' created with filesystem 'ext3'.
    
  2. Mount the new, temporary impex volume in the place of the regular impex volume. To do that it will be necessary to mount the volume just created as a block device in the controller and use it instead of the old impex volume.
    3tctl mount sv impex.user.data (use your application name and volume name instead of impex and data respectively)
    mount_id = /dev/md1
    
  3. 3. The associated md device indicated in step 2 is the one that will be used for mounting the provisory impex volume. Now the present impex volume needs to be umounted and the crontab line commented and crontab restarted as steps 1 to 3 of the "Procedure for checking the impex volume" section of this document.

    After this, the new provisory impex may be mounted, using the md device indicated as the result of step 2:

    mount /dev/md1 /vol/_impex
    df -kh
    Filesystem Size Used Avail Use% Mounted on
    /dev/hda1 1.2G 873M 272M 77% /
    /dev/hdb 1008M 33M 925M 4% /var/applogic
    none 385M48K385M1%/dev/shm 
    127.0.0.1:/var/local/lib/3trsh 1.2G 873M 272M 77% /vol
    127.0.0.1:/var/local/lib/3trsh_lib/etc 1.2G 873M 272M 77% /vol/etc
    127.0.0.1:/var/local/lib/3trsh_lib/lib 1.2G 873M 272M 77% /vol/lib
    /dev/md1 128G 173M 19G 1% /vol/_impex
    
  4. Now it is possible to work with the "new" impex volume for import/export operations, etc. Once done, it can be umounted, the original impex remounted, the crontab job reenabled and crond restarted:
    umount /vol/_impex
    mount /dev/hdc /vol/_impex
    
  5. The last steps consist in umounting the provisory volume as a block device in the controller, destroying the volume and the application:
    3tctl umount sv impex.user.data
    3t vol destroy impex:data
    Are you sure that you want to destroy volume 'impex:data' (y/n)? y
    3t app destroy impex
    Are you sure that you want to destroy application 'impex'? (y/n) y
    Destroying descriptors for application 'impex'...
    

Permanently Replacing the impex Volume

At times it may not be enough to replace the impex volume with a temporary copy, and we may want to create a bigger permanent volume which will replace the existing one, and at the same time preserve the old data that existed in the smaller impex volume we want to replace.

It is possible to do so by replacing the controller impex data stream by a bigger one in the node that the controller is running after copying over the files, marking it as synchronized and the on the other server as unsynchronized and restarting the controller.

These are the steps that should be followed:

  1. On the node running the controller determine what data streams correspond to the impex volumes and take a copy in both nodes, in case we need to go back to the original situation
    3tsrv sd get
    cluster
    {
    signature = "S20120402215634478274868435381"
    } 
    volume boot
    {
    mirrors
    {
    mirror v-ctl-boot: server = srv1, synced = 1
    mirror v-e69315e2-fff3-473c-97dc-af309466134a: server = srv2, synced = 1
    }
    }
    volume meta
    {
    mirrors
    {
    mirror v-ddbe6bb2-9cfc-4c4c-b8e0-95b1f9f99208: server = srv1, synced = 1
    mirror v-1b0ecc0e-782c-4ddb-b8cb-b7981560fd3e: server = srv2, synced = 1
    }
    } 
    volume impex
    {
    mirrors
    {
    mirror v-ctl-impex: server = srv1, synced = 1
    mirror v-f821f907-4ae0-4044-87ea-6dd39cefe9fd: server = srv2, synced = 1
    }
    } 
    server srv1: ha_role = primary
    server srv2: ha_role = secondary
    server lodg6srv8.lod.ca.labs.com: ha_role = reference
    cd /var/applogic
    mkdir impex_back
    cp /var/applogic/volumes/vols/v-ctl-impex /var/applogic/impex_back/v-ctl-impex_back
    

This step should be repeated as well on srv2 for its data stream. i.e. v-f821f907-4ae0-4044-87ea-6dd39cefe9fd in this example

  1. Next we are going to determine the block size for the impex volume. This will probably not change with respect to old versions and in future versions, but just in case it is good to see it. This will be done by determining what md device the impex volume is associated to and then running the sfdisk command against it:
    3tsrvctl list mounts | grep impex
    mnt.srv1.SYSTEM:_sys.impex2 /dev/md3
    sfdisk -uB -l /dev/md3
    Disk /dev/md3: 16777216 cylinders, 2 heads, 4 sectors/track
    sfdisk: ERROR: sector 0 does not have an msdos signature
    /dev/md3: unrecognized partition table type
    

    You can determine the block size by doing.

    blockdev --getbsz /dev/md3
    4096
    
  2. Now we are going to create two data streams with the new size of impex, associate them to loop devices, and create a new md device with them, which we will format:
    dd if=/dev/zero of=/var/applogic/impex_back/new-impex bs=4096 count=20000000
    20000000+0 records in
    20000000+0 records out
    81920000000 bytes (82 GB) copied, 690.235 seconds, 119 MB/s
    losetup -f
    /dev/loop0
    losetup /dev/loop0 /var/applogic/impex_back/new-impex
    dd if=/dev/zero of=/var/applogic/impex_back/new-impex2 bs=4096 count=20000000
    20000000+0 records in
    20000000+0 records out
    81920000000 bytes (82 GB) copied, 832.347 seconds, 98.4 MB/s
    losetup -f
    /dev/loop1
    losetup /dev/loop1 /var/applogic/impex_back/new-impex2
    mdadm -create /dev/md150 -level=1 --raid-device=2 /dev/loop1 /dev/loop0
    mkfs -t ext3 /dev/md150
    mke2fs 1.41.1 (01-Sep-2008)
    Filesystem label=
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    5005312 inodes, 19999984 blocks
    999999 blocks (5.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=0
    611 block groups
    32768 blocks per group, 32768 fragments per group
    8192 inodes per group
    Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424
    Writing inode tables: done
    Creating journal (32768 blocks): done
    Writing superblocks and filesystem accounting information: done 
    This filesystem will be automatically checked every 29 mounts or
    180 days, whichever comes first. Use tune2fs -c or -i to override.
    
  3. Once the raid is created let's mount it to a directory, as well as the stream we have copies, and transfer all data from the old data stream to the new one:
    mkdir /var/applogic/impex_back/source
    losetup -f
    /dev/loop2
    losetup /dev/loop2 /var/applogic/impex_back/v-ctl-impex_back
    mdadm --assemble /dev/md160 -force --run /dev/loop2 
    mdadm: /dev/md160 has been started with 1 drive.
    mount /dev/md160 /var/applogic/impex_back/source
    mkdir /var/applogic/impex_back/dest
    mount /dev/md150 /var/applogic/impex_back/dest
    cd /var/applogic/impex_back/source
    tar cf - . | (cd /var/applogic/impex_back/dest; tar xvf -)
    

./

./tmp/
./tmp/apk-install
./gilmi06_oni_prf_test_w2k8.tar
tar: ./gilmi06_oni_prf_test_w2k8.tar: file changed as we read it
./lost+found/
./Sys_Filer_Windows08-3.5.14.tar
tar: ./Sys_Filer_Windows08-3.5.14.tar: file changed as we read it
  1. Next we can umount the md devices, stop them and destroy the loop devices:
    umount /var/applogic/impex_back/source
    umount /var/applogic/impex_back/dest
    mdadm -stop /dev/160
    mdadm -stop /dev/150
    losetup -d /dev/loop1
    losetup -d /dev/loop0
    losetup -d /dev/loop2
    rm -rf /var/applogic/impex_back/new-impex2
    

It is now necessary to replace the old data stream by the new one in the controller machine, mark it as good and the one on the other server as bad and reboot the controller. When it comes up it will come with the new impex volume. First we will umount the /vol/impex in the controller and comment the 3tctlmon as indicated in steps 1 to 3 in the "Procedure for checking the impex volume{_}" section of this document.

  1. Next go back to the server which has the controller and mark the stream for the impex volume as good in it and as bad on the other server, then reload it
    3tsrv sd get > /tmp/meta_file
    cluster
    {
    signature = "S20120402215634478274868435381"
    } 
    volume boot
    {
    mirrors
    {
    mirror v-ctl-boot: server = srv1, synced = 1
    mirror v-e69315e2-fff3-473c-97dc-af309466134a: server = srv2, synced = 1
    }
    }
    volume meta
    {
    mirrors
    {
    mirror v-ddbe6bb2-9cfc-4c4c-b8e0-95b1f9f99208: server = srv1, synced = 1
    mirror v-1b0ecc0e-782c-4ddb-b8cb-b7981560fd3e: server = srv2, synced = 1
    }
    } 
    volume impex
    {
    mirrors
    {
    mirror v-ctl-impex: server = srv1, synced = 1
    mirror v-f821f907-4ae0-4044-87ea-6dd39cefe9fd: server = srv2, synced = 0
    }
    } 
    server srv1: ha_role = primary
    server srv2: ha_role = secondary
    server lodg6srv8.lod.ca.labs.com: ha_role = reference
    
    3tsrv sd set file=/tmp/meta_file
    
  2. The next step is to bring down the controller and destroy the mounts
    xm destroy controller (stopping the necessary heartbeat services)
    3tsrvctl list mounts | grep impex
    mnt.srv1.SYSTEM:_sys.impex 2 /dev/md3
    3tsrvctl destroy mount mnt.srv1.SYSTEM:_sys.impex
    3tsrv bd list | grep md3 (get the hoop devices associated here)
    mdadm -stop /dev/md3 (if needed)
    hosetup -d /dev/hoop2
    
  3. Now copy in place of the old impex data stream, the new one
    cp -pr /var/applogic/impex_back/v-ctl-impex_back /var/applogic/volumes/vols/v-ctl-impex
    
  4. Reboot the controller (3tsrv set role=primary --recover) and it should come up with the new controller impex volume size, which it will synchronize to the other server. Make sure to leave the cron job active again