August 2019 ~ Dilli's Oracle/MySQL Linux Blog

Voting Disk

Voting Disk is a file resides on shared storage and manages cluster members. It manage information about node membership. Each voting disk must be accessible by all nodes in the cluster.

The voting disk is used as a central reference for all nodes and keeps the heartbeat information between nodes. If any of node is unable to ping the voting disk, the cluster immediately recognizes the communication failure and evicts the node from cluster. Voting disk reassigns cluster ownership between the nodes in case of failure. Minimum 1 and maximum 15 copy of voting disk is possible.

It can be seen that number of voting disks whose failure can be tolerated is same for (2n-1) as well as 2n voting disks where n can be 1, 2 or 3. Hence to save a redundant voting disk, (2n-1) i.e. an odd number of voting disks are desirable.

# View voting disk location

crsctl query css votedisk

# Backup voting disk

The voting disk data is automatically backed up in OCR as part of any configuration change so you do not have to perform manual backups of the voting disk.

# Adding voting disk.

You cannot directly add voting disk from Oracle Database 11g Release 2 onwards. Instead we can add new diskgroup with desired redundancy and relocate it to new diskgroup. This will provide additional voting disk.

# Deleting voting disk

Addition and deletion of votedisk is not allowed on ASM. You can always create new diskgroup with different redundancy group to reduce number of voting disk.

Note:

There is 1 voting disk if DG with external redundancy

There are 3 voting disks if DG with normal redundancy

There are 5 voting disks if DG with high redundancy

# Relocate voting disk, or recover voting disk

crsctl replace votedisk

Modifying redundancy level of Diskgroup containing voting disk.

Let's say we have DG VDISK and it is configured with external redundancy.

If we want to increase level of redundancy to Normal or High then we need to go through following steps.

Create diskgroup with desired redundancy.
Add another disk to the diskgroup and mark it as quorum disk. The quorum disk is one small Disk (500 MB should be on the safe side here, since the Voting File is only about 280 MB in size) to keep one Mirror of the Voting File. In case of normal redundancy you need one quorum disk. Other two disks will contain each one Voting File and all the other stripes of the Database Area as well, but quorum will only get that one Voting File. For high redundancy you need two quorum disks. QUORUM disks can contain the voting file for Cluster Synchronization Services (CSS). REGULAR disks, or disks in non-quorum failure groups, can contain any files.
Now try to relocate the voting disk from exiting disk-group to newly created disk-group.

Checking current voting disk location

Checking available asm disks

Create asm disk group with desired redundancy.

Set diskgroup compatible.asm attribute to 11.2.0.0.0

Add quorum disk to disk group. We need to add 2 quorum disk for DG with high redundancy.

Validate asm diskgroup.

Relocate votedisk to newly created Diskgroup

Query and validate changes.

OCR (Oracle Cluster Registry)

OCR (Oracle Cluster Registry) – resides on shared storage and it is accessed by all nodes in the cluster. It maintains information about cluster configuration and information about cluster database.

OCR contains information like which database instances run on which nodes and which services runs on which database. OCR is created during the time of Grid Installation. It stores information to manage Oracle clusterware and it’s component such as RAC database, listener, VIP, Scan IP & Services. Minimum 1 and maximum 5 copy of OCR is possible.

# Check OCR file details

ocrcheck

OCR Backup:

Oracle automatically takes backup every 4 hrs on master node. You can also take backup using ocrconfig export utility.

Oracle11g R2 and higher releases simplified OCR and Voting file management by storing the OCR and Voting files in ASM (Automatic Storage Management). ASM automatically maintains the number of OCR/Voting disks based on the underlying Diskgroup redundancy further reducing manual DBA file management tasks. Additionally the Clusterware stack also initiated periodic automatic backups of these files.

To determine OCR file location

more /etc/oracle/ocr.loc

Adding new location

ocrconfig -add <DiskGroup>

Deleting location

ocrconfig -delete <DiskGroup>

View ocr backup location

ocrconfig -showbackup

Manually backup

ocrconfig -manualbackup

Dump backup of ocr file

ocrconfig -export ocr_backup_$(date +%Y_%m_%d).dmp

Restore OCR

ocrconfig -restore

Following are New Features from Oracle 11g R2 onward.

OCR And Voting disk can be stored on ASM or certified cluster file system.
Voting disk and OCR can be dynamically added or replaced.
Voting disk and OCR can be keep in same disk-group or different disk-group
Voting disk and OCR automatic backup kept together in a single file.
Automatic backup of Voting disk and OCR happen after every four hours, end of the day, end of the week
Administer access: root or sudo privilege are required for managing account.

Step by step restoring OCR and voting disk in case DG with Voting disks and OCR fails.

If there is no voting disk and or diskgroup containing Voting disk failed to mount due to insufficient disk members then the only way to recover OCR and voting disk is to create new DG and start recovery. Error message as below will be noticed on alert log file.

gpnpd(3183)]CRS-2328:GPNPD started on node rac01.

2019-08-25 12:45:39.548

[cssd(3253)]CRS-1713:CSSD daemon is started in clustered mode

2019-08-25 12:45:41.343

[ohasd(3025)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE

2019-08-25 12:45:43.641

[cssd(3253)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/rac01/cssd/ocssd.log

Please following following steps to restore OCR and Voting disk.

1. Create new disk with desired redundancy. ASM attribute compatible.asm should be 11.2.0.0.0 or higher and there should be sufficient quorum failure groups as per redundancy level. For normal redundancy there should be 1 quorum failure group and 2 quorum failure groups are required for high redundancy.

2. Stop crs with -f force option.

crsctl stop crs -f

3. Start crs in exclusive mode without crs. Check crs status with -init option.

crsctl start crs -excl -nocrs

4. Check ocr location from ocr.loc file. This file contains the diskgroup where OCR file is resides. If cluster is already running we can use ocrconfig command to modify the location. Since cluster is offline this file need to modify manually. Replace newly created diskgroup.

cat /etc/oracle/ocr.loc

vi /etc/oracle/ocr.loc