利用Oracle CRS搭建应用的高可用集群
【IT168技术文档】
前言:CRS的简介和由来
从Oracle 10gR1 RAC 开始,Oracle推出了自身的集群软件,这个软件的名称叫做Oracle Cluster Ready Service(Oracle集群就绪服务),简称CRS。从Oracle 10gR2开始,包括最新的11g,Oracle将其更名为Clusterware(集群件),但通常意义上我们认为CRS = Clusterware = Oracle Cluster Ready Service = Oracle Cluster Software.
CRS一般用来搭建Oracle的并行数据库,即RAC,但除了与RAC的接口之外,CRS还提供了一组高可用性的应用程序接口(API),用来搭建一般应用程序的高可用集群,即一般我们常说的双机热备,比如使用CRS实现MySQL的双机热备。
这种主备模式的双机热备还可以包括许多第三方的应用程序,比如虚拟IP、磁盘组、文件系统、MySQL数据库、Apache,或者单节点的Oracle实例,或者单节点的ASM,等等,都可以作为资源注册到CRS中去,由CRS来启动,关闭,监测应用程序的状态,还可以设置应用程序相互的依赖关系,保证多组资源正确的启动顺序。
??? 本文就以保护单节点oracle实例为例,演示如何使用CRS来实现上述功能。使用的主要的软件有:Solaris 10u4, Oracle CRS 10.2.0.2 , Oracle RDMBS 10.2.0.3, VxVM 5.0 ,磁盘阵列型号是AMS1000。
??? 系统拓扑图大致如下:
主要操作步骤如下:
一、准备工作:软件安装,数据库创建
1.安装Solaris 10, Veritas Volume Manager ,安装过程略
2.创建用户组dba/oinstall,oracle用户,并修改相应profile和/etc/hosts文件
修改rhosts文件,配置oracle用户的对等连接;
连接心跳网线;
如果生产环境中推荐心跳网络使用千兆,推荐每台机器有两块网卡分别连接两个网络交换机;
在操作系统中启动心跳网卡;
准备共享磁盘,这里使用的是SAN环境下的共享盘ams_wms0_0098,大小为20G;
3.下面使用静默方式来安装Oracle集群软件、数据库软件
这种安装创建方式的优点是创建速度比较快,并且不需要运行图形界面,适合远程安装和建库;
缺点是不直观,需要手工编写响应文件;
这一步CRS的安装只需要在一个节点上做:
oracle@rac01$. ./clusterware/runInstaller -silent -responsefile /tmp/shahand/crs.rsp
crs.rsp 文件内容参考“五-7”部分
CRS的runInstaller运行完毕以后,要手工在两个节点上运行root.sh,
并要手工运行$CRS_HOME/cfgtoollogs/configToolAllCommands
检查CRS安装配置正确:
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->root@rac01 #crs_stat -tName Type Target State Host------------------------ ora....c01.gsd application ONLINE ONLINE rac01ora....c01.ons application ONLINE ONLINE rac01ora....c01.vip application ONLINE ONLINE rac01ora....c02.gsd application ONLINE ONLINE rac02ora....c02.ons application ONLINE ONLINE rac02ora....c02.vip application ONLINE ONLINE rac02
检查设置了正确的心跳网络:
root@rac01 # $CRS_HOME/bin/oifcfg getif
e1000g0 10.198.88.0 global public
e1000g1 192.168.2.0 global cluster_interconnect
使用静默方式安装Oracle 数据库软件,这一步两个节点都要做:
./runInstaller -silent -responsefile /tmp/shahand/db.rsp
db.rsp 文件内容参考“五-8”部分
4.创建oracle数据库文件所需要的盘组、逻辑卷、文件系统、挂载文件系统并设置权限;
只需要在一个节点上做;??
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->root@rac01 # vxdisksetup -i ams_wms0_0098root@rac01 # vxdg init oradata12 ams_wms0_0098root@rac01 # vxassist -g oradata12 make oradata 18Groot@rac01 # vxedit -g oradata12 set user=oracle group=dba mode=644 oradataroot@rac01 # timex mkfs -F vxfs /dev/vx/rdsk/oradata12/oradataversion 7 layout37748736 sectors, 18874368 blocks of size 1024, log size 65536 blockslargefiles supportedreal 10.30user 0.07sys 0.04root@rac01 # mount -F vxfs -o largefiles /dev/vx/dsk/oradata12/oradata /oradataroot@rac01 # chown oracle:dba /oradataroot@rac01 # df -h /oradata/Filesystem size used avail capacity Mounted on/dev/vx/dsk/oradata12/oradata18G 70M 17G 1% /oradata
5.静默方式创建oracle数据库,只需要一个节点上做:???
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->oracle@rac01 $ dbca -silent -createDatabase -sid orcl -sysPassword sys -systemPassword sys \-datafileDestination /oradata -gdbName orcl -templateName General_Purpose.dbcCopying database files1% complete3% complete11% complete18% complete26% complete37% completeCreating and starting Oracle instance40% complete45% complete50% complete55% complete56% complete60% complete62% completeCompleting Database Creation66% complete70% complete73% complete85% complete96% complete100% complete
手工检查数据库orcl的状态,可以登陆数据库select status from v$instance查看。
二、Oracle 集群软件资源的手工注册
1. 注销crs本身自带的ons、gsd、vip资源
root@rac01 # crs_stop -all
Attempting to stop `ora.rac01.gsd` on member `rac01`
Attempting to stop `ora.rac01.ons` on member `rac01`
Attempting to stop `ora.rac02.gsd` on member `rac02`
Attempting to stop `ora.rac02.ons` on member `rac02`
Stop of `ora.rac02.gsd` on member `rac02` succeeded.
Stop of `ora.rac02.ons` on member `rac02` succeeded.
Stop of `ora.rac01.gsd` on member `rac01` succeeded.
Stop of `ora.rac01.ons` on member `rac01` succeeded.
Attempting to stop `ora.rac01.vip` on member `rac01`
Attempting to stop `ora.rac02.vip` on member `rac02`
Stop of `ora.rac02.vip` on member `rac02` succeeded.
Stop of `ora.rac01.vip` on member `rac01` succeeded.
root@rac01 # crs_unregister ora.rac01.gsd
root@rac01 # crs_unregister ora.rac01.ons
root@rac01 # crs_unregister ora.rac01.vip
root@rac01 # crs_unregister ora.rac02.vip
root@rac01 # crs_unregister ora.rac02.ons
root@rac01 # crs_unregister ora.rac02.gsd
root@rac01 # crs_stat -t
CRS-0202: No resources are registered.
2.创建虚拟IP资源:
root@rac01 # crs_profile -create havip -t application -a /oracle/crs/bin/usrvip \
-o oi=e1000g0,ov=10.198.94.139,on=255.255.248.0
root@rac01 # crs_register havip
root@rac01 # crs_setperm havip -o root
root@rac01 # crs_setperm havip -u user:oracle:r-x
root@rac01 # crs_stat -t -v
Name Type R/RA F/FT Target State Host
----------------------------------
ha_vip application 0/1 0/0 OFFLINE OFFLINE
root@rac01 # crs_start havip
root@rac01 # crs_stat -t -v
Name Type R/RA F/FT Target State Host
----------------------------------
havip application 0/1 0/0 ONLINE ONLINE rac01
3.准备控制其他资源启动、关闭、检查的脚本文件dg.sh/fs.sh/db.sh/lsnr.sh
这四个脚本文件内容参考“五-3/4/5/6”部分
对crs_profile命令中的选项和参数做简单说明:
(1) 选项-r定义了该资源所依赖的资源,在下面的例子中,资源oradata_mount启动时依赖于
disk_group先 启动,需要停止disk_group的时候必须先停止资源oradata_mount,
资源orcl_db的启动则同时依赖于oradata_mount/disk_group/havip/listener;
(2) 参数-o 包括:ci的意思是crs对资源状态的监测间隔(check interval),单位为秒;
ra : crs重启资源的尝试次数,RESTART_ATTEMPTS,次数到达以后将重新分配;
fi : 资源状态出现错误以后,crs的尝试间隔,FAILURE_INTERVAL,单位是秒;
ft : 资源状态出现错误以后,crs的尝试次数,FAILURE_THRESHOLD;
这些参数可以使用默认值,分别是60秒/1/0秒/0。
(3) 参数-a 是指ACTION_SCRIPT,参数值为资源启动、关闭、监测的脚本,脚本固定的三个参数为
start/stop/check;
管理数据库监听的部分:
修改$ORACLE_HOME/network/admin/listener.ora文件,
将其中(HOST = rac01 )部分修改成(HOST = 10.198.94.139 ) (虚拟IP地址)
crs_profile -create listener -t application -a /oracle/crs/crs/public/lsnr.sh -r havip -o \
ci=180,ra=6,ft=2,fi=12
crs_register listener
crs_setperm listener -o root
crs_setperm listener -u user:oracle:r-x
crs_start listener
管理磁盘组和逻辑卷的部分:
crs_profile -create disk_group -t application -a /oracle/crs/crs/public/dg.sh -r havip -o \
ci=180,ra=6,ft=2,fi=12
crs_register disk_group
crs_setperm disk_group -o root
crs_setperm disk_group -u user:oracle:r-x
注:本身磁盘组的启动并不依赖于虚拟IP的启动,这里之所以设置两者的依赖关系,
是为了防止虚拟IP在一个节点启动,而磁盘组在另外一个节点启动,造成资源不一致的情况出现。
管理文件系统的部分:
crs_profile -create oradata_mount -t application -a /oracle/crs/crs/public/fs.sh -r disk_group -o \
ci=180,ra=6,ft=2,fi=12
crs_register oradata_mount
crs_setperm oradata_mount -o root
crs_setperm oradata_mount -u user:oracle:r-x
管理数据库实例的部分:
crs_profile -create orcl_db -t application -a /oracle/crs/crs/public/db.sh -r \
"oradata_mount listener" -o ci=180,ra=6,ft=2,fi=12
crs_register orcl_db
crs_setperm orcl_db -o root
crs_setperm orcl_db -u user:oracle:r-x
crs_start orcl_db
4.确保脚本具有执行属性,并把public 和profile的内容拷到第二个节点上。
# chmod +x /oracle/crs/crs/public/*
# rcp -r -p /oracle/crs/crs/public/* rac02:/oracle/crs/crs/public/
5.启动所有的资源
下面可以看到,在crs启动和关闭资源的过程中,其顺序是按照前面定义的资源依赖关系进行的:
root@rac01 # crs_stop -all
Attempting to stop `orcl_db` on member `rac01`
Stop of `orcl_db` on member `rac01` succeeded.
Attempting to stop `oradata_mount` on member `rac01`
Stop of `oradata_mount` on member `rac01` succeeded.
Attempting to stop `disk_group` on member `rac01`
Stop of `disk_group` on member `rac01` succeeded.
Attempting to stop `listener` on member `rac01`
Stop of `listener` on member `rac01` succeeded.
Attempting to stop `havip` on member `rac01`
Stop of `havip` on member `rac01` succeeded.
root@rac01 # crs_start -all
Attempting to start `havip` on member `rac01`
Start of `havip` on member `rac01` succeeded.
Attempting to start `listener` on member `rac01`
Start of `listener` on member `rac01` succeeded.
Attempting to start `disk_group` on member `rac01`
Start of `disk_group` on member `rac01` succeeded.
Attempting to start `oradata_mount` on member `rac01`
Start of `oradata_mount` on member `rac01` succeeded.
Attempting to start `orcl_db` on member `rac01`
Start of `orcl_db` on member `rac01` succeeded.
检查资源状态是否正常:
oracle@rac01 $ crs_stat -t
Name Type Target State Host
------------------------
disk_group application ONLINE ONLINE rac01
havip application ONLINE ONLINE rac01
listener application ONLINE ONLINE rac01
oradata_mount application ONLINE ONLINE rac01
orcl_db application ONLINE ONLINE rac01
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->2008-01-10 17:30:22.526: [ CRSRES][1580] startRunnable: setting CLI values2008-01-10 17:30:22.527: [ CRSRES][1580] Attempting to start `dg` on member `rac01`2008-01-10 17:30:22.589: [ CRSRES][1581] startRunnable: setting CLI values2008-01-10 17:30:22.629: [ CRSRES][1581] Attempting to start `havip` on member `rac01`2008-01-10 17:30:22.688: [ CRSAPP][1581] StartResource error for havip error code = 12008-01-10 17:30:22.749: [ CRSAPP][1581] StopResource error for havip error code = 12008-01-10 17:30:22.757: [ CRSRES][1581] X_OP_StopResourceFailed : Stop Resource failed(File: rti.cpp, line: 17962008-01-10 17:30:22.758: [ CRSRES][1581][ALERT] `havip` on member `rac01` has experienced an ...2008-01-10 17:30:22.758: [ CRSRES][1581] Human intervention required to resume its availability.2008-01-10 17:30:23.211: [ CRSRES][1580] Start of `dg` on member `rac01` succeeded.这时候发现在输入网卡时候不小心输入了错误的网卡名称,公共网卡名称应该是e1000g0,
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->#!/bin/sh# *****************************************************************# shahand 2008-1-3SCRIPT=$0ACTION=$1 # Action (start, stop or check)DG_NAME=oradata12VOL_NAME=oradatacase $1 in'start')/usr/sbin/vxdg -tfC import $DG_NAME;/usr/sbin/vxvol -g $DG_NAME startallecho "Resource STARTED";;'stop')/usr/sbin/vxvol -g $DG_NAME stopall;/usr/sbin/vxdg deport $DG_NAMEecho "Resource STOPPED";;'check')if [[ `/usr/sbin/vxprint -v -g $DG_NAME |grep $VOL_NAME|wc -c` -eq 0 ]]thenexit 1elseexit 0fiecho "Resource CHECKED";;*)echo "usage: $0 {start stop check}";;esacexit 0
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->#!/bin/sh# *****************************************************************# ## shahand 2008-1-3# *****************************************************************# /oracle/crs/crs/public/lsnr.shSCRIPT=$0ACTION=$1 # Action (start, stop or check)case $1 in'start')su - oracle -c "lsnrctl start"exit $?;;'stop')su - oracle -c "lsnrctl stop"exit $?;;'check')if [ `ps -ef|grep tnslsnr |wc -c` -eq 0 ]thenecho "bad";exit 1elseecho "good";exit 0fi;;*)echo "usage: $0 {start stop check}";;esacexit 07.Oracle集群软件的安装响应文件crs.rsp ,其中在####之前需要修改???
<!--Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->ORACLE_HOME="/oracle/crs"sl_tableList={"rac01:rac01-priv:rac01-vip:N:Y","rac02:rac02-priv:rac02-vip:N:Y"}ret_PrivIntrList={"e1000g0:10.198.88.0:1","e1000g1:192.168.0.0:2"}n_storageTypeOCR=2s_ocrpartitionlocation="/oracle/ocrfile1"s_ocrMirrorLocation=""n_storageTypeVDSK=2s_votingdisklocation="/oracle/vdfile1"s_OcrVdskMirror1RetVal=""s_VdskMirror2RetVal=""ORACLE_HOME_NAME="OraCRS10ghome1"################# complete modify #####################RESPONSEFILE_VERSION=2.2.1.0.0UNIX_GROUP_NAME="oinstall"FROM_LOCATION="../stage/products.xml"NEXT_SESSION_RESPONSE=<Value Unspecified>TOPLEVEL_COMPONENT={"oracle.crs","10.2.0.1.0"}DEINSTALL_LIST={"oracle.crs","10.2.0.1.0"}SHOW_SPLASH_SCREEN=falseSHOW_WELCOME_PAGE=falseSHOW_NODE_SELECTION_PAGE=falseSHOW_SUMMARY_PAGE=falseSHOW_INSTALL_PROGRESS_PAGE=falseSHOW_CONFIG_TOOL_PAGE=falseSHOW_XML_PREREQ_PAGE=falseSHOW_ROOTSH_CONFIRMATION=trueSHOW_END_SESSION_PAGE=falseSHOW_EXIT_CONFIRMATION=falseNEXT_SESSION=falseNEXT_SESSION_ON_FAIL=falseSHOW_DEINSTALL_CONFIRMATION=falseSHOW_DEINSTALL_PROGRESS=falseRESTART_SYSTEM=falseRESTART_REMOTE_SYSTEM=falseREMOVE_HOMES=ORACLE_HOSTNAME=<Value Unspecified>SHOW_END_OF_INSTALL_MSGS=falseCOMPONENT_LANGUAGES={"en"}s_clustername="crs"CLUSTER_CONFIGURATION_FILE=""8. Oracle数据库软件的安装响应文件db.rsp,其中在####之前需要修改CLUSTER_NODES={}ORACLE_HOME_NAME="OraDB10ghome1"ORACLE_HOME="/oracle/10g"##########################################################FROM_LOCATION="../stage/products.xml"RESPONSEFILE_VERSION=2.2.1.0.0UNIX_GROUP_NAME="oinstall"FROM_LOCATION_CD_LABEL=<Value Unspecified>SHOW_WELCOME_PAGE=trueSHOW_CUSTOM_TREE_PAGE=trueSHOW_COMPONENT_LOCATIONS_PAGE=trueSHOW_SUMMARY_PAGE=trueSHOW_INSTALL_PROGRESS_PAGE=trueSHOW_REQUIRED_CONFIG_TOOL_PAGE=trueSHOW_CONFIG_TOOL_PAGE=trueSHOW_RELEASE_NOTES=trueSHOW_ROOTSH_CONFIRMATION=trueSHOW_END_SESSION_PAGE=trueSHOW_EXIT_CONFIRMATION=trueNEXT_SESSION=falseNEXT_SESSION_ON_FAIL=trueNEXT_SESSION_RESPONSE=<Value Unspecified>DEINSTALL_LIST={"oracle.server","10.2.0.1.0"}SHOW_DEINSTALL_CONFIRMATION=trueSHOW_DEINSTALL_PROGRESS=trueACCEPT_LICENSE_AGREEMENT=falseTOPLEVEL_COMPONENT={"oracle.server","10.2.0.1.0"}SHOW_SPLASH_SCREEN=trueSELECTED_LANGUAGES={"en"}COMPONENT_LANGUAGES={"en"}INSTALL_TYPE="Enterprise Edition"sl_superAdminPasswds=<Value Unspecified>sl_dlgASMCfgSelectableDisks={}s_superAdminSamePasswd=<Value Unspecified>s_globalDBName="orcl"s_dlgASMCfgRedundancyValue="2 (Norm)"s_dlgASMCfgNewDisksSize="0"s_dlgASMCfgExistingFreeSpace="0"s_dlgASMCfgDiskGroupName="DATA"s_dlgASMCfgDiskDiscoveryString=""s_dlgASMCfgAdditionalSpaceNeeded=" MB"s_dbSelectedUsesASM=""s_dbSIDSelectedForUpgrade=""s_dbRetChar=""s_dbOHSelectedForUpgrade=""s_ASMSYSPassword=<Value Unspecified>n_performUpgrade=0n_dlgASMCfgRedundancySelected=2n_dbType=1n_dbSelection=0b_useSamePassword=falseb_useFileSystemForRecovery=trueb_receiveEmailNotification=falseb_loadExampleSchemas=falseb_enableAutoBackup=falseb_dlgASMShowCandidateDisks=trueb_centrallyManageASMInstance=truesl_dlgASMDskGrpSelectedGroup={" "," "," "," "}s_dlgRBOUsername=""s_dlgEMCentralAgentSelected="No Agents Found"b_useDBControl=trues_superAdminSamePasswdAgain=<Value Unspecified>s_dlgEMSMTPServer=""s_dlgEMEmailAddress=""s_dlgRBORecoveryLocation="/oracle/db/flash_recovery_area/"n_upgradeDB=1n_configurationOption=3csl_upgradableSIDBInstances={}n_upgradeASM=0sl_dlgASMCfgDiskSelections={}s_ASMSYSPasswordAgain=<Value Unspecified>n_dbStorageType=0s_rawDeviceMapFileLocation=""sl_upgradableRACDBInstances={}s_dlgRBOPassword=<Value Unspecified>b_stateOfUpgradeDBCheckbox=falses_dbSid="orcl"b_dbSelectedUsesASM=falsesl_superAdminPasswdsAgain=<Value Unspecified>s_mountPoint="/oracle/db/oradata/"b_stateOfUpgradeASMCheckbox=falseoracle.assistants.server:OPTIONAL_CONFIG_TOOLS="{}"oracle.has.common:OPTIONAL_CONFIG_TOOLS="{}"oracle.network.client:OPTIONAL_CONFIG_TOOLS="{}"oracle.sqlplus.isqlplus:OPTIONAL_CONFIG_TOOLS="{}"oracle.sysman.console.db:OPTIONAL_CONFIG_TOOLS="{}"varSelect=1s_nameForOPERGrp="dba"s_nameForDBAGrp="dba"