Author: Bo Tang
摘要
在本文中,我们将继续使用《Oracle12c RAC和Grid Infrastructure部署系列三》最后遗留下来的环境。如果读者是从《系列二》一直看过来的,那么就知道在实验开始前环境里已经配置了ASMFD和Flex ASM。由于本文开始着手研究ADVM/ACFS,偏重于操作系统相关的内容,所以不得不交代一下:从本系列最前的《系列一》开始到现在的各个环境中,SELINUX都是打开的。在本实验过程中,首先创建ADVM卷,然后创建ACFS文件系统(在12c中叫Oracle云文件系统)并挂载,最后在ACFS文件系统上创建个表空间做Oracle数据库服务器基于集群的巩固性测试。
微信查看请扫:
目录
1. 查看原始实验环境
2. 聊聊SELinux
3. 创建Oracle云文件系统
3.1 创建ADVM/ACFS
3.2 ADVM代理实例(APX实例)
4. 启动和停止APX实例
5. 测试APX实例
5.1 加载工作负载
5.2 在ACFS文件系统上放置一个普通文件
5.3 强迫关闭+ASM2实例,查看ADVM/ACFS仍然挂载
5.4 手动启动刚才被强迫关闭的+ASM2实例
5.5 重分配+APX2实例和db12c2数据库服务器实例连接回到+ASM2实例
总结 正文
1. 查看原始实验环境
我们将继续使用《Oracle12c RAC和Grid Infrastructure部署系列三》最后遗留下来的环境。首先查看一下集群概况:
[grid@station11 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM.lsnr ora....er.type ONLINE ONLINE station11
ora.DATA.dg ora....up.type ONLINE ONLINE station11
ora.FRA.dg ora....up.type ONLINE ONLINE station11
ora....ER.lsnr ora....er.type ONLINE ONLINE station11
ora....N1.lsnr ora....er.type ONLINE ONLINE station12
ora....N2.lsnr ora....er.type ONLINE ONLINE station12
ora....N3.lsnr ora....er.type ONLINE ONLINE station11
ora.MGMTLSNR ora....nr.type ONLINE ONLINE station11
ora.asm ora.asm.type ONLINE ONLINE station11
ora.cvu ora.cvu.type ONLINE ONLINE station11
ora.db12c.db ora....se.type ONLINE ONLINE station11
ora.mgmtdb ora....db.type ONLINE ONLINE station11
ora....network ora....rk.type ONLINE ONLINE station11
ora.oc4j ora.oc4j.type ONLINE ONLINE station11
ora.ons ora.ons.type ONLINE ONLINE station11
ora.scan1.vip ora....ip.type ONLINE ONLINE station12
ora.scan2.vip ora....ip.type ONLINE ONLINE station12
ora.scan3.vip ora....ip.type ONLINE ONLINE station11
ora....11.lsnr application ONLINE ONLINE station11
ora....n11.ons application ONLINE ONLINE station11
ora....n11.vip ora....t1.type ONLINE ONLINE station11
ora....12.lsnr application ONLINE ONLINE station12
ora....n12.ons application ONLINE ONLINE station12
ora....n12.vip ora....t1.type ONLINE ONLINE station12 |
其次,检查一下ASM磁盘:
[grid@station11 ~]$ asmcmd lsdsk
Path
AFD:RACDISK1
AFD:RACDISK10
AFD:RACDISK11
AFD:RACDISK2
AFD:RACDISK3
AFD:RACDISK4
AFD:RACDISK5
AFD:RACDISK6
AFD:RACDISK7
AFD:RACDISK8
AFD:RACDISK9
[grid@station11 ~]$ asmcmd afd_lsdsk
--------------------------------------------------------------------------------
Label Filtering Path
================================================================================
RACDISK1 ENABLED /dev/sda5
RACDISK2 ENABLED /dev/sda6
RACDISK3 ENABLED /dev/sda7
RACDISK4 ENABLED /dev/sda8
RACDISK5 ENABLED /dev/sda9
RACDISK6 ENABLED /dev/sda10
RACDISK7 ENABLED /dev/sda11
RACDISK8 ENABLED /dev/sda12
RACDISK9 ENABLED /dev/sda13
RACDISK10 ENABLED /dev/sda14
RACDISK11 ENABLED /dev/sda15 |
由于本文开始着手研究ADVM/ACFS,偏重于操作系统相关的内容,所以不得不交代一下:从本系列最前的《系列一》开始到现在的各个环境中,SELinux都是打开的。
[root@station11 /]# getenforce
Enforcing |
[root@station12 ~]# getenforce
Enforcing |
2. 聊聊SELinux
SELinux是一种基于域-类型模型(domain-type)的强制访问控制(MAC)安全系统。它由NSA(美国国家安全局)编写(据说包含2万行以上有效代码)并设计成内核模块嵌入到Linux内核中。SELinux是 Linux发展历史上最杰出的安全子系统之一。
在RedHat Linux各个发行版中,SELinux最早出现在Red Hat Enterprise Linux 4里。大家知道:考这个版本和之后版本的所有RHCE认证时,SELinux都是一个必考点和通过关键点。在配置RHCE其他任何考点(比如网页服务器)时,如果无法做到与SELinux协同工作,考生肯定通过不了RHCE认证。因此想必这么多年都过去了,技术人员对SELinux肯定已经不陌生了。
Oracle过去一直对SELinux采取不建议用户使用的态度。这是可以理解的。因为SELinux实在太安全了:它通过强制性访问控制策略严格阻止那些它认为不合法的进程的运行;它在文件系统上添加一种特殊的上下文标签用以标记文件和文件夹的安全属性。这样一来Oracle庞大的进程系统在运行过程中稍有不慎就会被SELinux的某些阻止行为所阻止,进而马上就会带来一些令人摸不着头脑的问题。这么说吧,SELinux就像一头异常凶猛的看家狗:它固然能使家园变得安全,但是搞不好,它反噬起主人来也非常危险。正因为以上的解释,即使你找到了一份最新的(本文写于2016年10月)由国内某Oracle优化团队编写并扩散的12cR2 Beta版本的RAC安装手册,你会从中读到他们仍然习惯性地在安装操作系统时选择关闭SELinux。
Oracle在11.2.0.2以后的文档中宣称安装支持SELinux。因为在11.2.0.2中与SELinux相关的主要BUG(远不止下面列出的这些,但是其它的一些BUG大多数是以下列出的BUG的重复BUG (Duplicated Bug))都处理得差不多了。但是对ACFS部分在11g的文档中都说不支持SELinux:
BUG编号 |
BUG说明 |
6079461 |
ERROR WHILE LOADING SHARED LIBRARY WHEN SELINUX IS ENABLED ON ORACLE LINUX 5 AND RED HAT ENTERPRISE LINUX 5(在11.2.0.2修复) |
6140224 |
SQLPLUS FAILS TO LOAD LIBNNZ11.SO WITH SELINUX ENABLED ON EL5/RHEL5(11.2.0.2以前都有影响,在11.2.0.2修复) |
6155260 |
导致BUG6140224
ASSEMBLY CODE IN RSA CRYPTOC IS NOT POSITION INDEPENDENT(在11.2.0.2修复) |
9215184 |
OCI APPLICATION FAILS FOR 64 BITS LIBRARIES ON SELINUX(在11.2.0.2修复) |
9746474 |
SELINUX IS PREVENTING “EXECMOD”(在11.2.0.2修复) |
9817031 |
11.2 GRID INFRA. FAILS DURING ROOT.SH, IF DEFAULT SELINUX IS SET TO “ENFORCED”(在11.2.0.2修复) |
12754448 |
SECURITY-ENHANCED
LINUX (SELINUX) IS NOT SUPPORTED ON ORACLE AUTOMATIC STORAGE MANAGEMENT
… CLUSTER FILE SYSTEM (ORACLE ACFS) FILE SYSTEMS(在12.1.0.1中修复) |
14030380 |
ACFS START FILESYSTEM FAILS WITH MIXED CASE HOSTNAME WHEN SELINUX ENFORCING(在11.2.0.3.2修复) |
言归正传,涉及到12c中的ADVM/ACFS,Oracle在12c官方文档http://docs.oracle.com/database/121/OSTMG/GUID-A972A122-D434-415D-A8B5-BCCBCDD0BC8C.htm#OSTMG92000中是这样描述的:
When
using Security-Enhanced Linux (SELinux) in enforcing mode with Oracle
ACFS, ensure that the Oracle ACFS file systems are mounted with an
SELinux default context. Refer to your Linux vendor documentation for
information about the context mount option. |
翻译一下:
当在SELinux enforcing 模式下使用Oracle ACFS , 确保 Oracle ACFS文件系统挂载时带有默认的 SELinux上下文标签。参考你的Linux供应商提供的文档以了解相应的SELinux上下文标签挂载选项。 |
总之一句话:既然Oracle都这么说了,在12c中我们值得探索SELinux下的ADVM/ACFS的配置。当然环境中还同时包括ASMFD和Flex ASM,这就是本文写作的初衷。记住SELinux如果没有必要enforcing的话,美国国家安全局就不会开发它,Linux教父Linus Torvalds也不会同意它进入内核主流殿堂。
3. 创建Oracle云文件系统
Oracle云文件系统简化甚至自动化了对存储的管理,并提高了存储利用率、寿命、基于预测性能改变的敏捷可调性和面向通用用途以及数据库文件的高可用性。Oracle云文件系统由卷管理系统ADVM和集群文件系统 ACFS共同组成。在12c中Oracle云文件系统变得非常灵活:可以用来存储数据库文件,也可以用来存储非结构化数据比如配置文件和应用系统文件,甚至用于安装数据库软件和存放数据库备份文件。
3.1 创建ADVM/ACFS
在实验过程中,首先在Flex ASM环境中创建ADVM卷,然后创建ACFS文件系统(在12c中叫Oracle云文件系统),最后挂载。因为本节带有准备环境的性质所以主要展示真实的命令,其中原理不做过多解释。
加载内核模块:
[root@station11 ~]# /u01/app/12.1.0/grid/bin/acfsload start -s
/u01/app/12.1.0/grid/bin/acfsload: line 34: /lib/acfstoolsdriver.sh: 没有那个文件或目录
/u01/app/12.1.0/grid/bin/acfsload: line 34: exec: /lib/acfstoolsdriver.sh: cannot execute: 没有那个文件或目录 |
[root@station12 ~]# /u01/app/12.1.0/grid/bin/acfsload start -s
/u01/app/12.1.0/grid/bin/acfsload: line 34: /lib/acfstoolsdriver.sh: 没有那个文件或目录
/u01/app/12.1.0/grid/bin/acfsload: line 34: exec: /lib/acfstoolsdriver.sh: cannot execute: 没有那个文件或目录 |
以上的失败提示信息表明:需要编译内核模块。
[root@station11 ~]# /u01/app/12.1.0/grid/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@station11 ~]# /u01/app/12.1.0/grid/bin/acfsload start -s
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed |
[root@station12 ~]# /u01/app/12.1.0/grid/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@station12 ~]# /u01/app/12.1.0/grid/bin/acfsload start -s
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed |
再次加载内核模块:
[root@station11 ~]# acfsload start -s
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed |
[root@station12 ~]# acfsload start -s
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed |
创建挂载目录,并设置SELinux上下文:default_t。
[root@station11 ~]# mkdir /acfsdata
[root@station11 ~]# chcon -t default_t /acfsdata
[root@station11 ~]# ls -Zd /acfsdata
drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /acfsdata |
[root@station12 ~]# mkdir /acfsdata
[root@station12 ~]# chcon -t default_t /acfsdata
[root@station12 ~]# ls -Zd /acfsdata
drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /acfsdata |
在station11的图形界面上,以grid用户启动asmca:
点“Volumes”再点”Create”:
“Volume Name”处填“vol0”;“Disk Group Name”下拉式列表处选“DATA”;“Size”处填“1G”。我们演示一个1G小卷。然后点击OK按钮。
等待创建卷…...
确认卷创建成功:
看到了卷“/dev/asm/vol0-382”的概貌:
然后点击“ASM Cluster File Systems”选项卡,再点击“Create”按钮。
“Type of ACFS”下拉式列表处选“Cluster File System”;“Mount Point”处填“/acfsdata”;“Auto Mount”复选框勾起;“Select Volume”下拉式列表处选“VOL0-/dev/asm/vol0-382-1.0G”。然后点击OK按钮。
ACFS文件系统已经在/dev/asm/vol0-382卷上成功创建完成。接下来只要在station11(调用asmca命令的这台机器
)上以root 用户执行/u01/app/grid/cfgtoollogs/asmmca/scripts/acfs_script.sh脚本向集群件注册并挂载ACFS文件系统。由于是调用srvctl命令,不必在集群里其他机器上执行。
看一下/u01/app/grid/cfgtoollogs/asmmca/scripts/acfs_script.sh脚本的内容:
#!/bin/sh
/u01/app/12.1.0/grid/bin/srvctl add filesystem -d /dev/asm/vol0-382 -m /acfsdata -fstype ACFS -autostart ALWAYS
if [ $? = "0" -o $? = "2" ]; then
/u01/app/12.1.0/grid/bin/srvctl start filesystem -d /dev/asm/vol0-382
if [ $? = "0" ]; then
chmod 775 /acfsdata
/u01/app/12.1.0/grid/bin/srvctl status filesystem -d /dev/asm/vol0-382
exit 0
fi
/u01/app/12.1.0/grid/bin/srvctl status filesystem -d /dev/asm/vol0-382
fi |
开始执行脚本:
[root@station11 acfsdata]# /u01/app/12.1.0/grid/bin/srvctl start filesystem -d /dev/asm/vol0-382
PRCR-1079 : 无法启动资源 ora.data.vol0.acfs
CRS-5016:
Process "/bin/mount" spawned by agent "ORAROOTAGENT" for action "start"
failed: details at "(:CLSN00010:)" in
"/u01/app/grid/diag/crs/station11/crs/trace/crsd_orarootagent_root.trc"
Error mounting file system '/dev/asm/vol0-382' on '/acfsdata': /sbin/mount.acfs.bin: error while loading shared libraries: libhasgen12.so: failed to map segment from shared object: Permission denied
CRS-2674: Start of 'ora.data.vol0.acfs' on 'station11' failed |
在执行过程中发现“srvctl add filesystem -d /dev/asm/vol0-382 -m /acfsdata -fstype ACFS -autostart ALWAYS”这前半部分没问题。但是在SELinux处于“enforcing”(本文环境)时,“/u01/app/12.1.0/grid/bin/srvctl start filesystem -d /dev/asm/vol0-382”这后半部分会报告典型的SELinux AVC错误。前半部分向集群件注册没问题,但是后半部分挂载ACFS文件系统有问题。这几乎可以肯定是个未公开的BUG。查看station11上的/var/log/audit/auditd.log日志 更确定这一点:
type=AVC
msg=audit(1476967058.678:458): avc: denied { execute } for pid=18342
comm="mount.acfs.bin" path="/u01/app/12.1.0/grid/lib/libhasgen12.so"
dev="dm-0" ino=1843106 scontext=system_u:system_r:mount_t:s0
tcontext=system_u:object_r:default_t:s0 tclass=file
type=SYSCALL
msg=audit(1476967058.678:458): arch=c000003e syscall=9 success=no
exit=-13 a0=0 a1=ff46c0 a2=5 a3=802 items=0 ppid=18341 pid=18342
auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0
ses=4294967295 tty=(none) comm="mount.acfs.bin"
exe="/sbin/mount.acfs.bin" subj=system_u:system_r:mount_t:s0 key=(null) |
既然Oracle说12c支持SELinux下的ACFS,因此我们就不会把SELinux设成“disabled”或“permissive”去实验。其实SELinux除错并没有想像中那么复杂,操作系统上auditd没开的话,查看 /var/log/message日志;auditd开的话,查看/var/log/audit/auditd.log日志。深入一点,还有strace和ddd等工具帮忙去发现问题解决问题。看了以上日志后,猜测应该是grid用户去执行mount之类的操作时,SELinux认为权限不够引起的(grid用户去注册自己的集群件,SELinux就没管)。如果是这样,以root用户挂载不就解决了:
[root@station11 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/RAC12crhel6-root 52G 28G 22G 56% /
tmpfs 3.8G 1.3G 2.6G 33% /dev/shm
/dev/vda1 283M 131M 133M 50% /boot
[root@station11 ~]# mount.acfs -o all
[root@station11 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/RAC12crhel6-root 52G 28G 22G 56% /
tmpfs 3.8G 1.3G 2.6G 33% /dev/shm
/dev/vda1 283M 131M 133M 50% /boot
/dev/asm/vol0-382 1.0G 943M 82M 93% /acfsdata |
[root@station12 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/RAC12crhel6-root 52G 29G 21G 58% /
tmpfs 3.8G 1.3G 2.6G 33% /dev/shm
/dev/vda1 283M 131M 133M 50% /boot
[root@station12 ~]# mount.acfs -o all
[root@station12 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/RAC12crhel6-root 52G 29G 21G 58% /
tmpfs 3.8G 1.3G 2.6G 33% /dev/shm
/dev/vda1 283M 131M 133M 50% /boot
/dev/asm/vol0-382 1.0G 943M 82M 93% /acfsdata |
如果为了省事,你也可以让SELinux处于“disabled”或“permissive”状态,那么/u01/app/grid/cfgtoollogs/asmmca/scripts/acfs_script.sh脚本就不会执行出任何问题。
附:关闭SELinux后的执行:
[root@station11 ~]# /u01/app/grid/cfgtoollogs/asmca/scripts/acfs_script.sh
已在节点 station11,station12 上装载 ACFS 文件系统/acfsdata |
即使前面遇到执行脚本出错,然后用root手工加载,仍然会看到图形界面上报告ACFS创建成功:
[root@station11 acfsdata]# /u01/app/12.1.0/grid/bin/srvctl config filesystem -d /dev/asm/vol0-382
卷设备: /dev/asm/vol0-382
规范卷设备: /dev/asm/vol0-382
辅助卷设备:
装载点路径: /acfsdata
用户:
类型: ACFS
装载选项:
说明:
ACFS 文件系统已启用
ACFS 文件系统已在以下节点上分别启用:
ACFS 文件系统已在以下节点上分别禁用: |
为了把这个文件系统用于创建表空间数据文件,改变一下它的属组和所属组。挂载后SELinux上下文不再是挂载点本身的“defaut_t”上下文标签,而是ACFS真实的“unlabeled_t”上下文标签。这是完全符合Linux原理的。
[root@station11 ~]# ls -ld /acfsdata/
drwxrwxr-x. 4 root root 4096 10月 17 16:05 /acfsdata/
[root@station11 ~]# chown oracle:oinstall /acfsdata
[root@station11 ~]# ls -ld /acfsdata
drwxrwxr-x. 4 oracle oinstall 4096 10月 17 16:05 /acfsdata
[root@station11 acfsdata]# ls -Zd ./
drwxrwxr-x. oracle oinstall system_u:object_r:unlabeled_t:s0 ./ |
[root@station12 ~]# ls -ld /acfsdata/
drwxrwxr-x. 4 root root 4096 10月 17 16:11 /acfsdata/
[root@station12 ~]# chown oracle:oinstall /acfsdata
[root@station12 ~]# ls -ld /acfsdata/
drwxrwxr-x. 4 oracle oinstall 4096 10月 17 16:11 /acfsdata/
[root@station12 acfsdata]# ls -Zd ./
drwxrwxr-x. oracle oinstall system_u:object_r:unlabeled_t:s0 ./ |
此时此刻再查看一下集群概况,已经带了Oracle云文件系统:
[grid@station11 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM.lsnr ora....er.type ONLINE ONLINE station11
ora....L0.advm ora....me.type ONLINE ONLINE station11
ora.DATA.dg ora....up.type ONLINE ONLINE station11
ora.FRA.dg ora....up.type ONLINE ONLINE station11
ora....ER.lsnr ora....er.type ONLINE ONLINE station11
ora....N1.lsnr ora....er.type ONLINE ONLINE station12
ora....N2.lsnr ora....er.type ONLINE ONLINE station11
ora....N3.lsnr ora....er.type ONLINE ONLINE station11
ora.MGMTLSNR ora....nr.type ONLINE ONLINE station11
ora.asm ora.asm.type ONLINE ONLINE station11
ora.cvu ora.cvu.type ONLINE ONLINE station11
ora....l0.acfs ora.acfs.type ONLINE ONLINE station11
ora.db12c.db ora....se.type ONLINE ONLINE station11
ora.mgmtdb ora....db.type ONLINE ONLINE station11
ora....network ora....rk.type ONLINE ONLINE station11
ora.oc4j ora.oc4j.type ONLINE ONLINE station11
ora.ons ora.ons.type ONLINE ONLINE station11
ora.proxy_advm ora....vm.type ONLINE ONLINE station11
ora.scan1.vip ora....ip.type ONLINE ONLINE station12
ora.scan2.vip ora....ip.type ONLINE ONLINE station11
ora.scan3.vip ora....ip.type ONLINE ONLINE station11
ora....11.lsnr application ONLINE ONLINE station11
ora....n11.ons application ONLINE ONLINE station11
ora....n11.vip ora....t1.type ONLINE ONLINE station11
ora....12.lsnr application ONLINE ONLINE station12
ora....n12.ons application ONLINE ONLINE station12
ora....n12.vip ora....t1.type ONLINE ONLINE station12 |
3.2 ADVM代理实例(APX实例)
[grid@station11 ~]$ . oraenv
ORACLE_SID = [+ASM1] ? +APX1
The Oracle base remains unchanged with value /u01/app/grid
[grid@station11 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Mon Oct 17 16:08:33 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> show parameter instance_type
NAME TYPE
------------------------------------ ---------------------------------
VALUE
------------------------------
instance_type string
ASMPROXY |
[grid@station12 ~]$ . oraenv
ORACLE_SID = [+ASM2] ? +APX2
The Oracle base remains unchanged with value /u01/app/grid
[grid@station12 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Mon Oct 17 16:11:10 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> show parameter instance_type
NAME TYPE
------------------------------------ ---------------------------------
VALUE
------------------------------
instance_type string
ASMPROXY |
以上两个查询语句已经说明:ADVM代理实例是一种特殊的oracle实例。实例类型为ASMPROXY,名叫+APX拼上节点号。它使ADVM/ACFS能连接到Flex ASM实例。APX实例需要运行在每台配置了ADVM/ACFS的机器之上。和ASM实例以及Flex ASM实例一样APX实例也是AMM管理的,默认的 memory_target 在大部分应用场景下都不需要调整。
如果没有加载ACFS内核模块(acfsload start -s),APX是启动不了的。如果没有找到APX实例,这台机器根本就不可能volenable ADVM卷。
4. 启动和停止APX实例
APX实例可以独立开关。
[grid@station11 ~]$ srvctl stop asm -proxy -node station11 |
[grid@station11 ~]$ srvctl status asm -proxy -detail
ADVM proxy is not running
ADVM proxy is enabled |
[grid@station12 ~]$ srvctl start asm -proxy -node station12 |
[grid@station11 ~]$ srvctl status asm -proxy -detail
ADVM proxy is running on node station12,station11
ADVM proxy is enabled |
5. 测试APX实例
5.1 加载工作负载
在ACFS卷上创建ACFSDATA表空间:
[oracle@station12 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Mon Oct 17 23:19:28 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn system/oracle_4U
Connected.
SQL> set linesize 1000
SQL> col TABLESPACE_NAME format a20
SQL> col FILE_NAME format a80
SQL> select tablespace_name , file_name from dba_data_files;
TABLESPACE_NAME FILE_NAME
-------------------- --------------------------------------------------------------------------------
SYSTEM +DATA/DB12C/DATAFILE/system.279.923784037
SYSAUX +DATA/DB12C/DATAFILE/sysaux.278.923783913
EXAMPLE +DATA/DB12C/DATAFILE/example.286.923784339
USERS +DATA/DB12C/DATAFILE/users.280.923784203
UNDOTBS2 +DATA/DB12C/DATAFILE/undotbs2.287.923785293
UNDOTBS1 +DATA/DB12C/DATAFILE/undotbs1.281.923784205
6 rows selected.
SQL> create tablespace acfsdata datafile '/acfsdata/acfsdata01.dbf' size 800M;
Tablespace created.
SQL> select tablespace_name , file_name from dba_data_files;
TABLESPACE_NAME FILE_NAME
-------------------- --------------------------------------------------------------------------------
SYSTEM +DATA/DB12C/DATAFILE/system.279.923784037
SYSAUX +DATA/DB12C/DATAFILE/sysaux.278.923783913
EXAMPLE +DATA/DB12C/DATAFILE/example.286.923784339
USERS +DATA/DB12C/DATAFILE/users.280.923784203
UNDOTBS2 +DATA/DB12C/DATAFILE/undotbs2.287.923785293
UNDOTBS1 +DATA/DB12C/DATAFILE/undotbs1.281.923784205
ACFSDATA /acfsdata/acfsdata01.dbf
7 rows selected. |
查看ACFS文件系统上建出的文件。同时也查看这些文件的SELinux上下文标签。ACFS里文件的真实上下文标签是unlabeled_t。
[oracle@station11 ~]$ cd /acfsdata/
[oracle@station11 acfsdata]$ ls
acfsdata01.dbf lost+found
[oracle@station11 acfsdata]$ du -sh acfsdata01.dbf
801M acfsdata01.dbf |
[root@station11 acfsdata]# ls -Z
-rw-r-----. oracle asmadmin system_u:object_r:unlabeled_t:s0 acfsdata01.dbf
drwx------. root root system_u:object_r:unlabeled_t:s0 lost+found |
[oracle@station12 ~]$ cd /acfsdata
[oracle@station12 acfsdata]$ ls
acfsdata01.dbf lost+found
[oracle@station12 acfsdata]$ du -sh acfsdata01.dbf
801M acfsdata01.dbf |
[root@station12 acfsdata]# ls -Z
-rw-r-----. oracle asmadmin system_u:object_r:unlabeled_t:s0 acfsdata01.dbf
drwx------. root root system_u:object_r:unlabeled_t:s0 lost+found |
专门在这个表空间加载工作负载:
[oracle@station12 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Mon Oct 17 23:19:28 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn system/oracle_4U
Connected.
SQL> create table sh.sales2 tablespace acfsdata
2 as select * from sh.sales;
Table created.
SQL> update sh.sales2 set AMOUNT_SOLD=AMOUNT_SOLD+1 ;
保证其持续运行中 |
5.2 在ACFS文件系统上放置一个普通文件
[root@station12 acfsdata]# dd if=/dev/zero of=acfs.file bs=10M count=5
记录了5+0 的读入
记录了5+0 的写出
52428800字节(52 MB)已复制,0.0902295 秒,581 MB/秒
[root@station12 acfsdata]# ls
acfsdata01.dbf acfs.file lost+found |
说明ACFS的通用性,同时说明这个dd与《系列二》中的对ASM磁盘的dd性质完全不同,ASMFD是不会阻挡这个dd的。SELinux也没问题。
5.3 强迫关闭+ASM2实例,查看ADVM/ACFS仍然挂载
在强迫关闭+ASM2实例前,ASM/APX/数据库服务器实例的连接情况:
[grid@station11 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.2.0 Production on Sat Oct 17 23:46:55 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> set linesize 1000
SQL> col client_instance_name format a20
SQL> col ASM_INSTANCE_NAME format a20
SQL> col DB_NAME format a20
SQL> col STATUS format a20
SQL> select distinct i.instance_name asm_instance_name, c.instance_name client_instance_name,
2 c.db_name, c.status
3 from gv$instance i, gv$asm_client c
4 where i.inst_id=c.inst_id;
ASM_INSTANCE_NAME CLIENT_INSTANCE_NAME DB_NAME STATUS
-------------------- -------------------- -------------------- ------------------------------------
+ASM1 +APX1 +APX CONNECTED
+ASM1 +ASM1 +ASM CONNECTED
+ASM1 -MGMTDB _mgmtdb CONNECTED
+ASM1 db12c1 db12c CONNECTED
+ASM2 +APX2 +APX CONNECTED
+ASM2 +ASM2 +ASM CONNECTED
+ASM2 db12c2 db12c CONNECTED
7 rows selected. |
加载工作负载同时,强迫关闭+ASM2(就是+APX2实例和db12c2数据库服务器实例正连接着的那个Flex ASM实例):
[grid@station11 ~]$ srvctl stop asm -node station12 -stopoption ABORT -force |
或
[root@station12 ~]# ps -eo pid,command | grep 'asm_smon_+ASM2' | grep -v grep | cut -c -6 |
查看集群范围内Flex ASM实例的运行情况,确认+ASM2实例已经死亡:
[grid@station11 ~]$ srvctl status asm -verbose
ASM is running on station11
Detailed state on node station11: Started
Detailed state on node station12: Instance Shutdown |
以下查询语句证实了Flex ASM新特性在起作用:+APX2实例被自动且透明地连接到 +ASM1实例,避免了崩溃。 db12c2数据库服务器实例也被自动且透明地连接到 +ASM1实例,也避免了崩溃。这样对于有表空间在ADVM/ACFS上的数据库服务器来说,才能真正提高Oracle数据库服务器基于集群的巩固性。
[grid@station11 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.2.0 Production on Sat Oct 17 23:56:55 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> set linesize 1000
SQL> col client_instance_name format a20
SQL> col ASM_INSTANCE_NAME format a20
SQL> col DB_NAME format a20
SQL> col STATUS format a20
SQL> select distinct i.instance_name asm_instance_name, c.instance_name client_instance_name,
2 c.db_name, c.status
3 from gv$instance i, gv$asm_client c
4 where i.inst_id=c.inst_id;
ASM_INSTANCE_NAME CLIENT_INSTANCE_NAME DB_NAME STATUS
-------------------- -------------------- -------------------- ------------------------------------
+ASM1 +APX1 +APX CONNECTED
+ASM1 +APX2 +APX CONNECTED
+ASM1 +ASM1 +ASM CONNECTED
+ASM1 -MGMTDB _mgmtdb CONNECTED
+ASM1 db12c1 db12c CONNECTED
+ASM1 db12c2 db12c CONNECTED
6 rows selected. |
5.4 手动启动刚才被强迫关闭的+ASM2实例
下面我们要手动启动刚才被强迫关闭的+ASM2实例:
[grid@station11 ~]$ srvctl start asm -node station12
[grid@station11 ~]$ srvctl status asm -verbose
ASM is running on station12,station11
Detailed state on node station11: Started
Detailed state on node station12: Started |
在目前的12.1.0.2的GI版本中还不能做到让db12c2数据库服务器实例和+APX2实例自动连接回+ASM2实例:
[grid@station11 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.2.0 Production on Sat Oct 18 00:03:55 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> set linesize 1000
SQL> col client_instance_name format a20
SQL> col ASM_INSTANCE_NAME format a20
SQL> col DB_NAME format a20
SQL> col STATUS format a20
SQL> select distinct i.instance_name asm_instance_name, c.instance_name client_instance_name,
2 c.db_name, c.status
3 from gv$instance i, gv$asm_client c
4 where i.inst_id=c.inst_id;
ASM_INSTANCE_NAME CLIENT_INSTANCE_NAME DB_NAME STATUS
-------------------- -------------------- -------------------- ------------------------------------
+ASM1 +APX1 +APX CONNECTED
+ASM1 +APX2 +APX CONNECTED
+ASM1 +ASM1 +ASM CONNECTED
+ASM1 -MGMTDB _mgmtdb CONNECTED
+ASM1 db12c1 db12c CONNECTED
+ASM1 db12c2 db12c CONNECTED
6 rows selected. |
5.5 重分配+APX2实例和db12c2数据库服务器实例连接回到+ASM2实例
如果本文5.1节中运行的工作负载结束了,那么请在会话中提交后,再一次运行此工作负载。工作负载能够安然结束,本身就证明了在之前实验中+ASM2被杀死并没有伤害到+APX2实例和db12c2数据库服务器实例。
SQL> commit;
Commit complete.
SQL> update sh.sales2 set AMOUNT_SOLD=AMOUNT_SOLD+1 ; |
请保证以下所有实验是在以上工作负载伴随下进行。顺便看一下5.2节中的那个通用文件还能操作吗?
[root@station11 ~]# cd /acfsdata/
[root@station11 acfsdata]# ls
acfsdata01.dbf acfs.file lost+found
[root@station11 acfsdata]# rm -f acfs.file
[root@station11 acfsdata]# ls
acfsdata01.dbf lost+found |
接下来,我们要重分配+APX2实例连接回+ASM2实例。
[grid@station11 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Thu Oct 18 00:05:10 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn / as sysasm
Connected.
SQL> alter system relocate client '+APX2:+APX';
System altered.
SQL> set linesize 1000
SQL> col client_instance_name format a20
SQL> col ASM_INSTANCE_NAME format a20
SQL> col DB_NAME format a20
SQL> col STATUS format a20
SQL> select distinct i.instance_name asm_instance_name, c.instance_name client_instance_name,
2 c.db_name, c.status
3 from gv$instance i, gv$asm_client c
4 where i.inst_id=c.inst_id;
ASM_INSTANCE_NAME CLIENT_INSTANCE_NAME DB_NAME STATUS
-------------------- -------------------- -------------------- ------------------------------------
+ASM1 +APX1 +APX CONNECTED
+ASM1 +ASM1 +ASM CONNECTED
+ASM1 -MGMTDB _mgmtdb CONNECTED
+ASM1 db12c1 db12c CONNECTED
+ASM1 db12c2 db12c CONNECTED
+ASM2 +APX2 +APX CONNECTED
6 rows selected. |
接下来,我们要重分配db12c2数据库服务器实例连接回+ASM2实例。
[grid@station11 ~]$ sqlplus /nolog
SQL*Plus: Release 12.1.0.2.0 Production on Thu Oct 18 00:07:50 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
SQL> conn / as sysasm
Connected.
SQL> alter system relocate client 'db12c2:db12c';
System altered.
SQL> set linesize 1000
SQL> col client_instance_name format a20
SQL> col ASM_INSTANCE_NAME format a20
SQL> col DB_NAME format a20
SQL> col STATUS format a20
SQL> select distinct i.instance_name asm_instance_name, c.instance_name client_instance_name,
2 c.db_name, c.status
3 from gv$instance i, gv$asm_client c
4 where i.inst_id=c.inst_id;
ASM_INSTANCE_NAME CLIENT_INSTANCE_NAME DB_NAME STATUS
-------------------- -------------------- -------------------- ------------------------------------
+ASM1 +APX1 +APX CONNECTED
+ASM1 +ASM1 +ASM CONNECTED
+ASM1 -MGMTDB _mgmtdb CONNECTED
+ASM1 db12c1 db12c CONNECTED
+ASM2 db12c2 db12c CONNECTED
+ASM2 +APX2 +APX CONNECTED
6 rows selected. |
总结
在12c中我们值得探索SELinux下的ADVM/ACFS的配置。当然环境中还同时包括ASMFD和Flex ASM。在这套Oracle云文件系统环境中,ADVM代理实例是一种特殊的oracle实例。实例类型为ASMPROXY,名叫+APX拼上节点号。它使ADVM/ACFS能连接到Flex ASM实例。APX实例需要运行在每台配置了ADVM/ACFS的机器之上。
后记:最后实话实说UEK3对ACFS的支持是不太好,在没有打任何补丁的情况下,又处于本文中如此复杂的环境下(ASMFD+Flex ASM+SEL inux),刚开始两台主机在实验过程中不定时会重启。分析后发现跟SELinux并无大关系,是ora_j001进程内存映射方式污染了UEK3内核。遂调整两台机器上的kernel.shmmax 为物理内存的1/2后问题得到解决(已经运行500多小时没有问题),望Oracle将来在OEL中不要把kernel.shmmax默认值给得过大。 |