Bo's Oracle Station

【博客文章2021】对Oracle RAC TRANSACTION可飘移服务的工作过程的思考

2021-11-30 10:00| 发布者: admin| 查看: 4553| 评论: 0|原作者: Bo Tang

摘要: 【博客文章2021】对Oracle RAC TRANSACTION可飘移服务的工作过程的思考
【博客文章2021】对Oracle RAC TRANSACTION可飘移服务的工作过程的思考

Author: Bo Tang

1. 详细配置Oracle RAC TRANSACTION可飘移服务:
    Policy-Managed RAC的服务为例子,详细配置一个Oracle RAC Transaction可飘移服务,来支持APPLICATION CONTINUITY和TRANSACTION GUARD:


    -commit_outcome TRUE代表实现了TRANSACTION GUARD这个Oracle功能。
    对于Oracle RAC Transaction可飘移服务来讲,连接负载均衡选项(-j)比较推荐使用SHORT。但是使用LONG也是可以的。连接负载均衡选项代表会话负载均衡的分配机制主要考量短会话(SHORT)还是长会话(LONG),长会话比较典型的是连接池。
    对于Oracle RAC Transaction可飘移服务来讲,运行时负载均衡选项-rlbgoal比较推荐使用SERVICE_TIME。但是使用THROUGHPUT也是可以的。运行时负载均衡选项是考量以相应时间最快为优化目标还是以吞吐量最大为优化目标。
    -retention选项代表commit outcome信息要被保留的秒数,默认就是86400秒,最长不超过1个月(2592000秒)
    -replay_init_time代表在指定的秒数范围内,事务replay必须启动。如果在指定的秒数内它未能启动,则放弃该事务replay。  

2. Oracle RAC TRANSACTION可飘移服务的飘移过程:

    当可恢复的故障发生的时候(比如所连接的RAC节点出现意外故障),配置了Oracle RAC TRANSACTION可飘移服务的RAC数据库,就会尝试寻找仍然存活的节点进行会话重新连接。首先判断-retention选项代表commit outcome是否超时。如果没有超时,就完成会话的重新连接。重新连接尝试的次数是由-failoverretry(或者-z)来决定的。每次重新连接的间隔是由-failoverdelay(或者-w)来决定的。一旦连接成功,允许事务在 -replay_init_time规定的秒数内开始replay。在等待relay的过程中,正在检查以下两个重要的指标:
    a. 数据库是否是原数据库,SCN是否正常(没有发生过不完全恢复)?
    b. 发生故障时最后的那个事务是否提交成功了?
    检查完成以上信息后,整理出一个将要执行的SQL语句队列,然后这个队列将被按照顺序执行。

3. Oracle RAC TRANSACTION可飘移服务的测试过程:

    Oracle RAC SESSION可漂移服务或者Oracle RAC SELECT可漂移服务就可以通过sqlplus中做一个会话,接着kill会话所连接的节点上的ora_smon进程的方式来测试。Oracle RAC SELECT可漂移服务在测试中表现尤其惊艳:查询短暂停顿(SQL>提示符还在),接着会话和cursor被漂移到新的节点,剩余的结果集一行不多一行不少地返回给用户。Oracle RAC TRANSACTION可漂移服务不能通过在sqlplus中做一个事务,不提交,接着kill会话所连接的节点上的ora_smon进程的方式来测试。如果那样做,只会看到与Oracle RAC SESSION可漂移服务一样的结果,即:会话被飘到新的节点(用户不需要重新连接),但是事务并没有replay,用户需要重新做这个事务。那么Oracle RAC SESSION可漂移服务到底怎么测试呢?它是通过逻辑事务ID,在出现数据库连接故障时,来识别数据库返回给应用的信息有没有到达,还有保证replay过程中不会发生重复提交。因此测试时需要应用端的配合,不是简单地在sqlplus中做个事务能够测试的。
    测试过程必需使用UCP连接池并import oracle.ucp.admin.UniversalConnectionPoolManagerImpl来连接数据库。

4. 使用oracle.ucp.admin.UniversalConnectionPoolManagerImpl来进行成功测试Oracle RAC TRANSACTION可飘移服务

    在数据库中创建一个表来做实验:


    接着写一个java文件:Worker.java来模拟一个serverlet,完整的Worker.java供下载:

package acdemo;

import javax.sql.DataSource;
import java.sql.*;
import oracle.ucp.jdbc.ValidConnection;
import oracle.ucp.jdbc.PoolDataSource;
import oracle.ucp.jdbc.oracle.OracleJDBCConnectionPoolStatistics;
import java.util.Random;
import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.util.Calendar;
/**
 * Worker thread that runs the workload in a loop. This simulates a servlet
 * execution that accesses the database.
 *
 * In a loop the threads will:
 *  1. Grab a connection from the pool
 *  2. Execute SQL
 *  3. Release the connection from the pool
 *  4. Sleep to simulate time used for the business execution
 * 
 * @author Jean de Lavarene
 * @Contributing author Kuassi Mensah
......
    {
       PreparedStatement pstmt = c.prepareStatement("insert into emp(empno,ename,sal) values(?,?,?)");
       int empno = (int)System.nanoTime()%9999;
       pstmt.setInt(1,empno);
       pstmt.setString(2,"Bob"+empno);
       pstmt.setInt(3,8000);
       try{
            pstmt.executeUpdate();
            if (ACDemo.VERBOSE) { System.out.println("Adding row to emp"); }

        } catch(SQLException insertsqlex)
        {
         if (insertsqlex instanceof SQLIntegrityConstraintViolationException) {
            if (insertsqlex.getMessage().startsWith("ORA-00001: unique constraint (HR.PK_EMP) violated")){
              System.out.println("in catch block for constraint violation\n");
              empno++;
            }
         }
         else throw insertsqlex;
       }
      pstmt.close();
    }
     // c.rollback();
     c.commit();
  }
 
  public void run() {
    long counter = 0;
     boolean retry = false;
    while(true) {
      Connection c = null;
     DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:MM:SS");
     Calendar cal = Calendar.getInstance();
    
      long nanoTimeStart=0,timeSpentOnDb = 0;
      try {
        // Step 1: Get a connection from UCP. This corresponds to the beginning
        // of the AC request. The replay driver will now start recording all the
        // JDBC calls.
        c = ds.getConnection();
        // Make sure auto commit if off:
        c.setAutoCommit(false);

        /*
         *  Print connection instance
         */
        /*
          ResultSet rs;
          Statement stmt = c.createStatement();
         
          rs =

           stmt.executeQuery("select '... Connected to '||sys_context('userenv','instance_name') from dual");
 
          while (rs.next()) {
              // Only display for VERBOSE operation
              if (ACDemo.VERBOSE) { System.out.println( dateFormat.format(cal.getTime()) + "  " + rs.getString(1)); }
          }
          rs.close();
          stmt.close();
          */
           if (retry) {
              System.out.println(" Application driven connection retry succeeded");
              retry = false;
           }

        nanoTimeStart = System.nanoTime();
       
        // Step 2: run the workload
        if (ACDemo.VERBOSE) { System.out.println("Executing databaseWorkload()"); }
        databaseWorkload(c);
       
      } catch (SQLException ea) {
        // Application developers have to write code to recover
        // from database errors. With AC they would still have to do so but
        // it'll only be exercised if  AC wasn't able to do its magic.       
        //
        // Fast Connection Failover
        try {
         if (c == null ||!((ValidConnection)c).isValid()){
          ea.printStackTrace();
          System.out.println("Application error handling: attempting to get a new connection "+ea.getMessage()+".");
           c.close();
           String fcfInfo = ((OracleJDBCConnectionPoolStatistics) ds.getStatistics()).getFCFProcessingInfoProcessedOnly();
           System.out.println("FCF information: " + fcfInfo);
            retry = true;
         } else {
          System.out.println("unknown exception: " + ea);
          }
        }catch (SQLException ea1) {}
        //
        synchronized (ACDemo.statsLock) {
          ACDemo.nbOfExceptions++;
          if(ACDemo.applicationCrashOnErrors && ACDemo.nbOfExceptions > 20)
          {
            // I'm a very poorly written application and I will crash after
            // 20 exceptions:
            System.err.println("20 fatal exceptions.");
            System.err.println("");
            System.err.println("*** APPLICATION CRASHED ***");
            System.err.println("");
            System.exit(1);
          }
        }
        if(ACDemo.VERBOSE) {
          ea.printStackTrace();
         
          System.err.println("."+ea.getMessage()+".");
        }
       
      } finally {
        timeSpentOnDb = (System.nanoTime()-nanoTimeStart)/1000000; // in ms
        try {
          if (c != null) {
            // Step 3: release the connection into the pool. This corresponds
            // to the end of the AC request. The AC driver stops recording and
            // can purge the replay queue.
            c.close();
            if (ACDemo.VERBOSE) { System.out.println("Closed connection"); }
          }
        } catch (SQLException ea) {}
      }

      // don't update the state for the first run:
      if(counter > 0) {
        synchronized (ACDemo.statsLock) {
          ACDemo.operationsCompleted++;
          // ACDemo.timeSpentOnGetConnection += timeSpentOnGetConnection;
          ACDemo.timeSpentOnDb += timeSpentOnDb;
        }
      }
      // Step 4: sleep to simulate the business processing time
      if (ACDemo.threadThinkTime > 0) {
        // Introduce delay between requests for processing webpages
        long timeToSleep = ACDemo.threadThinkTime +
          random.nextInt((ACDemo.threadThinkTime<10)?10:ACDemo.threadThinkTime/10);
        try {
          Thread.sleep(timeToSleep);
        } catch (Exception ea) {}
      }
      counter++;
    }
  }
}

      调用Worker.java的ACDemo.java供下载:(包括PrintACStatsThread.java   PrintStatThread.java),以下是ACDemo.java的全部代码:

package acdemo;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.SQLException;
import java.util.Properties;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;

import javax.sql.DataSource;

import oracle.ucp.admin.UniversalConnectionPoolManagerImpl;
import oracle.ucp.jdbc.PoolDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
import oracle.ucp.admin.UniversalConnectionPoolManager;
import oracle.ucp.UniversalConnectionPoolAdapter;
import oracle.ucp.UniversalConnectionPoolException;
import oracle.ucp.UniversalConnectionPoolStatistics;

/**
 * This demo of Application Continuity uses the Universal Connection Pool (UCP) and
 * the JDBC thin driver to simulate a web workload with servlets concurrently
 * requesting connections from the pool to execute SQL on the database.
 * In order to show the effect of application continuity the database sessions need to be
 * killed manually. Without AC this demo will crash. With AC only a small response time
 * increase should be seen.
 *
 * This demo is packaged in a jar to facilitate its execution.
 * See runreplay and runnoreplay script files.
 * Beforehand,  you need to change the configuration parameters in "acdemo.properties".
 *
 * Note that the demo will run forever and needs to be killed.
 *
 * @author Jean de Lavarene
 * @contributing author Kuassi Mensah
 * @contributing author Troy Anthony
 */
public class ACDemo extends Thread {

  // Number of concurrent threads running in the application
  // UCP is tuned to have MAX and MIN limit set to this

  // How often should the thread print statistics.   Time in milliseconds
  static final int DELAY_BETWEEN_PRINTING_STATS = 5 * 1000;
  static boolean VERBOSE = true;
  private static String PROP_FILE="acdemo.properties";
  static int connectionWaitTimeout = 3; // seconds
 
  static int nbOfThreads = 0;
  static int ucpPoolSize = 0;
  static int threadThinkTime = 0;
  // Do not require validateConnectionOnBorrow if patch 31112088 applied
  static boolean validateConnectionOnBorrow = false;

  static boolean applicationCrashOnErrors = true;
  static boolean fastConnectionFailover = false;

  static boolean cpuIntensive = false;

  static final Object statsLock = new Object();
  static int operationsCompleted = 0;
 
  // timeSpentOnDb should not include wait time
  static long timeSpentOnDb = 0;
 
  static int nbOfExceptions = 0;


  static public void main(String args[])
    throws SQLException {
    Connection conn = null;
    PreparedStatement pstmt = null;
    ResultSet rs = null;

    if(args.length > 0) {
      PROP_FILE = args[0];
    }
    try {
      Properties prop = new Properties();
      try {
        prop.load(new FileInputStream(PROP_FILE));
      } catch (IOException e) {e.printStackTrace();}
     
      nbOfThreads = Integer.parseInt(prop.getProperty("number_of_threads"));
      ucpPoolSize = Integer.parseInt(prop.getProperty("ucp_pool_size"));
      threadThinkTime = Integer.parseInt(prop.getProperty("thread_think_time","20"));
      VERBOSE = Boolean.parseBoolean(prop.getProperty("verbose","false"));
      applicationCrashOnErrors = Boolean.parseBoolean(prop.getProperty("application_crash_on_errors","true"));
      fastConnectionFailover = Boolean.parseBoolean(prop.getProperty("fastConnectionFailover","false"));
      validateConnectionOnBorrow = Boolean.parseBoolean(prop.getProperty("validateConnectionOnBorrow","false"));
      connectionWaitTimeout = Integer.parseInt(prop.getProperty("connectionWaitTimeout","3"));
      PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
      pds.setConnectionFactoryClassName(prop.getProperty("datasource"));
      // Set DataSource Property
      pds.setUser(prop.getProperty("username","HR"));
      pds.setPassword(prop.getProperty("password","HR"));
      pds.setURL(prop.getProperty("url"));
      pds.setConnectionPoolName(UCP_POOL_NAME);
      pds.setConnectionWaitTimeout(connectionWaitTimeout);
      pds.setFastConnectionFailoverEnabled(fastConnectionFailover);
      pds.setValidateConnectionOnBorrow(validateConnectionOnBorrow);
      pds.setInitialPoolSize(ucpPoolSize);
      pds.setMinPoolSize(ucpPoolSize);
      pds.setMaxPoolSize(ucpPoolSize);
      pds.setConnectionProperties(prop);

      System.out.println("######################################################");
      System.out.println("Connecting to " + prop.getProperty("url"));
      System.out.println(" # of Threads             : " + nbOfThreads);
      System.out.println(" UCP pool size            : " + ucpPoolSize);
      System.out.println("FCF Enabled:  " + pds.getFastConnectionFailoverEnabled());
      System.out.println("VCoB Enabled: " + pds.getValidateConnectionOnBorrow());
      System.out.println("ONS Configuration:  " + pds.getONSConfiguration());
      System.out.println("Enable Intensive Wload:  " + cpuIntensive);
      System.out.format("Thread think time        : %d ms\n",
        threadThinkTime);
      System.out.println("######################################################");
      System.out.println("");
     
      // Start the connection pool with the PoolManager:
      UniversalConnectionPoolManager poolManager =
        UniversalConnectionPoolManagerImpl.getUniversalConnectionPoolManager();
      poolManager.createConnectionPool((UniversalConnectionPoolAdapter)pds);
      System.out.println("Starting the pool now... (please wait)");
      long start = System.currentTimeMillis();
      poolManager.startConnectionPool(UCP_POOL_NAME);
      long end = System.currentTimeMillis();
      System.out.println("Pool is started in "+(end-start)+"ms");
      ACDemo u = new ACDemo();
      u.runDemo(pds);
    } catch (SQLException sqlea) {
      do{
        sqlea.printStackTrace();
        sqlea = sqlea.getNextException();
      }
      while(sqlea != null);
    }
    catch (Exception ea) {
      System.out.println("Error during execution: " + ea);
      ea.printStackTrace();
    } finally {
         if (rs != null) rs.close();
         if (pstmt != null) pstmt.close();
         if (conn != null) conn.close();
    }

  }

  /**
   * Start the worker threads:
   */
  private void runDemo(PoolDataSource pds)
    throws Exception {

    Thread[] t = new Thread[nbOfThreads];

    for (int i = 0; i < nbOfThreads; ++i) {
      t[i] = new Thread(new Worker(pds));
      t[i].start();
    }

    /*  Stats thread - displays some UCP statistics */
    Thread stat = new PrintStatThread();
    stat.start();
    /*  AC Stats thread - displays AC client statistics
         Note: The acchk utility is database resident in 19.12 and provides view-based access to the Application Continuity statistics.
         The acchk utility is available from within Orachk prior to Oracle Database 19c
    */
    //Thread acStat = new PrintACStatThread(pds);
    //acStat.start();
   
    // Wait for all threads to be done:
    for (int i = 0; i < nbOfThreads; ++i) {
      t[i].join();
    }
    needToPrintStats = false;
    stat.interrupt();
    //acStat.interrupt();

  }

  static boolean needToPrintStats = true;
  static String UCP_POOL_NAME="actest";
}


    下面是调用ACDemo.java的,能够进行Transaction Replay的shell程序


    其中的ac_replay.properties

# Stub file to create ac_replay.properties
# Use replay datasource
datasource=oracle.jdbc.replay.OracleDataSourceImpl

# Set verbose mode
VERBOSE=FALSE

# database JDBC URL
url=jdbc:oracle:thin:@(DESCRIPTION=(CONNECT_TIMEOUT=90)(RETRY_COUNT=50)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=cluster11-scan.example.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=serv3)))

# database username and password:
username=hr
password=oracle_4U

# Enable FAN
fastConnectionFailover=TRUE

#Disable connection tests
validateConnectionOnBorrow=TRUE

# number of connections in the UCP's pool:
ucp_pool_size=20

#Connection Wait Timeout for busy pool
connectionWaitTimeout=5

# number of active threads (this simulates concurrent load):
number_of_threads=10

# think time is how much time the threads will sleep before looping:
thread_think_time=50

    它运行时,我们kill掉一个实例(UCP会连接serv3的两个实例),不会报错。查看hr.emp表会发现,其中的行数还在不断增加。

5. 使用oracle.ucp.admin.UniversalConnectionPoolManager来进行失败对比测试Oracle RAC TRANSACTION可飘移服务

   下面是 调用ACDemo.java的,不能够进行Transaction Replay的shell程序


    其中的ac_noreplay.properties

#Stub file to build ac_noreplay.properties
# Use vanilla datasource
datasource=oracle.jdbc.pool.OracleDataSource

# Set verbose mode
VERBOSE=FALSE

# database JDBC URL
url=jdbc:oracle:thin:@(DESCRIPTION=(CONNECT_TIMEOUT=90)(RETRY_COUNT=50)(RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=cluster11-scan.example.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=serv3)))

# database username and password
username=hr
password=oracle_4U

# Disable FAN
fastConnectionFailover=FALSE

#Disable connection tests
validateConnectionOnBorrow=FALSE

# number of connections in the UCP's pool
ucp_pool_size=20

#Connection Wait Timeout for busy pool
connectionWaitTimeout=5

# number of active threads (this simulates concurrent load)
number_of_threads=10

# think time is how much time the threads will sleep before looping
thread_think_time=50
~                                   

    它运行时,我们kill掉一个实例(UCP会连接serv3的两个实例),会报错。查看hr.emp表会发现,其中的行数停止增加。

















路过

雷人

握手

鲜花

鸡蛋

QQ|手机版|Bo's Oracle Station   

GMT+8, 2022-5-9 21:00 , Processed in 0.024786 second(s), 21 queries .

返回顶部