SHAREDPLEX 8.6 out of sync for queue xxx

复制架构:

source:oracle 11g rac
target:teradata(数仓)

从event log里报错信息可以看出,s和t端的数据不一致,到底是因为什么不一致,需要查一下信息,可通过t端的/vardir/log下的相关queue的*errlog.sql及数据库端来核实具体原因。
splex evet log里的报错信息如下:
Error 2016-04-28 13:13:03.633404 42512 1259554560 Table “TRSEN_DATA”.”PAY_TEST” out of sync for queue p1 at source rowid AAAcDnADTAAAOFvAAD: — out of sync (posting from jxcard, queue p1, to NBAT_TIM0)
Error 2016-04-28 13:13:03.045695 42512 1259554560 Table “TRSEN_DATA”.”PAY_TEST” out of sync for queue p1 at source rowid AAAcDnADTAAAOEbAAL: — out of sync (posting from jxcard, queue p1, to NBAT_TIM0)

产生原因:
因为t端丢失了s端部分数据,丢失数据原因是因为利用qview工具做事务的跳过动作导致。

解决方案:
因为是做异构环境的同步,无法利用repair和copy动作去处理,只能利用新queu来初始化不同步的数据,具体操作如下:
1、找出s端和t端不同步的数据,具体表具体对待
2、sourc:创建表结构,没有数据
3、target:将表中此时间段的数据进行delete
4、target:stop post(原队列)
5、source:edit config文件配置另一个队列中,激活config文件
6、source:insert对应时段的数据
7、target:等新insert的数据结束后,start post(原队列)
8、source:激活原来的config文件(activate config xxx nolock;)

针对上述的操作注意事项:
6操作步骤中,注意通过splex的post queue来核实insert的数据有没有全部同步到t端,不要随意去select target端的目标表
7操作步骤中,必须等待数据同步过来再操作
8操作步骤中,必须等待7步骤中start post queue里的backlog数据全部无积压,才能做activate动作,否则会导致原queue因错误无法启动(一句话很麻烦)

Config 文件配置信息:
sp_ctrl (splxvip:2100)> view config teradb3
datasource:o.jxcard
TRSEN.PAY_TEST TRSEN_DATA.PAY_TEST SPLXSCAN:p1@r.NBAT_TIM0
THIG_OWNER.TB_CUST TRSEN_DATA.TB_CUST SPLXSCAN:p10@r.NBAT_TIM0
splex.PAY_TEST TRSEN_DATA.PAY_TEST SPLXSCAN:p10@r.NBAT_TIM0
注意我们是在splex用户进行问题表的部分数据同步,且队列不再原队列上

SHAREPLEX: ORA-01400

sp_ctrl (openstack02:2100)> show log reverse
Error 2015-05-15 09:34:58.733430 46266 784828160 s:2 Poster: Unexpected Oracle error: ORA-01400: cannot insert NULL into (“YTCRM”.”AO_AUTOMATIC_BASE”.”USER_CODE”). (object name: “YTCRM”.”AO_AUTOMATIC_BASE”) (posting from ifdb, queue express, to intwebdb) [module opo]
Notice 2015-05-15 09:34:56.024503 46266 889710336 Poster: SQL Cache disabled. (posting from ifdb, queue express, to intwebdb) [module opo]
Info 2015-05-15 09:34:54.785300 46266 4213745408 Poster launched, pid = 46266 (posting from ifdb, queue express, to intwebdb)
Notice 2015-05-15 09:34:54.783704 46262 451921664 User command: oracle start post (from openstack02)

Compare errors with following event_log errors:
01/22/07 19:38 Internal error: Error -1 in de_compare_client see sebprod_declt-owner-table-pid-*.log [sp_declt(deq)/pid]
01/22/07 18:17 Error: Oracle Error 1400: ORA-01400: cannot insert NULL into
(“owner”.”table”.”col”)

Cause
This error is thrown if an attempt is made to insert a null into a column that has not null constraint.

RESOLUTION:
Check deferrable primary key constraints: when the constraint is set immediate causing the error或者修改报错列让其能接受其null值(需要和app商量)

sp_ctrl (openstack02:2100)> status

Brief Status for openstack02
Process State PID Running Since
————— —————————— ——– ——————–
Cop Running 14108 23-Apr-15 14:51:19
MTPost Running 46798 15-May-15 09:47:19
Import Running 37315 08-May-15 17:12:20
Cmd & Ctrl Running 46262 15-May-15 09:34:31
There are no active configuration files

sp_ctrl (openstack02:2100)> qstatus

Queues Statistics for openstack02
Name: express (o.ifdb-o.intwebdb) (MTPost queue)
Number of messages: 13582373 (Age 5793 min; Size 6474 mb)
Backlog (messages): 13582094 (Age 5757 min)

sp_ctrl (openstack02:2100)> /

Queues Statistics for openstack02
Name: express (o.ifdb-o.intwebdb) (MTPost queue)
Number of messages: 13581655 (Age 5793 min; Size 6474 mb)
Backlog (messages): 13581330 (Age 5757 min)

splex多链路配置

一、To establish the variable-data directories
1、Install SharePlex in the normal manner according to the instructions in the SharePlex
Installation and Demonstration Guide. SharePlex uses the default port of 2100.
These instructions assume, as an example, that the SharePlex product directory (containing
the binaries) is installed in a directory named /splex/proddir, and that the
SharePlex variable-data directory is installed in a directory named /splex/vardir.

Note:Do not start SharePlex until directed to do so. The following steps create a variable-
data directory for each sp_cop. In this example, two directories are created, one
using port 2101 and one using port 2200. The default port of 2100 is used for the
original SharePlex installation.

2、Tar the original /splex/vardir and its subdirectories. The following commands put
the resulting tar file in /splex (one directory up). Pay close attention to the dots in the
tar command.
$ cd /splex/vardir
$ tar cvf ../splex.tar .

3、Make new variable-data directories, each named for the port number (or another
unique identifier) to be used by the associated sp_cop instance.
$ mkdir /splex/vardir/splex2101 (for the first instance of sp_cop)
$ mkdir /splex/vardir/splex2200 (for the second instance of sp_cop)

4、Change directories to the variable-data directory made for port number 2101.
$ cd /splex/vardir/splex2101

5、Enter the original variable-data directory structure into this directory.
$ tar xvf /splex/splex.tar

6、Repeat steps 4 and 5 for each additional variable-data directory. (In the example,
there is only one additional directory.)
$ cd /splex/vardir/splex2200
$ tar xvf /splex/splex.tar

7、Set the port number for each variable-data directory that you created by setting the
TCP and UDP port parameters in the SharePlex paramdb. There is a paramdb file in
each of the variable-data directories you created. See page 241 for instructions on
setting the port numbers.

8、[PERFORM THIS STEP ONLY IF THERE IS AN ACTIVE CONFIGURATION.] Log on as a SharePlex
Administrator and run the clean_vardir.sh script for each variable-data directory
that you created. The script removes duplicated replication queues and restores each
one to a fresh state. Instructions for running clean_vardir.sh are on page 413.

9、In the rim sub-directory of each variable-data directory, delete the shstinfo.ipc and
shmaddr.loc files.==============>此步比较重要,如果不清理,可能无法复制数据
Note: The shstinfo.ipc and shmaddr.loc files may not exist if sp_cop has never been
started against this variable-data directory.

二、To establish the Oracle users
Create an Oracle user account for each variable-data directory for which there will be an
instance of sp_cop. The account enables the Post process to log into Oracle and post
data. To create the accounts, run the SharePlex ora_setup program for each sp_cop
instance. To identify an account, use its port number, for example splex2101. See
page 387 for instructions on running ora_setup.

三、To start and control the sp_cop instances
To run multiple sp_cop sessions, start a session of sp_cop and sp_ctrl for each variable-
data directory. Follow these steps.

1 Export the SP_SYS_VARDIR environment variable to point to the first variable-data
directory.
$ export SP_SYS_VARDIR=/splex/vardir/splex2101 (ksh shell)
Or…
$ setenv SP_SYS_VARDIR /splex/vardir/splex2101 (csh shell)

2 Start sp_cop by typing the absolute path name and the –uportnumber argument,
where portnumber is the port number assigned to the first variable-data directory’s
sp_cop instance. This points SharePlex to the correct port and allows you to see that
sp_cop instance when you use the ps -ef | grep sp_ command. Use the & argument
to run sp_cop in the background.
$ /splex/proddir/bin/sp_cop -u2101 &

3 To run sp_ctrl for that instance of sp_cop, start sp_ctrl, then issue the port command.
$ ./sp_ctrl
sp_ctrl(sysA)> port 2101

4 Repeat the preceding steps for each variable-data directory.
Example syntax for a second variable-data directory
$ export SP_SYS_VARDIR=/splex/vardir/splex2200 (ksh shell)
Or…
$ setenv SP_SYS_VARDIR /splex/vardir/splex2200 (csh shell)
$ /splex/proddir/bin/sp_cop -u2200 &
$ ./sp_ctrl
sp_ctrl(sysA)> port 2200
Note: If you receive an error message similar to this:
Error cleaning up previous shared memory segment ###.
Cannot delete because there are users attached.
Check if SharePlex processes are running and kill them if necessary.
Find out if someone else started a session of sp_cop using the same port number and
variable-data directory. Kill the processes associated with that session, then start
sp_cop again.

以下操作都需要以Shareplex的管理用户(即当时安装Shareplex时指定的系统用户,一般来说是oracle)来执行。
3. 修改source&&target端paramdb
$ cd /data/shareplex /vardir/2500/data
$ vi paramdb 编辑使其内容如下:
SP_COP_TPORT 2200 #端口号更改
SP_COP_UPORT 2200 #端口号更改
SP_SYS_LIC_1076  ” E1BVM2JNHA47DRRRLWRYQSN822B7XXS5BA: SHANGHAI YTO EXPRESS CO LTD ”

至关重要:设置环境export SP_SYS_VARDIR=/quest/splex/vardir/2200 ,否则会清空现有链路信息,到时候就死定了