Tuesday, December 9, 2008

What can I kill?

Fairly often, I have to kill Oracle background processes. At first this probably seems crazy. Do I want to destroy my database?

No, it is SOP. Archivers (ARCn) are my favorite targets. You can solve all kinds of problems by killing the archivers and letting them be automatically restarted. Often issues with redo transport services (Streams, CDC, Logical Data Guard) can be solved by forcing the archivers to restart.

Another time you might want to kill a background process is if it is leaking memory.

Unfortunately, there are many processes in Oracle that cannot be killed. Off the top of your head you might mention PMON, SMON and LGWR. Is there anyplace in the documentation that breaks down which you can kill without terminating the instance? No. So I experimented on 11g with these results:

Killable:
ARCn: Redo log archivers
CJQn: Job scheduler coordinator
Dnnn: Dispatchers
DIA0: Diagnosibility process 0
DIAG: Diagnosibility coordinator
FDBA: Flashback data archiver process
Jnnn: Job scheduler processes
MMNL: Manageability Monitor Process 2
MMON: Manageability Monitor Process
PING: Interconnect latency measurement
Qnnn: Queue cleanup processes
QMNC: Queue coordinator
RECO: Distributed recovery process
Snnn: Shared servers
SMCO: Space management coordinator
Wnnn: Space management processes

Instance-Critical:
ACMS: Atomic controlfile to memory server
CKPT: Checkpoint
DBRM: Resource manager process
DBWn: Database writer processes
LGWR: Redo log writer
LMDn: Global enqueue service daemons
LMON: Global enqueue service monitor
MMAN: Memory manager
PMON: Process monitor
PSPn: Process spawners
RMSn: RAC management server
RVWR: Recovery writer
SMON: System monitor process
VKTM: Virtual keeper of time process

Monday, December 8, 2008

Keeping CDC from falling apart catastrophically

CDC, especially the log-based implementations such as "asynchronous autolog" can be a really nice effective way to maintain a DSS system with low impact on a production OLTP data source.

One problem with CDC is that it uses Streams, specifically the logminer capture and apply components, which can be challenging to make work successfully. A constant danger is the CDC staging database encounterin a problem that makes it unable to continue processing changes. In development, we find ourselves rebuilding the change source and sets frequently to overcome bugs, mistakes and mysterious failures.

In production, however, rebuilding the CDC environment requires rebuilding the entire DSS reporting system or data warehouse. Each change from the production source must be captured by CDC or there will be data loss. Rebuilding CDC implies losing transactions.

As it turns out, everything hinges on the availability fo a recent logminer data dictionary build in the redo logs. With CDC, you can reinstantiate the CDC environment as of any point in time, as long as you have a source archivelog with a dictionary build, and all subsequent logs to the current pont in time. Thus my #1 recommendation for people using async autolog CDC:

Generate a data dictionary into the source redologs daily using a script similar to this:



SQL> variable dicscn number
SQL> exec dbms_capture_adm.build(:dicscn)
SQL> column discn format 9999999999999999999999
SQL> print dicscn

(Record this SCN)


If you ever need to reinstantiate the CDC environment, re-create the change source as of the SCN of the last available dictionary build:


SQL> exec dbms_cdc_publish.create_autolog_change_source( -
change_source_name=>'MYSOURCE', -
description=>'My OLTP Source', -
source_database=>'MYDB', -
first_scn=>'', -
online_log=>'y')


Then ceate a change set that begins processing changes as of the time that the previous failed CDC staging system failed:


SQL> exec dbms_cdc_publish.create_change_set( -
change_set_name=>'MYSET', -
description=>'My Change Set', -
change_source_name=>'MYSOURCE', -
begin_date=>to_date('2008-12-09 18:00:00','yyyy-mm-dd hh24:mi:ss'), -
stop_on_ddl=>'y')


Finally create your change tables and subscriptions as usual. The changes in the change tables will beg as of the date specified in create_change_set.