Out of Memory Killer kills Oracle RMAN session with error RMAN-03004 RMAN-10038


When RMAN Backup is run to disk with 2 channels, all the system memory seems to be allocated for the backup. After a while the out of memory killer kills one of the RMAN sessions with following error stack:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03004: fatal error during execution of command
RMAN-10038: database session for channel ORA_DISK_2 terminated unexpectedly



OOM Killer Strikes when There is No Swap Usage
The linux system generally does not start to use the swap area (due to lazy
swapping) unless the total free memory is really low. So if OOM killer kills
some processes but the system was not swapping it means that there are
available pages in HighMem zone but we are short of LowMem area.

In the messages log file, you would see something like:

kernel: Out of Memory: Killed process NNNN

The swap area not being used is something strange.

Check the kernel parameters, if they are set appropriately. for example, check kernel.shmmax if it is set too high. If so, set it appropriately, like in 32-bit Oracle Enterprise Linux set :

kernel.shmmax to the value of 3221225472.

If this parameter is set too high, it can cause inappropriate usage of Memory allocations and swap area.