Thanks for checking Torsten.
The issues started when we noticed a login via dbmcli was not possible, all login types were rejected.
The SAP system had been shutdown (gracefully), proably explains why there are no client connections.
Unusual entries in KernelMsg seem to start with a lot of connect/connection released
messages like this:
ask | 108 2013-09-24 15:57:31 | CONNECT | 12633: Connect req. (DSL, T108, connection obj. 0x8013a49b0, Node:'uan112-se', PID: 15922) |
Thread | 0x23CE Task | 108 2013-09-24 15:57:31 | CONNECT | 12651: Connection released (DSL, T108, connection obj. 8013a49b0) |
Thread | 0x23CE Task | 108 2013-09-24 15:58:36 | CONNECT | 12633: Connect req. (DSL, T108, connection obj. 0x8013a49b0, Node:'uan112-se', PID: 15922) |
Thread | 0x23CE Task | 108 2013-09-24 15:58:36 | CONNECT | 12651: Connection released (DSL, T108, connection obj. 8013a49b0) |
Failed login like this:
ask | 168 2013-09-24 16:09:09 | RTESec | 2: User control attempts to connect |
| 2013-09-24 16:09:09 | RTESec | 0: Authentication rejected |
| 2013-09-24 16:09:09 | RTESec | 0: Authentication method: SCRAMMD5V1 |
| 2013-09-24 16:09:09 | RTESec | 0: Authentication rejected |
| 2013-09-24 16:09:09 | RTESec | 0: Authentication method: SCRAMMD5 |
We did restart the x_server a number of times.
Then an error occurs with the Watchdog
ask | - 2013-09-24 17:38:13 ERR RTEKernel | 125: The watchdog process is no longer alive,_FILE=RTEKernel_StartupUnix+noPIC.cpp,_LINE=408 |
| Contact your system administrator. Show him the error message which points to an operating system |
and looks like the DB tries an emergency shutdown:
Thread | 0x2391 Task | - 2013-09-24 17:56:21 | RTEKernel | 114: Caught STOP signal |
Thread | 0x2390 Task | - 2013-09-24 17:56:21 | RTE | 20225: Database tries automatic shutdown |
Thread | 0x7A5E Task | - 2013-09-24 17:56:21 ERR RTE | 20126: Database automatic shutdown failed,_FILE=RTE_ExternalCall+noPIC.cpp,_LINE=937 |
Thread | 0x2390 Task | - 2013-09-24 17:56:21 WNG RTEKernel | 121: Kernel is being stopped in ONLINE state |
Thread | 0x2390 Task | - 2013-09-24 17:56:22 | RTEKernel | 61: rtedump written to file 'rtedump' |
Thread | 0x2390 Task | - 2013-09-24 17:56:22 | RunTime | 3: State changed from ONLINE to KILL |
Thread | 0x2390 Task | - 2013-09-24 17:56:22 | RTEKernel | 111: Tracewriter resumed |
Thread | 0x2390 Task | - 2013-09-24 17:56:22 | RTEKernel | 94: Waiting for tracewriter to finish work |
Thread | 0x23C7 Task | 3 2013-09-24 17:56:22 | Trace | 20000: Start flush kernel trace |
Thread | 0x2390 Task | - 2013-09-24 17:56:22 | RTEKernel | 116: Tracewriter termination timeout: 60 seconds |
Thread | 0x23C7 Task | 3 2013-09-24 17:56:22 | Trace | 20001: Stop flush kernel trace |
Thread | 0x23C7 Task | 3 2013-09-24 17:56:22 | Trace | 20002: Start flush kernel dump |
Thread | 0x23C7 Task | 3 2013-09-24 17:56:24 | Trace | 20003: Stop flush kernel dump |
Thread | 0x23AE Task | - 2013-09-24 17:56:33 ERR RTEKernel | 125: The watchdog process is no longer alive,_FILE=RTEKernel_StartupUnix+noPIC.cpp,_LINE=408 |
| Contact your system administrator. Show him the error message which points to an operating system configuration error and then contact the database support if your system administrator can not fix the error. |
Thread | 0x23C7 Task | 3 2013-09-24 17:56:39 | RTEKernel | 110: Releasing tracewriter |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | TENANT | 13008: Requestor for tenant database DSL has stopped |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | RTEThread | 13: The thread LegacyRequestor is finished |
Thread | 0x23B0 Task | - 2013-09-24 17:56:39 | RTE | 20214: CONSOLE thread stopped |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | RTEKernel | 58: Backup of diagnostic files will be forced at next restart |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | RTEKernel | 118: SERVERDB DSL has stopped |
| 2013-09-24 17:56:39 | RTEKernel | 14: Kernel version: Kernel | 7.8.02 Build 036-121-248-298 |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | RunTime | 3: State changed from KILL to STOPPED |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | RTEThread | 13: The thread Requestor is finished |
Thread | 0x2390 Task | - 2013-09-24 17:56:39 | TENANT | 13005: Tenant database DSL has stopped |
Thread | 0x2390 Task | - 2013-09-24 17:56:40 | RTEKernel | 119: Kernel aborts |
The last entry in the KrnlMsg file was several hours before the x_cons suspends where noticed.
The running dbmsrv processes are
uan112:sqddsl 1002> ps -ef |grep dbmsrv|grep DSL
sdb 5652 5651 0 Sep24 ? 00:00:05 /sapdb/DSL/db/pgm/dbmsrv -sdbstarter 3600 3600 A -P 0000000300000007000000080000000B
sdb 11302 1 94 Sep24 ? 13:45:32 /sapdb/DSL/db/pgm/dbmsrv -sdbstarter 3600 3600 A -P 0000000300000007000000080000000B
sdb 13644 1 4 07:37 ? 00:00:00 /sapdb/DSL/db/pgm/dbmsrv -P 0000000b0000000e0000000f00000012
sdb 13873 1 4 07:37 ? 00:00:00 /sapdb/DSL/db/pgm/dbmsrv -P 0000000b0000000e0000000f00000012
sdb 13978 1 4 07:37 ? 00:00:00 /sapdb/DSL/db/pgm/dbmsrv -P 0000000b0000000e0000000f00000012
As the DB seems to be down I might kill them later and try to start the DB from scratch.