Hi Maria,
as to if CDB cache server really matters, I cannot say. What I know is that you can delete that cache, if needed, so I would say that recovering CDB is (up to my knowledge) not necessary - but my knowledge is limited here...
As to the recovery - according to the pictures you had added above, the database kernel gets an OS error "wrong file or device name" when trying to open CDB_COM (for details you might want to look in the 'dbm.prt' and 'KnlMsg' log files). But really seems to me that the backup medium you want to recover is missing here.
In regards to CSADMIN failing, have you looked at the Content Server 'cstrace.txt' log file for any error? Maybe creating a ticket at SAP would help here - this is always recommended, if we might need OS access for analysis...
Thorsten
Re: Recovering CDB and SBD from MaxDB
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
changing the minimum size would not have any influence here, but maybe you should increase the CAT_CACHE_SUPPLY size even more. The size is given in bytes, so 1000000 bytes are just a bit under 1 MB size across all sessions. This size adds to the database memory heap, but if needed I would suggest to raise it considerably, maybe even to 100 MB or more...
To find out how much CAT_CACHE_SUPPLY is used by a task, just do a "select * from SESSIONS" and look at CatalogCacheUsed (displayed value is given in KB!). This value is like a 'high water mark', because the catalog cache per session never declines, it only grows up to its maximum allowed size. But this is good - just let your application run for a while and then check the cache supply per session (of course, the values are reset after a database restart).
From the database point of view, the maximum number of active connections is indeed shown in the SESSIONS table. If you see a higher number in your JDBC application, then maybe this is a result of JDBC connection pooling enabled.
In addition you could also run a 'x_cons <dbname> show active' to get a list of all active database sessions.
Kind regards,
Thorsten
PS:
the 'diag_pack' command is not known to create any performance issue on database level - it only converts some pseudo xml text files to plain text and creates a package of database error log files, which might be of interest for the error diagnosis. But of course, no need to do this right away :-)
Let me know how it turns out with the CatalogCachedUsed and a possibly increased CAT_CACHE_SUPPLY...
Re: Error "-9400 AK Cachedirectory full"
Hello Thorsten,
Thanks again for your answer. I might be wrong, but I always thought (from the maxdb documentation) that the value of CAT_CACHE_SUPPLY was in 8KB pages, so 1'000'000 would mean 8GB which is probably very large now.
I now generated the log archives, and would be glad if you could provide an upload link (we would prefer not to publish these files here on the forum). However I don't see any info regarding the growth of the catalog cache (I saw that for the data cache), maybe we have to activate some debug flag for that?
Thanks again for your help,
Christophe
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
indeed, CAT_CACHE_SUPPLY is given in 8 KB pages - sorry for the confusion, no idea what made me think it was bytes...
Here is a link for uploading the diagpkg file:
https://mdocs.sap.com/mcm/public/v1/open?shr=2wTgQrXdl2NNZUV5LjwfTIBdW0kssK3O7Rx8R2675eI
What is your current setting for CAT_CACHE_SUPPLY? You mentioned trying 1.000.000 (which would be almost 8 GB) - did it still give you errors with the nearly 8 GB cache?
Thorsten
Re: Error "-9400 AK Cachedirectory full"
Hi Thorsten,
Yes we still get AK errors even with 1.000.000 (8GB), and we even just got a DB crash today at about 13:29. I uploaded 2 diagpkg files: the one called diagpkg.tgz was generated at approx. 10:30, and the other one at approx. 14:00 after the crash. You can also see that very shortly after the DB restart at 13:30, we got an AK error at 13:37.
I'm still not sure that the log files contain growth debug info for the catalog cache, can you please check and let me know if we should activate some debug flag for that? Thx.
Thanks again
Christophe
Re: Error "-9400 AK Cachedirectory full"
Hello Christoph,
you are correct, the KnlMsg.txt file does not contain any growth information for the Catalog Cache - this likely means that the Catalog Cache does not need to grow and the database aborts due to a corrupted catalog Cache structure (the result of a so far unknown MaxDB bug).
Unfortunately this kind of error is hard to find as we need to catch the statement/action which corrupts the Catalog Cache. When the database finally detects the error by accessing the corrupt structure, the damage might have already been in the Cache for a while.
The problem of locating the bug:
1. Technically we would need to have the database run in debug 'slow kernel' mode, which will slow down database performance drastically.
2. Also, the last crash occured after ca. 6 hours of the database being online - this would generate a lot of log data via the 'slow kernel' and probably too much to reasonably analyse.
How to proceed:
Can you try to set up a small testcase where the database preferredly crashes very soon (even better would be to identify the offending SQL statement)?
Maybe you could set up a new database for testing and then try to force the error as soon as possible after database start (or at least wthout much other activity around)? A small Catalog Cache size might help in forcing the issue, but then maybe it does not have any impact at all.
Thorsten
Re: SAP Database Studio fail after first start & close
Same problem here...
Re: SAP Database Studio fail after first start & close
Deleting the contents of c:/users/<user>/sdb/DatabaseStudio and the file c:\Users\<user>\.exclipse\org.eclipse.equinox.security\secure_storage and then running DBStudio.exe as Administrator worked for me..
Re: Error "-9400 AK Cachedirectory full"
Hello Thorsten,
Actually we identified at least 2 statements that lead to the "AK Cachedirectory full" error, and one causes approximately 90% of the errors. It's a simple UPDATE statement, however it's a bit of a "rough" statement because we (re)SET all the columns, even the columns that are not necessarily updated. This is "standard" in our application, and we do that for a few hundreds tables, but somehow the AK error mainly occurs for just one table. This table is not huge, it has around 88 columns and 800K lines, 31 constraints, 17 indexes, and a few dozen foreign keys.
Regarding your proposal to setup a test DB and reproduce the error, I doubt a bit that it would work because the error does not happen systematically. I will however try to "generate" the error on our productive system in debug 'slow kernel' mode during a night or Sunday when our users are not working.
Actually when the AK error occurs, the DB dumps some .stm and .dmp files, all called AKxxxxx.stm or AKxxxxx.dmp (there seems to be one file for each active session). Is there a way to analyse these dump files and maybe find out what the problem exactly was?
By the way and also interesting, we managed to solve the problem we had with DB connections. This had nothing to do with the MaxUsers parameter: we actually had to increase the maximum allowed number of semaphores array on our Linux system. We managed to identify this error via the xserver log file, in which we saw the error:
ERR 11277 IPC | create_sem: semget error, No space left on device |
increasing the number of maximum semaphores array from 128 to 256 solved the problem. Maybe this is a silly question, but could it be that another system limit causes the bug because the DB somehow cannot get enough resources for some task?
We also see in the xserver log file some eeror messages like
ERR -11987 COMMUNIC session re-used, command timeout?
Is this error "normal/standard", or can it be that there is also something wrong here?
Thanks again for your help,
Christophe
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
you can ignore the error '-11987, session re-used...', this should be a warning rather than an error, but nothing to worry about...
For the OS 'semmni' parameter I would recommend it to increase even further to at least 9000 to avoid any 'semget error, No space left on device' messages in the future and since I am not aware of any negative side effects for high semmni values. Further details are described in SAP note 628131, if you want I can attach more details...
Regarding the AK...stm and AK...dmp files, yes - please uload them - it might help.
If you start your system bwith the slowkernel, please activate the TraceCatalogCache trace option with level 7 and send us the trace after the next crash.
Reagrds,
Thorsten
Re: Error "-9400 AK Cachedirectory full"
Hello Thorsten and thanks again for your answer.
I just uploaded a .tgz archive of all the AK dump files (via the upload link you gave me last time): maybe you can see something useful?
Regarding SAP note 628131, I'd be glad if you could attach more details or just send me the content by email (if that's possible): we unfortunately don't have access to these notes.
For the "slow mode" debug, I'll probably do that in a few days when I can find a slot when nobody works.
Best regards and many thx for your time,
Christophe
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
I have uploaded a copy of note 628131 into diagpack directory and will look into the AK dump files tomorrow, if time allows.
Thorsten
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
some update for you:
1. The crash might be related the constraint 'year_id' - can you let me know how that constraint is defined?
2. Or even better, can you do a database catalog extract and upload it? This extract would write the complete database object definitions (but without actual content) to a text file.
To do so, connect via 'loadercli' as "loadercli -d <dbname> -u dbuser,pwd" and then start the extract with "catalogextract user outstream 'yourfilename'".
3. When you try to reproduce the issue using the slow kernel, please also enable the parameter CheckTaskSpecificCatalogCache by setting it to 'YES' before you start the slow kernel (and set the trace level for CheckCatalogCache to 7). This parameter change should abort (=crash) the database as soon as it detects any inconsistency (as opposed to now when the database throws the AK cachedirectory full error for a while and eventually aborts when the cache is sufficiently corrupt...).
4. I will be out of office until January 5.
Merry Christmas & Happy New Year,
Thorsten
Re: Error "-9400 AK Cachedirectory full"
Hi Thorsten,
I just uploaded the catalog extract. We actually have 33 constraints with the name YEAR_ID, so we need the table name to find out which one caused the problem (it might actually be the ones from tables PERSON and PERSON_LOG, because we sometimes get AK errors with these 2 tables). FYI, we actually almost get all AK errors with the table ACCOUNTTRANSACTION.
Actually we recently copied the entire table ACCOUNTTRANSACTION into a new table, dropped the old table, and renamed the new one to ACCOUNTTRANSACTION. We then reinserted all foreign keys and constraints ... but it did not help. :-(
By the way, did you check the file KnlMsgArchive in the diagpack files I uploaded? There you can see a stacktrace of the error when the DB crash occurs, it's a SIGSEGV: maybe you find out something useful there?
Thanks again for your support,
Christophe
Re: Error "-9400 AK Cachedirectory full"
Hi Christophe,
KnlMsg does not help much here, because the database eventually aborts when it detects the corrupt catalog cache structure, but what we need to find out is during which operation the cache gets corrupted in the first place. Therefore the request to start the database in slow kernel mode and with CheckTaskSpecificCatalogCache to have the database kernel detect the corruptions sooner in the hope that we can still see the statement damaging the cache (probably some statement writes into a memory area it is not supposed to write...).
Thorsten
main_newbas/job_dbdif_upg fails during upgrade from 7.31 to 7.4 on MaxDB
The RADDBDIF job stops with shortdump PERFORM_NOT_FOUND when it tries to call the non-existing form EXECUTE of RSXPLADA.
The cause is an (incorrect ?) entry in table DBDIFF for PLAN_TABLE_EXTERNAL which has RSXPLADA in column SOURCE instead of SDB3FADA like other objects with DBSYS = ADABAS D.
I changed it to SDB3FADA and the update continued without further problems.
Software versions:
NetWeaver 7.31 SPS 14 before upgrade
NetWeaver 7.4 SPS 9 after upgrade
MaxDB 7.9.08.27
Software Update Manager 1.0 SP 12
Re: Error "-9400 AK Cachedirectory full"
Hello Thorsten,
Let me first wish you all the best for 2015!
I wanted to run our DB in "slow mode" yesterday but then realized that the slow kernel is not part of the "community edition". When I run "db_restart -slow" in dbmcli, I get the following error message:
ERR
-24994,ERR_RTE: Runtime environment error
20095,kernel program missing '/opt/sdb/MaxDB/pgm/slowknl'
I just downloaded the .tgz again (64 bits, I tried 7.8.02.39 and 7.9.08.27) and could not find the file 'slowknl'.
Is there any way we can download the slow kernel (we actually currently use version 7.8.02.38), or shall I just run our DB with with the higher debug levels? Would this help?
Thanks again and best regards,
Christophe
Re: -2 ERR_USRFAIL: User authorization failed
Hello guys,
I'm having the same issue while installing an IDES SAP system based on MaxDB in Linux 11R2.
The password of SUPERDBA and CONTROL is 8 character long.
Here's the current database configuration, database is in ADMIN mode:
/sapdb/programs/bin/dbmcli on xxx : DHS>inst_enum
OK
7.7.06.10 /sapdb/DHS/db
/sapdb/programs/bin/dbmcli on xxx : DHS>db_enum
OK
DHS /sapdb/DHS/db 7.7.06.10 fast running
DHS /sapdb/DHS/db 7.7.06.10 quick offline
DHS /sapdb/DHS/db 7.7.06.10 slow offline
DHS /sapdb/DHS/db 7.7.06.10 test offline
/sapdb/programs/bin # ./sdbregview -l
DB Analyzer /sapdb/programs 7.7.06.10 64 bit valid
Server Utilities /sapdb/programs 7.7.06.10 64 bit valid
PCR 7300 /sapdb/programs 7.3.00.60 valid
PCR 7301 /sapdb/programs 7.3.01.22 valid
PCR 7500 /sapdb/programs 7.5.00.50 64 bit valid
SAP Utilities /sapdb/programs 7.7.06.10 64 bit valid
Redist Python /sapdb/programs 7.7.06.10 64 bit valid
Base /sapdb/programs 7.7.06.10 64 bit valid
JDBC /sapdb/programs 7.6.06.02 valid
Messages /sapdb/programs MSG 0.7732 valid
ODBC /sapdb/programs 7.7.06.10 64 bit valid
Database Kernel /sapdb/DHS/db 7.7.06.10 64 bit valid
SQLDBC 77 /sapdb/programs 7.7.06.10 64 bit valid
Loader /sapdb/programs 7.7.06.10 64 bit valid
SQLDBC /sapdb/programs 7.7.06.10 64 bit valid
Fastload API /sapdb/programs 7.7.06.10 64 bit valid
SQLDBC 76 /sapdb/programs 7.6.05.15 64 bit valid
Do you know what could be the problem?
Thank for any hint!
Sorin
Re: Error "-9400 AK Cachedirectory full"
Hello Christophe,
ups, seems that we do not deliver the slowknl with the current Community Edition packages at the moment - no idea why, I do hope we can change this in the future.
I have uploaded a 'slowknl' file to the usual link, version is 7.8.02.38 for Linux x86_64, please download to the '.../pgm' directory (as indicated in your previous error message) and ensure to have the same file permissions and ownership set as for the regular 'kernel' file (also located in that dir).
Kind regards and hope it works now,
Thorsten
Re: Error "-9400 AK Cachedirectory full"
Hello Thorsten,
Thanks for the 'slowknl' file: I now downloaded it. I don't know when I will again have a good slot to restart the DB in slow mode as I have to do it when no user is working.
Actually we have not had any DB crash for almost 10 days now: I will now analyze what changes we did just before the last crashes to maybe find out if these could have been the "solution" to our problem, or if somehow this is just a coincidence. We changed many DB parameters, semaphore server config, the way we handle jdbc connections and prepared statements in our application, etc. I will let you know if I find out something interesting.
Best regards,
Christophe