Two days ago I have been asked to look at the following error which occurred in a PRE-PRODUCTION data base just few weeks before going live:
ORA-04030: out of process memory when trying to allocate 4032 bytes (qmxdGetElemsBy,qmemNextBuf:alloc)
The aim of this article is to explain how (with the help of my colleagues) I succeeded to circumvent this ORA-04030 error.
First, I was aware that ORA-04031 is linked to an issue with the SGA while the current ORA-04030 is linked with an abnormal consummation of PGA memory.Unfortunately, having no prior acquaintances with this application makes the issue hard to solve. All what I have been said is that when the corresponding PRE-PRODUCTION application has been triggered to parse 10 XML messages with 3MB of size for each message, it crashes during the parsing process of the 6th xml message with the above error.
I asked, nevertheless, the customer whether treating simultaneously 10 x 3MB XML messages is a situation that is possible to happen in their real life PRODUCTION application or they are just trying to fudge the database with extreme situations. And the answer was, to my disappointment, YES. This kind of XML load is very plausible once the application will go live.
I asked (again) if I can have those 10 XML messages in order to inject them into a DEVELOPMENT data base. Once they handled me a copy of the desired messages, I tried to parse them in the DEVELOPMENT instance which, hopefully (yes hopefully) crashes with the same error.
There is an ORA-04030 diagnostic tool developed by Oracle Support which, if you give it the corresponding alert log file, will generate for your system a set of recommendations that if implemented might solve the issue.
This tool when provided with the corresponding trace file, suggested us, among many other suggestions, to:
- Set the hidden parameter _realfree_heap_pagesize_hint
- Use bulk collect with the limit clause and gave us a link showing how to do that following Steven Feuerstein article published on oracle magazine
Taken into account my ignorance of the application business and how the code has been exactly managed to parse the xml messages, I was looking to use a tactical fix, which will be safe, necessitate a minimum of tests and with a big chance of no side effects. As far as I was in a development database I instrumented (added extra logging) the PL/SQL code until I have clearly isolated the piece of code which is causing the error:
BEGIN -- New parser lx_parser := xmlparser.newParser; -- Parse XML file xmlparser.parseClob(lx_parser,pil_xml_file); --pil_xml_file); lxd_docxml := xmlparser.getDocument(lx_parser); -- get all elements lxd_element := xmldom.GETDOCUMENTELEMENT(lxd_docxml); FOR li_metadata_id IN 1..pit_metadata.COUNT LOOP IF pit_metadata(li_metadata_id).MXTA_FATHER_ID IS NULL THEN gi_num_row_table := 1; P_PARSE_XML_ELEMENTS(lxd_element ,NULL ,pit_metadata ,li_metadata_id ,pit_metadata(li_metadata_id).TAG_PATH ); END IF; END LOOP; …./…. END;
The above piece of code is called in a loop and is traumatizing the PGA memory.
With the help of a colleague, we decided to use a tactical fix which consists of freeing up the pga memory each time a xml document is treated using the Oracle xmldom.freedocument API. In other words, I added the following piece of code at the end of the above stored procedure:
END IF; END LOOP; …./…. xmldom.freedocument(lxd_docxml); -- added this END;
And guess what?
The ORA-04030 ceases to disturb us.
Bottom Line: When troubleshooting an issue in a real life production application try first to reproduce it in a TEST environment. Then instrument again the PL/SQL code in this TEST environment until you narrow the error close to its calling program. Then look to a tactical fix that will not need a lot of non-regression tests and which should be of limited side effect.