Data Management

ORA-04030 doesn't always mean you're running out of RAM


I maintain an ancient Oracle Reports system, which does a peculiar FTP gyration that recently started giving us errors.

Here's the scenario: We have a database that powers a farm of six Oracle Reports servers. Some of the reports we generate are tilde-delimited text files that have to be FTP'd into a mainframe computer. (Don't ask -- it's an ancient system and it's not going to be changed anytime soon.)

This was all written before the feature was introduced in the Oracle Reports server that supports FTP as a target. In other words, it was written before we could FTP directly from the Reports server into the mainframe. So when the database sends a URL to one of the Reports servers, it waits and reads back in the resulting report text file. This gets saved as a blob in the database, and the database periodically exports the blobs into the local file system and runs an external Java FTP process to push those files into the right location on the mainframe.

Things started blowing up recently when the 10g server tried to save the incoming text file as a blob in the database. There's no way -- at least no way that I know -- to stream the incoming file into the database. It has to wait, read in the entire text file from the Reports server, and then save that entire chunk into the database, which means that the database has to allocate temporary storage somewhere to hold the text file until it can be written out.

These text files are not small -- some of them have grown into the hundred-megabyte range. These six files are scheduled very early in the morning, so there should not be  that much traffic on the database server I'm using; besides, the database server has 8 GB of RAM and only runs the database. There is plenty of capacity to temporarily hold these reports.

We recently started seeing a bunch of ORA-04030 errors in our application error log. The error text would say something like "out of process memory when trying to allocate 16396 bytes."

Our first thought was to check RAM usage, which we did, and everything looked fine. The database had way more headroom than it needed to allocate RAM for these files. Our DBA opened up a support ticket with Oracle and, after a bit if investigating, it turns out we were running low on swap space for the database. We had a couple GB of free disk space but apparently that wasn't enough for the hungry database server. Luckily, it's not a real disk -- it's virtual running on a big EMC disk farm. We were able to fairly easily get it expanded to give more room for the database.

I'm still not clear on why the database didn't warn us specifically about the swap space issue. Why give this general out-of-memory error, which mislead us into thinking RAM was the issue? If the database knew it was swap space at issue instead of RAM, it should have been indicated in the error text.

5 comments
Justin James
Justin James

... at the level that the SQL engine that throws the error runs at, swap and physical memory are indistinguishable. For example try just filling up a string in memory with junk. When you run out swap space, your app will throw an out of memory error because the OS cannot allocate any more RAM to it, but the IS will start tossing low swap space warnings and maybe try to increase swap if possible. Now, that being said, the real question is, "why is the database suddenly using a heck of a lot more RAM (physical + swap) in the first place?" Increasing swap may very well just put off a much bigger problem. You might want to look into why the DB now needs a lot more RAM than it used to. J.Ja

kirby
kirby

The title is simplistic, but James has it right. The system uses RAM to hold data. When RAM is full, and on more efficient systems when RAM isn't getting used, it is swapped to disk if there is space available. How did you analyze your RAM usage? Tools like 'top' and 'sar' on a Unix box at the time of the error should have showed a lot of disk activity (swapping) and 0 available RAM. If your database appears to have a lot of RAM available then you might try reducing some of your init parameters. It maybe that the database has enough RAM, but that the OS is starved. Couldn't do more analysis without the details. Might be useful to see how you troubleshot the problem if you do another article.

RexWorld
RexWorld

I haven't asked the DBA to look but I would be willing to bet it's just the Portal using up all that RAM and swap space. As we've rolled out the Portal to more and more users, there's now gigabytes worth of document blobs stored in that schema. Plus many more users logged in, so there's session and state objects being maintained by the Portal. Etc., etc.

RexWorld
RexWorld

Hey, thanks for the pointers. Our configuration is pretty similar to yours -- 8 GB Windows 2003.