General discussion

Locked

too much cached memory?

By cpfeiffe ·
I have RHEL4 kernel 2.6.9-5ELsmp. The system has 8 GB RAM. After boot and starting all processes the system shows about 7.3 GB free, 50 MB inactive and 350 MB active or somewhere thereabouts. The system will be fine (vmstat 3, top, free, swap -l) for a long time if left alone. Then when you access a file (cp, tar, cat, etc.) The "cached" (visible in vmstat and free) memory grows. Some of this is added to "active" and some to "inactive". All the while, it never returns to "free" (even after sync). It stays cached.

I think this is normally fine, except that reading files that have already been read add to cached and cached seems to grow by 2X file size for cp/mv (like it caches it for each file system). The system has been up for 12 hours now and cached is up to 2.5 GB.

It seems to be growing out of control and I can't free any of it. If it keeps growing the system will (and previously has) run out of RAM. In HP-UX you can limit this setting. Can you limit it in RedHat? Is there something else I can do?

Thanks for your help.

This conversation is currently closed to new comments.

11 total posts (Page 1 of 2)   01 | 02   Next
| Thread display: Collapse - | Expand +

All Comments

Collapse -

by jmgarvin In reply to too much cached memory?

Did you setup enough swap space? Just from the sound of it you either don't have enough swap or your swap isn't being utilized properly?????

This is really strange sounding and I can't say I've encountered this one before.

if you do a top (when the memory is filling up) and an uptime what is the output?

Do you have a RAID or no?

So many questions...so few answers....

Collapse -

by jmgarvin In reply to

hmmmmm. Could it be something with sysvinit?

Hearing more I am leaning towards something going wrong with your file system. Are you using ext3 and an IDE RAID? That may be your problem.

It also might be the bug from Red Hat, but boy this is an interesting problem!!!

Collapse -

by jmgarvin In reply to

Arg! On second thought if the OS isn't on a RAID, it shouldn't matter about the filesystem...

However, I still think that either the filesystem is misrepresenting the disk or your disk caching is borked due to something with the RAID is seeing things....

Collapse -

by cpfeiffe In reply to

Poster rated this answer.

Collapse -

by cpfeiffe In reply to too much cached memory?

There is 4 GB swap and it does fill up before the system reports "out of memory", but we really don't want to use swap and cache is not a good excuse to use swap. First, cache should be flushed to use RAM for other things when needed and that doesn't appear to be happening enough. Second, the same file should not be cached twice (once per file system during a cp). I'd like to limit this somehow if possible. We are using raid-1 on the DB disks and no raid on the OS disks at the moment.

On another note, this might be related to a newly discovered bug. The system locks up and the kernel reports (exit.c:840!) during high disk I/O. I reported it to RedHat and they think it is caused by pdflush. So this might ultimately tie into that.

I'll keep this open and keep posting as I find new things. I'll rate everybody's answers when I close out. If anyone does know of a way to limit the cache memory in RedHat please let me know.

Collapse -

by Nico Baggus In reply to too much cached memory?

The growing by a factor of two is about the way
the cache is built. It keeps disk blocks in
memory for a while.

Freeing parameters are derived from other: have
you read the kernel documentation
Documentation/filesystem/proc.txt

also
http://gentoo-wiki.com/FAQ_Linux_Memory_Management

might give some insight about prefering cache or
swapping.

Having your memory run until it fills completely
should be a bug somewhere. Maybe top can show
you some exorbitant large processes? view with
M.
This filling until memory is exhausted sure
looks like some memory leak....

Cache is never swapped out (doesn't make sense)
cache entries can be restored from disk
somewhere, so they can be just written and
discarded when dirty, and just discarded
otherwise.

kind regards,
Nico Baggus

Collapse -

by cpfeiffe In reply to

Poster rated this answer.

Collapse -

by cpfeiffe In reply to too much cached memory?

Thanks Nico. The doc explained what I wanted to see. If I could just get the system to stop swapping as much I think we would be OK. Too bad they didn't put this in the release notes. Oh well. I am trying this now. I don't think there is a memory leak, we just have a lot of I/O (it is an Oracle server) so cache fills in a hurry. That is fine, I just don't want to swap excessively at the benefit of disk cache. Oracle has its own cache and we've given it a huge amount of the memory to work with.

I'll rate your answer with everyone else's when I close this. I suppose it will be some time next week. I'm also waiting to see some movement on the pdflush bug RedHat is working on.

Collapse -

by cpfeiffe In reply to too much cached memory?

Could be a cache issue on the RAID. Dell hasn't released the 2.6 update for their products yet so there might be something there. I'll keep everyone posted. Boy, Dell is way behind on this. They aren't supporting RHEL4 until late July now. RH supports EL4 on Dell hardware and has been doing so for months so they must have work-arounds. I thought they were supposed to be working together. Also, Dell says Oracle can't run on RHEL4 even though Oracle is supporting it.

Collapse -

by cpfeiffe In reply to too much cached memory?

Thanks guys. Here's the latest...
I tuned cache/swap per the link. I ran the backups again and watched how memory was handled "watch free". Everything seemed fine. Cache took all memory quickly, but cache dumped when it needed to. Coincidentally, there is that pdflush bug, which I suspect is there because of Linux's inefficiencies with the new cache management algorithms. So it seems that if we set the swappiness to 0 we can avoid the bug for the most part and also avoid the "out of memory". I am updating RedHat with this info on the bug also. Hopefully, they have a fix soon. The bug is ugly. The kernel just exits. FYI the bug ID is 150653.

Back to Linux Forum
11 total posts (Page 1 of 2)   01 | 02   Next

Related Discussions

Related Forums