Oracle Sean

View Original

The Importance of Monitoring inodes in OCI

I encountered a situation that may be a bit of an edge case, but it’s worth mentioning and monitoring for since it resulted in failure of a database node.

tldr;

We encountered the error “cannot create temp file for here-document: No space left on device” on a filesystem that had plenty of physical space:

$ df -h .
Filesystem                    Size  Used Avail Use% Mounted on
/dev/mapper/x                  24G   20G  3.3G  86% /

The problem wasn’t physical space but inodes:

$ df -i .
Filesystem                       Inodes      IUsed     IFree IUse% Mounted on
/dev/mapper/x                   1572864    1572864         0  100% /

Lesson 1

If you have a job that monitors free space on your system, consider monitoring for free inodes as well. Running out of inodes on a RAC node causes the node to stop.

Lesson 2

Oracle Cloud generates a lot of logging related to the management console. Multiple databases amplifies this. Review your retention policies and adjust them to prevent pressure on either space or inodes.

Lesson 3

In an OCI environment, browse Oracle Support’s knowledge-base for articles related to cleandb, cleandblogs.pl, and /etc/crontab. We were a victim of bug 29602662 described in note 2572489.1.

Background

During a POC of Exadata on Oracle Cloud, I was experimenting with renaming a database using NID. I opened a ticket with Oracle Support and they responded that it would work and was no different than renaming a database in an on-premises environment. I asked about interaction with the console and they said it should just know.

Narrator: It didn’t.

The database rename was successful but the console didn’t recognize the new database, nor would it let go of the old one. The backup service continued running, spawning logs to match. In this case it generated enough logs to exhaust the available inodes on the root filesystem.

OCI Logging

OCI generates logs under /var/log/oracle/, including directories named cstate and cstate_vYYYYMMDD_HH. The date/timestampped directories include many links back to the main cstate directory, similar to:

cstate_v20191211_07/cstate_39bbc4f6-c35b-4b53-928a-3224053a2e23.suc.xml -> /var/opt/oracle/cstate/cstate_39bbc4f6-c35b-4b53-928a-3224053a2e23.xml

inodes

When a file is created on a *nix filesystem it gets an inode. An inode, or index node, is a file descriptor and a limited resource. A file uses at least one inode. A symlink uses at least one inode.

In one of the versioned cstate directories, we see a lot of symlinks:

$ find . -type f | wc -l
39303
$ find . -type l | wc -l
37031

The Problem

The existence of symlinks nearly doubled inode use on the host. This was compounded by the cleandb job not running because of a bug in the default crontab. Without cleanup, we rapidly exhausted inodes.

This was a bit of an edge case since the database rename caused more logging than usual. The system hosts multiple databases, which raised the baseline.

I continue to see an abundance of broken symlinks. These appear to happen when a process or cleanup job removes a file but not the associated links. The broken links are eventually removed by cleandb, but we see are thousands at any given time on this system. I have an SR open with support to identify the root cause but in the mean time the broken links may be identified with:

find -xtype l

…and removed with:

find -xtype l -delete

One more thing—maintenance in OCI Exa tends to run on node one. To get a feel for the impact of maintenance logging, physical space used on node one is about 1.5x higher than node two, but inode use is nearly 10x higher on node one!