NFS client continues to issue “Disk quota exceeded” errors after quota is raised

IdleLayabout

New Member
Joined
Nov 7, 2022
Messages
1
Reaction score
0
Credits
64
Anyone else experienced this? Spent quite a bit of time googling, can’t find a similar issue. Also unsure if this is something that also occurs in Centos 8.

Occurs on Rocky 8.6 (kernel 4.18.0) and 9.0 (kernel 5.14.0) and apparently Ubuntu 20-04 although another team tested this last OS for us.

Tested from a NFS4.1 mount from a Pure Storage device as well as a NFS4.2 mount from a xfs volume running on a Rocky9 server.
The user name spaces (uids) were identical for all tests.


Summary:
NFS client continues to issue “Disk quota exceeded” errors after quota is raised. This is only for block quotas, not inode quotas. It appears to be related to client side attribute caching.


Description:
NFS file system mounted on host on which client is working.
Client is overquota and tries to write to a file (call this FileA.txt).
Client gets “Disk quota exceeded” error as expected.

Admin now increases the quota sufficiently to allow the user to continue writing to FileA.txt. However writes to this particular file still produce “Disk quota exceeded” errors, even though client successfully writes to the file. Writes to other files do not produce errors so long as client did not attempt to write to them while quota was exceeded. Writes to FileA.txt on other hosts which have the NFS file system mounted do not throw this error, even while the error is simultaneously presenting itself on the initial host. Copying the file to another file name and then overwriting the original FileA.txt ‘fixes’ the problem.

The same mounts above were also exported to a Centos7.3 server (kernel 3.10.0)and the error did not occur: raising the user quota after a file write caused a “Disk quota exceeded” allows subsequent writes to that file with no further error messages.

Note 1: when the FS is mounted with the noac option this bug does not occur. Conversely setting actimeo=0 does not fix the bug. The noac option is a combination of the generic option sync, and the NFS-specific option actimeo=0. Hence it appears that the issue is caused by the default async, and setting noac forces sync and fixes the issue.

Note 2: inode quotas do not cause this issue and behave as expected.

Note 3: making use of soft vs hard quotas does not change the behaviour. The issue occurs at the hard quota.

Note 4: looking at TCPDUMP the server is not passing error messages to the client during this condition.


Setup:
SELinux and all firewalls disabled

exportfs -v from my xfs NFS server:
/opt/nfs rocky8.client(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
/opt/nfs centos7.client(sync,wdelay,hide,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)

All servers mentioned here:
cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 +4.2

For the xfs server setup:
acl client = 2.2.53-1.el8.1
acl server = 2.3.1-3.el9
libgssapi no such package in rocky
libevent client = 2.1.8-5.el8
libevent server = 2.1.12-6.el9
librpcsecgss no such package in rocky
nfs-utils client = 1:2.3.3-51.el8
nfs-utils server =1:2.5.4-10.el9
util-linux = 2.32.1-35.el8
util-linux = 2.37.4-3.el9



TCPDUMP:
We start dumping data to FileA.txt:

cat data >> FileA.txt

16:42:39.788810 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq 2497532:2498980, ack 4153, win 12282, options [nop,nop,TS val 2571582773 ecr 4267237588], length 1448
16:42:39.788822 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq 2498980:2500428, ack 4153, win 12282, options [nop,nop,TS val 2571582773 ecr 4267237588], length 1448
16:42:39.788823 IP rocky9.server.nfs > rocky8.client.943: Flags [.], ack 2500428, win 24568, options [nop,nop,TS val 4267237589 ecr 2571582773], length 0
16:42:39.788834 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq 2500428:2501876, ack 4153, win 12282, options [nop,nop,TS val 2571582773 ecr 4267237588], length 1448

cat data >> FileA.txt

16:42:39.788847 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], seq 2501876:2503324, ack 4153, win 12282, options [nop,nop,TS val 2571582773 ecr 4267237588], length 1448
16:42:39.788849 IP rocky9.server.nfs > rocky8.client.943: Flags [.], ack 2503324, win 24568, options [nop,nop,TS val 4267237589 ecr 2571582773], length 0
16:42:39.788856 IP rocky8.client.943 > rocky9.server.nfs: Flags [P.], seq 2503324:2503872, ack 4153, win 12282, options [nop,nop,TS val 2571582773 ecr 4267237588], length 548

cat data >> FileA.txt

16:42:39.788903 IP rocky9.server.nfs > rocky8.client.943: Flags [P.], seq 4153:4253, ack 2503872, win 24568, options [nop,nop,TS val 4267237589 ecr 2571582773], length 100: NFS reply xid 4118676701 reply ok 96 getattr ERROR: Disc quota exceeded
16:42:39.789416 IP rocky8.client.943 > rocky9.server.nfs: Flags [P.], seq 2503872:2504072, ack 4253, win 12282, options [nop,nop,TS val 2571582775 ecr 4267237589], length 200: NFS request xid 4135453917 196 getattr fh 0,2/53
16:42:39.790175 IP rocky9.server.nfs > rocky8.client.943: Flags [P.], seq 4253:4361, ack 2504072, win 24568, options [nop,nop,TS val 4267237590 ecr 2571582775], length 108: NFS reply xid 4135453917 reply ok 104 getattr NON 3 ids 0/-530227613 sz 695948683
16:42:39.830384 IP rocky8.client.943 > rocky9.server.nfs: Flags [.], ack 4361, win 12282, options [nop,nop,TS val 2571582816 ecr 4267237590], length 0



User ID Used Soft Hard Warn/Grace
---------- ---------------------------------
user 3.9M 4M 4M 00 [------]

# setquota -u user 5M 5M 1000 1000 -a /opt/nfs
# xfs_quota -x -c 'report -h' /opt/nfs/
User quota on /opt/nfs (/dev/mapper/VGsplunk-LVsplunk)
Blocks
User ID Used Soft Hard Warn/Grace
---------- ---------------------------------
user 3.9M 5M 5M 00 [------]

# tcpdump | grep nfs | grep ERROR

cat data >> FileA.txt
cat: write error: Disk quota exceeded
<nothing in tcpdump>
user 4.0M 5M 5M 00 [------]

cat data >> FileA.txt
cat: write error: Disk quota exceeded
<nothing in tcpdump>
user 4.2M 5M 5M 00 [------]

cat data >> FileA.txt
cat: write error: Disk quota exceeded
<nothing in tcpdump>
user 4.4M 5M 5M 00 [------]

Finally we exceed the new limit:
cat data >> FileA.txt
cat: write error: Input/output error
cat: write error: Disk quota exceeded
16:47:32.739902 IP rocky9.server.nfs > rocky8.client.943: Flags [P.], seq 5185:5285, ack 1505904, win 24568, options [nop,nop,TS val 4267530540 ecr 2571875724], length 100: NFS reply xid 2726233309 reply ok 96 getattr ERROR: Disc quota exceeded
 

Members online


Top