NFS mount disappearing
from rehydrate5503@lemmy.world to selfhosted@lemmy.world on 21 Oct 02:40
https://lemmy.world/post/21085739

Hi all,

I’m having an issue with an NFS mount that I use for serving podcasts through audibookshelf. The issue has been ongoing for months, and I’m not sure where the problem is and how to start debugging.

My setup:

The issue:

NFS mount randomly drops. When it does, I need to manually mount it again, then restart the Audiobookshelf container (or reboot the VM, but I have other services).

There doesn’t seem to be any rhyme or reason to the unmount. It doesn’t coincide to any scheduled updates or spikes in activity. No issue on the Unraid side that I can see. Sometimes it drops over night, sometimes mid day. Sometimes it’s fine for a week, other times I’m remounting twice a day. What has finally forced me to seek help is the other day I was listening to a podcast, paused for 10-15 mins and couldn’t restart the episode until I went through the manual mount procedure. I checked and it was not due to the disk sinning down.

I’ve tried updating everything I could, issue persists. I only just updated to Fedora 40. It was on 38 previously and initially worked for many months without issue, then randomly started dropping the NFS mounts (I tried setting up other share mounts and same problem). Update to 39, then 40 and issue persists.

I’m not great with logs but I’m trying to learn. Nothing sticks out so far.

Does anyone have any ideas how I can debug and hopefully fix this?

#selfhosted

threaded - newest

schizo@forum.uncomfortable.business on 21 Oct 03:11 next collapse

I’m going to have to cut up my nerd card here, but I had similar issues with NFS exports from my roll-your-own build.

After a month of troubleshooting I decided that working is better than purity so I just mounted the SMB shares instead and everything just worked going forward.

Best I can tell, NFS is just very very finnicky when it comes to hardware accessibility (drive spun down, etc.), network reliability, and is just a lot less robust than other options. I never was able to trace why NFS was the one and only thing that never seemed to work right, but at least there’s other options as a workaround?

phanto@lemmy.ca on 21 Oct 04:08 next collapse

I did the hackiest, lamest thing back in the day… I had my client write the current date and time to a file on the share every two minutes as a Cron job… Kept it working for months! I saw it on a forum somewhere, tried it, and… Shocked Pikachu face I don’t know if I ever disabled that Cron job! Haha!

walden@sub.wetshaving.social on 21 Oct 14:54 collapse

And as a bonus, presumably you have a nice file filled with historic dates and times!

phanto@lemmy.ca on 21 Oct 16:31 collapse

I checked, it’s still there! (It doesn’t append, it overwrites, so no, I just have a file with the current date and time accurate to within two minutes.)

jbloggs777@discuss.tchncs.de on 21 Oct 05:12 next collapse

NFSv3 (udp, stateless) was always as reliable as the network infra under Linux, I found. NFSv4 made things a bit more complicated.

You don’t want any NAT / stateful connection tracking in the network path (anything that could hiccup and forget), and wired connections only for permanent storage mounts, of course.

schizo@forum.uncomfortable.business on 21 Oct 17:38 collapse

Yeah it was NAS -> DAC -> Switch -> endpoints and for whatever reason, for some use cases, it would just randomly hiccup and break shit.

I could never figure out what the problem was and as far as I could tell there was nothing in the network path that stopped working or flapped or whatever unless it did it so fast it didn’t trigger any monitoring stuff, yet somehow still broke NFS (and only NFS).

Figured after a bit that since everything else seemed fine, and the data was being exported via like 6 other methods, that meh, I’ll just use something else.

rehydrate5503@lemmy.world on 22 Oct 01:47 collapse

Haha don’t cut it up just yet! I’ll try some of the other options suggested here, as I’d like to learn what the issue is. The worst case I’ll try smb.

SpeakinTelnet@programming.dev on 21 Oct 03:12 next collapse

First thing I’d do is to look at the client (fedora) journal for anything funky happening.

‘sudo systemctl status nfs-client’

Since it’s random I assume you won’t have any timeout in your /etc/fstab but it might be worth taking a look anyway.

Be aware that if the network drops the NFS will be disconnected and won’t auto-reconnect so this could also be the issue.

I don’t know if it plays well with container mounted volume, but looking at autofs could be a solution to auto-remount the share. I use it profusely for network mounted home directories.

rehydrate5503@lemmy.world on 22 Oct 01:56 collapse

Thanks for the detailed reply.

So the command gives me an error that nfs-client cannot be found.

The fstab just has basic default config. No timeout set.

I considered network issues, though it seems to be quite stable for other services. Not ruling it out just yet. I have a new switch coming in the next week, so will test if the issue persists when I put that in.

I will also give autofs a shot.

Thanks!

2xsaiko@discuss.tchncs.de on 21 Oct 03:38 next collapse

Never seen this before, but you can enable NFS debugging with ‘rpcdebug -m nfs -s all’ (or nfsd on the server, or rpc for the underlying protocol). It prints to dmesg.

rehydrate5503@lemmy.world on 22 Oct 01:45 collapse

Thank you, will try this when I have time later this week.

jonno@discuss.tchncs.de on 21 Oct 04:20 next collapse

Are you loosing the mounts after a reboot? As in, are you mounting via /etc/fstab?

rehydrate5503@lemmy.world on 22 Oct 01:44 collapse

They are mounted via the gui, but it just puts the mount into fstab. I checked the config there and it is just the standard default options for an nfs mount.

Edit: and no, I don’t lose it on reboot. Reboot re-mounts the share correctly.

AbidanYre@lemmy.world on 21 Oct 12:02 next collapse

Try autofs

rehydrate5503@lemmy.world on 22 Oct 01:56 collapse

I’ll give that a shot, thanks!

saywhatisabigw@lemmy.world on 21 Oct 15:29 next collapse

Could it be suspended due to power saving settings?

rehydrate5503@lemmy.world on 22 Oct 01:42 collapse

Doesn’t seem like it as far as I can tell.

possiblylinux127@lemmy.zip on 21 Oct 17:27 next collapse

How is it mounted?

rehydrate5503@lemmy.world on 22 Oct 01:42 collapse

Through the Cockpit gui, which just puts it into fstab. I checked the fstab config and it just has the basic default settings.

[deleted] on 22 Oct 01:51 collapse
.
Evoliddaw@lemmy.ca on 22 Oct 20:40 collapse

After spending quite some time troubleshooting years ago, I chopped up the odd NFS disconnection once every month or two to random network errors. Autofs looks like it’ll work but my bandaid solution was just to add a mount command to the NFS share in cron every 5 minutes. Took me 15 seconds and haven’t had to look at it since.