Hello Michael,
may I ask which version of NFS are you using in your environment and what problems have you had with NFSv4.1?
Especially the performance ones.
Best regards
Florian
Ain’t that the truth!
I had a customer using 18 month old red hat kernels (not patch for a long time)
They upgraded ontap to something modern and either did not realize (likely) or forgot they were using nfs v4.1/pnfs. As soon as ontap finished the boot of the last node to the latest version, ALL the rhel boxes panicked. Upon review it was due to a bug fixed 15 months ago in the kernel. The fix was simple, Chang the mounts to v3.
So sane thought, if you go down the 4.1 path it is best to …
[View More]keep everything current both client and server to avoid nasty bugs
Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Toasters <toasters-bounces(a)teaparty.net> on behalf of Michael Bergman via Toasters <toasters(a)teaparty.net>
Sent: Tuesday, August 17, 2021 7:52:28 AM
To: Toasters <toasters(a)teaparty.net>
Subject:
_______________________________________________
Toasters mailing list
Toasters(a)teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
[View Less]
On 2021-08-17 12:44, Sebastian Goetze wrote:
> @Florian: Could be, that NFS4.0/4.1 referrals are the solution to keeping
> your paths local, especially, if your volumes do not move around...
I concur, could be. We almost don't use NFSv4.1 at all (yet) for various
reasons one main being performance issues due to the nature of our heavy NFS
workloads
W.r.t. performance in the NFS clients, if that's a worry, nconnect=x
available in later kernels (e.g. in RHEL 8.x) can help speed things …
[View More]up.
Quite a lot. It depends on the workload patterns you have.
pNFS is interesting and can be good for certain applications and their
workloads but it's cumbersome to work with and get to work smoothly. The
clients running pNFS have been more the problem to get to behave well, in
the past years at least. Nowadays w e.g. RHEL 8.x, possibly it's better.
I would say it's still a VERY good idea that *all* your pNFS clients are
exactly the same Linux kernel and version. I.e. the same distro and version
of it. If you can't build such an environment, I'd recommend to refrain from
using pNFS and try to mitigate any perceived or proven performance
bottlenecks in other ways (e.g. nconnect=x can help, and a better different
scale out config of a new ONTAP cluster, large FlexGroup(s) etc)
/M
[View Less]
Good morning,
thank you very much for all those replys.
I really haven't thought that :)
Yes, our network wasn't designed for pNFS at all, because we have never used it.
We have started now to use clients, which are using this by default and we have encountered this problem.
Our storage network was ever only switched, so no routing is/was possible and we always had to choose the right IP range for the client and node.
We also normally don't have huge clusters, so no volume movement either.…
[View More]
But I have really thought, that the pNFS implementation is at least broadcast domain aware, but it seems it isn't.
This means, that pNFS is not working in our environment. That's ok for us.
Thanks a lot for the deeper insight you all have given me.
Best regards
Florian Schmid
[View Less]
On 2021-08-16 18:41, Justin Parisi wrote:
> I think it was "working as designed." RFE is several years old; filed it
> when I was still in support.
It is. Working as designed. That's my take on this.
Meaning: if you misunderstand things a bit and design your network around
the cluster intended to run and leverage pNFS in such a way that doesn't
make sense really, you'll have to turn pNFS off and then it'll work again
for the clients.
Back to the drawing board, re-design the network e.…
[View More]g. move your LIFs around
so that they're all in the same VLAN (subnet). We don't run pNFS anywhere,
but that's the customary way we do it for our very large clusters (our
biggest is a 20-node now, in refresh state b/w 5 y old FAS8080 and new A800)
in any case, since many years. And *everything*, all NFS traffic, always
goes across at least one L3 hop. Nothing else, no NFS clients!, are allowed
in the VLANs where our LIFs sit. That way we "own" that VLAN completely and
can control it in a good fashion.
BTW: Wow. A several years old RFE from *you* who hasn't been implemented
yet. Not much demand for it then I'd say
/M
-------- Original Message --------
Subject: Re: pNFS not working on vserver when several lifs with different
Vlans available
Date: Mon, 16 Aug 2021 12:17:45 -0400
From: tmac <tmacmd(a)gmail.com>
To: Parisi, Justin <Justin.Parisi(a)netapp.com>
CC: Michael Bergman <michael.bergman(a)ericsson.com>, Toasters
<toasters(a)teaparty.net>
Wow...Thanks Justin.
I guess that was possibly an oversight when developed (and hence the RFE)?
--tmac
*Tim McCarthy, */Principal Consultant/
*Proud Member of the #NetAppATeam
*I Blog at TMACsRack
[View Less]
Wow...Thanks Justin.
I guess that was possibly an oversight when developed (and hence the RFE)?
--tmac
*Tim McCarthy, **Principal Consultant*
*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
On Mon, Aug 16, 2021 at 12:09 PM Parisi, Justin via Toasters <
toasters(a)teaparty.net> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: "Parisi, Justin" <Justin.Parisi(a)…
[View More]netapp.com>
> To: Michael Bergman <michael.bergman(a)ericsson.com>, Toasters <
> toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 16:06:37 +0000
> Subject: Re:
> FYI TR-4067 calls out this exact issue as well. There’s currently no way
> to “blacklist” inelegible interfaces for pnfs. If it’s in the SVM, it will
> be used with pNFS, regardless of if it’s reachable.
>
> We have an open RFE to allow you to specify LIFs for pNFS to avoid this
> problem. In the meantime you can blacklist devices from the client as per
> page 54 of the tr.
>
> https://www.netapp.com/pdf.html?item=/media/10720-tr-4067.pdf
> ------------------------------
> *From:* Toasters <toasters-bounces(a)teaparty.net> on behalf of Michael
> Bergman via Toasters <toasters(a)teaparty.net>
> *Sent:* Monday, August 16, 2021 11:10:12 AM
> *To:* Toasters <toasters(a)teaparty.net>
> *Subject:*
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters(a)teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
>
>
>
> ---------- Forwarded message ----------
> From: "Parisi, Justin via Toasters" <toasters(a)teaparty.net>
> To: Michael Bergman <michael.bergman(a)ericsson.com>, Toasters <
> toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 17:06:46 +0100
> Subject:
> _______________________________________________
> Toasters mailing list
> Toasters(a)teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
[View Less]
FYI TR-4067 calls out this exact issue as well. There’s currently no way to “blacklist” inelegible interfaces for pnfs. If it’s in the SVM, it will be used with pNFS, regardless of if it’s reachable.
We have an open RFE to allow you to specify LIFs for pNFS to avoid this problem. In the meantime you can blacklist devices from the client as per page 54 of the tr.
https://www.netapp.com/pdf.html?item=/media/10720-tr-4067.pdf
________________________________
From: Toasters <toasters-bounces(…
[View More]a)teaparty.net> on behalf of Michael Bergman via Toasters <toasters(a)teaparty.net>
Sent: Monday, August 16, 2021 11:10:12 AM
To: Toasters <toasters(a)teaparty.net>
Subject:
NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
_______________________________________________
Toasters mailing list
Toasters(a)teaparty.net
https://www.teaparty.net/mailman/listinfo/toasters
[View Less]
Exactly. I was too terse in my reply.
Plus a blurry distinction b/w "striped" and "distributed".
Where one can argue (as you did) that striping is a special case of
distribution. Which it is, I agree
As for the metadata server concept in pNFS see some details here:
<https://datatracker.ietf.org/doc/html/rfc8434#section-12>
There is, conceptually, one (1) metadata server. It has to keep track of a
whole bunch of things, the clients *must* talk NFSv4.1 to it and it has to
control the …
[View More]so called layouts that pNFS clients are given. Or revoke layouts
when and if applicable. The ONTAP implementation may "stripe" the metadata
server (!) ;-) somehow, but still. I really don't know -- pNFS is cumbersome
to set up and I have no incentive (or interest) to do it in any strong
enough use case we have for our internal Storage Service
An excerpt from the above doc:
data server (DS): a pNFS server that provides the file's data when
the file system object is accessed over a file-based protocol.
Note that this usage differs from that in [RFC5661], which applies
the term in some cases even when other sorts of protocols are
being used. Depending on the layout, there might be one or more
data servers over which the data is striped. While the metadata
server is strictly accessed over the NFSv4.1 protocol, the data
server could be accessed via any file access protocol that meets
the pNFS requirements.
metadata server (MDS): the pNFS server that provides metadata
information for a file system object. It is also responsible for
generating, recalling, and revoking layouts for file system
objects, for performing directory operations, and for performing
I/O operations to regular files when the clients direct these to
the metadata server itself.
See also <http://www.pnfs.com/>
Again: how ONTAP implements the metadata server, I'm not sure. Presumably in
a scale-out way, as cleverly as possible with the technology available to
NetApp already in ONTAP.
There has been (still is..?) ideas on how to in a fully standardised way
stripe the metadata server function across many devices as well, but I'm not
clear on the current status of this. I only find old references now (really
old, 5+ y):
<https://datatracker.ietf.org/doc/html/draft-mbenjamin-nfsv4-pnfs-metastripe…>
Maybe I'm missing something, comments welcome.
I also stumbled across this [URL below], it's 10+ y old but I've never seen
this before. It was an interesting read; so many things have taken place and
become mature in the past decade of the stuff which is mentioned in this
slide pack. Not bad, NetApp :-) It took a wee bit longer than we all wanted,
sure, but it works and one can trust it in a VERY BIG system (cluster).
<http://www.nfsv4bat.org/Documents/ConnectAThon/2010/wafl-unstriped.pdf>
/M
On 2021-08-16 16:22, tmac wrote:
Just out of curiosity, what is the output of "network interface show
-vserver pnfs-svm"
The answer might lie there.
I doubt it is a VLAN issue and more a network design issue.
I found this:
https://mysupport.netapp.com/site/article?lang=en&page=%2FAdvice_and_Troubl…
Issue
* When NFS client attempts to Read/Write to the SVM with pNFS enabled, the
operation hangs, with "nfs: [LIF address / hostname] server not responding"
in /var/log/messages.
Cause
* This issue occurs in situations where pNFS is enabled on an SVM and
there are LIFs created in different subnets, which are non-routable by the
client
* Data ONTAP may provide any LIF which belong's to the same SVM as the
referral to a client
* The Read/Write operation hangs because the data LIF IP is not reachable
by the client
Solution
* Make sure clients can reach all data LIFs of the SVM
-or-
* Disable support for pNFS on the SVM
::> nfs modify -v4.1-pnfs disabled
--tmac
Tim McCarthy, Principal Consultant
Proud Member of the #NetAppATeam
I Blog at TMACsRack
[View Less]
Just out of curiosity, what is the output of "network interface show
-vserver pnfs-svm"
The answer might lie there.
I doubt it is a VLAN issue and more a network design issue.
I found this:
https://mysupport.netapp.com/site/article?lang=en&page=%2FAdvice_and_Troubl…
Issue
- When NFS client attempts to Read/Write to the SVM with pNFS enabled,
the operation hangs, with "nfs: [LIF address / hostname] server not
responding" in /var/log/messages.
Cause
- This issue occurs in …
[View More]situations where pNFS is enabled on an SVM and
there are LIFs created in different subnets, which are non-routable by the
client
- Data ONTAP may provide any LIF which belong's to the same SVM as the
referral to a client
- The Read/Write operation hangs because the data LIF IP is not
reachable by the client
Solution
- Make sure clients can reach all data LIFs of the SVM
-or-
- Disable support for pNFS on the SVM
::> nfs modify -v4.1-pnfs disabled
--tmac
*Tim McCarthy, **Principal Consultant*
*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
On Mon, Aug 16, 2021 at 9:16 AM Florian Schmid via Toasters <
toasters(a)teaparty.net> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: Florian Schmid <fschmid(a)ubimet.com>
> To: toasters <toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 13:13:15 +0000 (UTC)
> Subject: pNFS not working on vserver when several lifs with different
> Vlans available
> Hi,
>
> I had a very strange incident some weeks ago.
>
> We are on 9.7 branch.
>
> In one of our vServers, where we had NFSv4 and 4.1 enabled, clients with
> NFSv4.1 and pNFS where unable to read or write data.
> Directory listing was possible.
>
> On the vServer, we have several different Vlans and each Vlans has an
> interface for each node there. Our storage network is completely switched,
> so no routing between Vlans possible.
>
> After creating a NetApp case, they told me, that pNFS doesn't handle Vlans
> and the client will get any IP on that node, even when the IP is from a
> different Vlan and therefor not accessible from the client.
>
> Is this really the case? Is the pNFS process really not Vlan aware and
> sends any IP from that node where volume is located, even when the IP is in
> a different network than the client IP?
>
> As suggested from NetApp, I have disabled pNFS and now everything is
> working.
>
> What is your experience here?
> Is anybody using pNFS with NetApp and if yes, do you only have one network
> in your vServer or is the client able to access all IPs there through
> different Vlans?
>
> I'm curious about your answers.
>
> Best regards
> Florian Schmid
>
>
>
>
> ---------- Forwarded message ----------
> From: Florian Schmid via Toasters <toasters(a)teaparty.net>
> To: toasters <toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 14:13:20 +0100
> Subject:
> _______________________________________________
> Toasters mailing list
> Toasters(a)teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
[View Less]
I would respectfully disagree.
pNFS does not stripe data across controllers. Data is distributed, not
striped.
Every node is a metadata server and will redirect the client process to the
node directly holding the volume. No striping.
Maybe you are confusing pNFS with FlexGroups which distributes files across
nodes.
A very long time ago in ONTAP GX, there >>was<< an ability to stripe data
across controllers. That feature was removed upon going to ONTAP 8.0
--tmac
*Tim McCarthy, **…
[View More]Principal Consultant*
*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*
*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*
On Mon, Aug 16, 2021 at 9:36 AM Michael Bergman via Toasters <
toasters(a)teaparty.net> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: Michael Bergman <michael.bergman(a)ericsson.com>
> To: Toasters <toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 13:34:17 +0000
> Subject: Re: pNFS not working on vserver when several lifs with different
> Vlans available
> On 2021-08-16 15:13, Florian Schmid wrote:
> > Is this really the case? Is the pNFS process really not Vlan aware and
> > sends any IP from that node where volume is located, even when the IP is
> > in a different network than the client IP?
>
> Yes. That's how pNFS is supposed to work. That's the whole point one can
> say, of pNFS. It's striping of the data traffic (not metadata) across
> several nodes, all at once.
>
> In that respect, the setup you say you have, this:
>
> "On the vServer, we have several different Vlans and each Vlans has an
> interface for each node there. Our storage network is completely switched,
> so no routing between Vlans possible."
>
> is totally wrong for pNFS. In such a set up you definitely do not want
> pNFS.
>
> /M
>
>
>
> ---------- Forwarded message ----------
> From: Michael Bergman via Toasters <toasters(a)teaparty.net>
> To: Toasters <toasters(a)teaparty.net>
> Cc:
> Bcc:
> Date: Mon, 16 Aug 2021 14:34:25 +0100
> Subject:
> _______________________________________________
> Toasters mailing list
> Toasters(a)teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters
[View Less]