In this blog post, I’m sharing a few troubleshooting tips related to File Server Networking in Windows Server 2012 R2.
For each one, I first describe the issue that is commonly reported, followed by a quick explanation of the root cause of the issue and finally a way to solve it.
Let me know if those are helpful and feel free to share your own issues related to File Server Networking using the comments.
1. Make sure your network interfaces are RSS-capable
Issue:
- Certain 10GbE NICs won’t perform as well as others
- Multichannel might not aggregate multiple 10GbE in certain configurations
Cause
- Some 10GbE NICs show as non-RSS capable. Without RSS, SMB uses 1 TCP connection.
- If mixing RSS and non-RSS NICs, Multichannel will use only the RSS-capable NICs
Solution (both are complete solutions, choose one)
- Server-class NICs should show as RSS-capable. Update the driver. Check RSS settings.
- You might need to disable RSS on both to aggregate throughput.
- Blog: Make sure your network interfaces are RSS-capable
2. Use Multiple VNICs on the host
Issue
- When using a VNIC on the host, I cannot achieve maximum performance.
- This is noticeable when using 10GbE NICs RSS-capable NICs and/or NIC Teaming.
Cause
- The VNIC on the host is not RSS capable. Without RSS, SMB uses only 1 TCP connection.
Solution
- Use multiple VNICs to make sure you have multiple connections.
- Blog: Make sure your network interfaces are RSS-capable
3. SMB prefers slow NIC to faster VNICs
Issue
- When using a VNIC on the host, SMB prefers a slower physical NIC to the VNIC.
- This is important when using 10GbE NICs RSS-capable NICs and/or NIC Teaming.
Cause
- The VNIC on the host is not RSS capable and the physical NIC is RSS capable.
- SMB will always prefer RSS NICs to non-RSS NICs, even if at slower speeds.
Solution (both are complete solutions, choose one)
- Disable the RSS-capability of the physical NIC
- Use SMB Multichannel Constraints to prefer the VNICs
4. More TCP connections for SMB inside a VM
Issue
- In Windows Server 2012, I got 1 TCP connection per VMNIC. In R2 I now get 4.
- Why has this changed? Is there a problem?
Cause
- With the new Virtual RSS feature, VMNICs now report themselves as RSS-capable.
Solution
- This is by design. Enjoy the increased performance...
5. More connections to a Scale-Out File Server
Issue
- Windows Server 2012 R2 creates more connections to Scale-out File Servers.
- Why has this changed? Is there a problem?
Cause
- Windows Server 2012 used one set of connections per Scale-Out File Server.
- Windows Server 2012 R2 uses one set of connections per Share on a SOFS,
when this helps avoid server-side redirection (typical case: Mirrored Storage Spaces)
Solution
- This is by design. Enjoy the increased performance...
6. Use multiple subnets when deploying SMB Multichannel in a cluster
Issue
- When using multiple NICs on the same subnet in a cluster, only one is used
Cause
- Cluster networking won’t use more than one NIC per subnet
- You can confirm by using the cmdlet Get-SmbServerNetworkInterface
Solution
- Configure each NIC to use a different subnet. Make sure cluster shows multiple networks.
- Blog: Use multiple subnets when deploying SMB Multichannel in a cluster
7. Update firmware and driver for your NICs
Issue
- I’m using an RDMA NIC, but SMB reports the NIC as not RDMA capable.
Cause
- NICs with older firmware or driver might not report themselves correctly.
Solution
- Update the firmware and driver with the latest from the manufacturer’s website.
- Link: Latest RDMA drivers for Chelsio for Windows Server 2012 R2 (Select latest Unified Wire driver)
- Link: Latest RDMA drivers for Mellanox for Windows Server 2012 R2
8. How much traffic needs to pass between the SMB Client and Server before Multichannel actually starts?
Issue
- SMB3 always starts with single TCP/IP then moves to multiple TCP/IP or RDMA.
- Concern with timing of initial handshake and speed of transition to faster behavior.
Cause
- SMB Multichannel is used to discover RSS and RDMA capabilities.
- For server SKUs, Multichannel starts on the first read or write operation.
- For client SKUs, Multichannel won’t start unless you’re doing some amount of work.
Solution (both are complete solutions, choose one)
- On server SKUs, don’t worry. This happens fast and is done only once per session.
- On client SKUs, there is a registry configuration to use the server behavior, if necessary.
- Blog: How much traffic needs to pass between the SMB Client and Server before Multichannel actually starts?
9. Can I use SMB3 storage without RDMA?
Issue
- Concern around performance of SMB3 without RDMA NICs
Cause
- We have talked so much about RDMA...
Solution (both items contribute to the solution)
- Using non-RDMA NICs is fine. You can get good performance with non-RDMA 10GbE.
- Watch your CPU utilization. make sure you’re using RSS NICs.
- Blog: Can I use SMB3 storage without RDMA?
10. Is it possible to run SMB Direct from within a VM?
Issue
- Running file server or SQL Server in a VM
- Desire to use RDMA networking from the guest
Cause
- Windows Server 2012 and Windows Server 2012 R2 cannot do RDMA to the guest
Solution (both are complete solutions, choose one)
- Run the workload on the bare metal
- Consider solutions that use the Hyper-V storage path (Hyper-V over SMB, Shared VHDX)
- Blog: Is it possible to run SMB Direct from within a VM?
11. Use Client Access network for CSV Traffic
Issue
- Limited performance when accessing a scale-out file server in Windows Server 2012
- Happens when client hitting non-owner node and redirection is required
Cause
- The client access network is high speed RDMA but the cluster network is not
- Redirection is happening over cluster network only (usually a 1GbE NIC)
Solution (both are complete solutions, choose one)
- Enable option to use client network for CSV traffic: (Get-Cluster).UseClientAccessNetworksForSharedVolumes=1
- Upgrade to Windows Server 2012 R2 (automatic rebalancing avoids redirection)
- Blog - Automatic SMB Scale-Out Rebalancing in Windows Server 2012 R2
12. Single file copy performance
Issue
- Limited performance when copying a single large file to a scale-out file server
- Using a 10GbE connection, can only achieve less than 150MB/sec throughput
Cause
- File extension done in 1MB increments, serialized in write-trough mode
- This leads to loss of the asynchronous nature of SMB2/3
Solution (both items contribute to the solution)
- Copy multiple files in parallel (ROBOCOPY /MT)
- Apply Windows Server 2012 R2 GA patch (changes to 8MB increments)
- KB article and patch download - http://support.microsoft.com/kb/2883200