The most recent Cisco Nexus implementation I was involved in had some challenges. Since you can’t really have an in depth discussion on Twitter due to the 140 character limit, I figured I would throw this post together and share in the fun that is Cisco Nexus switch deployments. Maybe it will help out with someone’s Cisco Nexus implementation. As always, comments are always welcome. Maybe you will see something I missed, or have a question regarding why something was done a certain way.
I have removed any identifiers to the actual customer this network belongs to and received permission from the customer to post this.
This was going to be a standard Nexus 7010, dual core install with a handful of FEX 2232/2248’s for TOR server connectivity. Ordinarily this is a pretty straightforward install. The Nexus 7000 series switch does not support dual-homing the 2200 series FEX like the Nexus 5000 series does. However, there is a way to connect a FEX to a pair of Nexus 7000’s and have it fail over if one of the 7000’s fails. You just have to wait about 90 seconds or so for the FEX to come online with the second Nexus 7000. This particular configuration is not supported by TAC(that I know of), so getting it to work is done without the assistance of Cisco.
Here’s what the network would look like at this particular data center after the Nexus 7010’s were put in place:
Know what those non-Cisco switches connected into each Nexus 7010 mean from a Spanning Tree perspective? You guessed it. MST. That changes things a little bit, but not too much.
I should also point out that these Nexus 7010’s were purchased with a single N7K-M132XP/12 line card in each chassis. That means the only way into the box was via a 10Gbps connector. We had a 2Gbps circuit connecting this data center to another as the primary link, and a backup 1Gbps circuit connecting to that same remote data center. The 2Gbps circuit came via a 10Gbps handoff, so connecting it to the primary Nexus 7010 was not a problem. The other circuit came via a 1Gbps handoff, so that ruled out using the secondary Nexus 7010 as there were no 1Gbps interfaces. There was the option to buy a module for the Nexus 7010 that supported 1Gbps fiber connections, but since that particular card lists at $27,000(It appears you CAN buy anything on Amazon!), even with a decent discount, that’s a lot of money to spend just for a single 1Gbps connection. That also still doesn’t fix the inability to have redundant connections to a FEX with fast failover capabilities.
Due to issues around redundancy and flexibility of 1/10Gig connectivity, the design was changed to include Nexus 5548UP switches. A pair of 5548UP’s were purchased, and the design changed to the following:
*Note – Although the orange oval is listed as FEX 101 in the legend, it was actually several FEX’s numbered 101, 102, 103, etc. I realized the image looked confusing, but I am too lazy to go back and re-do the Visio drawing. Just know that it represents multiple FEX’s.
Implementation went fine. Everything appeared to be functioning normal. vPC was up on the 7k’s as well as the 5k’s. I tried to make the drawing above reflect what the configuration would look like. I didn’t want to drop lines and lines of config for 4 devices as it would really inflate the size of this post. Each Nexus 5548UP connected to both Nexus 7010′s via vPC. Each FEX connected to both Nexus 5548UP’s via vPC as well.
During testing right after the implementation, one problem did surface. An iSCSI storage array connected to one of the Nexus 2248’s was unable to communicate with another iSCSI storage array in a different data center. Basic ICMP(ie ping) communication was possible, but it was erratic in dropping some of the traffic. It looked like an asymmetrical flow issue where one path was getting black-holed or dumped somewhere along the way. There was also no direct ratio of pings that responded to pings that did not respond. We also tried to restart the replication process as ping is not a foolproof test. However, all ping tests were THROUGH the Nexus gear and not TO the Nexus gear. In the case of the Nexus 7000, there is a built in control plane policing policy that will start dropping excessive ICMP traffic sent TO itself. I’ve personally witnessed an issue where someone thought the Nexus 7000’s we just installed for them were messed up because they were throwing massive amounts of pings at the Nexus 7000’s, and they were not getting 100% of ping requests answered. Just because there is no response doesn’t mean there is a problem. Sometimes it is by design.
Due to the replication failure, we started to unplug some things. The FEX 2248 in question was unplugged from the second Nexus 5548UP it was connected to. That secondary Nexus 5548UP still had connectivity to the primary Nexus 5548UP and both Nexus 7010’s. The problem still happened.
Next, we reconnected the FEX 2248 to the secondary Nexus 5548UP. Then, we disconnected the secondary Nexus 7010 from the secondary Nexus 5548UP. The problem still occurred.
We ended up disconnecting all FEX 2232/2248’s and both Nexus 7010’s from the secondary Nexus 5548UP. The only thing the secondary Nexus 5548UP was connected to was the primary Nexus 5548UP. Everything stabilized after we isolated the secondary Nexus 5548UP.
Lab It Up!
The following week, I was able to recreate the same environment we experienced the failures in except for the iSCSI arrays. I had another pair of Nexus 7010’s and Nexus 5548UP’s for another data center this customer was going to upgrade in the near future. I also had several FEX’s to use for testing along with a 3750 switch that was going to serve as the connection point for the management interfaces on each Nexus 7010/5548UP. To simulate the remote data center, I had a Cisco 3560-X switch with a 10Gbps interface module to connect to the primary Nexus 7010.
I plugged a box into the Nexus 2248 FEX that would be the remote target. I plugged my laptop into the 3560-X using a completely separate VLAN than the link between the 3560-X and Nexus 7010. This ensured I was crossing multiple layer 3 boundaries between my system and the simulated remote target host. Testing showed no problems whatsoever. No packet loss. Nothing.
If At First You Don’t Succeed……
The next week, we went back to the data center and reconnected everything back up to the second Nexus 5548UP. The configuration of the Nexus 7010’s and Nexus 5548UP’s were modified to where the configuration reflected this:
The iSCSI arrays were now able to connect to each other across the WAN. However, another problem surfaced. We weren’t able to connect to the management side of the Nexus 5548UP’s consistently.
Remote Management Woes
While administration of the Nexus 7010’s was in-band via IP addresses assigned to SVI’s, management access to the Nexus 5548UP’s is only possible via the management interface itself. The problem is that this particular client doesn’t have an out of band dedicated management network. These Nexus 7010 switches and Nexus 5548UP switches are going to be the only switches left in place at this data center once everything gets migrated off of the other vendor switches and onto the FEX 2248’s and FEX 2232. Due to that, the gateway for the Nexus 5548UP’s resides on the Nexus 7010’s. Normally, this wouldn’t be that big of a problem. Just plug the management interfaces into a free port on the Nexus 7010’s and be done with it. There’s just one small problem with that. These 7010’s only have 10Gig interfaces on them. Checking Amazon again, this module for 48x1Gbps copper connections is listed at $27,000. Factor in your discount, but even at 50% off, that’s still over 10k per Nexus 7010, and we would need 2 of them.
In order to get the management interfaces connected to the Nexus 7010’s for layer 3 reachability for remote management, they had to connect to a third party switch. We used a Cisco 3560G switch since it was on hand, and had copper and fiber capabilities. The management interfaces from the Nexus 5548UP’s connected into the copper Gig ports on the 3560G. The 3560G connected to the Nexus 5548UP’s via the 1Gig SFP slots. Then, the management traffic would ride the normal 10Gbps uplinks from the 5548UP’s to the 7010’s. Simple right? The black lines below represent regular links between the 7010′s and the 5548UP’s. They also represent the fiber connections from the 3560G to the 5548UP’s. The red lines are the management interface connections from the 5548UP’s to the 3560G switch. The green lines are the management interface connections from the 7010′s to the 3560G switch.
Back to the problem at hand. We could reach the Nexus 7010’s from across the WAN with no problems. We weren’t able to reach the Nexus 5548UP’s from across the WAN or from the local Nexus 7010′s. Additionally, remote access to the 3560 was pretty much non-existent. From the console port of the 3560, I could connect to both 5548UP’s. From the 5548UP’s and the 3560, Spanning-Tree looked fine. This was an MST environment, so spanning-tree was pretty simple to troubleshoot. We found that if we disconnected one of the uplinks from the 3560G SFP port to a 5548UP, the problem went away. Reconnect the second uplink and the problem reappeared. The inter-switch links connecting the 3560G with the 5548UP’s were setup as access ports. The only VLAN flowing across those links was the management VLAN that the 3560G and the 5548UP’s were using for remote device administration. Also, in the case of the 5548UP’s, the management interfaces were also where the vPC keepalive link was setup. For the 7010′s, I just needed 2 IP’s that could talk to each other through the 3560G switch. I followed Chris Marget’s suggestion and used 169.254.1.X addresses for the 7010 management IP’s. These IP’s were only used for the vPC keepalive link.
Testing went on for a bit and then the client made a suggestion that ultimately fixed it. He asked if we could setup both uplinks from the 3560 as a vPC link. Of course! The 5548UP’s were reconfigured for vPC on the interfaces connecting to the 3560. The 3560’s SFP interfaces were configured as a standard port-channel as well. The problem went away after that. The final management connectivity looked like the following:
A few questions regarding the Nexus line the perhaps someone can answer:
- Is it assumed that organizations deploying Cisco Nexus switches are going to have a dedicated management network?
- Will the Nexus 7000’s ever support dual connections to Nexus 2000 FEX’s? Were that feature available, no 5548UP’s would have been purchased for this particular implementation.
- Is there ever going to be a comprehensive Nexus 7000/5000/2000 design guide? Or, is it buried somewhere deep within the Cisco website and I just missed it?
- On a lighter note(no pun intended), why in the world are the Nexus 2000 FEX’s so heavy if they don’t even do local switching?