Does Aerohive Scale?

AerohiveLogo

Note: If you are a TL/DR type of person, let me give you the short answer to the title of the post: Yes! 🙂

For everyone else, I will try my hardest to keep this as short as possible. I will include as many pictures and CLI screens as I think are needed to help answer the scalability question, and no more. While I entertained the idea of making two separate posts regarding scalability, I felt it best to keep it to a single post since AP(Access Point) to AP communication and layer 3 roaming are best explained together. My wife and friends will tell you that I can be long-winded. I apologize in advance.

Let me just start by saying that I work for Aerohive Networks. I have been an employee of Aerohive for about 3 months. In that time, I have learned a tremendous amount about the overall Aerohive solution and architecture. Prior to working for Aerohive, I worked for a reseller that sold for Cisco(to include Meraki), Aruba, and Aerohive. I wasn’t unaware of Aerohive, but let’s be honest for a minute. Aerohive doesn’t have a lot of information out there around how their various protocols work. This isn’t unique to Aerohive, as plenty of vendors withhold deep technical information. It isn’t necessarily done on purpose. It just takes a lot of time and effort to get that information out there in a digestible format for customers and partners. The smaller the company, the harder that is to do. Lastly, there are plenty of people within IT that don’t really care how it works, just as long as it works.

My goal with this post is simple. I will attempt to expound on Aerohive’s scalability. This topic comes up in pre-sales discussions from time to time. I have not been an Aerohive employee long enough to have it come up more than a handful of times. I can tell you that in my former position at a reseller for multiple wireless vendors(Aerohive, Aruba, Cisco, Meraki), this topic DID come up with customers and internally among the folks within the reseller that I worked for. I suspect that in the months and years to come with me working in a pre-sales capacity at Aerohive, it will come up even more.

I’ll try to address the two biggest scalability issues. They are:

  1. AP to AP communications, to include associated client information
  2. Layer 3 roaming

Before I dive into that, a bit of initial work is needed for those unfamiliar with Aerohive and the concept of Cooperative Control. Through a set of proprietary protocols, APs talk to each other and exchange a variety of information about their current environment or settings, to include client information. For a basic overview of Cooperative Control and the protocols used within it, you can read this whitepaper on Aerohive’s website.

Got it? Good. Let’s tackle the first scalability “issue”.

AP to AP Communications

Imagine you have a building with multiple floors and about a thousand APs. You know that there are other wireless vendors with controller based systems that can support this amount of APs. You wonder how Aerohive would do the same thing with no central controller managing the RF environment and the associated clients. How can 1,000 APs manage to keep up with each other? Certainly, the controller that can handle 1,000 APs has sufficient processing power and memory to perform this task. No AP on the market would have similar processing power in a single AP. So how does Aerohive do it?

A few quick points:

  1. It is very likely that these 1,000 APs are not on the same local subnet from a management IP perspective. Maybe you divided up the APs in 2, 3 or 4 subnets for management purposes with the intent of keeping the BUM(broadcast, unknown unicast, and multicast) traffic down to a reasonable level.
  2. All 1,000 APs are not going to be within hearing distance(from an RF perspective) of each other. This is an important point to remember, so hang on to that one in the back of your mind.

The basis of AP to AP communications revolves around the Aerohive Mobility Routing Protocol, or AMRP. To take the definition from the whitepaper I linked to above:

AMRP (Aerohive Mobility Routing Protocol) – Provides HiveAPs with the ability to perform automatic neighbor discovery, MAC-layer best-path forwarding through a wireless mesh, dynamic and stateful rerouting of traffic in the event of a failure, and predictive identity information and key distribution to neighboring HiveAPs. This provides clients with fast/secure roaming capabilities between HiveAPs while maintaining their authentication state, encryption keys, firewall sessions, and QoS enforcement settings.

 Since I don’t have 1,000 APs and associated PoE switches to build out my hypothetical multi-floor building, I am going to shrink it down to 2 switches and 7 APs. 5 of the APs will be on one switch and all APs on that switch will share the same management subnet and client VLANs. The other 2 APs will be on another switch and will use the same VLAN number for wireless client access, but that VLAN will have a different IP subnet than the switch with the 5 APs on it as well as a different AP management VLAN and IP subnet. The switches will also be separated with a layer 3 boundary. My lab environment looks like this:

AerohiveLabSetup  The switches and APs will boot up and start talking to each other. I’ll deal with automatic channel and transmit power selection in another post. AMRP will ensure that all APs are talking to each other. Since I am dealing with 7 APs in relatively close proximity to each other, they will all hear each other at pretty good signal strength. I’ll drill down more into that RF range aspect shortly.

Let me connect to one of the 5 APs on switch 1. I can do this from either HiveManager, or by simply using SSH to connect to the AP locally. Alternatively, I could also use the local console port on the AP to grab this information. I am going to take a look at the neighboring APs from an AMRP perspective.

Show AMRP Neighbor

As you can see, this AP can see 4 neighboring APs. It doesn’t show the other 2 that reside on switch 2. The reason for this is due to the fact that those APs are on a different subnet from a management perspective. Even if it can hear the other APs over the air, they still don’t show up in the AMRP neighbor list. We’ll get to that when we cover layer 3 roaming.

If I connect a client to the SSID(GoFast2) these APs are all advertising, I can now see that client in the “show amrp client” CLI output. It appears on all APs within the same local subnet. In the first image, I am connected to an AP on the same local subnet, but not the one the client is directly associated to. Show AMRP Client - Foreign AP

In this second image, I am connected to the AP the client is directly associated to. You can see the output is different as it shows the interface the client is connected to and not the IP address of the AP like in the previous CLI output.

Show AMRP Client - Local AP

Just to clear up any misunderstanding, AMRP is NOT building tunnels between each AP for communications purposes. It is handled in a secure way, but it is not done via a tunnel. All connected client information will be shared among the APs on the same local management subnet via AMRP. That brings me to an important issue regarding scalability.

In order to increase scalability, Aerohive gives you the ability to ONLY share client information between APs that are within RF range of each other. By default, all APs on the same management subnet will share client information with each other. While this is perfectly fine in many environments, if there are a large number of APs on a given management subnet, you probably want to change that default setting and only have client session information sent to APs that the client can actually roam to. Keep in mind that restricting it to APs within RF range will also include any APs that a layer 3 roam could happen on. In Hive Manager NG, this is done in the following location:

Configure/Common Objects/Hives

AMRP Updates - RF Range 1

Select the Hive containing the APs you want to change. At the bottom of the screen in the Client Roaming section, just uncheck the box labeled “Update hive members in the same subnet and VLAN.” Update the configuration on your APs and you will now remove client session sharing between APs not in radio range of each other.

AMRP Updates - RF Range 2

 

So now, instead of this:

AMRP-All APs

 

You have this:

AMRP-RF Range APs

 

As a client roams from one AP to another, the AP it roams to will now share that client session with all of its APs within RF range. The AP that the client roamed off of will stop sharing that client info with APs that are in RF range of it since it no longer maintains the client session. The cycle repeats itself as the client roams to yet another AP. Using this method, the APs do not have to know about all client sessions on all APs within the same local management subnet. That allows Aerohive to scale out from a layer 2(and layer 3) roaming perspective.

Layer 3 Roaming

Layer 3 roaming is handled by another Aerohive protocol entitled Dynamic Network Extension Protocol, or DNXP. To take the definition from the whitepaper I linked to above:

DNXP (Dynamic Network Extension Protocol) – Dynamically creates tunnels on an as needed basis between HiveAPs in different subnets, giving clients the ability to seamlessly roam between subnets while preserving their IP address settings, authentication state, encryption keys, firewall sessions, and QoS enforcement settings.

 How it works is pretty interesting. To explain it, I will have to go back and talk about AMRP. On a given AP management subnet, AMRP takes after OSPF to a certain extent. If you are familiar with how OSPF works on a shared Ethernet segment, you know that there is this concept of a designated router or DR. Additionally, there is a backup designated router, or BDR. The DR and BDR exist to reduce the amount of OSPF traffic flowing between OSPF neighbors. The DR is responsible for sending updates to all the other routers to inform them of the network topology. If it fails, the BDR takes over. Using that same concept, Aerohive APs on a shared Ethernet segment(i.e. layer 2) elect a designated AP, or DA for short. They also elect a backup designated AP, or BDA. This can be seen by running the “show amrp” command on a given AP. Take a look at the following CLI output:

Show AMRP

You can see here that the DA for the shared segment these APs are on is 172.16.100.2. The DA serves several different functions, but one of things it keeps track of is the load level on all the APs within a given segment. When it comes to layer 3 roaming, this DA has a pretty important job. It decides which AP on its given Ethernet segment will spin up the tunnel required for layer 3 roaming. Instead of just spinning up tunnels from the AP the client left during the course of its layer 3 roam, it will seek out the least loaded AP on the given Ethernet segment and have that AP setup the tunnel to the AP the client roamed to. This chosen AP establishes the DNXP tunnel to the AP that the client roamed to on a different subnet and ensures that the network knows that this now roamed client is reachable through this chosen AP.

To see which APs are running as the DA or BDA, but without all the AMRP info from the previous command, you can use the “show amrp interface eth0” command, assuming your connection to the network from the AP is using eth0. It may be using a different interface.

Show AMRP Interface Eth0

I used the above AP, because it was not the DA or BDA. I wanted to show an AP that was in the “Attached” state. The APs that are not the DA or BDA will have a state of “Attached” in the same way that OSPF would show a “DROTHER” on a router that was not the DR or BDR.

An additional thing to note with regard to layer 3 roaming, is that you can restrict which clients can perform a layer 3 roam. This is controlled within the user profile, and in a given SSID, I can have multiple profiles based on a number of different classification methods. Whether you want to allow or deny layer 3 roaming, Aerohive gives you the choice and at a fairly granular level.

To turn on layer 3 roaming for a particular user profile in Hive Manager NG, you simply modify the user profile.

Configure/Common Objects/User Profiles

NG L3 Roam 1

Turn on layer 3 roaming via the Traffic Tunneling tab and if needed, adjust the idle timeout/traffic threshold values. Save the user profile and update the applicable APs.

NG L3 Roam 2

Layer 3 Roaming Test

Although AMRP only shows neighbors on the same Ethernet segment, DNXP is aware of other APs that are nearby, but use a different subnet for the AP’s management IP. You can see these APs with the “show amrp dnxp neighbor” command as shown below:

Show AMRP DNXP Neighbor - Switch 1

Notice that 4 of the APs are in an L2 state. These are the APs on the same Ethernet segment(switch 1) for the management IP. The other 2 APs are in the L3 state. The AP knows that clients could roam to these layer 3 neighbors. In order for this to work, the APs need to be a member of the same “hive”, which is a fancy name for a shared administrative domain. Think of it like an autonomous system number that you would see in routing protocols like BGP. These APs within the same hive share a common secret key, which allows them to communicate securely with each other and trade client state and other AP configuration information. If I look at an AP that resides on switch 2(separated from switch 1 via layer 3), the opposite information appears, with 1 AP in an L2 state, and 5 APs in an L3 state.

Show AMRP DNXP Neighbor - Switch 2

I still have my client connected from the previous example discussing AMRP. It is connected to the AP 390 on switch 1. The 2 APs on switch 2 are separated by a layer 3 boundary from the other 5 APs, so is it aware of the client connected to the AP 390 device? Yes. Since all APs are within RF range of each other, client information is passed between them all. However, in the case of clients separated by a layer 3 boundary, DNXP comes into play. You can see these client sessions in another subnet by looking at the DNXP cache.

Show AMRP DNXP Cache

Note that although this client has not performed a layer 3 roam, the CLI output tells you where the tunnel is going to originate from across the layer 3 boundary when it does roam. This is because the designated AP(DA) has already determined which AP is the least loaded on the subnet that the client will roam from and has assigned that least loaded AP with tunneling duties if the client needs to make a layer 3 roam.

I have moved the client closer to one of the APs on switch 2 so that it roams. I had to actually move the AP out of my office and into the hallway and moved the client into the next room so that the RSSI value on the AP it is currently connected to would be low enough for it to actually roam. As you can see in the CLI output below, the client has roamed over to the AP that resides on switch 2.

Show Station - AP250 - Switch 2

Now, I should be able to see a tunnel spun up between this AP 250 on switch 2 and the AP that the DA connected to switch 1 chose(172.16.100.6).

Here it is from the perspective of the AP the client roamed to on switch 2:

Show AMRP Tunnel - AP250 Switch 2

Here it is from the perspective of the AP(172.16.100.6) on switch 1 that was responsible for building the tunnel to connect the two subnets:

Show AMRP Tunnel - AP330 - Switch 1

Other APs on switch 1 are also aware of this tunnel:

Show AMRP Tunnel - AP390 switch 1

Let’s go one step further and add another client. I associate a client to an AP on the switch 1 side. For the purposes of maintaining some brevity, I won’t show all the CLI around that client before it roams. I moved the client into the same room as the first client and it associates to the AP 250 on switch 2. We can now take a look at how the layer 3 roam was constructed for this second client.

Show AMRP Client - L3 roam on AP250-Switch2

We can see above that both clients are attached to this AP via a layer 3 roam. This can be verified additionally with a “show amrp tunnel” command.

Show AMRP Tunnel - Both clients on AP250

If we take a look at an AP on the network(switch 1) that these clients roamed from, we see that these APs are aware of the clients and how to reach them via the AP that setup the tunnel to the other side.

Here is that view from one of the APs on switch 1:

Show AMRP Client - AP1130 switch 1 - 2clients

And the view from another AP on switch 1:

Show AMRP Client - AP130 switch 1 - 2 clients

There is one thing in particular I want you to notice. Both clients have executed a layer 3 roam and have tunnels spun up bridging these two separate subnets. However, notice that the tunnel end point on the switch 1 side is different for each client. That is because the DA told one AP to setup a tunnel for layer 3 roaming purposes for the first client, but told a different AP to setup a layer 3 roaming tunnel for the other client. Remember that the DA is aware of each AP’s load on a given subnet it is responsible for. It is spreading the load(no pun intended) among the various APs to keep from overloading any single AP. If there were a lot of clients on my home lab network and several of them performed a layer 3 roam at once, you might see tunnels originating from even more APs. It is this distribution of tunnel origination that allows layer 3 roaming to scale and not overload one particular AP.

Closing Thoughts

If you made it this far, congratulations. I told you this was going to be a long post. If nothing else, I hope I have shed a little more light on how Aerohive can scale when it comes to APs cooperating with each other and with regard to layer 3 roaming. As I mentioned earlier in the post, I did not cover automatic channel and transmit power selection. That is covered by a different protocol(ACSP) that works in conjunction with AMRP, and I hope to write another post soon showing how that works.

As with any solution, there are limits. Whether a controller based wireless network, or a cooperative control environment like Aerohive’s, at some point you can break it. It would be impossible for me to acquire the number of clients and access points to do this on my own. Not to mention the fact that I would need a decent sized building to spread out the APs enough to where they aren’t able to hear each other. I hope I was able to at least demonstrate the scalability, albeit on a smaller scale.

Let me know your thoughts or if you have any additional questions in the comments section below.

Posted in aerohive, wireless | 2 Comments

From Multi-Vendor To Single-Vendor

AerohiveLogoCareers take a funny turn a lot of times. Opportunities come up that you weren’t expecting and the timing is never as perfect as you want it to be. At least, that is how it has always been with me. I’ve learned though, that sometimes the best thing for you is to charge full speed ahead through the door, roll the dice, and take your chances. That is where I find myself right now. Having accepted an offer from Aerohive Networks to serve in a pre-sales engineering role in my local area, I am leaving behind a job and a company that I have enjoyed tremendously. Yes, there were times when I had to be talked off the ledge and keep on going. I think that comes with most jobs though. Overall, it has been a very rewarding almost 5 years working for a value added reseller(VAR) and I will miss it greatly.

In the span of a few months, I had to decide to give up the following:

1. Multi-vendor implementations and support.
2. Studying for the CCIE Wireless lab exam with 1 failed lab attempt already under my belt.
3. Involvement with other vendors courtesy of social media(blogging, Twitter, etc). – My involvement with Tech Field Day, HP, and other vendors has brought me into a whole different level of vendor interaction that I didn’t know existed.
4. Extensive travel across the greater US, which isn’t always fun, but I enjoy different locales and different networks to work on. I also haven’t paid for a hotel room or flight for my family in years.
5. Working with people and clients I have known for years.

For all that I gave up, I gained some things.

1. Being able to get really deep in a limited set of products from a single vendor.
2. Travel much closer to home and for shorter durations.
3. Potentially being able to get a better look at how products are brought to market.
4. Potentially being able to understand a vendor’s technology at a much deeper level than I ever could on the partner or end customer side(e.g. The secret sauce around RRM).
5. Potentially having more time for blogging, which I have neglected greatly over the past few years.
6. No more nights and weekends working on customer projects. – This may not totally go away, but it will decrease tremendously.
7. I have always wanted to work for a vendor to complete my overall picture of the IT industry.
8. The chance to compete against larger competitors. – It takes a lot of work to unseat incumbent vendors, or win deals against much larger competitors. Not every deal will be won, but when you can win in an ethical manner, it is a good feeling.
9. Better compensation. – None of us work for free, and I don’t want to be working until I am in my 70’s. Of course, if I can’t sell anything, I might be working until I am in my 70’s.

Which list is better? I came to the conclusion that what I was gaining would outweigh what I was giving up.

The Heart of the Matter
One thing that comes up when talking to peers is being able to go into the single vendor mode mindset after being multi-vendor for so many years. Can it be done? The short answer is yes.

I have worked with a number of networking vendors over the years. However, if I were to break down percentages and allocate them to each vendor, Cisco would have the largest share of the pie. Probably upwards of 75%. I have implemented solutions from Cisco, Meraki, HP, Brocade, Aerohive, Aruba, Meru, Sonicwall, Riverbed, Barracuda Networks, Dell, and a number of other smaller vendors. I have worked with, but not implemented, solutions from F5, Extrahop, Solarwinds, Juniper, and a few others. I wouldn’t claim to have high proficiency in any of them, except Cisco, and “high proficiency” is a rather subjective term. Put me in front of a Cisco Catalyst switch, give me a set of configuration requirements, and I can go to work right away. Put me in front of another vendor’s switch, and I have to stop and think about what needs to be done. I’ll fumble through the CLI, but eventually get it done. Does that make me multivendor proficient?

In all reality, to be proficient in more than one vendor requires consistent exposure and experience with each vendor’s products. I can tell you that even within Cisco, there are products I am very familiar with, and other products that I am not as familiar with. There are just too many products and too many caveats to function at a very low level on more than a handful of products from Cisco. That is the problem with multi-vendor work. Even if it is consistent, there are so many things to learn about each one. This was a lesson I learned when studying for the Cisco CCIE Wireless exam. I spent months on switches, wireless controllers, APs, Prime, ISE, and the MSE, and I still don’t feel like I am anywhere near an expert with those platforms. I am definitely a lot stronger with those products today than I was a year or two ago, but I still have much to learn.

Perhaps the biggest benefit to being multi-vendor focused is the awareness of each vendor’s product set. I don’t necessarily have to know how to configure each nerd knob. I just have to know what the capabilities are. In short, vendor analysis is as big a part of being multi-vendor as is doing the actual configuration and troubleshooting work. Does working for a vendor like Aerohive mean I cannot spend time learning about how wireless is done at any of their competitors? On the contrary, I think it requires that. If you are going to sell against the competition, you better know what you are selling against. If you rely on vendor competitive documents, you will get bit eventually. Those documents are rarely up to date, and I have seen them from numerous vendors working in the VAR space.

In short, I think you can be multi-vendor while working for a single vendor, but from the standpoint of understanding the competition. I know where my paycheck is coming from, so as long as I can do things in an ethical manner, I have no problems only presenting products from the company I represent. I already do that to a certain extent on the VAR side. It isn’t a foreign concept to me.

On another note, if my new job and CWNP studies allow, I plan on doing a lot more blogging. However, don’t be surprised if a fair amount of those posts are “how to’s” on Aerohive. I am VERY excited about being able to get as deep as I can in their products, since I do very little Aerohive work these days. I plan on sharing what I can when I can in the hopes that it will help someone out there. For an IT community that has given me so much, it is the least I can do to return the favor.

Posted in aerohive, career, wireless | 1 Comment

In Pursuit of the CCIE

Just a short post to let you know this blog is not dead. I have not written anything in several months. While I have several posts that are partially complete, I have not been able to finish them…..yet.

For the past several months, I have been busy studying for the CCIE Wireless lab exam. Prior to that, I was sort of working towards the CCIE Route/Switch written and lab exam. I wasn’t fully committed, so my studying was sporadic at best. My heart just wasn’t in forcing myself to learn more about IPv6, multicast, MPLS, and some of the other blueprint items.

Somewhere along the line it changed. Maybe it was having another co-worker who was serious in his pursuit of the CCIE Wireless. Maybe it was that my job working for a reseller had me doing more and more Cisco wireless work. Maybe I just liked the fact that wireless was hard. I’m not really sure. I just know that at some point, a switch flipped inside my head and I just decided to go all in on my studies. Honestly, I should have done this years ago, but the timing just didn’t seem right.

I’ve been studying most nights every week for a few months. I don’t sleep a whole lot these days. A lot of times, I fall asleep in my chair up in my office and don’t wake up until my wife comes up to check on me. On those nights when I do make it to my bed, I think about the lab blueprint until my brain finally shuts down and I drift off to dream. I have dreams about odd things like wireless authentication. My thoughts are always on the lab. Whether I am in a meeting with a client, sitting in church, or just driving down the road, it consumes me.

I’m constantly fighting off the voices in the back of my mind telling me to stop and go back to life as it was before the study urges took over. I have a wife and two kids. I have a job that demands a decent level of performance mentally. I travel a fair amount for work. I work odd hours. I am fairly active in my local church. I also make a decent living, so passing the lab doesn’t mean a massive pay raise for me. There are so many reasons I shouldn’t do this, and they almost overshadow the reasons that I should.

On the positive side, I am convinced there are doors that will not open career-wise, without the CCIE. Will I make more money after passing the lab? Probably. Will I have more recruiters and HR folks pinging me on LinkedIn? Yes. Will I have interesting career choices cross my path? Probably. I’m not planning on doing anything different work-wise after I pass, but as any of you who have CCIE digits knows, you have more options.

Those are all well and good, but if there is one reason I want to pass the lab, it is related to a quote attributed to John F. Kennedy from a speech he gave in 1962 regarding the USA’s attempts to land on the moon:

“We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard.”

That’s it in a nutshell. I need to know if I can push myself to finish something that on the surface, seems impossible. When I was 15 years old, I ran a mile(1600 meters) in 4 minutes and 56 seconds on a dirt track in Hawaii. I had been trying to break 5 minutes for a while at that point. I remember that race vividly. I had a great running coach that trained me well. I put in a lot of miles on hills and roads leading up to that point, and I only mentioned the locale(Hawaii) to give you an idea of what kind of “hills” I was referring to. It was the end of our track season and I was in peak shape. Had it been a rubber track, I could have probably run it in 5 or 6 seconds faster. It doesn’t matter though. I broke 5 minutes. For some, that is not a big deal. For a kid who had asthma at a younger age, that was huge. It will always be one of my favorite moments in my life, taking a back seat to only the birth of my children and the marriage to my wife.

I am always telling my kids that they can be anything they want to be as long as they are willing to work hard for it. I can tell them all day long. It’s better if I show them through example. I’ll find out in 18 days when I sit the lab for the first time. I may go back several more times before I pass it, but I am prepared to do that.

Nobody ever talks to me about my sub-5 minute mile I ran. In fact, my father was the only one in my family who witnessed it. When, and it is a “when”, I pass the CCIE Wireless lab, most of the people in my day to day life, outside of work, will not even know what that is. I am perfectly fine with that. I’m not doing this for accolades or pats on the back. I’m doing this for me, and also to secure a potentially greater ability to provide for my family.

When it is over, I will take a break from studying. I’ll stop reading technical books for a few months, and not think about this stuff too much outside of my work hours. I have several hundred books I have put off reading for several years. I also have 60 years of National Geographic magazines that a friend gave me that are sitting in my office closet begging to be read. After a few months and a few dozen books and magazines, I will get back on the study “horse” and push towards the Aruba ACMX.

While I would have loved to create a bunch of blog posts documenting the technical aspects of my studies, I made the decision to devote that time to studying. Anyone who has written even one technical post knows how much time those things take. I am very grateful for people like Rasika who took the time to document all of their studies. If you are studying for the CCIE Wireless as well, you are probably already familiar with his excellent site. Much of that content applies to the version 3 lab blueprint.

Just wanted to put something up here to let you know I have not abandoned this site. I’m still around. I’m just busy studying.

Posted in career, ccie, learning, wireless | 4 Comments