An NDA Can Keep Bad Decisions Away

Over the past year, I have seen some interesting presentations from vendors showing me some things that they have on their future roadmap. Some of these things have already been released to the public. I’m still waiting on the rest. All of this was a result of having non-disclosure agreements or NDA’s in place. The vendors agree to show us some of their stuff that is coming to market soon on the assumption that we will not release this information to anyone else. While I do enjoy knowing about things before they hit the market, I sometimes feel bad for companies that don’t have access to this information. Not only that, I often wish I had access to all vendor product roadmaps. Let’s face it. From the network hardware/software standpoint, we generally do business with only a handful of vendors. I say that as someone who works in a corporate environment. If you are a consultant, that doesn’t necessarily hold true as you may sell a wide variety of hardware and software.

If your dealings with companies are limited to a select few, those companies have a vested interest in making sure you stay with them. One of the ways to do that is to give you a better view into their product cycle so that you know what is coming. Look at the switch market for example. The number of vendors offering products in that space is growing and growing. I recently spent a LOT of time comparing 10Gig aggregation switches between 6 vendors. What if the vendor I use today had a platform that was average or below average in terms of 10Gig capabilities? If I had a hard requirement for a certain number of 10Gig ports and it had to be contained to 1 chassis, my choices are really going to be driven by 10Gig port density. It could be some other factor like power consumption or even chassis size. It doesn’t really matter. As long as my usual switch vendor cannot meet that requirement, I am going to go outside and look for another vendor. If I am dead set on staying with that vendor, I am going to change my requirements, To me this does not seem like a viable option unless it would cause unbelievable pain and suffering to introduce another vendor into the network. If that is the case, you probably need to re-think the whole single vendor strategy. Then again, if that single vendor works for you, then go for it. It’s your network and we each have to make the decisions that serve our company and customers best.

Back to the fictional switch problem. What if I am the incumbent vendor and I know you have a need that I cannot fill today, but will be able to fill that need in a couple of months, or even a year from now? Should I tell you even if nobody else knows about it? This is where the NDA comes into play. If you have done a bunch of research and are looking at alternatives to the incumbent, your mind might be changed if you happen to know something better is coming. Maybe it is far better than every other vendor’s current products you have been looking at. Maybe it is on par with the replacement vendor you have been looking at. Can you wait that long?

After seeing the new shiny thing that is coming out soon, you may decide to stick with your existing vendor. However, who is to say that the other vendors won’t be coming out with even better hardware/software around the same time or a month or two after? This is the point in which I find myself wishing I had access to all the vendor’s product road maps. I know some vendors will do an NDA on the notion that it will get them a sale, but I don’t know that I am going to be able to spend a bunch of time with every vendor to the point in which an NDA can be put in place. Perhaps it is best to bring in the consultants/integrators that sell products from a number of different companies. I would suspect they have some sort of idea when it comes to the future direction of certain product lines. Or maybe not. They might be in the same boat as I am.

The last thing you want to do is buy a product and have an even better solution appear a week later. I do think that most vendors will let you know that a better product is coming rather than lose the sale as long as the dollar amount is high enough. I don’t think company X is going to reveal a whole lot about their future road map if the net gain is a couple thousand dollars. Then again, if the salesperson has had a REALLY bad quarter all bets are off, but at that point you can smell the desperation in their sales pitch and I tend to be put off by that. That leads me to the thought that you really have to consider a wide range of factors when dealing with vendors. To me, the product has to meet the technical requirements above all. After that, cost is important. Right along with cost is the experience the vendor will give you. What are the hardware/software support capabilities of that vendor? What is the direction of the company? How long has the company been in business? If it is a recent startup(ie less than 2 or 3 years), who are the people running the R&D for these products? Are they known in the sector they are doing business in? In other words, if they sell security solutions, are they using experienced security professionals to develop the products or is this strictly an “academic” operation in which someone had a decent idea and got some venture capital funding? Granted, you can’t always figure all of that stuff out, but if you can, it sure helps when the decision making time comes.

To sum it all up, I think the smart vendors are going to tell their customers what is coming and when they can expect to see it for sale. It helps people like me plan for things down the road. I’m more interested in a vendor that is constantly updating their technology as opposed to one who releases new products with lesser frequency than leap years occur. When it comes to NDA’s, I don’t think the size of the customer should matter either. Small companies can get big relatively quick in this age of acquisitions and mergers. IT professionals DO talk to each other and tend to trust each other’s opinions MORE than a vendor paid “performance test” by an “independent lab”. If you are a vendor and want to show me your road map, I promise not to tell anyone outside of my company. 🙂 I don’t even want you to buy me lunch. I might just put pictures of your hardware up in my cubicle and drool over it all day until I my manager lets me buy it.

Posted in vendors | Tagged , | 2 Comments

Nexus 7010 Competitors – Part 2

**** Please note that these are my own thoughts and observations and should not in any way be taken to be the opinion(s) of my employer. Additionally, this is a rather long post, so please bear with me. I promise not to waste your time by babbling incessantly about non relevant things.

Finally! After many hours spent sifting through vendor websites and reading various documents, I have finished my comparison. If there’s one thing I came away with in this process, it’s that some vendors are better than others at providing specifics regarding their platforms. By far, Juniper was the best at providing in depth documentation on their hardware and software. Although Cisco has a ton of information out there about the Nexus 7000, I found that a lot of it was more on the architecture/design side and less on the actual specifics of the platform itself. Some vendors still hide documentation behind a login that only works with a valid support contract. In my opinion, that’s not a good thing. I think most people research products before they decide to buy, so why hide things that are going to cause roadblocks for people like myself trying to do some initial research? I’ve read MANY brochures, white papers, data sheets, third party “independent” tests(meaning a vendor paid for a canned report that gives a big thumbs up to their product), and other marketing documents in the past couple of weeks. I did not actively seek out conversations with sales people in regards to these products. I did have a couple of conversations around these products and not all the people I talked to were straight sales people. Some were very technical. However, I wanted to go off the things that the websites were advertising. Once the list is narrowed down to 2 or 3 platforms, the REAL work begins with an even deeper dive into the platforms.

I wish I could display the whole thing on this website and have it look pretty. Unfortunately, I don’t know how to do that and make it look nice. Remember, I get paid for networking stuff and not my web skills! In consideration of that, I have attached a PDF file of my comparison chart. I have the original in Excel format, but I didn’t upload it. If you want a copy, I can certainly e-mail it to you. You can send me your e-mail address via a direct message on Twitter. I can be found here.

What IS included in the spreadsheet.

I would love to say that I did all of this work for the benefit of my fellow network engineers, but I would be lying if I said that. I built this out of a specific need that my employer has or will have in the coming months/years. Due to that, some of the features that were important to me may not be important to you. If you find yourself wondering why I included it, just chalk it up to it being something that I considered a
requirement. Having said that, it would be selfish not to share this information with you, so take it for what it’s worth.

When it comes to the actual numbers of things like fan trays and power supplies, I tend to build out the chassis to the full amount it will hold. If it can take 8 power supplies, I will probably use 8. Same with fabric
modules. I like to plan with the belief that I will fully populate the chassis at some point, so I want to have enough power, throughput, and cooling on board to handle any new blades. All chassis examined have the
ability to run on less than the maximum number of power supplies.

When it comes to throughput rates, you have to distinguish between full duplex numbers and half duplex numbers. They don’t always specify which is which, so you have to dig through a lot of documentation to figure out what they are really saying. Thankfully marketing people tend to favor the larger numbers so more often than not, the number given is full duplex. In the case of slot bandwidth, I used the half duplex speed. The backplane numbers are all full duplex.

What IS NOT included in the spreadsheet and why.

If I were to include every single thing these switches support, the spreadsheet would be 10 times bigger than it already is. There are quite a few things that I consider to be basic requirements. These basic things
were left out of the sheet to avoid cluttering it up with things you probably already know. For example, does the switch support IPv6? This should be a resounding yes. If it doesn’t, why in the world would I even
consider it? The same can be said with routing protocols. They all should support OSPFv2 and RIPv2 at a minimum. Most, if not all support IS-IS and BGP as well. It is also worth pointing out that I may not even need this switch to run layer 3. I am looking for 10Gig aggregation and am not necessarily concerned about anything other than layer 2. All of these switches also support QoS. Perhaps they do things a little differently
between each switch, but the basics are still the basics and I don’t really need a billion different options when it comes to QoS. That may change in a few years, but for now, I am not looking at running anything
other than non-storage traffic over these switches.

I think you see my point by now. I could go on and on about what isn’t included. If it is something well known like SSH for management purposes, I don’t need to include it in the list. It’s a given.

Special note on the TOR(Top of Rack) fabric extension.

While I primarily need 10Gig aggregation, another bonus is the ability to have 1Gig copper aggregation as well. However, I don’t want it all coming back to the chassis itself. The Nexus 7010 has the ability through the Nexus 5000’s(of which I already own several) to attach Nexus 2000 series fabric extenders that function as top of rack switches(although it’s not REALLY a switch). This is a nice bonus feature as I can aggregate a lot of copper connections back to 1 chassis without all the spaghetti wiring that is commonly seen in 6500’s and 4500’s. In the case of Brocade and Force10, they actually have the TOR extensions as nothing more than MRJ-21 patch panels. With 1 cable(which is the width of a pencil) per 6 copper ports, the amount of wiring coming back to the chassis is reduced tremendously.

Additionally, there is no power consumption at the top of the rack like there is with the Nexus 2000’s and it is a direct link to the top of rack connections unlike the Nexus model where I have an intermediate 5000 series switch in between.

One final note. The HP/H3C A12508 is listed on the HP site as the A12508, but when you click into the actual product page, it is listed as the S12508. These terms can be mixed and matched and mean the same chassis. I have chosen to use A12508 as the model number as much as possible in this post, but my previous post that mentioned the various switches used the letter “S” instead of “A”.

I plan on posting a few more thoughts on this process as it pertains to specific platforms. I was awed by several of the platforms, not just by the hardware itself, but by the approach the company is taking to the data center in general. Any of these platforms will do the job I need them to do. Some will do that job a lot better than others. As for cost, I have only seen numbers on a few of the platforms. That’s something that is important, but not the most important. You can read my previous post on this for more clarification on what my thought process is.

Remember that I am not claiming to be an expert in regards to any of these platforms. I have done many hours of research on them, but there is a chance that some information in this PDF file will be wrong. If you see any glaring errors, please let me know. I promise you won’t hurt my feelings. If anything is marked “Unknown”, rest assured that I looked at every possible piece of literature on the website that I could reasonably find. If you managed to read this far in the post, the file is below. Enjoy!

Nexus 7010 Comparison – PDF File

*****Update – The Juniper 8200 series does support multi-chassis link aggregation. It just requires another piece to make it work. The XRE200 External Routing Engine gives the 8200 this capability. Thanks to Abner Germanow from Juniper for clarifying that!

Posted in cisco, switching, vendors | Tagged , , , , | 5 Comments

Nexus 7010 Competitors

I have an increasing need for 10Gig connectivity. Although I may have enough ports today, I have to plan for the future. While I can easily buy some more Nexus 5000 series switches, I would rather have a more capable platform. As a heavy user of Cisco hardware, the logical choice was to use the Nexus 7000 series line. It is a platform that I can grow into over time. I don’t need the big 7018, so the 7010 will suffice. My company has a great relationship with Cisco and our sales rep and local engineer are top notch. No hard selling on their part so the relationship is, in my opinion, a very good one.

Having said that, I also have to point out that I have an obligation to my company to ensure the best product is selected. It would be irresponsible of me to make a technical decision of six digit magnitude and have it come up short in features. I need to make sure the product we select is the best fit for our particular needs. That doesn’t mean the Nexus 7010 is the wrong device. For all I know it will be the best thing for us. Of course, I still have to do my due diligence.

Over the past several weeks, I have been looking over some of the competition. Granted, I still have to spend a lot more time looking at Nexus 7010 competitors, so I am nowhere near done. I’ve been really busy with other things, so I haven’t been able to dedicate as much time as I thought I would to figuring this out. What I have done so far is narrow down a list of vendors and the appropriate product that can compete with the Nexus 7010. Here’s a short list of the features I am looking to compare:

1. 10Gig port count across the entire chassis.
2. 10Gig port/blade/module oversubscription rates. (Some products may not have this issue.)
3. Size of chassis.
4. Power consumption.
5. Layer 2 features(STP, TRILL, proprietary)
6. Layer 3 features(Standard based protocols, proprietary protocols)
7. Cost(Not the main driver, but it is a factor to consider after the technical merits.)
8. Product age(Is it a new platform, or has it been around for more than a year or two?)
9. Focus of the company
10. Size of the company
11. Support structure of the company
12. Code updates(Is there a defined release cycle?)
13. Availability of documentation from the vendor.
14. Connectivity options other than 10Gig(1Gig copper ports or some type of TOR integration aka Nexus 2000’s?)

Obviously there are going to be other things to consider. I also was very vague on the L2 and L3 feature requirements. That was on purpose. As I go through this process, I will be able to elaborate more on the particular L2/3 features that are needed vs those that are available.

Here’s the models I am comparing:

Cisco Nexus 7010
Brocade NetIron MLX 16
Juniper EX8216
Force10 E1200
Arista 7500
HP S12508 – This was recently changed from the S9512E as it was recommended by someone from HP that it was a better comparison to the Nexus 7010.

It is pretty hard if not downright impossible to find competing platforms that have exactly the same specs. I tried to find the closest match in terms of 10Gig port capability since that is the main driver behind this project.

More posts to come soon on this. I am still trying to decide if I want to do a post on each platform individually or do a few posts focusing on certain features that they all have in common. Any thoughts on this are appreciated.

Posted in cisco, switching, vendors | Tagged | 6 Comments

ACE Boot Camp – Days 3 and 4

Days 3 and 4 did not disappoint! I don’t know if I stated this in the earlier posts, but the days basically consisted of lecture in the morning and labs after lunch. I REALLY, REALLY enjoyed the lecture portion. Again, I have to state that the instructor was fairly knowledgeable in regards to ACE, so he was able to actually teach instead of regurgitate a slide deck like other classes I have been in. That makes all the difference in the world. As for the labs, I guess they do some good if you have not had much experience with the ACE CLI. We did not do any labs using the built in GUI or ANM. The problem I have with labs is that they are a very canned and controlled environment. You end up just going through the motions without actually soaking up what it is that you are doing. Ideally, the labs would need to be tailored to your environment to have the greatest effect. This of course, is not realistic. Having said that, I am sure there are some people who get something out of it. My opinion was shared by others in the class in regards to the effectiveness of the labs, so I am not the only one who feels this way. However, the effectiveness of the lecture portion completely overshadowed any shortcomings of the lab portion.

In the interest of brevity, I am going to touch on the things I thought were the most interesting, but I don’t want this post to be so long it requires a coffee break to finish.

Route Health Injection – On a simplistic level, RHI allows the ACE to inject a host route into the network. You would use this to advertise the VIP(virtual IP) that clients use to connect to a server farm. If the server farm is not available due any number of issues, the host route can be automatically removed from the route table and not advertised. The alternative is to simply advertise the VIP’s as part of a regular subnet advertisement like you do with any other VLAN or subnet. Again, I am simplifying this and need to point out that this is NOT something that is specific to Cisco ACE. Other vendors implement similar technologies.

KeepAlive-Appliance Protocol(KAL-AP) – There’s a few variations of the Cisco ACE, and one of those is the Global Site Selector(GSS). Its purpose is simply to provide higher level load balancing between data centers. Basically, it is a load balancer of load balancers. By using KAL-AP, the GSS can query VIP’s at multiple data centers and determine which one is the best fit to send traffic to.

There are a couple of things that the ACE 4710 appliance does that the ACE module cannot. I asked the question as to why this is the case and was told that the ACE appliance has different architecture than the module. It has certain functionality that might come to the module at some point, but for now is restricted to the appliance. These extra functions really revolve around the ACE appliance being able to cache certain HTTP objects and speeding up the process of delivering a web page to an end user. A fair amount of detail on this can be found here.

It sure seems as if I cut back on the information from days 3 and 4 when compared to 1 and 2. I did. Although there were plenty of interesting things covered in the past 2 days of class, a lot of those things would take a while to explain and draw out via diagrams. That’s also assuming that I actually understand these things well enough to explain them in depth.

That brings to me to a more philosophical point in regards to the type of niche product that Cisco ACE is. While it would be great if you knew the CLI on ACE backwards and forwards, it really isn’t necessary. What is necessary is an understanding of what a platform like ACE is capable of. I sat in a meeting today in which some developers wanted ACE to perform health checks on a server outside of a load balance pool and use the results of that query to determine whether or not servers should be removed from a load balance pool. Basically, they wanted to do something that ACE is not really designed to do. Spending 4 days in a classroom learning all about ACE gave me the information needed to have a productive meeting with these developers today. I was able to answer their questions and give better guidance than I would have a couple of weeks ago. I don’t know all the commands for ACE. I will still have to use the configuration guides to look things up now and again. The important thing is that I understand the capabilities and limitations of the ACE load balancer a lot better today than I did prior to taking the ACE class. My main goal is to know what it can and cannot do in order to design anything requiring load balancing properly. To me that is more important than memorizing commands.

Posted in ace, cisco, learning, load balancing | Tagged , , | Comments Off on ACE Boot Camp – Days 3 and 4

ACE Boot Camp – Day 2

Day 2 of ACE boot camp did not disappoint! Another full day of lecture and labs. We covered the following topics:

Modular Policy CLI
Managing the ACE Appliance and Service Module
Security Features
Layer 4 Load Balancing
Health Monitoring

I’ll cover some general things about each topic and go into additional details on the points I thought were interesting.

Modular Policy CLI – ACE classifies which traffic it will load-balance based on policy maps, which are comprised of class maps. If this sounds a lot like how you build QoS policies on IOS based routers, it is. The big difference is that ACE is far more restrictive in what those policies contain.

Managing the ACE Appliance and Service Module – Like most Cisco devices, ACE can be managed in a number of different ways. Telnet, SSH, HTTPS, and SNMP. You can even use the XML API if you want. With SNMP, versions 1 and 2 cannot understand contexts. SNMP version 3 can. In order for SNMP version 1 and 2 to work with contexts, you have to use the community string format of “community@context” where “community” is the community name and “context” is the name of the virtual context. When the GET, SET, or whatever SNMP action you choose hits the ACE, the “@context” portion is understood and passed along to the appropriate context.

Security Features – There are a ton of different ways to restrict traffic entering and leaving the ACE. Most of the time you will be focused on traffic entering the ACE. As with applying ACL’s to interfaces on switches and routers, very rarely will you see access lists applied in the outbound direction. That feature is there in case you have some special need to use it.

An interesting capability that the access lists have in ACE is the ability to use object groups to identify which traffic to permit or deny. If you have ever worked on the PIX, ASA, or FWSM, you will be familiar with object groups. They make traffic identification much easier not to mention the simplification of the ACE configuration itself.

The much more granular security options were of great interest to me. Take something like IP fragmentation and reassembly. You can specify the max number of fragments allowed from one packet. If it exceeds the number you specify, you can just drop the traffic. Many other options exist with regards to the packet stream itself. You can enforce certain flags from being set. If violations occur, not only can you drop the traffic, but you could actually reset the flag itself and then send the traffic through the ACE. While most options are configurable, there are some rules that are always enforced. For example, the source IP of a traffic flow can never equal the destination IP.

Layer 4 Load Balancing – This is exactly what it sounds like. Load balancing based on TCP/UDP flows. I think the neatest part about this particular topic was the fact that you can actually load balance traffic across multiple firewalls and have the return traffic come back through the same firewall. This of course requires an ACE on both sides of the firewall, but withe ability for the ACE module to have up to 250 virtual contexts, it doesn’t have to be 2 separate physical ACE modules. The same module can host both contexts that live on either side of the firewall. It is fairly clever how they make this work. Essentially, when traffic comes from one firewall into the ACE, it remembers the MAC address of the sending firewall and places that connection in a state table. When traffic comes back through the ACE, it already knows which firewall to send the traffic to based on that state table. I’m not sure I would want to use an ACE module for load balancing through firewalls, but there are plenty of customers out there that are already doing it or could see the benefit in doing something like that.

Health Monitoring – If there’s one thing the ACE seems to have a fairly large amount of options on, it’s the health monitoring or probes. All the major protocols have specific probes on the ACE that are used to check the health of the back end or “real” servers. This is way beyond the load balancer simply pinging the server to make sure it is up and running. Let’s say you used the HTTP probe. Instead of just trying a simple ping to check a back end servers’ status, the HTTP probe can actually go out and make an HTTP connection to the server or serverfarm. That’s a far more intelligent way to query server status. Based on the probe results, any number of things can be done to the various serverfarms and servers ACE may be providing services for. They may be taken out of active status, have their priority reduced, etc.

There’s a LOT more to this stuff. This was only day 2 of 4! More to come.

Posted in ace, cisco, learning, load balancing, training | Tagged , , | Comments Off on ACE Boot Camp – Day 2

ACE Boot Camp – Day 1

First off, let me point out that this is not a boot camp with a certification in mind. It’s a 4 day course given by Firefly Communications. Although I booked the course through Global Knowledge, I was told that they typically outsource their data center courses to Firefly. Works for me. As long as it is quality training, I don’t care if you outsource it to Elbonia. I am assuming they use the term “boot camp” because it is an end to end ACE class taught in just 4 days.

Which brings me to my first point. My company was able to use Cisco Learning Credits to pay for this class. At 30 credits, that translates to $3,000 US dollars for 4 days worth of training. Sitting in the class, I couldn’t help but notice people doing regular work while the instructor was going through his lecture. I realize most places are understaffed. Outages happen. Fires have to get put out. However, $3,000 for 4 days to me is a big deal. If you send your employees off to training that is critical/applicable for their job, LET THEM TRAIN! Leave them alone while they are there. Of course, that’s a 2 way street in that some employees need to learn to let go as well. The company will function without them for a few days. You can turn off “martyr” and “hero” mode for a couple of days. I am checking e-mail at night, but not being obsessive about it. I have very capable co-workers who can do anything and everything without my help.

Now, on to the actual class. Let me begin by commenting on the quality of instruction. I’ve been to plenty of poor classes in which someone was trying to shovel test material down your throat the whole time. I’ve also sat in several classes where the instructor was obviously out of their league and could not field questions from the crowd that weren’t covered on the vendor approved slide deck. That is simply not the case with Firefly. My instructor is very competent and when he hits the limit of his knowledge, he indicates that. So far, I think I have only seen 1 time out of the dozen or so questions he was hit with today in which that was the case. I guess that is what $3,000 a seat gets you.

It seems as if there is a fairly decent mix of people in this class. About a dozen or so in attendance. A fair amount of them are actually using the ACE 4710 appliance which I thought was rather interesting. Of course, most are using the standard ACE module. There are varying levels of experience with ACE as well. I was under the impression that I would be here mainly for the second half of the class, as I felt comfortable with the basics. Of course, just when you go and get comfortable, you realize how little you know. I learned a LOT today. Mainly, it was about things I never really bothered to dig into. You see, like most people, we probably only dig into the features we absolutely need right now. Maybe we plan on coming back and covering everything else at a later time, but I think that happens far less than we’d like it to. Some of the things we covered today that I was horribly deficient on were:

Resource Management – If you use multiple contexts, RM can prevent a single context from taking over the entire resources of the module. I don’t use this as it is currently not a concern, but good to know if things change!

HTTP Message Structure – 3 fields make up each HTTP message: Start/Request line(includes the METHOD), Header fields, and Body(which is optional)

ACE 4710 appliance – I don’t use it and never have. However, it does do a few things the module does not mainly centered around application acceleration. We have not covered that exhaustively yet, but I will take good notes when we do.

There were other things covered in which I was glad to get a decent refresher. The main one being TCP sequence numbers. They are always a bit confusing to me if I don’t study them on a fairly regular basis. Although you weren’t there with me in class today, you can read this post by Jeremy Stretch which talks about TCP sequencing. He even uses nice graphics!

We ended the day doing a pretty simple lab in which we created some contexts and messed around with resource management to see if we could oversubscribe the module in terms of CPU, memory, etc in regards to other contexts. Overall, it was a really good first day. I am eagerly anticipating what tomorrow will be like. It is also good to be taught by someone who actually helped develop the slide deck the course is taught from. He was able to add funny little details about how he created this drawing or that. It’s always nice to have someone teach who has a great sense of humor. So far, I give the Firefly ACE boot camp 2 thumbs up!

I am hoping to get a wee bit more technical in the following posts regarding ACE boot camp as the remaining days will REALLY focus on load balancing. Who knows? I might even post a graphic or two! Shocking isn’t it?

Posted in ace, cisco, learning, load balancing, training | Tagged , , | Comments Off on ACE Boot Camp – Day 1

Busy, Busy, Busy!

It’s not that I don’t have anything to say! People who know me know that I very rarely shut up for more than a few minutes. It’s just that I have been fairly busy lately. A lot of different things have been eating into my time and writing things for a network blog take a lot of time and effort. I have a 4 day Cisco ACE class next week in which I will be out of town, so I hope to get several posts done at night when I am sitting in the hotel. You don’t actually think I will be going out at night do you? Hmmmm…..a week away from the office and a training day that ends at 4:30pm. That leaves me all sorts of time for the following:

1. Catch up on the billion or so web pages I have bookmarked.
2. Get some things written for the blog that revolve around possible competitors to the Nexus 7000. With HP, Arista, Brocade, Force10, and Juniper selling competing products, there’s a lot of data to sift through. I honestly have no idea who will come out on top. It might just be the Nexus 7000!
3. Comment on my experience with the ACE class I will be taking with Global Knowledge. I’ve spent the last several days at work focused on ACE, so I am very interested in filling in the gaps of my knowledge regarding this interesting product.
4. Read up a little more on the Cisco/EMC/VMware vBlock concept. I went to a presentation today about that and am intrigued to say the least.
5. Write about the concept of baselining your in-house applications. This would be focused on knowing what the normal TCP/UDP operations look like from a packet capture standpoint.

I try and keep a running list in Evernote of the things I would like to write about. The list continues to grow, but the time it takes to transform just one of those ideas into a somewhat coherent post just hasn’t been there.

I hope to have some new content up early next week. The last thing I want is to end up abandoning this blog and waste all my time playing mindless games on my iPad, although I do enjoy doing that a few times a week.

Posted in documentation, efficiency, learning, vendors | Tagged , | 2 Comments

Technical Book Annoyances

It seems that if you read enough technical stuff, you are bound to find things you either disagree with, or know to be untrue. At least I think they are untrue until I validate my thoughts with the applicable standard or bounce my thoughts off of others within the IT community. I understand that it is VERY difficult to write books, white papers, articles, etc with a technical focus and have them turn out well. A lot of editing has to take place. The target audience has to be considered as well. In short, it is a lot different than writing a fiction novel or short story as when it comes to technical stuff, there’s a bunch of people out there second guessing every sentence you write and diagram you create. With fiction writing, the entire story is in someone’s head and accuracy is not a factor. Let me also state that I have tremendous respect for most technical authors.

However………

I have been doing a lot of reading on switching lately. I’m lazily making my way to the CCIE R/S lab attempt so I have been reading the 3560 command reference guide. It’s the Good Will Hunting approach. In addition to that, I decided to supplement my lunch today with some reading of End-to-End QoS Network Design. So, now that I am thinking about switching and QoS, I came home after work and started in on my so far unused copy of Cisco LAN Switching Configuration Handbook. There’s a QoS chapter. Hooray! The best of both worlds.

I get to the end of the third page in the chapter and run into a problem. That’s page 223 if you have the printed version handy. I came across this tidbit:

Classes 1 through 4 are termed the Assured Forwarding (AF) service levels. Higher class numbers indicate higher-priority traffic. Each class or AF service level has three drop precedence categories:

* Low (1)
* Medium (2)
* High (3)

Traffic in the AF classes can be dropped, with the most likelihood of dropping in the Low category and the least in the High category. In other words, service level AF class 4 with drop precedence 3 is delivered before AF class 4 with drop precedence 1, which is delivered before AF class 3 with drop precedence 3, and so on.

Did you spot any errors? At first, I thought I was reading it all wrong. Maybe I misinterpreted what the authors were saying. AF classes have different priority levels? Hmmmm. I thought they were the same. Then of course, there’s the issue of the book stating that the higher the drop precedence, the less likely it will get dropped. In other words, the book is saying that AF43 will get dropped less than AF41 along with a repeat of the logic stated earlier that AF41 comes before AF33 in terms of priority. At this point, I am really starting to get confused. That can’t be right. I’ve never heard anything like that before.

Maybe it was an error that was fixed in the errata that Cisco Press usually posts on their site. I checked, and there were no corrections issued for that particular book on the Cisco Press site. I also went over to Amazon and read through the reviews people had written. A lot of times, people will list errors in their book reviews, so it is a good source of information when determining whether or not to buy the book. There was nothing there that alluded to this potential error.

The next step was to check the RFC along with the End-to-End QoS book I am reading as well. RFC 2597 is entitled Assured Forwarding PHB Group. Or, AF per hop behavior group. Remember that in a DiffServ environment, QoS is done on a per class basis. There is no end to end guarantee for an individual flow so to speak. That’s what IntServ is for. One might even say that a per hop behavior is indicative of each hop being able to do whatever they want to traffic based on the class it is in. The router could care less about the entire flow. It looks at DSCP markings and makes a decision based on that. Maybe that’s being too generic, but that is the way that I understand DSCP, PHB, etc. The RFC said the following:

Section 1 Paragraph 3

Within each AF class IP packets are marked (again by the customer or
the provider DS domain) with one of three possible drop precedence
values. In case of congestion, the drop precedence of a packet
determines the relative importance of the packet within the AF class.
A congested DS node tries to protect packets with a lower drop
precedence value from being lost by preferably discarding packets
with a higher drop precedence value.

Section 2 Paragraphs 1 and 2

Assured Forwarding (AF) PHB group provides forwarding of IP packets
in N independent AF classes. Within each AF class, an IP packet is
assigned one of M different levels of drop precedence. An IP packet
that belongs to an AF class i and has drop precedence j is marked
with the AF codepoint AFij, where 1 <= i <= N and 1 <= j <= M.
Currently, four classes (N=4) with three levels of drop precedence in
each class (M=3) are defined for general use. More AF classes or
levels of drop precedence MAY be defined for local use.

A DS node SHOULD implement all four general use AF classes. Packets
in one AF class MUST be forwarded independently from packets in
another AF class, i.e., a DS node MUST NOT aggregate two or more AF
classes together.

Okay, so the RFC is fairly clear that the lower the drop precedence, the safer the traffic should be. Routers and switches should drop AFx3 before AFx2, and drop AFx2 before AFx1. Additionally, we learn that the 4 AF classes are general use. There is no hierarchy. AF4x is no different than AF2x. We may give AF4x more bandwidth in a given QoS policy during periods of congestion, but we can also do the same for AF3x, AF2x, or AF1x traffic if we want to.

Looking for the same type of information in the End-to-End QoS book yields the following taken from chapter 3, Classification and Marking Tools, in the “Marking Tools” section:

DSCP values can be expressed in numeric form or by special keyword names, called per-hop behaviors (PHB). Three defined classes of DSCP PHBs exist: Best-Effort (BE or DSCP 0), Assured Forwarding (AFxy), and Expedited Forwarding (EF). In addition to these three defined PHBs, Class-Selector (CSx) codepoints have been defined to be backward compatible with IP Precedence (in other words, CS1 through CS7 are identical to IP Precedence values 1 through 7). The RFCs describing these PHBs are 2547, 2597, and 3246.

RFC 2597 defines four Assured Forwarding classes, denoted by the letters AF followed by two digits. The first digit denotes the AF class and can range from 1 through 4. (Incidentally, these values correspond to the three most significant bits of the codepoint, or the IPP value that the codepoint falls under.) The second digit refers to the level of drop preference within each AF class and can range from 1 (lowest drop preference) to 3 (highest drop preference). For example, during periods of congestion (on an RFC 2597–compliant node), AF33 would be dropped more often (statistically) than AF32, which, in turn, would be dropped more often (statistically) than AF31. Figure 3-7 shows the Assured Forwarding PHB encoding scheme.

More validation that I was correct in my initial hunch. Even though I was fairly certain the LAN Switching Configuration Guide book was wrong, I wanted to double-check with perhaps one of the greatest tools available. Twitter. I got responses from 3 different people, which is saying something considering it was about 10PM CST. Thanks to Steve Shaw, Steve Rossen, and Andrew vonNagy for their assistance in validating my assumptions. Steve Shaw pointed me to the exact paragraph in RFC 2597 and Andrew mentioned 1 of the 3 new QoS videos that Kevin Wallace created and posted on YouTube which discuss the AF class and drop precedence. Check it out. It’s a fantastic video.

Now, you might be wondering why I am making such a big deal out of this. Is it really important? In this case, yes. Understanding the AF class and drop precedence is vital to one’s understanding of DSCP as a whole. If you get this wrong, it could bite you down the road when designing a QoS policy for a large network. It can bite you when troubleshooting a QoS problem. Some things have to be right every single time they are put in print. QoS is not some super-easy technology that can be mastered in a half-day. Auto-QoS is a great functionality on hardware. I used it this past weekend when configuring 3750 POE switches for use with Cisco phones. However, typing a command and understanding what the ramifications are of that command are 2 different things.

Thank goodness for multiple sources!

Posted in cisco, documentation, qos, routing, switching | Tagged | Comments Off on Technical Book Annoyances

Are You A Technology Bigot?

If you have been around IT for more than 5 minutes, you have probably been involved in a technology dispute. You have come across the person who loathes any company but one. Or, they hate one company more than any other. Perhaps they hate certain protocols or technologies because they are slightly proprietary. You get the point.

These people are everywhere. Perhaps you are one. I have been one at times. Maybe even right now. With the sheer amount of things your average networking professional is required to know, it is often easier to take refuge in the arms of a select few vendors. In a previous post, I asked the question regarding whether or not we can stay vendor neutral. I think we can, but it takes some concerted effort on our part to do so.

I don’t want to re-hash that old post, so I will move on to the point I want to make in this post. When you think about the companies you buy from, (By that I mean the actual hardware/software producer and not the reseller.) why do you buy from them? Surely you are not using only price to justify your selection are you? What are the technical reasons you buy from certain vendors? Can you name any of them? How about if I give you a competing product? Can you tell me why your choice is better than the competition?

About a month ago, I bought an iPad. I went into the Apple store and stood in line to buy my iPad. As I was standing there, a young couple was looking at a Macbook, or iMac, or whatever and asked the sales guy why they should buy a Mac. I was actually impressed with how the lady asked the question. She said: “We are looking to get a new computer and I want you to tell me why I should buy a Mac. They cost a lot more than an HP or Dell system.” Obviously someone who is open to different technology, but wants to make the right purchase. She had “accountant” written all over her. The reply from the sales man really took my by surprise. He said: “You buy a Mac for several things. First, you don’t have to worry about any viruses. Second, it is a lot more secure than any Windows machine. Third, you don’t have to worry about it crashing on you. Fourth, it costs more because it is a much higher quality product.”

I didn’t stick around long enough to hear if he closed the sale or not. I was too enamored with my ability to con my wife into letting me spend $499 on a device that will waste even more of my time with meaningless games and YouTube videos. As I heard him say those things to that couple, I was thinking how incredibly naive and wrong they were. The Apple computing platforms have been relatively unharmed by large amounts of viruses and security issues because their market share has always been in single digits and wasn’t worth the criminal/hacker community’s time and effort. If 90% or more people are using Windows boxes, why would you spend time on less than 10% of the computer population? In the past couple of years, Apple has made huge gains in the consumer market. Huge. You’ll see an increasing number of exploits head Apple’s way as their market share increases. My opinion. I could be wrong, and if I am, call me out on it. As for Apple having to deal with OS or app crashes? Nah. That would never happen right? Perhaps the only thing he said that I would possibly agree with is that it costs more because it is higher quality. After using my iPad for a month, I must say that it is a VERY polished system. I love the way it works, but I do have plenty of apps that crash. Safari included.

Whew! Enough talk about Apple. I mentioned that story just to make a point. Sometimes we delude ourselves into believing that one product/company is better than another based on hearsay, groupthink, or own positive experience with that product/technology/protocol. Perhaps it is all we’ve ever known and thus come to the conclusion that it is the best. Or maybe that guy was just trying to make a sale and counted on the ignorance of the consumer. I don’t know. I doubt I will make another trip to the Apple store unless they are the only ones selling Apple TV. What can I say? I’m becoming a convert/fanboy/zombie when it comes to Apple.

Here’s an exercise for you. Don’t worry. It’s purely a mental one. Act as if you were a first time visitor to your company data center, computer room, closet, or wherever you hide your network gear. Ask about the various products you bought and why you chose them over a competing product. If you run a Cisco ASA firewall, why did you pick that over CheckPoint, Juniper NetScreen, WatchGuard, or SonicWall? Why did you choose that Juniper router over Cisco, Vyatta, Brocade, or Adtran? It’s a good exercise because it forces you to confront the real reasons you buy from certain vendors. You see, you can be a fan of a product or a company and buy continually from them without ever really considering why you do it in the first place. At some point, someone who knows a fair amount about that particular product space might ask you to defend your selection. You better have a better answer than cost or the plethora of free lunches you get from the vendor. If you have no idea what the criteria is for determining the best choice, then you might be in over your head. Don’t worry though. Most people won’t notice as long as the free lunches keep rolling in.

In closing, can you be a technology bigot? Not if you want to be a professional. Every company has flaws and every company will produce bad technology from time to time. Being open to all solutions will keep you from buying the bad technology or using the wrong protocol. Your job as a corporate drone like myself is not to convert everyone to a particular product/technology to where they shut out reason and refuse to consider alternatives. Your job is to find the right product for your particular situation. Let the facts behind your decision speak for themselves. Tell people why you chose a particular product or technology from technical merits alone and you’ll find most people will accept that. Tell people that only a moron would pick something else and you’ll end up with a lot fewer friends. You better hope the vendor you buy from wants to buy you lunch all the time because no one else will.

****EDIT: I should probably make the point that I am only focusing on technical merits of hardware/technology first. There are other very valid reasons to buy or not buy certain products such as ease of use or familiarity by existing staff, ability to procure said equipment, or size and scope of project. If you have a fairly nailed down requirements list for some remote sites and need to deploy equipment there, then I wouldn’t advocate going through a full blown product selection procedure every single time. My point is simply that before any of those things are considered, the product must meet the technical requirements of the job at hand. After determining that, then you can consider the support structure, cost, etc. If the cost is too much, your requirements will have to change.

Thanks to Scott and Jon for their thoughts on the matter.

Posted in vendors | Tagged , | 3 Comments

It’s Game Day! Are YOU Ready?

It’s late August here in the United States. That means one thing for a lot of people. Football is starting. No offense rest of the world. Your football is my soccer, although I tend to side with you that my soccer should be called football. How often does one kick an American football? A LOT less than we touch the ball with our hands. I’m getting off on a “semantics” tangent though. It is the one sport that predominately resides within North America. Yes, I am acknowledging you too Canada!

Many athletes at all age levels have been practicing for several months and are ready to get started with the football season. Many a Saturday afternoon, Sunday afternoon, and Monday night will be spent watching people knock each other over to carry a piece of pig skin across some lines on the ground and celebrate by dancing as gracefully as one can when covered by all that protective gear. Millions of people will watch all the way up to early next year when the championships are decided by as little as 1 point. For the most part, there are no do-overs. All of you “instant replay” fans just bite your tongue and let me carry this analogy as far as I can. When the game is over, it is over. There are no series of games like baseball, hockey, and basketball have. You have one shot at glory. Miss it, and you’ll have to wait until next season.

There’s a German proverb which says: “To aim is not enough. You must hit!”

I get paid for things that go bump in the night. Whether that thing happens to be a router failing, or a circuit deciding it no longer likes my 1’s and 0’s, my job is to fix it and fix it fast.

I do come to work during the day. I go to meetings and look at configurations of various hardware. I build network diagrams and dispense or seek advice on a number of different things. I participate in the important philosophical discussions like whether or not Anakin Skywalker was a better Jedi than Luke or Yoda(In my opinion, Anakin (aka Darth Vader) was the better Jedi and was robbed of his destiny by his meddling child and his rebel scum friends). I put in change requests for maintenance that must be performed. I can plow through the day to day stuff without hardly any interaction from management. Of course, they care about the quality of the work and if I used these stencils in my Visio diagrams, they might object. However, my overall existence in the day to day network operations life is rather calm.

In essence, I do the things that need to be done during the day, but my REAL job comes in spurts. Kind of like football(From now on, when I say football, I mean American football.) players. My game time comes at odd hours much like the police officer or fire fighter. When trouble happens, I need to perform. I need to be able to ask the right questions and formulate a short list of what the possible problems are. I need to be able to troubleshoot in a logical fashion either working up/down the OSI model or grabbing a packet capture and examining the session flows. When it is my equipment or systems that are at fault, I have to get in there and make the big play. I need a touchdown each and every time. I can’t drop a pass or fumble the ball. I get paid for results and rest assured my management is watching. They have to. All it takes is for someone much higher up on the food chain to ask why they pay the salaries of network people who can’t seem to fix the network. Then, I am out on the street forced to sell my services to the highest bidder, who hopefully doesn’t play Dungeons and Dragons or World of Warcraft with any of my now former co-workers/managers. I would have used a sport like tennis or basketball, but since I am in IT, the odds of that happening are much less than a bunch of technical geeks sitting around in Viking helmets and leather tunics taking part in the raid of an Ogre village on World of Warcraft over a shared broadband connection in an obscure apartment complex deep in suburbia while guzzling Red Bulls and listening to angry death metal music. By the way, for all of you D&D geeks who are shaking your heads in disgust at my mention of Ogre villages, I get it. I saw Shrek. I know they are solitary creatures, but I needed an effective illustration. If I used Elf village, the visual would have been less powerful.

Am I saying that we can’t make mistakes? Well, that depends. There are some places in which you can’t. Ever. Most places will allow mistakes. We’re all human and mistakes will happen. Of course, with enough attention to detail those mistakes can be minimized significantly. What I AM saying is that you need to be able to perform when a crisis hits. Your entire career at a particular company may come to a screeching halt over just a few minutes of doing the wrong thing. It won’t matter how long you have been with company X if your performance is so poor that company X starts bleeding millions of dollars due to an outage that you can’t fix. Problems are going to happen. Outages are going to happen. If your company expects you to fix them, you better fix them. Now I know that some people get in over their heads. It may be the company’s fault for placing an unrealistic demand on you, or it may be your fault for misrepresenting your capabilities. If your company is expecting you to fix and support issues with F5 load balancers and you have never so much as looked at an F5 load balancer, you better let someone know and get up to speed as fast as you can. After all, your job typically is whatever your company says it is. Don’t like that? Tough. Go somewhere else. Life isn’t fair. Sometimes you are the one in the room everyone is counting on to fix the problem, even if it isn’t your equipment that is causing the problem.

In the interest of brevity, let me close with some thoughts on how to ensure your performance is top notch.

1. Know what the scope of your job is. – This may seem a bit simplistic, but you need to be on the same page as management when it comes to your responsibilities. You cannot rely on someone else to tell you if that piece of network gear buried in some rack in a data center is your responsibility. You are going to have to find that out yourself and it needs to happen before the problem occurs. Hopefully your co-workers who have been there longer than you have a good grasp on what things belong to you. For example, if your boss expects you to take care of the wireless network, you better do it or have it handed off to someone else who can take care of it when a problem arises.

2. Develop your skills around your responsibilities. – I’m not advocating you abandon any sort of professional development that is not DIRECTLY related to your job. However, a BIG part of getting a pay check from a company is directly tied to being able to do your job as defined by the company. Good managers won’t load you up with things you are not able to do unless you have managed to con your way into a job by being a bit liberal with your resume. If you are stuck with something you are relatively new to, do the best you can and make sure your management KNOWS you are doing the best you can. Read books, configuration guides, white papers, and other technical documentation. Attend a training class. A career in IT is all about adaptation. None of us are working with the same hardware/software we were 10 years ago. If you are, odds are you either work for the government or a REALLY cheap company. Perhaps there are one or two things that have had a ten year plus shelf life, but for the most part, technology changes so fast that a decade is a lifetime in IT.

3. Be prepared. – Expect the unexpected. Think about different failure scenarios and design the network to remediate any single points of failure. If need be, have some block time purchased with an external consultant or VAR that has considerable experience with your specific hardware/software platforms. Carry maintenance contracts on all your hardware/software that is critical.

4. Raise any red flags early on. – If there are issues you know are going to be a problem, let someone know as soon as possible. Document those issues. Fix those issues. Even if the company says no due to budgetary reasons or some technical issue, at least you have done your homework and tried to make these issues known. If a problem does occur, nobody can come back to you and say that you should have known about this, or that it was your fault, etc. Additionally, it may work out to your benefit as management typically appreciates people who just want to make things better and do what is right for the stability of the network.

5. Stay calm during the outage/problem. – Remember that in a lot of people’s eyes, it is always the network that is at fault. Don’t let that get you down. Stay focused and work on the problem at hand. Ask as many questions as needed to get an idea of what the scope of the problem is. Don’t be afraid to ask very basic questions. One of the best ones to ask is “What changed?” or “When did the problem start?”. Maintain professionalism at all times. I get upset when I am on a conference call and someone won’t stop moaning about why it’s not their fault long enough for me to ask a question or answer one. However, it’s rather immature and unprofessional for me to lash out at them in anger even if I know it’s not my issue. There ARE times when I have had to tell someone to stop talking so that I could either answer a question or ask one of someone else on the call. I hate having to do that, but sometimes in the interest of getting it fixed YOU HAVE TO. If you are dealing with an issue where people are congregating around your desk watching over your shoulder, try and tune them out. You can’t always tell them to get lost or to leave you alone. You have to learn to work under pressure, but if you have taken item number 2 to heart, you should be able to minimize the time these people are hovering near your desk.

6. Be humble. – If people know that you don’t know it all, they tend to cut you a little more slack. If you are condescending and treat people like garbage because they don’t know the difference between a “routed” protocol and a “routing” protocol, they will be very unforgiving of your mistakes. Remember, there is no way possible you can know it all. There are people out there who know far more than you do. Sometimes they are in the same room as you. If you save the day and score a touchdown, good job. You don’t have to do the happy dance in front of everyone if you figure out what caused that routing loop. Your actions will speak for themselves. On the other hand, if you storm into the room demanding people shut up and watch you perform, you better get it right. If you don’t, your stock just went down and at some point, you’ll be looking for work elsewhere.

Ask any athlete how hard they have to work in order to get to their peak performance level and you’ll no doubt hear a recurring answer. You will find that it took a lot of time and effort to get there. There are no short cuts. When the wide receiver catches the ball and runs 80 yards to the end zone for a touch down, you can bet he ran sprints hundreds of times in the months prior. When the quarterback throws the ball for 50 yards and drops it right on the chest of the wide receiver, you can bet he threw that same pass hundreds of times in the months prior. When the defensive end wraps his arms around the running back and slams him to the ground, you can bet he practiced on a tackling dummy hundreds of times in the months prior. The examples go on and on. Peak performance takes time and effort. You practice and refine your skills for what is usually a short performance. Sometimes the performance extends over a couple of days or weeks, but generally issues get diagnosed and resolved in a relatively short time. How you prepare will determine the outcome. If you take shortcuts, expect poor results. If you put in the effort to perform well, good things will come your way. Granted, you probably won’t get a multi-million dollar contract with company X, but how many football players do you know who understand cool stuff like policy routing and VRF’s? Oh, and being able to fix problems on the network quickly leaves you more time to play World of Warcraft.

Posted in efficiency, learning | Comments Off on It’s Game Day! Are YOU Ready?