Thursday, March 31, 2011

Scary ways to abuse VLANs

I ran across this article the other day (after someone at the Spiceworks Forums posted a link to it).  It made me cringe, repeatedly, to read it.  This post will address why this other article is so wrong, and why you should not do what this guy suggests. 

The key to understanding how to approach this lies in understanding what a "broadcast domain" is.  A broadcast domain is the set of devices all of whom will receive a broadcast sent by any other member of that set.  Normally, every device connected to a standard LAN switch will be in the same broadcast domain; in short, switches define broadcast domains.  Every device connected to the same LAN is a member of the same broadcast domain.

What VLANs allow one to do is treat a switch as if it were multiple independent switches, coexisting in the same box.  The switch is told to group its ports, some to one virtual switch, others to another.  The end result is to have multiple LANs (virtual LANs, or VLANs) coexisting on the same hardware.  You could get the same result by buying multiple switches, one for each independent LAN.  VLANs just let you do this with fewer switches.  That's all.  (There's some added complexity when you start talking about trunking and about layer 3 switching, but neither of these is essential to understanding what a VLAN is.)

The author's definition of a VLAN (Virtual LAN) as a "technology that enables dividing a physical network into logical segments at Layer 2" is, I suppose, not entirely inaccurate; however, it's less than useful to understanding what a VLAN is.  The problem this author has is that he's viewing VLANs as a partitioning of a physical network.  But that's not the right approach.  While VLANs have this effect, that's not the way to understand them.  It's far better to think of VLANs as a way for multiple LANs—that is, multiple broadcast domains—to independently coexist in the same hardware, much the way that virtualization hypervisors allow multiple computers to independently coexist on the same hardware. 

A few lines down from that is another flat out wrong statement.  VLANs are not used to "join physically separate LANs or LAN segments into a single logical LAN".  You cannot do that with VLANs alone; doing this (if for some reason you wanted to) is the role of a bridge or a tunnel—or just a cable between two switches.  You might use a VLAN in the course of setting up a bridge or tunnel, but VLANs don't allow you to do this on their own. 

The discussion on page two about the use of VLANs to control broadcast traffic is fundamentally correct; this is one of the major reasons for separating a network into multiple broadcast domains.  The statement "Small LANs are typically equivalent to a single broadcast domain" really illustrates the fundamental mistake this author made: a LAN is, by definition, a broadcast domain, and so a small LAN would necessarily also be a broadcast domain.  There's also some discussion about IP multicasts that is all entirely incorrect and should be just ignored.  The reason IP multicasting is disabled on most consumer routers is because the routers aren't smart enough to handle them correctly; it's got nothing to do with bandwith consumption.  In actuality, the proper configuration of IP multicasting in switches and routers that support it fully reduces, rather than increases, bandwidth use, and most large networks will turn these functions on to maximize bandwidth utilization.

And a little bit later we have another killer doozy statement: "VLANs can be configured on separate subnets".  Indeed, not only can they be, but in fact they pretty much have to be, assuming you're using VLANs properly.  Since each VLAN is a separate broadcast domain, and each separate broadcast domain needs its own subnet, each VLAN (in a properly constructed network) will have its own, distinct, subnet.  The author here gets away with breaking this rule only because the switch he's using allows a port mode that allows a port to simultaneously exist in more than one VLAN, which breaks the virtualization model I talked about earlier.  This port mode is found on low-end devices like the Linksys switch he's using; it is typically not found on larger, enterprise-grade switches.  You simply cannot set up a Catalyst 3760 to behave the way this guy has set up this little SRW2008.

Here's the problem with how this guy is abusing VLANs.  Instead of making each VLAN its own broadcast domain, he's taken an existing broadcast domain and broken it into three pieces.  That, by itself, would be ok, provided he then provided routing between those domains to enable them to communicate (at layer 3 instead of layer 2).  But he doesn't do that because the switch he's using doesn't offer layer 3 switching.  So what he does instead is he selectively violates the integrity of the segregation between the VLANs.  This works only because this switch permits the "general" port access mode, which allows multiple VLANs to present on the same port untagged.  I've never seen an enterprise switch, at least not from a major vendor, that allows this, and it's generally not a good idea, precisely because it enables a violation of the cardinal rule that every device connected to the same (V)LAN is in the same broadcast domain.  (He admits that the ability to do this is "key to our example".  Scary.)  The crazy thing that ends up happening with this configuration is that traffic that is sent to a device on one VLAN will be replied to on a different VLAN entirely.  While this may not create a problem when you're only using one switch that shares its MAC tables across all VLANs, that won't scale up to multiple switches, and this configuration will cause excess unicast flooding in a multiple switch environment (exactly one of the problems it was supposed to avoid), especially if the switch has independent MAC learning on each VLAN.  And it's a very tricky and tedious configuration to set up and maintain, far more complicated than a proper setup using access mode ports and a layer 3 switch.

So please, do not ever configure a network like this.  The simple fact is that this sort of configuration only works in a small network—and if you have a small network you almost certainly don't have a need to do this sort of thing anyway!  In fact, please don't use the "general" access mode even if your switches support it; any time you do you are violating the integrity of the VLAN broadcast domain, and you'll probably end up with hard-to-diagnose network gremlins somewhere down the line, not to mention a configuration that's simply impossible in most upper-end switches.  Just stick to one untagged VLAN per port, please; if you find yourself breaking this rule, you've probably done something wrong in your design.

So, now that you've read my rant about why this is the wrong way to go about it and for the wrong reasons, I should tell you a bit about the right reasons.  For that, go here.  I'm not going to get into the details of how because that varies a lot between switch types.  If you want specific help on a specific problem, go here.