Nonbovine Ruminations: 2011

Thursday, September 15, 2011

Thoughts on Bitcasa

Bitcasa has been getting a lot of attention in my Google Plus circles the past week or so; I suspect this is because it was in the running for TechCrunch's Disrupt prize (but ultimately lost to something called "Shaker", which appears to me to be some sort of virtual bar simulation). Bitcasa claims to offer "infinite" storage on desktop computers for a fixed monthly fee. I've yet to see any solid technical information on how they're doing this, but it seems to me that they're planning to do this by storing your data in the cloud and using the local hard drive as a cache.

There's nothing earthshattering about this; I set up Tivoli Hierarchical Storage Manager to store infrequently used files on lower-priority storage media (either cheap NAS arrays or tape) four years ago with my last employer, and the technology was fairly mature then. Putting the HSM server, data stores, or both in the cloud was possible even then and should be even more so now that cloud services are far more mature than they were then. So while there are obviously issues to sort out, this isn't a big reach technically.

More interesting to me is how they plan to provide "infinite" storage. Given the promise of infinite storage, most users will never delete anything; my experience is that users don't delete files until they run out of space, and if they really do have "infinite" storage that won't ever happen. The rule of thumb I recall from storage planning 101 is that storage requirements double every 18 months. According to Matthew Komorowski, the cost of storage drops by half every 14 months, so their cost to provide that doubled storage should slowly decline over time, but that margin is fairly thin and may not be sufficient to cover the exponentially growing complexity of their storage infrastructure over time. They'll also have to cope with ever-increasing amounts of data transit, but I can't find good information just now on the trend there, in part because transit pricing is still very complicated.

More interesting to me is that Bitcasa appears to be claiming that they will use deduplication to reduce the amount of data transferred from clients to the server. This is, itself, not surprising. The surprising thing is that they also claim that they will be using interclient deduplication; that is, if you have a file that another client has already transferred, they won't actually transfer the file. I think they're overselling the savings from interclient deduplication, though. I may not be typical, but the bulk of the file data on my systems seems to fall into a few categories: camera raws for the photos I've taken; datasets I've compiled for various bulk data analyses (e.g. census data, topographic elevation maps, the FCC licensee database); virtual machine images; and savefiles from various games I play. The camera raw files (over 10,000 photographs at around 10 megs each) are obviously unique to me, and as I rarely share the raws their opportunity to leverage deduplication gain there is essentially nil. As to the datasets, they themselves are duplicative (most of them are downloaded from government sources), of course, but the derived files that I've created from the source datasets are unique to me and are often larger than the source data. So, again, only limited gain opportunity there. Most of my virtual machine images are unique to me, as I've built them from the bottom up myself. And obviously the saved games are unique to me. If I were to sign up my laptop and its 220 GB hard drive (actually larger than that but I haven't gotten around to resizing the main partition from when I reimaged the drive onto a newer, larger drive after a drive crash a couple months ago, so Windows still thinks it's a 220 GB drive) onto Bitcasa, they'd probably end up having to serve me somewhere around 170 to 200 GB of storage, depending mainly on how well it compresses. (Much of the data on my machine is already compressed.)

Even my music (what there is of it; I keep a fairly small collection of about 20 gigabytes) doesn't dedup well. I know, I've tried to dedup my music catalog several times over the past decade plus and my experience is that "identical" songs are often not identical at the bit level; the song might be the same but the metatags differ in some manner that makes them not bit-compatible. Or the songs might be compressed with different bit rates or even different algorithms; I have several albums that I've ripped as many as five times over the years with different rippers. Even if you rip the same song twice from the same disc with the same settings on the same ripper it still might end up with a different bitstream, if the disc has a wobbly bit on it.

If Bitcasa assumes that most of its clients will be "typical" computer users, most of whose data is "stuff downloaded from the Internet", then I suppose they can expect significant deduplication gain, especially for videos, music, and especially for executables and libraries (nearly everyone on a given version of Windows will have the same NTOSKRNL.EXE, for example. although in general OS files cannot be made nonresident anyway without affecting the ability of the computer to boot). The problem I think they're going to run into is that many of their early adopters are not going to be like that. Instead, they're going to be people like us: content creators far more than content consumers, whose computers are all filled to the brim with stuff we've created ourselves out of nothing, unlike anything else out there in the universe.

Then there's the whole issue of getting that 220 GB of data on my machine to their servers. It took my computer nearly 40 days to complete its initial Carbonite image, and that's without backing up any of the executables. I have some large files that I keep around for fairly infrequent use; if Bitcasa decides to offline one of those and I end up needing it, I might be facing a fairly long stall while it fetches the file from their server. Or, if I'm using the computer off the net (something I often do), then I'm hosed, and if I'm on a tether (which is also fairly frequent) then I could be facing a download of a gig file (or larger) over Verizon 3G at about 200 kbps. Good thing Verizon gives me unlimited 3G data!

I also wonder how Bitcasa will interact with applications that do random access to files, such as database apps and virtual machine hosts. I use both of these on a fairly regular basis, and I think it might get ugly if Bitcasa wants to offline one of my VHDs or MDFs (or even my Outlook OST or my Quickbooks company file). If they are planning to use reparse points on Windows the way Tivoli HSM does, files that will need to be accessed randomly, or which need advanced locking semantics, will have to be fully demigrated before they can be used at all.

In addition to all this, the use of cryptographic hashes to detect duplicates is risky. There's always the chance of hash collision. Yes, I know, the odds of that are very small with even a decent sized hash. But an event with even very low odds will happen with some regularity if there are enough chancves for it to occur, which is why we can detect the 21cm ultra-fine hydrogen spin transition at 1420.405752 MHz: this transition occurs with a probability of something like one in a billion, but we can detect it fairly easily because there are billions upon billions of hydrogen atoms in the universe. With enough clients, eventually there's going to be a hash collision, which will ultimately result in replacing one customer's file with some totally different file belonging to some other customer. Worse yet, this event is undetectible to the service provider. (Kudos to the folks at spideroak for pointing this out, along with other security and legal concerns that interclient deduplication presents.)

So while I think the idea is interesting, I think they're going to face some pretty serious issues in both the short term (customer experience not matching customer expectation, especially with respect to broadband speed limitations) and long term (storage costs growing faster than they anticipate). Ought to be interesting to see how it plays out, though. I think it's virtually certain that they'll drop the "infinite" before too very long.

(Disclaimers: Large portions of this post were previously posted by me on Google Plus. I am not affiliated with any of the entities named in this post.)

Tuesday, July 19, 2011

Aaron Swartz v. JSTOR

So the big noise on G+ today (at least in my circles) is all about Aaron Swartz' arrest for hacking into MIT's network and downloading over 4 million journal articles from JSTOR. Demand Progress, the nonprofit that Aaron is connected to, in a masterful bit of spin, is alleging that he has been "bizarrely" charged with "downloading too many journal articles". This would be true insofar as "too many" in the circumstances in which he did it would have been zero, if you are to believe the indictment. The problem isn't so much that he downloaded "too many" articles, but instead that (as alleged in the indictment) he used several different false IDs in his attempts to do so, did so as such a rate that it created problems for JSTOR's service, took several affirmative steps to evade MIT's and JSTOR's attempts to stop him (at which point he knew, or should have known, he had exceeded his access), and eventually resorted to sneaking into restricted areas at MIT in order to facilitate the process further before finally being caught in the act by MIT police.

Of course, many of the people on Google+ are insisting that this is about copyright (it's not) and that Aaron's actions strike a great blow for the freedom of information (they don't) and that this might even take down JSTOR (it won't). It is fairly clear that Aaron's actions are for one of two purposes: either he was collecting data for the same sort of mass analysis of journal articles that he did back in 2008 (which purportedly involved some 400,000 journal articles), or else he was planning to release the downloaded articles onto the Internet through a file sharing service (as is alleged in the indictment). Both of these would be fairly noble causes; I find it rather unlikely that Aaron's intent here was venal.

However, if his intent was noble, that doesn't explain the spin from Demand Progress. If this is civil disobedience, then they should be doing what other protest organizations have done in similar situations: admit what they did, say why they did it, and demand a public outcry against whatever was so wrong that it justified breaking the law, while at the same time standing prepared to accept the consequences of having broken the law. However, DP's PR statement doesn't do that. It instead calls the prosecution "bizarre" and clearly intends to whitewash Aaron's use of falsified identity information, increasingly determined attempts to evade MIT's and JSTOR's security, and eventually repeated criminal trespasses to MIT's grounds, in order to accomplish his goals. That, to me, is not mindset of civil disobedience, but instead the mindset of the criminal attempting to avoid responsibility for his crime.

So, to Aaron: dude, boner move. There are better ways to do this sort thing that don't involve skulking around in MIT's basement peering through the ventholes of your bike helmet. To Aaron's supporters: please don't make this about copyright. It's about Aaron not thinking clearly about his goals and means. I'd have a comment for JSTOR, too, but I really don't know what to say. Someone has to collate, digitize, and store these documents, and to expect them to do it for free seems silly. Someone has to pay for it. (But see also JSTOR's comment on the indictment.)

Information may want to be free, but data centers are not free. Theft of computing services is not a "victimless crime", and fundamentally that's what Aaron did here. If his access really was for "research", then I'd like to know if he attempted to negotiate with JSTOR for the access that he needed, or if he merely assumed that they wouldn't let him have it. I'd be a lot more sympathetic if he made an effort and was rebuffed.

In any case, I imagine Aaron will end up like poor ol' Mitnick: barred from using computers for some time, barred from the MIT campus forever, and slapped with a huge fine and a felony conviction or two. And he's not even 25 yet.

Also, if anybody finds the security camera images of him sneaking around using his bike helmet to hide his face, please let me know.

Monday, May 30, 2011

Weinergate, or the dangers of public WiFi

So the news is all atwitter today over what has been dubbed "Weinergate" by at least some in the media, relating to New York Congressman Anthony Weiner allegedly tweeting a picture of an erection to a college student in Seattle. Anthony Weiner has claimed that his Twitter account was hacked in order to do this, a claim which conservatives are denying. This post is about the credibility of Weiner's claim, and the hidden danger of using public unencrypted WiFi to access password-protected services.

Weiner is a pretty aggressive user of social media services, from what I've seen, and he seems to be using them himself (rather than delegating that to a social media consultant). He probably uses a smartphone of some sort to post his tweets. Like many other people, he likely uses public WiFi access points when they're available, as such services are typically faster than the 3G network when they are available. The problem with this, though, is that when you access a password-protected service like Twitter or Facebook, your device sends your password to the service provider in order to authenticate your session. By default, that password is sent using what is called "basic authentication", which sends the password without any encryption; the password is sent in the clear, and anyone who can overhear the exchange will be able to see, and more importantly, capture both the username and password. The key here is the "anyone who can overhear the exchange": the only thing protecting your Twitter password is the physical security of the medium being used to send your login request to Twitter.

This isn't that terribly much of a problem if the computer is connected to the internet via a wired connection: the unencrypted passwords will typically be exposed only to the chain of internet service providers between the user's computer and the social network. Now, there are certainly risks here, but in general ISPs do not collect intelligence about their customers and share that intelligence with third parties (other than the US government, that is); I've never heard of a social media account being hacked through password collection at an ISP. Basically, in the wired case the physical security of the medium is fairly good, and so the risk is low.

The same is true if you're using 3G/4G wireless. All the various digital cellular protocols used for cellular wireless use transport encryption, meaning it would be phenomenally difficult to intercept and successfully recover the content of a login request sent via cellular wireless.

However, things get a lot shakier when we start talking about WiFi. WiFi is notorious for its history of poor transport security; the original WEP security provided with early WiFi systems is flawed and can be cracked with an ordinary computer in a matter of hundreds (sometimes tens) of seconds. There are newer standards that alleviate this in various ways and the newer WPA and WPA2 encryption algorithms are probably sufficiently as secure as the underlying wired networks they're connected to. But the real danger here is unencrypted public WiFi. Here there is no transport security at all: everything you send, and everything you receive, is sent with no encryption at all. And since it's being sent over a radio medium, that means anyone with a compatible radio receiver can listen in to the entire conversation. The long and the short of it is that if you log into Facebook, Twitter, or most other social networking services over a public unencrypted WiFi service, you are sharing your login details, including your password, with everyone in radio range of your device.

There are widely available tools that are specifically designed to sniff WiFi sessions for social media passwords, and it's a fair bet that at any event where a public-access unencrypted WiFi is available, someone will be running one of these tools. And if you're a prominent public political figure who is known to use social media from a mobile device, someone like, say, Anthony Weiner, it's reasonable to assume that your political enemies will send someone to follow you about with one of these tools for the sole purpose of trying to capture your passwords. In short, you got pwned by firesheep, Anthony.

So what's the solution here? First, don't ever use a public unencrypted WiFi service to send sensitive information, including a password, without taking additional steps to protect your security. The simplest is to not use public WiFi. With many devices, this is the only safe choice: my Droid will automatically attempt to log on to all of its various social networking connections (to collect updates) as soon as it detects that it has Internet access. For mobile devices, therefore, one should rely only on cellular access and on password-protected WiFi sources that you already trust. This means, for example, turning off the option to automatically connect to any public WiFi that your device might detect.

Another option, which isn't really available on smartphones but would be on notebooks, is to install a browser add-on that forces social media sessions to be conducted via HTTPS instead of HTTP. Most of Google's properties already offer this; Google forces all login sessions to be sent via HTTPS, meaning the password will be encrypted in transit. I think Yahoo is also doing this. There is a plugin available for Firefox that forces Facebook, Twitter, and selected other sites to always use HTTPS encryption, to protect you from password grabbing, and I would recommend the use of such tools. I use one called Force-TLS on my own notebook.

A more aggressive option, and one that would have likely be a good choice for Congressman Weiner, would be to set up a VPN endpoint at your home or business (or at a public VPN endpoint service like PublicVPN) and force all your public Internet access through that client. This also ensures that all your Internet activity is encrypted by the VPN client before it leaves your device, ensuring that you won't be vulnerable.

And, of course, we should all pressure Facebook, Twitter, and other services to do as Google has done and redesign their services to avoid this vulnerability in the first place.

To bring it back to Weinergate, I personally find Weiner's claim, that his password was hacked, fairly credible. At least one conservative has poopooed the notion that someone could have hacked both his Twitter password and his Yfrog password at the same time, but in reality that's fairly likely with a WiFi password capture tool; all they have to do is observe him using both Twitter and Yfrog in the same session, which is a fairly common usage since most usage of Yfrog is on referral from Twitter. If one of his political opponents has been chasing him about following him with a password sniffer it's entirely possible that they have a large collection of his passwords. Not to mention that there's the real risk that he used the same password on both; while Weiner is a smart guy that doesn't mean he's necessarily an expert on Internet security, and even smart guys fall prey to that fairly common mistake.

Tuesday, April 26, 2011

Oregon tax on electric cars

So Oregon has decided that it's unfair for drivers of electric cars to avoid paying road use taxes and is proposing a special tax on electric cars to make up for this "inequity". This post will discuss why this is stupid, and why Oregon should resist the urge to implement this tax.

The federal government and, as far as I know, all of the states, impose excise taxes on gasoline. While in most cases these taxes are treated as general revenue and can be used for any purpose, there is the notion that they should be used to pay for road maintenance and construction, on the idea that the more one uses the roads the more one should pay for their upkeep, and gasoline usage is a fairly good proxy for road usage. Diesel fuel is taxed similarly, but one can also buy "exempt" diesel for use in off-road applications, such as running farm equipment or generators. The current federal gas tax is 18.4 cents per gallon; state gas taxes vary, but in Oregon (the state in question) is 30 cents a gallon. Thus, a car that gets 30 miles per gallon (which is slightly better than the 27.5 mpg fleet average required by CAFE) will pay one cent per mile in Oregon gas tax. The proposed tax on electric vehicles is one to two cents per mile, which suggests that Oregon believes that electric car owners should pay more than their fair share for road usage, itself an interesting statement.

The proposal, however, is misguided for at least four reasons. First, all-electric vehicles are, at this time, almost universally passenger cars, and usually small ones at that. Passengers cars present almost no wear and tear on roads; virtually all wear and tear on roads is the result of usage by trucks, or the result of weather (or other natural processes like earthquakes or landslides). So while cars represent the majority of users they do not cause the majority of wear and tear, and thus upkeep; that burden therefore ought to fall more heavily on larger vehicle operators. Diesel taxes are sometimes, but not universally, higher than gas taxes, reflecting the fact that most heavy vehicles run on diesel fuel; in Oregon diesel is also taxed at 30 cents per gallon. In any case, there is no reason why the tax burden on electric passenger cars should be greater than that of gasoline-powered passenger car of similar weight and performance.

Second, there are solid public policy reasons to abate road-use taxes on electric vehicles. Electric vehicles do not produce point pollution, and in the Pacific Northwest especially where a great deal of the electricity is produced by hydroelectric power, produce no pollution at all. The reduction in point pollution production is itself sufficient grounds to give a tax abatement to operators of such vehicles. Certainly imposing a tax burden equal to or greater than that imposed by pollution-generating gasoline-powered vehicles would be nonsensical, because it would tend to discourage consumers from making a choice that we would prefer them to make.

Third, the amount of tax that would be collected would not exceed the cost of collecting the tax. The typical electric vehicle that would be subjected to this tax has a range of about 80 miles. A vehicle driven 80 miles each day, five days a week, fifty weeks a year would travel around 20,000 miles, and be subject to a tax of between $120 and $400 a year (depending on tax rate). Most vehicles will be driven far less, with correspondingly lower tax revenue. Oregon estimates that there will be approximately 5000 vehicles subject to the tax in 2014 when it takes effect, generating probably somewhere between $200,000 and $500,000 in annual revenue. That means that the Oregon Department of Revenue has to implement this tax with fewer than ten full-time equivalents, or it will end up being revenue-negative.

Fourth, a miles-driven basis for taxation raises issues for taxing out-of-state vehicles operated in Oregon and Oregon-titled vehicles outside of Oregon. The use of gasoline taxes as a proxy for road usage relies in part on the fact that in most cases, motor vehicle fuel is used fairly proximal to its point of purchase. So while there is a discrepancy between state of purchase and state of use, in most cases it probably evens out in the end (although there are lots of exceptions, especially for communities near state lines where one state has a significantly lower tax rate than the other). However, if some road users are taxed by proxy and others for actual usage, that creates an inequitable basis for taxation. Arguably, if Oregon is going to tax electric vehicle owners for miles driven in Oregon, then it needs to do so as well for gasoline-powered vehicle owners as well. This then generates additional problems of crediting vehicle owners for miles driven outside of Oregon without being overly intrusive on owner privacy (the pilot program from some years ago used GPS technology, but that amounts to tracking the movements of anyone who owns a vehicle subject to this tax, and that just won't fly), and also on taxing out of state vehicles that are operated within Oregon. Finally, plugin hybrids risk double taxation under this plan, since they might well pay both a miles-driven tax and a gasoline excise tax. Replacing one tax inequity with several new ones is not an improvement. In this case it switches the burden of the inequity from an option disfavored in public policy (polluting) to one favored in public policy (not polluting), which is just stupid.

Fundamentally, I think Oregon's action in this regard is misguided. I'm sure they're seeing declining fuel tax revenues; the recession has resulted in people driving far less, and virtually every state has reported declining fuel tax revenues as a result. Also, I imagine the oil companies have been astroturfing the notion that it's unfair for electric vehicles (which they view as a huge threat) to be allowed to avoid taxation like this, and I'm sure the idea to tax electric vehicles has been driven at least in part by their public policy management agencies. Finally, the idea of implementing a special tax on a consumer choice that we bend over backwards elsewhere in public policy to encourage is just moronic. I just don't see the point of creating an entirely novel tax infrastructure to collect what would be at most a half million dollars of revenue on an activity that likely saves the state at least that much in costs elsewhere anyway. In fact, for me the fact that the revenue collected is not likely to exceed the cost of collecting it leads me to believe that the real purpose of this tax is to discourage people from owning electric vehicles, and that tells me that the real reason for the tax is to protect the oil and gas industry in Oregon. What's the real motivation here? (Keep in mind that Oregon is also one of only two states that prohibits self-serve gasoline stations.)

No, Oregon, this is a dumb idea. Don't put barriers in the way of progress, just because the oil companies want you to. Say no to HB 2328.

Wednesday, April 06, 2011

An example of when to use VLANs, and the danger of closet monkeys

I wrote a couple days ago about abusing VLANs. Just yesterday I had an occasion to use VLANs for a client, so I thought I'd write about that. There's also a "closet monkey" anecdote in here, as a cautionary tale as to why you shouldn't let outside techs into your network closets or server rooms unsupervised.

This client recently entered into an arrangement with a hosted provider for voice-over-IP telephony. The arrangement this provider offers installs Polycom SIP phones at the business location, along with a gateway device that is installed on the network to aggregate the SIP devices and trunk calls back to the service provider's facility. (As far as I can tell, call control is handled at the provider's facility, but that's not important right now.) This gateway device, in addition to its VoIP functionality, is capable of acting as a fairly generic NAT appliance. This particular provider's installer apparently uses a playbook for installing these devices that involves removing any gateway device that customer already has and replacing it by their device. This device also provides DHCP with a variety of specialized options preloaded for the benefit of the phones, including, apparently, their own DNS servers, which their system makes some use of in some way that wasn't clearly explained to me.

However, in my client's case this didn't quite work out. My client has Windows 2008 Small Business Server running at that location, with Active Directory in use. The SBS server provides both DNS and DHCP for the network; DHCP was not being provided by the existing gateway devices (a Watchguard firewall). So when they ripped the Watchguard out of the network and installed their gateway device, the DHCP server in their device conflicted with the DHCP server in the SBS server; fortunately, the gateway detected this and shut off its DHCP server. This resulted in the phones not getting all the extra DHCP options that they needed for optimal operation, nor did they have access to the provider's DNS servers.

It was about this point that they called me, to ask if there was some way to change the DNS for the network to point to their servers instead of the local Windows server. Of course, that's not acceptable; this client is using Active Directory, and in an AD environment it is absolutely nonnegotiable that all AD clients must use the Active Directory DNS servers, at least for all zones that describe the AD forest. I was, however, willing to configure the Windows server to use the provider's DNS servers as first-level forwarders, which would mean that any query not answerable by the zones defined in that server would be forwarded to the provider for resolution. (It is fairly rare for people to understand how DNS works; perhaps I'll blog about this in the future.)

So, while on the conference call, I went to remote into the client's site, in order to make the necessary changes to the DNS service. And here's where I ran into more problems. The VPN would not connect, for the fairly simple reason that they had disconnected the Watchguard firewall that was being used as the VPN endpoint. (It was at this point that I and my client discovered that they had done this.) Further discussion and inspection determined that they had disconnected the Watchguard from the WAN side, and I suspected also from the LAN side, although that wasn't confirmed until the next day when I went on site. This was clearly unacceptable. Remote access via that device is essential to this client's business operations as well as to my ability to provide them remote support; also, this client runs a FTP server at this location which is used for communications with a couple of business partners, which was obviously also made unavailable as a result of this change. It's possible that I might have been able to configure this new gateway device to provide comparable services; however, my main complaint is that this provider removed a gateway device without discussion or even notification as part of their install routine. There's a reason more experienced network engineers like myself refer to such people as "closet monkeys". When I was a full-time systems person I generally refused to let anyone outside the organization into my server room or network closets without direct supervision; it's incidents like this that explain why.

Anyhow, during the 45-minute conference call two nights ago, after it became apparent that this installers had rather significantly broken my client's network and that I would have to go in to fix it, we then discussed how to make all this work in harmony. Apparently their device doesn't like operating behind another firewall, and I suspect it will also not play well in router-on-a-stick mode. We could have arranged that using the Watchguard's "optional" network, but that would have required them to break from their playbook and negotiate with me, and getting a closet monkey to negotiate with the customer is usually impossible. However, they had actually done me a favor in disguise. This client has Comcast business cable modem service using an SMC cable modem. This modem supports transparent bridging but cannot be configured to do it by the customer; turning that on can only be done from the provider end. When I migrated the client from DSL to cable modem, about a month ago, I would have preferred transparent bridging but didn't want to deal with calling Comcast to set it up, so I set up a double NAT solution instead by configuring the modem to map each public IP to a RFC 1918 IP, and then using those mapped IPs at the firewall's WAN interface. This solution was less than ideal, in my opinion, but was working, so I left it alone. However, the installers for this system had apparently contacted Comcast and had the modem switched to transparent bridging to better support their device. A blessing in disguise. Anyhow, this meant that the cable modem was now presenting five public IP addresses (five of the six usable addresses of a /29 network, the sixth having been allocated to the cable modem itself) on its LAN ports, but their gateway device only needed one; I could use one of the others for the client's firewall and restore remote access, and another for the FTP site; only minor reconfiguration of the firewall would be needed, once it had been reconnected, of course. The only question remaining was how to run both devices in parallel, without conflict.

Here's where VLANs come in. The strategy here is to have one VLAN for the PCs (and printers and servers and other devices) and another, entirely separate VLAN, for the phones. This not only allows my client to continue using their firewall device, which has been set up for their specific business needs, but also allows the provider's edge device to serve all the special DHCP options to the phones that are required to make the phones work correctly, and allow the phones to get the DNS servers that the provider wants them to use, without interfering with the needs of the active directory environment. It's truly as if there were two entirely separate LANs. (There isn't even any routing between the two VLANs; while I could have set that up, there was no benefit to doing so.)

The only remaining issue was in how to get the phones to live on their VLAN without having to run additional cabling. The phones in question, as I mentioned, are Polycom SIP devices. Like most VoIP phones, they have passthrough Ethernet ports so that you don't have to install additional cabling to install them; you just plug the PC into the phone and the phone into the wall where the PC was plugged in before. In addition, like most VoIP phones, they support 802.1q tagging for VoIP traffic, which means the phone's traffic is tagged with 802.1q tags that allow a suitably capable switch to segregate the traffic for the phones from the traffic from the passthrough port (which is sent untagged). The provider wasn't able to advise me on how to set the phones up to do this, but I was able to figure it out anyway, having set up VoIP telephony systems before. Furthermore, Polycom has fairly decent documentation for its phones available on the net; all that was required was the addition of a special DHCP option to the Windows DHCP server, and I was able to fairly quickly find out what option was needed and what syntax these phones needed for that. This allowed the phones to operate on the voice VLAN while still using the same cable for passthrough data traffic to any device connected to the phone's passthrough port.

So, I defined a second VLAN on the client's switches, and set up all but three ports on the switches as "untagged 1 tagged 101" (1 being the data VLAN and 101 being the voice VLAN). The phones, when they boot, execute a DHCP discover on the data (untagged) VLAN. The Windows server responds with an IP address offer that includes the DHCP tag that tells the phone to switch to VLAN 101. The phone then rejects the DHCP offer and switches its VoIP interface to the tagged VLAN and sends another DHCP discover on the voice VLAN, which is responded to by the VoIP gateway device with all of the settings that are particular to the voice network. Other devices on the data network (such as workstations) just ignore this DHCP option and proceed as usual. Two of the ports that were not set up this way were set up as "untagged 101"; one of these was connected to the edge device (so that the edge device would not get 802.1q tags that it wasn't set up to deal with) and the other I used for configuration and troubleshooting access during the process. The final remaining port goes to an unmanaged gigabit switch that interconnects the client's servers; that switch is not 802.1q aware and thus should also not received tagged packets, and in any case no device on that switch needs to see voice traffic.

In this case, VLANs were key to solving this client's problem. The traffic segregation and quality of service wasn't really an issue; this client's network is small enough that it's unlikely that there'd be capacity issues. In this case segregation was mandated by the need to have distinct DHCP environments. In theory I could have used a DHCP server that used the requesting client's client ID or MAC address to serve different DHCP options, but such features are not standard in most DHCP servers. The VLAN solution was simpler.

One of the problems small businesses often face (often without knowing it) is that there's a bevy of solution providers out there that are offering what amount to turnkey solutions, and in most cases the solution is being deployed by people who are only trained to deal with a small subset of the possible environments they'll run into. Sometimes that'll work out OK, but really if you want a good result you need someone involved who is looking out for your needs, concerns, and interests. You just can't count on someone else's technician to do that. The solution they provide has been optimized for their needs, not necessarily for yours.

Thursday, March 31, 2011

Scary ways to abuse VLANs

I ran across this article the other day (after someone at the Spiceworks Forums posted a link to it). It made me cringe, repeatedly, to read it. This post will address why this other article is so wrong, and why you should not do what this guy suggests.

The key to understanding how to approach this lies in understanding what a "broadcast domain" is. A broadcast domain is the set of devices all of whom will receive a broadcast sent by any other member of that set. Normally, every device connected to a standard LAN switch will be in the same broadcast domain; in short, switches define broadcast domains. Every device connected to the same LAN is a member of the same broadcast domain.

What VLANs allow one to do is treat a switch as if it were multiple independent switches, coexisting in the same box. The switch is told to group its ports, some to one virtual switch, others to another. The end result is to have multiple LANs (virtual LANs, or VLANs) coexisting on the same hardware. You could get the same result by buying multiple switches, one for each independent LAN. VLANs just let you do this with fewer switches. That's all. (There's some added complexity when you start talking about trunking and about layer 3 switching, but neither of these is essential to understanding what a VLAN is.)

The author's definition of a VLAN (Virtual LAN) as a "technology that enables dividing a physical network into logical segments at Layer 2" is, I suppose, not entirely inaccurate; however, it's less than useful to understanding what a VLAN is. The problem this author has is that he's viewing VLANs as a partitioning of a physical network. But that's not the right approach. While VLANs have this effect, that's not the way to understand them. It's far better to think of VLANs as a way for multiple LANs—that is, multiple broadcast domains—to independently coexist in the same hardware, much the way that virtualization hypervisors allow multiple computers to independently coexist on the same hardware.

A few lines down from that is another flat out wrong statement. VLANs are not used to "join physically separate LANs or LAN segments into a single logical LAN". You cannot do that with VLANs alone; doing this (if for some reason you wanted to) is the role of a bridge or a tunnel—or just a cable between two switches. You might use a VLAN in the course of setting up a bridge or tunnel, but VLANs don't allow you to do this on their own.

The discussion on page two about the use of VLANs to control broadcast traffic is fundamentally correct; this is one of the major reasons for separating a network into multiple broadcast domains. The statement "Small LANs are typically equivalent to a single broadcast domain" really illustrates the fundamental mistake this author made: a LAN is, by definition, a broadcast domain, and so a small LAN would necessarily also be a broadcast domain. There's also some discussion about IP multicasts that is all entirely incorrect and should be just ignored. The reason IP multicasting is disabled on most consumer routers is because the routers aren't smart enough to handle them correctly; it's got nothing to do with bandwith consumption. In actuality, the proper configuration of IP multicasting in switches and routers that support it fully reduces, rather than increases, bandwidth use, and most large networks will turn these functions on to maximize bandwidth utilization.

And a little bit later we have another killer doozy statement: "VLANs can be configured on separate subnets". Indeed, not only can they be, but in fact they pretty much have to be, assuming you're using VLANs properly. Since each VLAN is a separate broadcast domain, and each separate broadcast domain needs its own subnet, each VLAN (in a properly constructed network) will have its own, distinct, subnet. The author here gets away with breaking this rule only because the switch he's using allows a port mode that allows a port to simultaneously exist in more than one VLAN, which breaks the virtualization model I talked about earlier. This port mode is found on low-end devices like the Linksys switch he's using; it is typically not found on larger, enterprise-grade switches. You simply cannot set up a Catalyst 3760 to behave the way this guy has set up this little SRW2008.

Here's the problem with how this guy is abusing VLANs. Instead of making each VLAN its own broadcast domain, he's taken an existing broadcast domain and broken it into three pieces. That, by itself, would be ok, provided he then provided routing between those domains to enable them to communicate (at layer 3 instead of layer 2). But he doesn't do that because the switch he's using doesn't offer layer 3 switching. So what he does instead is he selectively violates the integrity of the segregation between the VLANs. This works only because this switch permits the "general" port access mode, which allows multiple VLANs to present on the same port untagged. I've never seen an enterprise switch, at least not from a major vendor, that allows this, and it's generally not a good idea, precisely because it enables a violation of the cardinal rule that every device connected to the same (V)LAN is in the same broadcast domain. (He admits that the ability to do this is "key to our example". Scary.) The crazy thing that ends up happening with this configuration is that traffic that is sent to a device on one VLAN will be replied to on a different VLAN entirely. While this may not create a problem when you're only using one switch that shares its MAC tables across all VLANs, that won't scale up to multiple switches, and this configuration will cause excess unicast flooding in a multiple switch environment (exactly one of the problems it was supposed to avoid), especially if the switch has independent MAC learning on each VLAN. And it's a very tricky and tedious configuration to set up and maintain, far more complicated than a proper setup using access mode ports and a layer 3 switch.

So please, do not ever configure a network like this. The simple fact is that this sort of configuration only works in a small network—and if you have a small network you almost certainly don't have a need to do this sort of thing anyway! In fact, please don't use the "general" access mode even if your switches support it; any time you do you are violating the integrity of the VLAN broadcast domain, and you'll probably end up with hard-to-diagnose network gremlins somewhere down the line, not to mention a configuration that's simply impossible in most upper-end switches. Just stick to one untagged VLAN per port, please; if you find yourself breaking this rule, you've probably done something wrong in your design.

So, now that you've read my rant about why this is the wrong way to go about it and for the wrong reasons, I should tell you a bit about the right reasons. For that, go here. I'm not going to get into the details of how because that varies a lot between switch types. If you want specific help on a specific problem, go here.

Sunday, February 27, 2011

H.R. 607, the Broadband for First Responders Act of 2011

The following is a letter I've just emailed to my Congressman regarding H.R. 607, the Broadband for First Responders Act of 2011. This has been a matter of some discussion by amateur radio licensees in the United States of late. Paper copy will go off in the mail tomorrow.

Please feel free to adapt for your own purposes.

February 27, 2011

The Honorable Mike Quigley
1124 Longworth HOB
Washington, DC 20515

Dear Representative Quigley:

I am writing you today, as a constituent and an amateur radio operator, in reference to H.R. 607, the Broadband for First Responders Act of 2011. This bill claims to seek to establish a supply of radio spectrum available for a public safety broadband network, a goal I have no objection with in principle. However, I wish to bring your attention to a problem with this bill. As introduced, the bill would, if adopted, compromise national security, potentially breach an international treaty to which the United States is a party, and significantly harm the interest of amateur radio operators, all for a purpose that does not clearly serve the stated purposes of the bill. Given that the bill's primary purpose can still be largely met without these negative affects by a relatively simple amendment, I urge you to oppose this bill until the necessary changes are made.

Specifically, I draw your attention to Section 207 in the bill's text. This section seeks to mandate that all current public safety service radio operations currently between 170 and 512 megahertz be moved to the 700 megahertz band. This is mandated not so much to improve public service communications or for any of the other stated purposes of the bill, but instead for two specific purposes: to free radio spectrum to be subsequently auctioned off to wireless communications providers for commercial broadband services, and to force public service entities to purchase new radio equipment. Neither of these purposes directly serves the broader purposes of the bill. Notwithstanding this objection, there is a fatal flaw in this section, relating to the references to the frequency range of 420-440 megahertz. As a brief glance at the Table of Frequency Allocations (47 CFR § 2.106) will reveal, the 420-440 megahertz frequency range is, quite simply, not presently allocated to the public safety service, and so there are no public safety service users to remove from this band.

The 420-440 megahertz band is currently allocated to two separate purposes in the United States. The primary user is the United States government, which uses it primarily for a variety of radiolocation purposes (that is, radar) intended for national defense and border control. The PAVE PAWS early warning radar system, which monitors our coastlines for submarine-launched ballistic missiles and other airborne threats, makes use these frequencies. In addition, the Border Patrol and other federal law enforcement agencies use radar systems on these frequencies to monitor for persons attempting illegal entry into the United States in border areas such as Texas, New Mexico, Arizona, California, and Florida. The secondary user of this band are amateur radio operators, who use it for a variety of purposes with the clear understanding that the military has primacy in the band. Reassigning the band to commercial purposes would almost certainly result in interference with national security objectives.

In addition, within the 420-440 megahertz band there is a subband at 432-438 megahertz that is allocated to amateur radio as a result of treaty obligations that the United States has agreed to by virtue of being a member of the ITU. Part of this band is used by amateurs specifically to communicate with orbiting amateur radio satellites. Those satellites cannot (for fairly obvious reasons) be retuned to different frequencies. While the United States' obligations as a member of the ITU allow the United States to use, or allow the use of, these frequencies for other purposes, allocating them to broadband services (as this bill proposes) would be likely to create a breach of the convention, as those uses would likely cause harmful interference to amateur service operations in other countries as well as to operations in the Earth exploration satellite service (the other internationally-protected user of the band).

It is fairly obvious that the author of this bill labored under the misapprehension that 420-440 megahertz was a public service band, when the reality of the matter is that this band is a radiolocation and amateur service band. Given that the bill was drafted on a mistaken understanding of the current use of spectrum, the only proper thing to do is to correct the bill so as not to refer to this band. I would urge you to refuse to support this bill unless it is amended so as to either remove the references to the 420-440 megahertz band in section 207, or to remove entirely the spectrum reassignment mandated by Section 207.

I urge you to confer with representatives of the Federal Communications Commission and the National Telecommunications and Information Adminstration, with representatives of the divisions within the Department of Defense and Department of Homeland Security that make use of the spectrum at issue, and with the American Radio Relay League (ARRL) in deciding how to proceed with respect to this bill. I also suggest you speak with public safety officials in and outside of Illinois to find out how they feel about being mandated to once again purchase new radio equipment, but that is independent of the issue regarding the 420-440 megahertz band. I am confident that you will determine that reassigning the 420-440 megahertz band away from its current allocation as a military radiolocation and amateur band is not in the best interests of the United States.

If you have any questions regarding my objection to this legislation, please feel free to contact me.

Sincerely yours,

Kelly Martin
(address and telephone number redacted)

Tuesday, February 15, 2011

Technology is good for ham radio!

This post is a direct reply to G4ILO's neo-luddite post on his blog entitled "Is technology good for ham radio?" In it, he makes the startling comment, "The more high-tech ham radio becomes, the less magic there is."

Let me put it in short, simple words: there is no magic in ham radio. Ham radio is nothing but technology. Without technology, ham radio is nothing.

Yes, Julian makes a legitimate point regarding the possibility of amateur radio turning into poor copies of existing networks, and I agree with him on the lack of merit of D*STAR specifically. However, there is just as much "magic" in getting a network that combines computer and radio technologies up and running as there is in sending CW with a transmitter made out of parts salvaged from a compact fluorescent lightbulb. Of course we need to keep the ability to do it "simple", because the complicated ways are, fundamentally, built on top of the simple ways. But that doesn't mean we have to stop at simple, and in fact if we do we shoot ourselves in the foot. (It should be noted that Julian states that he uses PSK31 and other digital modes, all of which are less than ten years old or so, so even he doesn't practice what he preaches.)

It never fails to amuse and amaze me how luddite some hams are. I just don't understand how someone who, thirty or forty or fifty years ago, was using a totally newfangled technology to do something can, now, today, be totally unwilling to even entertain the notion that there might be some merit to the newfangled way of doing things.

Thursday, January 13, 2011

Ophiuchus, the 13th Astrological Sign?

I heard today about some noise that's going around about how astrologers have added a new sign to the zodiac, and how this changes everything or some such nonsense. It's sadly fascinating to see stuff like this, because it really exposes the degree to which the Internet has not only not made people less ignorant, but in fact increased the rate at which ignorance spreads. Apparently even Time Magazine is in on this nonsense, based apparently on a press release from the Minnesota Planetarium Society.

Here's the real story.

The ancient Babylonians divided the year into 12 segments, the Babylonians being fond of the number 12 (and also the number 60), and gave names to star groupings that corresponded to each of those twelve segments, enabling them to observe the sky and determine where in the year one was, a very useful skill in a place where the timing of planting is important. The zodiac has exactly twelve equal divisions because that's how it was constructed. It's a human construct, with no natural meaning whatsoever; basically a bookkeeping device. The leading edge of Aries, the first sign of the zodiac, arbitrarily corresponds to the position that the sun is in in the sky on the vernal equinox, the first day of Spring, which is anciently the first day of the year. The key point is that a "sign of the zodiac" is one of twelve equal divisions of the solar year. (These are not to be confused with months, which are anciently defined by the moon's cycle.)

There are, of course, other asterisms in the sky, such as the Great Bear (Ursa Major) and Orion, which are well known to most people but which are not part of the zodiac because they are not in or near to the plane of the ecliptic, the path the sun takes on its apparent annual cycle through the sky. Now let's fast forward to 1922, when the International Astronomical Union (IAU) formally adopted its constellation map, dividing the celestial sphere into 88 named chunks of astronomical real estate. In so doing, they largely kept the traditional Western names for these asterisms (although some of the Southern hemisphere constellations have modern names because those asterisms were not visible to the Babylonians and so they never named them), but the boundaries they settled on did not take into consideration the Babylonian origins of the signs of the zodiac or their astrological significance. As a result of their lack of concern, the ecliptic ended up passing through not twelve constellations (as it would had they remained faithful to their Babylonian predecessors) but indeed thirteen, and the division is not even remotely equal. The thirteenth is Ophiuchus, the Serpentbearer; the ecliptic passes through one corner of the constellation's defined area, although not particularly close to any major star in the constellation. The key point here is that a constellation is one of 88 (unequal) divisions of the celestial sphere.

The thing is, this isn't new. Astrologers and astronomers alike have known about Ophiuchus' intrusion on the zodiac since, presumably, 1922. Some astrologers think this matters; others don't. I've seen complete astrological systems based on the Ophiuchus being part of the zodiac, and I've seen so-called "sidereal" astrologies that take into account precession, which I talk about below. The ones that don't are called "tropical", for some reason I don't recall anymore. Diehard skeptics, of the sort who have a compulsive need to prove astrology wrong, often trot the Ophiuchuan issue forward as "proof" of the wrongness of astrology, along with the precession issue, and it's quite likely that the press release that started all this was motivated by that attitude. Astrology, like all forms of divination, involves the use of essentially randomly-generated symbols to spur self-reflection. The symbols used and their correlations are basically arbitrary; as a result, whether the symbols correspond to anything "real" or not is completely irrelevant. You'll get essentially the same results from astrology using the classical Babylonian/Greco-roman zodiac as you will with using this not-really-new 13-sign approach. Or you can play with Vedic astrology from India if you want something completely different, although the cultural context there may be too foreign for most Westerners to get much from it. Or not. Whatever floats your boat.

In addition, there's an additional complication. The Sun's position on the first day of spring, which originally defined the leading edge of Aries, as I mentioned above, is no longer in Aries. The axis of the Earth's rotation precesses in a cycle of about 26,000 years, causing the apparent position of the Sun against the celestial background to make a full cycle around the sky over that 26,000 year cycle. We're a couple thousand years into the cycle that was started when the leading edge of Aries was defined as the vernal equinox, and as a result the Sun is actually in Pisces on the first day of Spring. In about five hundred years, it'll be in the constellation of Aquarius on the first day of Spring, and according to most astrologers it's already in the sidereal sign of Aquarius, which is the origin of the popular phrase "Age of Aquarius". However, the astrological sign of Aries always starts on the first day of Spring, because that's how it's defined; where the Sun is against the stars in the sky simply isn't part of that definition. The Babylonians almost certainly didn't know about precession of the equinoxes.

Fundamentally the error here is with the Minnesota Planetarium Society, who has unnecessarily (and likely willfully) conflated the astronomical concept of "constellation" with the astrological concept of "sign". A constellation is not a sign, even though there is a historical relationship between two, and in fact twelve of the constellations have the same names as the twelve signs. The IAU did not, in 1922, create a thirteenth sign of the zodiac when it decided to define Ophiuchus to include a bit of the ecliptic. Unless, of course, you decide that you want that to be the case, in which case they did—but only for you.

So if you're a fan of traditional astrology, you can go on using it the way you always have, without worrying about this. It doesn't matter. On the other hand, if you want to worry about it, you can do that too. Just don't lose any sleep over it; that would be foolish indeed. As I mentioned, astrology is an entirely human creation, as are the astronomical names for the asterisms, and these definitions and symbolisms can be changed by humans whenever they want, but only if they want to, and not just because some "skeptic" demands it of them.

Addendum: Apparently AOL's article on this nonsense suggests that the Babylonians deliberately skipped Ophiuchus because "they wanted there to be 12 signs". This claim ignores reality, which is part of the basis of my rant on the Internet being used to spread ignorance. Ophiuchus is certainly near the ecliptic (and there is evidence that Greek and Roman astrologers even read some significance into this, treating the sun's near-passage to a bright star in the asterism as a meaningful event), but the fact remains that the zodiac is defined to have twelve signs in it. The modern fictitious "discovery" of a "thirteenth sign" is merely a consequence of the modern definition of the constellations and has nothing to do with the Babylonians.

Nonbovine Ruminations