Scunthorpe is a small town in the north of England. Back in 1996, many of the town’s residents were having trouble signing up for AOL’s Internet service. It seems that the company’s profanity filters were rejecting the name of the city since it contains a slang term for female genitalia. Although residents of Penistone, South Yorkshire and Lightwater, Surrey experienced the same problem, the issue has become known as the “Scunthorpe Problem”.
There are two basic ways to filter content on the Internet. One is to have humans do it, and the other is to have machines do it.
The problem with having humans do it is obvious – there are billions of web pages out there, and it would be a Herculean task to have an actual human visit each page and judge the content therein. Plus, web site operators change web hosts, move their content around, and rename pages all the time; every web site – even a tiny site such as this one – would need to be visited on a very regular basis to make sure that any filters were up to date.
On the other hand, machines work 24 hours a day without a salary. Even a modest computer – such as a home PC from five years ago – could filter tens of thousands of pages every day. But the problem is, machines have no sense of nuance. A computer only looks for a string of letters organized in a certain way. It sees web sites like romansinsussex.co.uk (an educational site about English history) and arkansasextermination.com (a site for an Arkansas-based pest extermination company) and blocks them because of “sex” in their addresses – although those sites have nothing to do with sex!
While you might think this is a limited problem, it really isn’t. There are millions of desktop PCs in Internet cafés and public libraries that need some kind of filtering, and there are thousands of strings of text that contain “objectionable” content in their names, but in themselves are not objectionable. For example:
– Some public libraries block search results for Russian ice dancer Irina Slutskaya, since her name contains the string slut.
– In 1998, prospective web site owner (and mushroom fan) Jeff Gold was blocked from registering the domain name shitakemushrooms.com, because of the string shit.
– In 2006, Massachusetts resident Linda Callahan was prevented from registering her name with Yahoo! as an e-mail address since it contained the string Allah (Yahoo! has since reversed its decision).
– In February 2003, a new spam filter was installed on the email server that serves Britain’s House of Commons. The filter immediately started blocking email referring to the “Sexual Offences Bill” then under discussion, as well as some emails about a censorship discussion.
– In an infamous case, the tech site Experts Exchange had to change their domain from expertsexchange.com to experts-exchange.com, because many web filtering programs (seeing only “expert sex change”) banned the site, making it unreachable for millions of people.
– In May 2006, a resident of Manchester, England had his emails to a local city council blocked because he used the word “erection” when referring to a local structure.
– In October 2004, the Horniman Museum in London had all kinds of email issues when its own spam filters decided that the museum’s name was a play on the words “horny man”.
– Emails containing the words “socialism”, “socialist” or “specialist” are sometimes blocked by spam filters as they contain the string Cialis, an erectile dysfunction medication frequently advertised in spam emails.
– This year, two users were prevented from signing up for Microsoft’s Xbox Live online game service because the users wanted to use the word “gay” in their “gamer tag” (online nickname). In one case, the user wanted to use his last name (Gaywood); in the other case, the “TheGayGamer” was a self-proclaimed homosexual who wanted to be identified as such.
So – how do we get around the “Scunthorpe Problem”? Well, there aren’t any easy answers. The obvious solution would be to have machines do most of the work, then have humans correct any errors. But this is exactly the system we have today, and with so many companies cutting back on customer service (not to mention not empowering their customer service people to affect any meaningful change), it seems that many users are stuck. Perhaps one day, some smart person will figure out an algorithm to get around such issues. Until then… just don’t move to Scunthorpe!