We all hate the idea of doing anything that will end up making us deal with even more email than we have to manage now. But this is one of those situations where what you don't know can hurt you.
Dow Jones, like all big organizations, has been forced to subscribe to an antispam service to keep a firehouse of illicit and offensive mail messages from reaching its employees, reporters included. When the service was first turned on, Outlook inboxes were suddenly free of offers for prescription medicines, mortgage refinances, crude erotica and all the other mainstays of the spam economy. Regular email life could resume -- spam-free. It looked like another victory for technology in the hands of the good guys. If it seemed too good to be true, well, that happens all the time in the tech world.
But after a while, some of my colleagues and I began to wonder where all that spam was going, and whether there was a chance that maybe, just maybe, some of the emails being flagged as spam and sent to an email gulag were actually just innocent communications. (For the longest time, regular access to those files had been blocked by IT policy.)
I asked IT managers for access to what was being caught in our spam filters -- the messages held back in quarantine and not delivered to our inboxes. When access finally was granted to me, and others in the rank and file here, you could hear the gasps from cubicles when we all saw what we had been missing.
The antispam system had been so effective because it had labeled as spam just about everything that was even remotely suspect. It was acting a bit like a police department that, in an effort to curb juvenile delinquency, was hauling in all teenagers without 'A' averages. Naturally, a huge percentage of the emails weren't spam at all. Our freedom from spam had come at a stiff price -- a very high false-positive rate.
How bad was it?
I took a good long look at a few days' worth of messages in my spam bucket. There were 192 in all. Sorting them by hand into 'real mail' and 'actual spam,' I figured that some 46% were legitimate messages that had been flagged as spam. Of these, most were news releases from companies, including VMWare, Dell and Hewlett-Packard. Notices from Purdue University, the Semiconductor Industry Association and Forbes Magazine also were blocked, though maybe that last one wasn't such a bad call after all.
I can live without the occasional news release. But what about when real readers take the time to sit down and write to me? That's a message I want to see.
Alas, of the 150 readers to write about a recent column, 20% were sent to the spam bucket and would never have been seen by me if I hadn't bothered to ask to take a look.
Other reporters who had taken advantage of the more-open access policy had similar tales. One colleague said his spam bucket contained a note from a friend he had assumed was angry with him because he hadn't written. Another found a crucial message from the company's official health-care provider announcing an important change in a health plan.
Spam researchers say this sort of thing is happening all the time at companies everywhere. 'Your experience is not at all unique,' says David Dagon, who studies spam detection at Georgia Tech. 'Antispam technology has become pretty mature in the last few years, but a lot of innovation still has to occur because the problem is so dynamic.'
The antispam software at my shop is provided by Postini, and we can assume it's at least as good as anyone else's by virtue of the fact that Google bought Postini last year.
Postini President Scott Petry seemed surprised that so much of my good mail was being flagged as spam. He said the software uses a number of different variables to score a message; those above a certain threshold get tagged as spam.
Those news releases, for example, were being sent from a single mailbox that had been configured in a way similar to the method spammers like to use. And one of the readers who had written to me had mentioned hospitals and charity work. A lot of spam involves charity scams, which is probably why that message got flagged, he said.
Mr. Petry then proceeded to explain aspects of our antispam software that I never knew about and that could be used to shrink the spam net. Specifically, Postini allows individual users to determine how aggressive its spam's filters should be. By default, our filters had been set to a vigilance level of four on a 1-to-5 setting, with five being the most exclusionary.
It turned out -- and this was news to most of us -- that the spam filter could be set by each user to be as aggressive or as permissive as each of us wished. I could lower the rating, Mr.Petry said, and start to see some of the messages that I had previously been missing.
Of course, I would also start seeing a lot more spam. And here you have the sad truth about the state of the art in spam protection. Set up your software to a low setting and you'll get most of your mail, but lots of spam. Ratchet up the controls and you'll see fewer stock picks, but you might miss the note from a long-lost friend.
Next time someone starts telling you about how smart computers have become, remind them about this situation, will you?
Michael Sloan 不过一段时间以后，我和一些同事开始盘算起那些垃圾邮件的去处，或许，仅仅是或许，某些被标记为垃圾信件并被打入万劫不复之地的电子邮件实际上只是些无毒无害的正常信息（很长一段时间以来，IT政策一直限制进入垃圾邮件文档。）
我仔仔细细地查看了几天来我的垃圾邮件箱里的信件，总共有192封。我动手将这些邮件分成了“真正的邮件”和“真正的垃圾”两类。我估计大约有46%被标记为垃圾邮件的信件都是正常信息。其中大多数是包括VMWare、戴尔(Dell)和惠普(Hewlett-Packard)在内的公司发布的新闻稿。来自普渡大学(Purdue University)、半导体行业协会(Semiconductor Industry Association)和福布斯(Forbes Magazine)的通知也被阻拦，尽管福布斯的通知或许根本算不上是什么坏信息。
垃圾邮件的研究人员表示，这种事情在各个地方的公司里都是常事。Georgia Tech研究垃圾邮件检测的大卫•达贡(David Dagon)说，“你的经历根本算不上特殊。反垃圾邮件技术在过去几年已经变得十分成熟，不过仍有待创新，因为问题也十分突出。”