Phillip Smith

Avoiding e-mail list data corruption

Re-posted from the New Internationalist Tech Blog

A few weeks ago, I started a major upgrade to New Internationalist's broadcast e-mail infrastructure. In the process of the upgrade, I noticed that a small number of e-mail subscriber records had been maliciously injected with arbitrary data (in this case, URLs to some other site).

Upon investigating the issue, it occurred to me that many other sites and organizations with larger e-mails list could be susceptible to this type of e-mail data corruption. So here's a quick run-down of the problem and some possible solutions.

The injection is relatively unsophisticated, and is not specific to one e-mail broadcast tool (i.e., I think it would be an issue for almost any e-mail platform on the market). Basically, the "attacker" is:

  • Posting data into existing e-mail list subscription forms
  • Guessing for common e-mail addresses (think of addresses like
  • Posting in malicious data and links for text fields like "First name" and "Last name"

What results is a record that looks like:


If we were to send out an e-mail campaign with personalization using the First name and Last name, we would inadvertently send that content to the subscriber.

Though the attack is fairly limited by the requirement of guessing existing subscribers e-mail addresses and, depending on the e-mail broadcast system, possibly a message triggered to the subscriber about a record change, or a subscription confirmation, it does point out that groups with larger e-mail lists could take steps to avoid this proactively.

In New Internationalist's case, I added some client-side form validation to help ensure that the data going into our broadcast tool was less likely to contain URLs or obviously bad data.

On the broadcaster side of things, I set up regular reports to look for cases of questionable data in our database, which I can then investigate manually.

Down the road, I'd like us to implement a "middleware" solution that also provides server-side data validation and does some pre-screening of the data to be posted to the broadcaster (using something like Mail-CheckUser or similar).

Would be great to know if other organizations have implemented anything similar, or have experienced similar attempts at corrupting their e-mail list data.


Hi, I'm Phillip Smith, a veteran digital publishing consultant, online advocacy specialist, and strategic convener. If you enjoyed reading this, find me on Twitter and I'll keep you updated.


Want to launch a local news business? Apply now for the journalism entrepreneurship boot camp

I’m excited to announce that applications are now open again for the journalism entrepreneurship boot camp. And I’m even more excited to ...… Continue reading