My family web site has a guest book that a number of people, mostly family, have signed. Since I put it up a couple of years ago, I’ve gotten “guestbook spam” every now and again — messages sent by people we don’t know, advertising their cheap V1@grA or whatever, though some just talk about what a great site it is, and how informative, and don’t actually contain a link. I don’t understand those ones, but whatever. Anyway, in the last few months, the number of spam entries I was getting increased astronomically, until a couple of weeks ago I started getting four or five of them a day. I wrote all the code for the website myself, and when someone adds a guestbook entry, I get sent an email containing who did it, when, and the text of the comment. When the spam started getting out of control, I changed it so that there’s now a “delete this entry” link in the email. If I click the link, the entry gets deleted. Very easy, but still annoying.
I have no real idea how these messages were getting created, but I’m quite certain it wasn’t someone actually sitting at a browser looking for guestbooks and adding entries when they find one. It had to be a bot of some kind. I figured that if I were writing a bot to do this, I might look at how the majority of guestbooks handle comments, and then write my bot accordingly. My guess was that they simply start requesting pages using POST, and sending “comment=&name=&email=” as the POST body. If it’s a guestbook-type page, that may or may not enter a comment, and then the bot can move on to the next page. I suspected that the vast majority of guest books use “email” as the name of the email address field, “name” as the name field, and “comment” as the comment field, so I changed my page so that the names of these fields are hard-coded random strings. (If I wanted to, I could change it so that the strings are not hard-coded, but randomly generated at run-time, but that’s just too much work.) The end result is that in the week since I made this change, I have not gotten a single spam entry in my guestbook.
It’s certainly possible for a script to get the (HTML) source for a page, analyze it to find out what the actual field names are, and then submit spam entries that way, but I guess the bots aren’t smart enough yet to do that. I’m sure it won’t be long though…
Feb 5 update: Got two spam entries this morning. Oh well.