Subscribe to
Posts
Comments
NSLog(); Header Image

Preventing Multiple TrackBacks

A week or so ago I set about writing a "duplicate TrackBack removal script" in PHP. The script would give you a list of your duplicate TrackBacks and offer to delete them. The deleting was easy - getting the list was the pain in the ass. I put in a good amount of time trying to find the "best" solution. A few attempts used a lot of GROUP BYs. A few used the more boring ORDER BY. I stashed results in arrays so that they could be enumerated and their information (article name, the off duplicate/triplicate/quintuplet/etc.) displayed.

Basically, I spent a lot of time getting nowhere. I already knew how I was going to delete the duplicate TrackBacks and I realizd that I didn't really care to see what I'd be deleting. The only real purpose such a list could serve would be to list the articles that would need rebuilt to remove the off duplicate (or more) TrackBacks.

So here it is - the "pruning" code is a one-line MySQL query:

ALTER IGNORE TABLE mt_tbping ADD UNIQUE
ejb_tbpingindex (tbping_tb_id, tbping_source_url)

What's this actually do? This single query creates a new index using the smallest number of unique keys available: two. The keys are the tbping_tb_id (your article's TrackBack Ping ID, unique to each of your entries) and the tbping_source_url, the URL that "sent" the TrackBack(s). An entry can have TrackBacks from multiple source URLs, and a source URL can ping different articles on your site, but any duplication of these two keys in your database indicates a duplicate TrackBack.

The new index prunes any TrackBacks with identical tbping_tb_ids and tbping_source_urls and - in my tests - seamlessly prevents them from being added in the future. Of course, if you wish to clean up your database, you can remove the index simply:

ALTER TABLE mt_tbping DROP INDEX ejb_tbpingindex

This does some housekeeping, cleaving off the index you created in the first query.

I initially thought that if this index were left in place it may prevent your MovableType installation from responding "OK" to the blog doing the (multiple) TrackBacking. This would result in a (futile) ping being sent every time the remote article is built.

In my testing, however, I found this to be false. I applied the index to several of my friends' blogs (having seven blogs running on my server has its advantages). I then TrackBacked some articles to which I'd already successfully posted TrackBacks and found that:

  1. The second TrackBack was not added to their database.
  2. My MT installation seemed to receive the "OK" from the remote blog.

In other words, the existence of the index seems to work flawlessly!

Your mileage may vary, but if this works as it seems to - preventing multiple TrackBacks from building up - I'm curious as to why this isn't included as a default (or an option) in MovableType. I've posted a question and will update this entry when/if I get an answer or more information. Please post here with your comments - does it work for you? Does it fail miserably? Does it need a small tweak?