Subscribe to
Posts
Comments
NSLog(); Header Image

AppleScript for Quickly Opening New Posts on a Forum

I mentioned in the comments on this post that I'd write this up, so here goes. What follows is an AppleScript I activate via FastScripts with the keyboard shortcut cmd-opt-f. It scrapes the page, filters the text, builds a list of URLs to click from the "new post" links (often an image is linked), and then opens them in reverse order, oldest to newest.

This lets me quickly read (or mark as read, even if I just quickly close that particular thread because I am not interested in it) all the new posts on any of the forums I visit regularly.

Note: a simplified version of this script is available here: http://cl.ly/1i1G0G3C2X1n.

The script starts by setting some global items:

-- Global Items
set siteURL to ""
set suffix to ""
set forumURL to ""
set extraLocation to ""
set rawSource to ""

These will come in handy later and I don't want to have to worry about their scope or namespaces.

Next we get the HTML of the front document:

tell application "Safari"
    activate
    set siteURL to URL of front document
    set rawSource to source of front document

Simple enough.

The next block is a series of checks for which site we're on and the setting of some of those global variables.

For example:

else if siteURL contains "photography-on-the.net" then
    set suffix to "\" id=\"thread_gotonew_"
    set forumURL to "http://photography-on-the.net/forum/"

I have a big if/else if/…/else if block terminated with this for the various sites I visit:

    else
        beep
        return false
    end if
end tell

POTN is one of the easiest to filter, so I'm making an example of this. Inspecting the HTML reveals that this is the code used to make all the "go to new post" links:

<a href="showthread.php?goto=newpost&amp;t=257659" id="thread_gotonew_257659"><img class="inlineimg" src="images/buttons/firstnew.gif" alt="Go to first new post" border="0" /></a>

We're interested in the URL, specifically, "showthread.php?goto=newpost&t=257659". The "suffix" is a large-enough-to-be-unique (to new post links) block of text that immediately follows the URL we want.

This gets the URLs on their own lines (and thus, in AppleScript parlance, in their own paragraphs):

-- All links, on their own line.
set rawSource to replaceText from "href=\"" to (return & forumURL) for rawSource
set rawSource to replaceText from suffix to ("•" & return) for rawSource

The first line (that isn't a comment) puts every link (the href=\" part denotes links) on its own line (the "return") with the forumURL at the front.

The second line replaces the suffix with the character • and a return to further separate the URLs we want on their own lines.

In other words, EVERY link is started on a new line, but only the URLs with a matching suffix - the "new post" links - will end with the • character.

But this replaceText bit, what's that? Simply this:

to replaceText from textToFind to replacementText for sourceText
    set AppleScript's text item delimiters to {textToFind}
    set sourceTextList to every text item of sourceText
    set AppleScript's text item delimiters to {replacementText}
    return sourceTextList as text
end replaceText

A quick and easy way to replace text.

Moving on, we've now got all of the URLs we want on a line that begins with the forumURL and ends with •. So…

-- Match.
set linkList to {}
set rawSourceParagraphs to paragraphs in rawSource
repeat with theLine in rawSourceParagraphs
    if theLine ends with "•" then
        set theLine to replaceText from "•" to "" for theLine
        set theLine to replaceText from "&amp;" to "&" for theLine
        set linkList to linkList & theLine
    end if
end repeat

First we set a new global variable linkList to an empty set. Then we set the variable rawSourceParagraphs to a list of the paragraphs in rawSource - this simply makes every new line a paragraph and thus a separate item in the list.

We want to cycle through rawSourceParagraphs and pull out the relevant URLs - remember they'll end with •. Since we know that, we quickly look at the lines (or paragraphs) that end with • and replace the bullet with nothing. Some sites encode & as &amp; so we convert it back because we just want the symbol in our URL.

Then we add the new URL to the existing list "linkList" ((I know, if you've programed a linked list this is somewhat confusing. I probably could have called this variable "urlList" but I found the slight confusion between linked lists and linkList amusing.)).

So now we've got a list of links we want to open.

tell application "Safari"
    repeat with theLink in reverse of linkList
        try
            open location theLink
        end try
    end repeat

if extraLocation is not "" then open location extraLocation

    -- Go back to original page
    if (count of linkList) > 0 then open location siteURL
end tell

This should be pretty easy to figure out. One nice thing was the in reverse addition to the repeat with. This opens the links farther down on the page (typically the older threads) first, so you can work your way up to the newest posts. It's how I prefer to read forums.

The last line with (count of linkList) refreshes the original page if there are new posts. The primary reason? To take you back to the first tab which should appear right before the oldest thread, so you can close it and just move through your tabs, left to right, easily ((Depending on your Safari settings, of course. Mine open new links in tabs and to the right of the current tab.)).

One bit not explained is the "extraLocation" variable. You can use it to open a page you always visit on that particular site or, as I'll demonstrate below, to open your private messages window if you have any private messages. This block goes right beneath the set forumURL bit:

if rawSource does not contain "/messages/\">0 New Messages" then
set extraLocation to "http://thesandtrap.com/messages/"
end if

Easy enough, eh?!

I'm sure the script could be improved (I made a small improvement today, in fact), but it works for me and I don't have to worry about it, so there you go.

P.S. I do hope the copying/pasting and formatting to special characters doesn't goof things up. I've taken care to get the & and &amp; and other entities and formatting and such correct, but if I've missed something somewhere, please let me know.