Subscribe to
Posts
Comments
NSLog(); Header Image

AppleScripting Safari Downloads

I've done some testing AppleScripting Safari image downloads from a "members" site (a photography club). Images are always of the form http://server.com/[prefix]xx[suffix].png, with "xx" being a number like 01 or 43. Currently I have a script capable of loading all of the images:

  1. I log in to the site and navigate to the images I want to download.
  2. I run an AppleScript which prompts me for the number of items, the prefix, and the suffix.
  3. The AppleScript builds x URLs and then uses open location to open them all.
  4. I manually drag and drop the images to a folder I've created.

The last step can take some time. Unfortunately, Safari's AppleScript dictionary doesn't allow for downloading images. The standard save document 1 approach doesn't work either - Safari attempts to save a page archive instead of the actual PNG file.

I did some Googling and discovered that "URL Access Scripting" has a download method, but this method can't get past the site's membership login requirements (and this is not a login which can be accessed via "http://username:password@server.com").

So, what to do? I'm opposed to using another browser for this functionality, and I don't believe tapping into Safari's JavaScript capabilities (via AS) will work. Can images from a members/login site be downloaded from Safari via AppleScript?

13 Responses to "AppleScripting Safari Downloads"

  1. You might try using the unix tool wget ? It supports credentials via HTTP and/or cookies from a text file. man wget for more information.

  2. [quote comment="39434"]You might try using the unix tool wget ? It supports credentials via HTTP and/or cookies from a text file. man wget for more information.[/quote]

    Surprisingly, wget doesn't appear to be a standard component of Mac OS X, though curl is. I haven't looked into using curl for this, but if it involved much monkey business each time, it wouldn't be any better than manually dragging the images to a folder.

  3. [quote comment="39435"]I haven't looked into using curl for this, but if it involved much monkey business each time, it wouldn't be any better than manually dragging the images to a folder.[/quote]

    curl is really good at dealing with patterns like this.

    curl -O "http://www.site.com/file[01-25].jpg"

    will download the 25 files to the "file01.jpg" through "file25.jpg" to the current working directory.

  4. [quote comment="39438"]curl is really good at dealing with patterns like this.[/quote]

    I know it is. But curl will simply download a "please login" page. Please note the "membership/login" requirements mentioned above.

  5. If the site sets a cookie once you login, you could likely login once from Safari, pull the relevant cookie out of ~/Library/Cookies/Cookies.plist and pass that to curl via the "--cookie" parameter.

    Baring that, you could probably have curl send the login form data directly using the "--data" flag.

  6. [quote comment="39440"]If the site sets a cookie once you login, you could likely login once from Safari, pull the relevant cookie out of ~/Library/Cookies/Cookies.plist and pass that to curl via the "--cookie" parameter.[/quote]

    Looking through a 2.5 MB file every time I want to do this is no easier than manually downloading.

    [quote comment="39440"]Baring that, you could probably have curl send the login form data directly using the "--data" flag.[/quote]

    I don't think that would work either. I don't think curl can maintain the "logged-in-ness" of the session.

    Simply put: Safari should let me do this. 😛

  7. [quote comment="39441"]
    I don't think that would work either. I don't think curl can maintain the "logged-in-ness" of the session.[/quote]

    The "--cookie-jar" option looks promising. I'd imagine that if the login info is stored in a cookie, using this option will allow the "logged-in-ness" to persist across invocations of curl.

    [quote comment="39441"]Simply put: Safari should let me do this. :-P[/quote]

    Oh, I agree wholeheartedly, just trying to help find a workaround.

  8. [quote comment="39445"]http://bbs.applescript.net/viewtopic.php?id=15805[/quote]

    That's where I found out about "URL Access Scripting," but as I said above, it can't handle the login info. 🙁

    [quote comment="39445"]http://automator.us/examples-02.html[/quote]

    Automator seems less capable than AppleScript, though I'm sure the two could somehow be paired… perhaps the AppleScript could create a new window, load the next image, call an Automator action to download the image, and move on? Is that possible?

    Or, better yet, a list of URLs could be created for the "Download URLs" list. But how can the list be created in AppleScript (as is currently done)?

    Hmmm… this looks promising, but I'm not getting my hopes up. I hadn't thought of Automator. Thanks.

    Update: This, too, simply downloads the login page.

  9. Did you try the --cookie-jar option? I would think you could write a shell script, which did:
    1. curl: call the login script with your credentials, and ask to save the cookies created in a named cookie-jar file
    2. curl: ask to download a sequence of files with the --cookie option providing your login state

    The shell script could handle options for where to store the downloaded files, and your sequence/prefix/suffix params. At which point you could wrap this in an AppleScript if need be to be able to call it through Safari's script menu (or hook up a trigger in QuickSilver).

  10. Using System Events scripting, you could paste the URLs one at a time into the safari address bar and then have it type option-return to download rather than view.

  11. with curl you can use -c to write a cookie file received, -b to read a cookie file, -d to fill out a form like this:

    curl -c '/Users/firstnamelastname/Documents/CurlCookies/nameofsite' -d 'login=joe&pass=1234' http://www.nameofsite.com/login.php

    now you are logged in, and you have a cookie stored locally (I made a folder in my documents folder). you can then read that cookie.

    curl -b '/Users/firstnamelastname/Documents/CurlCookies/nameofsite' http://www.nameofsite.com/whatever.jpg

  12. I like rob's idea. But I've yet to successfully login using curl. So I'm still using safari.

    What I do is run one script that extracts the URL and then I run a 2nd script that downloads the image. The nice thing about this, is I can stop the script, and then resume downloading at a later time.

    This is an example of the download script. Unfortunately I have not found a way monitor the download, my script just waits about 2 minutes before moving onto the next download.

    set listOfImagesToDownload to {"image address.jpg", "image address2.jpg"}

    set sizeOflistOfImagesToDownload to count listOfImagesToDownload
    repeat with imageCounter from 1 to sizeOflistOfImagesToDownload

    --set the URL to download
    set targetURL to get item imageCounter of listOfImagesToDownload

    --start download
    tell application "Safari"
    activate
    --set the URL of document 1 to targetURL as text
    tell application "System Events"
    keystroke "n" -- create a new safari window
    delay 2 -- wait for safari window to load
    keystroke "l" using command down -- select the contense of the location field of the safari window
    delay 3
    keystroke targetURL -- type the contense of targetURL into the location field in safari
    delay 10 -- wait while the link is being typed in
    keystroke return using option down -- option + return will tell safari to download the address, instead of loading the address
    delay 2
    keystroke "w" - (close the window)
    end tell
    end tell

    delay 120 -- wait for image to download before downloading another image
    end repeat