Janet Riley

Project Notebook: Delicious Link Checker Overview

February 26, 2013 | In tech

Project

Delicious Link Checker
Code on GitHub

Background

Delicious is a web-based bookmark service where you can save your bookmarks, tag them, and see them from any web browser. It’s super-useful. When I joined I was trying to keep bookmarks in sync between three computers. Now I use it to archive and retrieve interesting links. An If This Then That recipe automatically bookmarks posts from a few favorite foodie blogs - I was saving them all anyway, and it cut down the volume in my RSS reader.

Link rot has set in after eight years and 4000+ bookmarks. About 10% of links are broken, and many more have moved to new locations. If only there were a script that could tidy them up.

Goals

  • Clean up my Delicious links the lazy way - detect and delete dead links, and update moved links to their new URL
  • Write some Ruby. Make interesting mistakes.

How it works

The script requests my bookmarks through the Delicious API. One by one it checks the status with an HTTP HEAD request. Dead links (status 404 and 410) are deleted. Moved links ( status 301 and 308) are updated with the new location.

Under the hood

* A plain Ruby script, to be run from the command line
* HTTParty powers the web requests
* Coming soon, the WebMock gem for testing

Status

* The mechanics work. I’ve logged the results to file.
* I’ve been testing against an account that I imported my bookmarks into. Delicious no longer supports imports, so I can’t reload URLs that I’ve cleaned up. Rather than one glorious unrepeatable test…
* I found the WebMock gem, which should (?) let me fake all the HTTP status codes for testing error handling.

Next time: 50 Shades of DOH! - In which I discover how many ways an HTTP request can go wrong.