About

DuggBack

DuggBack LogoDuggBack is a service that lets you quickly find and shuffle through complete website mirrors and web caches of sites that has been hit by the 'Digg Effect'.

The 'Digg Effect' is a term given to the phenomenon when a popular site with many users is linking to a smaller site, causing it to slow down or halt completely due to the increased traffic.

This site is not affiliated with Digg, but is utilizing the Digg API to provide the essential data needed to generate each page and story list. Some may call this site a 'mash-up' since it's combining multiple data sources and mixing them into a new handy web application.

DuggBack might be the 'missing service' that Digg never will provide because of Digg's fundamental idea of linking directly to the website.

The site is not funded and is supported by ad revenue. The codename for this project was 'DuggCache'.

Purpose

Have you ever found yourself crawling through the digg.com stories and clicked on a link to a news article, but after a while the link times out or the page has been completely removed? You can use DuggBack to find mirrors services and web cache snapshots from before the site went down.

The mirror services that are available is provided by DuggMirror, Coral CDN, Wayback Machine and DotCache. The web caches serve as text-only backups and is provided by the 4 major search engines; Google, Yahoo, Live Search and Ask. Another cool feature that the Wayback Machine has is a list of older captures from the past, categorized by year.

There is also a special section where you sometimes can find a MirrorDot mirror link. This is actually the mirror service for SlashDot, just like DuggMirror is for Digg. The link will only be available when the story in question has been on the front page of SlashDot.

Design

For a particular story you have multiple services available and the access to them is done through a tab-based navigation system that will load the correct frame for you automatically when you click a tab. The tabs are relatively large and the different services are easy to recognize.

The page markup and design is done with traditional XHTML and CSS respectively. To make the site display properly, especially widths on Internet Explorer, stylesheets are loaded using conditional comments in the HTML header.

All of the ideas and coding for the site began late January 2007 and the initial site was launched late April 2007 (pre-launched 27th, gigaom, official launch 30th). Most of the time used to build this website has been put into making a solid code base and a reasonable database structure. The PHP code has been pre-compiled to speed up site access and server performance.

Caching of the pages has also been adjusted to fit a suitable client cache control for each page. This includes disabling the default PHP behavior and enable the Last-Modified header to only send updated content when needed.

For this website you should use a display resolution of 1280x1024 pixels or similar if you have the possibility. Lower resolutions are supported and specially rendered for but they might not look as good and be as easy to navigate.

The site design itself is simple and mimics the Digg v3.5 layout. The base sketchup is inspired by that design and credit goes to Daniel Burka at Digg.

Usage

To find the mirrors and caches for a Digg story, just add the topic and digg story after duggback.com, e.g.:

www.duggback.com/security/How_to_Encrypt_Your_VoIP

Another easy method is to post a comment on the Digg story page with a link back to 'www.duggback.com'. When someone clicks on that link they will be redirected automatically to the proper DuggBack page, based on the referrer.

If you want to do further research, or digg deeper, and find mirrors and caches for a webpage that has not been submitted to the Digg yet, you can use another cool function called 'Investigate'. All you have to do is to find the link you want to look-up and either paste it at the end of the duggback.com domain, e.g.:

www.duggback.com/http://news.com.com/2300-1026_3-6166254.html?tag=nefd.top

Or you can use the 'Investigation'-form at the end of every story page. The system will try to find all available mirrors and caches and create a generic page for the website address you submitted. If you submit a website address that has already been submitted and promoted to the Digg front page, you will be redirected to the correct story page at DuggBack.

In addition to the mirrors and caches, you can also find other useful bits of information, like the webserver location, reverse ip hostname, link to the DuggTrends graph and a list of neighboring websites served on same ip address.

The geographical location you can see in the 'website'-tab is not always correct, or rather far from it in a some situations. A lot of major websites will direct you to a local webserver for performance reasons and that may skew the ip and thus the location, e.g. Akamai's Content Delivery System. So don't be perplexed if you see Norway as the location for an american site. If you know the correct location of the webserver please let host.info know by making a correction on their site.

If you use Firefox and the Adblock extension you might experience some problems with the iframes used on this site, try to allow this site in the filter.

Updates

The votes and comments count is updated every half-hour for the last 75 stories. Older stories are also updated regularly whenever are visited. The update process is dictated by several update invervals (story, cache, hostinfo, geolocation, notfound) and as soon as some piece of data falls outside the expiration date it will be updated the next time the page is requested.

There are multiple fail-overs in place if an update fails or is unavailable. This applies to all the main function, i.e. story-, cache-, investigation-, and host information updates. If something fails it will skip the update and display the old data until the target site is back up.

Because of the expiration structure and the necessity of networked updates, the site is not as stable and reliable as a normal local-sourced site. The site has multiple breakpoints that will alert an admin if something doesn't behave as it should or some data expressions fail.

Tools Used

  • Apache v1.3
  • MySQL v4.1
  • PHP v4.3
  • eAccelerator
  • cURL
  • Golive CS2
  • Photoshop CS2
  • Web Developer Toolbar - Firefox
  • ColorZilla - Firefox

Services Used

-Torsten Lyngaas
Digg: ivc- / E-mail Address