Wayback Follies
As much as I'd like to say I'm a fanatic of forward-thinking, I am a lover of history. This story in particular focuses on the history of the sites I visit, and why I seemed to get the short end of the stick.
In freshman year of college, I got infected by my first virus and oddly it corrupted all my CSS and JPG files. So pretty much any site I had designed from 2000 to 2002 was gone. Sure I still had the code, but it wasn't the same without the images. Fast forward to last year when the hard drive I vowed never would die--died. All my work from that point up until 2003 was gone. Yes, I literally cried that day because everything I had made from 2000 to 2003 was lost to eternity. So sure, I claim that I have 5 years of history, but I don't have anything to show for it. "What if people think I'm lying," I thought.
I then turned to the supposed "king" of history gathering, the Wayback Machine. For those of you who are not familiar with the Wayback Machine of Doom, it's a program hosted by the Internet Archive that allows you to see past versions of a given website. According to their Frequently Asked Questions, Alexa Internet had a large part in the creation of the archive (maybe that's why it sucks).
Alexa Internet has been crawling the web since 1996, which has resulted in a massive archive. If you have a web site, and you would like to ensure that it is saved for posterity in the Internet Archive, and you've searched wayback and found no results, you can visit the Alexa's "Webmasters" page at http://pages.alexa.com/help/webmasters/index.html#crawl_site. [...] Sites are usually crawled within 24 hours and no more then 48. Right now there is a 6-12 month lag between the date a site is crawled and the date it appears in the Wayback Machine.
Geez, six months to a year lag before anything appears. Well, that's not the least of my worries. I have never seemed to have good results with either Alexa's Traffic Rankings (for example, Alexa says that only one site links back to me) or the Wayback Machine, so that's where the "short end of the stick" comes from. But I'm going to focus on how the Wayback Machine has handled past Avalonstars.
To put it bluntly, not one site I have done has been completely archived. I have never played with the robot.txt
file so Alexa can't come at me with that excuse. If you look here then you'll see that the site hasn't been archived since February. Worse than that, Alexa seems to like archiving my site when it's down, like here, here and here. On top of that, Wayback likes to hide my archives from me every so often and clicking on a link will yield the mother of all 404 pages.
But it hasn't been all bad, it has saved (but not perfectly saving) some of my old sites:
- September 24, 2002 - Avalonstar 11: Autobiographical Emissions
- November 30, 2002 - Avalonstar 12: Chaotic Soul
- June 4, 2004 - Avalonstar 15: The Moon of Syn
I don't know if I'll ever master that place, some places (like Google and Amazon, and almost one of Avalonstar's versions) have been perfectly archived and frequently archived at that. Discrimination? I don't know really and I won't try to guess. But if you have had success with this thing, let me know of your experience, I don't want this version suddenly lost to the web's black hole.