It’s been widely noted that the typical website lasts roughly three to five years. One of the goals of SUP’s Mellon grant is to mitigate that inevitability by exploring a range of preservation approaches for the web-based works we’re publishing. While documentation is a necessary component of archiving digital content, an ideal archive would also offer readers the ability to fully experience the interactive qualities of a digital scholarly work. So I’ve spent the past week learning about what some people are doing to make digital content accessible, even after its three-to-five year countdown has expired.
Just as last week’s blog post on documentation was going live, I was checking in at the 2017 Joint Conference on Digital Libraries, a meeting with a heavy focus on web archiving. If documentation offers a way to chronicle the experience of interacting with digital content that is no longer accessible, web archiving seeks to capture a snapshot of a website, even if it’s not quite as interactive as the original. Take, for example, the Wayback Machine, a project of the Internet Archive that crawls websites and records them so that readers can ostensibly visit a version of a webpage as it appeared on a specific date in history. It’s an important resource, especially at a time when information can disappear overnight, as if it never existed.
At first glance, web archiving appears to be the perfect solution to capturing and saving web-based content. But it has its limitations.
The big focus still seems to be on accurately capturing news, social media content, and scientific data for historical reasons, and rightly so. It’s becoming more important than ever to ensure history and science are not manipulated, misrepresented, or altogether deleted. Hopefully, the advancements being made in web archiving will continue and can also benefit the scholarly content we’re publishing so that academic perspectives and arguments are preserved along with cultural and scientific data. Scholarly discourse is happening online, and it isn’t just CNN and Twitter that need to be recorded. We also need to capture and preserve the voices that are interpreting, analyzing, and making sense of this content, and publishing their insights in digital formats.
Jasmine Mulliken is Production and Preservation Manager, Digital Projects, at Stanford University Press. She coordinates the production and preservation workflow of born-digital projects, including recommending platforms and coding standards to authors, consulting with authors on projects’ technical attributes, and evaluating best practices for archiving and preservation.