Posts

  • Representative Samples of Yahoo Groups with Webrecorder

    In mid October, 2019, Yahoo announced that it will “no longer host user created content” on Yahoo Groups as of December 14, 2019, which is a rather euphemistic description of its plans to delete countless message histories, files, polls, links, photos, general attachments and more (full list). This history of Yahoo Groups, created over the course of 18 years of the service’s offering, was met with a lot of concern across its user base, the...

  • Self-Hosted Archival Embeds

    The embedded web page is everywhere: Many web sites, social media in particular, allow users to embed content—text posts, videos, map widgets, or full social media timelines—into other pages located outside of their respective originating services. This practice has become a cornerstone of online discussion culture and criticism, political journalism, brand-building, and helps providing services to visitors of a page. Ever since political discourse has largely moved to social media, news sites have embedded tweets,...

  • Web archiving, privacy and sustainability—discontinuation of anonymous capturing services

    Up until yesterday, Webrecorder.io offered users that were not logged in to the site the same powerful web capturing service as logged in users. Anyone, without having to sign up for an account, was able to capture web pages and download the result as a WARC file. This service is now discontinued. Webrecorder.io can still be used free of charge, but users will have to sign up for an account and log in as registered...

  • Announcing Webrecorder API and WASAPI Support

    Over the years, Webrecorder has been developed as a fully API-driven application, with a web archiving backend and React-based frontend. We’ve been working on a spec for the full API and initial documentation. The API documentation is now available at: https://webrecorder.io/docs/api The API includes all the functionality that is available on https://webrecorder.io/ WASAPI A subset of the Webrecorder API is initial support for WASAPI, an API for bulk data transfer from web archives, developed by...

  • wabac.js: Viewing Web Archives Directly in the Browser

    This post was originally written as a guest post for DSHR’s blog. Some of these idea were also discussed in this Twitter thread As the web has evolved, many types of digital content can be opened and viewed directly in modern web browsers. A user can view images, watch video, listen to audio, read a book or publication, and even interact with 3D models, directly in the web browser. All of this content can be...

  • Webrecorder on the Desktop

    Create web archives on your computer, from anywhere We’re thrilled to announce an exciting new development for Webrecorder: the release of our new Webrecorder Desktop App available as a standalone application for Mac, Windows and Linux Since 2015, the Webrecorder project has offered a special read-only app software for browsing web archives offline, the Webrecorder Player. The Webrecorder Desktop combines the ease of use of the Player app and brings the full Webrecorder (including the...

  • Introducing Webrecorder Autopilot

    With this latest release, Webrecorder introduces a new way to enable users to capture technically complex web sites: Autopilot. Autopilot can perform actions on the current web page loaded in Webrecorder, similar to a human user: clicking buttons, scrolling down, expanding sections, and so forth. It does so via “behaviors,” carefully written, scripts that are adapted to the specific design and structure of certain web sites. To start, the Webrecorder team has prepared website-specific behaviors...

  • Improving Navigation in Web Archives

    If you don’t already know what you’re looking for, it can be difficult to find your way through web archives. Large collections created with automated tools, for example, can overwhelm visitors by requiring the entry of exact web addresses, showing huge calendars, or listing crawler seed URLs and metadata geared at experts. In short, so far, the UX (user experience) of accessing web archives has been heavily shaped by the technical parameters of capture. In...

  • First Webrecorder Community Call

    We’re excited to hold the first Webrecorder Community Call, next Tuesday, Jan 29, 2019, at 11:00am EST. The call will include brief presentations and updates on the Webrecorder project and will be attended by the whole team. To attend the call, join us via the Zoom app at: https://zoom.us/j/320269375 Anyone interested in Webrecorder for any reason is encouraged to attend, no experience necessary! The call will start with a more general overview and we’ll also...

  • Capturing Responsive Design

    High Fidelity Web Archiving for Multi-Source Images Have you ever visited the same web page using a bunch of different devices and commented to yourself about how crisp the images look regardless of the browser, screen size, and resolution used to view the page? Likewise, have you ever encountered that when capturing such a nice web page with your favorite web archiving tool and attempting to access said page right after, exactly these images appear...

  • A Prototype of Automated Web Archiving, Emulation and Server Preservation

    Web Archiving Automation: Beyond Crawling Automation has been one of the most requested features for Webrecorder. Users want to keep Webrecorder’s high fidelity capture and replay capabilities, but make it less tedious to ensure full capture of larger sites. The traditional approach to automating web archiving is crawling, automatically visiting page after page until some condition is met. But successful crawling still requires careful user oversight and trial-and-error testing. Scope the crawler too narrowly, and...

  • Announcing Webrecorder and DAT Integration

    After the major UI overhaul in June, which introduced new narrative and curatorial capabilities to Webrecorder, the Webrecorder team has been exploring how to improve the sharing and distribution of web archives. In addition to high-fidelity capture and replay, sharing and distribution of web archives in a decentralized way has always been a core goal of Webrecorder. A few years ago, we’ve launched the Webrecorder Player, a desktop tool that allows users to browse any...

  • Collaborating with the UK Web Archive to improve pywb

    We are thrilled to announce that we’ll be collaborating with the British Library to create a prototype for them to test pywb as the new replay engine for the UK Web Archive. Following up on our release of pywb 2.0 we will be making additional improvements to pywb to support the UK Web Archive use case. All of the work will be open source and the bulk of the work will be part of a...

  • Announcing pywb 2.0 release!

    We’re happy to announce that an updated release of pywb, the Python open-source web archiving engine that powers Webrecorder has finally been released! The 2.0 release of pywb represents a major refactoring and improvement of pywb, which has become the core engine that powers much of Webrecorder’s functionality. New documentation is also available! Initially, pywb started out as “Python Wayback” machine but has grown into a flexible, customizable framework for creating and replaying web archives....

  • We're hiring, join us!

    Want to help us make Webrecorder better and work on an exciting open source project with a small, distributed team? We are hiring a Senior Backend Developer Candidates from traditionally underrepresented groups in tech are especially encouraged to apply! We’ve extended the deadline by a few days, please apply by Friday, January 19th, 2018 (For those that have already applied, Thank You! We’ll be getting back to everyone in the next two weeks).

  • First Post, Webrecorder Grant Announcement

    We’re thrilled to announce that the Andrew W. Mellon Foundation Awards Rhizome $1 Million for Further Development of Webrecorder. We’re also starting this dev and product blog to provide updates on upcoming new features and technical development of the project. Stay Tuned!