Digital Awareness Month at State Records – the dream is over December 17, 2012
During the final week of DAM2012, Richard Lehane talked about State Records’ Open Data Project and the development of a web API to State Records’ catalogue, Archives Investigator. The API won a Mander Jones Award for the best finding aid to an archival collection held by an Australian institution or about Australia.
The fourth and final newsletter for DAM2012 continued with this theme, looking at the ways in which archives are being made available online as well as the impact that ‘digital natives’ might have on archives. To find out what State Records employees were reading about during DAM2012, read on!
Users of archives increasingly want to access archival collections online, at anytime and from anywhere. To meet this need, archival institutions are digitising their collections and providing access to the digital files online.
But how will users find archives online?
Even if all archives are digitised (extremely unlikely!!) users still need to be able to find archives of interest. Traditional archival catalogues are sometimes difficult for users to navigate, especially if users are ignorant of the ways in which archives are arranged and described.
The National Archives UK (TNA) recently launched Discovery, a new catalogue for searching across the TNA’s different databases and datasets. Using Discovery, users can search for items in the collection, browse hierarchies of government departments to find items of interest and search across user-generated tags.
The process of designing and developing Discovery involved detailed research into users, including their expectations, capabilities, limitations, preferences and context of use. TNA found that users have widely differing levels of expertise and motivations, and some have significant constraints on their behaviour.
To create a single interface that can satisfy both the seasoned archivist who knows how to navigate expertly through archival holdings and the retired grandmother who has no archival or subject knowledge, TNA turned to ‘persona-based design’. This approach uses fictional characters based on the actual observed behaviours of real users.
TNA interviewed and observed current and potential users to see how they use systems. TNA also looked at users’ daily routines to gain an in-depth understanding of their lifestyle and the challenges they face. Based on this research, TNA could identify patterns across user behaviour to create groups of customer type and the characters that reside within these groups.
TNA’s personas helped to determine which type of online services to prioritise. Would a proposed feature significantly improve the user experience of one or more personas? Would it compromise the user experience of another? Would it answer a need which is currently not provided? This approach enabled TNA to target its resources to focus on the different needs, goals and skill levels of a diverse user base.
The vast quantity of archives which will be made available online presents another challenge – finding archives of interest may be like finding the proverbial needle in a haystack.
The BBC has about a million hours of video and audio content in its archives, most of which is still on magnetic tape and film. The BBC plans to digitise and make most of this content available online by 2022.
Finding relevant content in such a large archive is likely to be a key research challenge, so the BBC’s Multimedia Classification project is researching automated techniques for extracting information about content (e.g. who’s in it, what it’s about etc.) This metadata will be searchable and will allow users to find content of interest.
The project is developing computer systems that can analyse content for video and audio characteristics (e.g. cuts, motion, luminance, faces, audio frequencies etc.) These characteristics are then mapped to a library of characteristics to identify key features in the content. For example, detecting scenes with head and shoulders shots of two people in a brightly lit room suggests that the content is a current affairs or news programme; detecting studio laughter suggests that it is a comedy, while explosions might indicate a thriller or action.
The project is also developing ways to analyse music to help identify the ‘mood’ of a program. For example, music in a major key is likely to accompany ‘happier’ scenes than music in a minor key.
What about users who are interested in archives held in different collections?
Another challenge facing users is that often they will be interested in archives relating to a particular subject or person which may be held in different collections. This usually involves searching several different catalogues on separate websites to find items of interest.
A number of institutions have collaborated to provide users with the ability to cross-search archival descriptions across institutions:
In the UK, the Archives Hub is an online gateway to the descriptions of archives held in UK universities, colleges and research institutes. It is the responsibility of Hub contributors to create and submit descriptions for inclusion on the Hub. Each month, the Archives Hub’s Features section highlights archives around a theme or topic, and includes images of selected archives and related web links. Past Features have covered a wide range of themes and topics, including recipes through the ages, tuberculosis, the Welsh in Patagonia and Charles Dickens.
Another site, AIM25, provides online access to collection level descriptions of the archives of over 100 higher education institutions, learned societies, cultural organisations and livery companies within the greater London area. Users can browse descriptions by repository, subject, personal name, corporate name or place name. So a user could browse descriptions of archives relating to apartheid, the London area of Shoreditch, Winston Churchill or Great Ormond Street Hospital.
The Social Networks and Archival Context (SNAC) project is a joint venture that includes researchers and developers at the Institute for Advanced Technology in the Humanities at the University of Virginia, the California Digital Library and the University of California at Berkeley’s School of Information. SNAC aims to bring together archival records in standardised form so that users can navigate among them and see the biographical and cultural contexts that disparate collections document.
The SNAC Prototype allows users to browse alphabetical lists of individuals, corporate entities or families to find ‘archival context records’ for them. For instance, the entry for Ella Fitzgerald provides:
- links to collections of the singer’s papers, photographs and music at the University of California and the Library of Congress
- links to other collections in which she is referenced
- a biographical timeline
- a list of occupations and subjects related to her life and work
- links to entries for people associated with her.
A user can also explore a person’s social and cultural environment with SNAC’s radial-graph feature. It creates a web, which can be manipulated, of a subject’s connections as revealed in archival records.
What else can we do with archival descriptive data?
At his presentation, Richard Lehane talked about the way in which the data in Archives Investigator is being made available through the API. In theory, archival descriptive data could be used to create ‘mashups’ combining the archival data with data from other sources.
National Archives of Australia (NAA) data inspired some creative digital projects as part of the GovHack 2012 conference. This event involved 300 odd developers and designers in more than 40 teams competing for $40,000 in prize money for using publicly open government data to create new applications and ways of visualising statistical information.
The entries making use of NAA data included:
- History in ACTION, a website that gives users the opportunity to create their own personalised bus tour exploring Canberra’s history
- PictureSwipe, an iPad app providing a searching and browsing experience across the NAA’s online PictureSearch database
- A Day in the Life, an app where a user can enter a date and location, and then see a point in time snapshot of what the world looked like then.
The rise of digital natives
The term ‘digital natives’ has been coined to describe people who were born during or after the widespread availability and distribution of computer technology. Digital natives grew up with the pervasive and widespread use of technology, continually demand new and improved uses of technology, and quickly embrace and adopt the new functionalities. Digital natives demand delivery of their content anytime, anywhere and want to view and store that content using a variety of devices and technologies.
So what impact might these digital natives have on archives? Two American writers have made some suggestions:
Lawrence Wischerth has suggested that the increasing numbers of digital natives in the workplace may be the catalyst to bring about the ‘tipping point’ for paper. Wischerth suggests that as digital natives move up through the corporate ranks and replace ‘digital immigrants’ in key decision making positions, organisations will move away from traditional paper based business processes and quickly and comprehensively adopt paperless ones.
H. Larry Eiring reported on research to identify and assess the potential impact of digital natives on the information management profession at the ICA Congress in Brisbane. The initial findings of the research indicate that:
- the expectation of instantaneous, uninterrupted communication and unhindered access to information will drive continual improvements in communication and information technology
- individual expectations of personal information privacy will be deemed unrealistic, excepting that information the individual maintains solely in their own mind
- records and information legislation and regulations will evolve, influenced by the principles of the new ‘information individualism’
- the creation of virtual communities of interest will link each person to every other, enabling knowledge sharing and supporting collaborative solutions.
Email messages, blog posts, Facebook updates, tweets and online videos – these forms of digital communication are replacing handwritten letters, finely detailed diaries and professionally produced films. But will they ever make it into an archive?
In his keynote address to the 17th Brazilian Congress on Archival Science, Chris Prom suggested that we must develop a method for archivists and records creators to work together so that digital communications can be kept alive long enough to be accessioned to an archives. Although many people value their digital communications highly, Prom identified the broader information ecology as making it difficult for them to identify, capture and preserve the records that have the most long-term value.
Prom argues that people need practical tools and services to transform information held in dispersed systems into evidence of their activities. Most people use multiple communication or social media services – how can we bring personally created content together in one location and fix it into defined formats to preserve a fuller, more contextually rich picture of a person’s life and influence?
The myKive project is developing free, open source software for copying, preserving and managing personal social media records, email and desktop files. myKive seeks to assist people in controlling their personal digital legacy and provides them with the opportunity to preserve it permanently if they or their descendents wish to.
Tea room discussion points
- If State Records was to develop a new catalogue or other way to search the collection, which personas would we need to identify and analyse? And what capabilities and limitations might these personas have?
- Mashups have the potential to attract new audiences for archival descriptive data by combining it with data from other sources. What mashups would you like to see using the data from Archives Investigator?
- Digital communications are fragile and difficult to identify, capture and preserve. What impact might this have on the researchers of the future who are interested in the career and influence of today’s writers, artists, politicians etc.?