Digital Forensics for Libraries and Archives June 17, 2016

8531522164_5f18f3b244_kThe State Library of South Australia in Adelaide recently hosted a two-day digital forensics course called “Digital Forensics for Libraries and Archives: Principles, Practices and Possibilities”. Presented by Cal Lee from the University of North Carolina, it was designed to provide an introduction to digital forensics activities for institutions in the GLAM (Galleries, Libraries, Archives and Museums) sector. Representatives from various libraries and archives around Australia attended, including yours truly from State Records NSW.


Cal is well-known for his work in this field – he’s presented at many conferences and has authored or co-authored many papers on the subject, as well as being one of the key people behind the BitCurator project.

The participants had a range of experience, from those who had heard about the subject of digital forensics in cultural institutions and were interested in it, to those who work at institutions with a digital archive and have some familiarity with the tools and processes.

The course gave a broad overview of the hardware and software tools that may be required by institutions that want to set up a digital archive, or want to collect material which is either born digital or has been digitised. When acquiring digital material, the goals are the same as when acquiring analogue material – to ensure the integrity of the material, allow users to make sense of the materials and understand their context, and prevent inadvertent disclosure of sensitive data.

On the first day, participants familiarised themselves with the components that may make up a forensic workstation. Hardware, depending on the requirements of the institution, may include multiple hard drive bays, optical drives, floppy drives (3½”, 5¼”, even 8″!), card readers and Zip drives. Special hardware to interface between old hardware (e.g. 5¼” disk drives) and new PCs may be required, such as the FC5025 or Kryoflux. To prevent ‘polluting’ the source disk or drive, hardware write blockers may be used. These are special devices which sit between the acquiring computer and the source disk or drive, and ensure that its data is only read, and nothing can be written back or changed. The use of such devices can help to fulfill the principles of provenance, original order and chain of custody.

There are similarities between the digital forensics tools and techniques used by law enforcement agencies and cultural institutions. However, for those in cultural institutions, digital forensics is not about solving crimes or catching people out, but capturing data from places where it’s not immediately visible (such as EXIF metadata stored in images, or user data stored in the Windows registry) instead of losing this data; ensuring the actions taken don’t change the source data irreversibly; recognising the data that is most likely to be changed; and documenting what has been done so it’s clear to others.

The technical topics and activities covered over both days included:

  • the use of commercial and open-source forensic software to create and mount disk images
  • common disk image formats and the advantages/disadvantages of using each
  • opening files in a hex editor to locate and view metadata
  • using file format identification tools
  • examining the characteristics and differences of file systems such as FAT, NTFS and HFS+
  • hashing algorithms, checksums and why they are used
  • most of the tools contained in the BitCurator virtual machine (e.g. the bulk_extractor tool, which can identify particular patterns in data, like phone numbers, email addresses, phone numbers, URLs, domains, IP addresses, etc.)
  • an introduction to regular expressions, and why they may be useful for finding particular patterns in data
  • an introduction to the Windows and Linux command-line interfaces, and why they would be used (e.g. automating workflows by piping the results of one command into another)
  • the creation of digital forensics XML files
  • the Windows registry and Windows security identifiers (SIDs)
  • legal and ethical concerns surrounding digital forensic examination, and drawing inferences from examination that could be right, but could also be wrong!

As you can see, it was a jam-packed couple of days! Cal covered all these subjects to a greater or lesser extent in a clear and friendly manner.

Photo credit: jon crel – tableau usb write blocker (CC BY-ND 2.0)

Leave a Reply

You must be logged in to post a comment.