Trust no one? The truth is out there August 12, 2016

Large-Scale Research in a Small Package

Much has been made of privacy concerns around the 2016 Australian census. Unfairly or not, much of the early discourse[1] was centred on the Australian Bureau of Statistics’ intention to retain name and address information for four years. Many commentators have shared their views on whether they trust the government on how they will use this information or protect it from malicious use. Far less is being said about how it will maintain this information.

Perhaps the public has an implicit trust in the quality of government digital records management, and that digital records are and will remain authentic and unchanged over time. After all, information that is not fit for use is hardly fit for misuse. But like a swan that glides effortlessly across a lake, records managers know that under the surface we are paddling furiously to make sure digital records are trustworthy, regardless of how they are used.

We tend to trust technology and automation

For digital records, authenticity and integrity are most assured when the technology is mature and the processes around it are highly automated, giving the user little opportunity to make mistakes. For example, when you receive an email from a government colleague, you assume that what you read is what your colleague typed.

Similarly, when you scan an invoice through a multifunction device’s document feeder, you are fairly confident that the resulting digital image is a good representation of the original. You might check the image to make sure it scanned the front of the page and not the back, or that it’s in colour if colour is important, but you probably don’t scrutinise it like a conscientious Justice of the Peace might. A glance at the image is usually enough to satisfy you that it’s fit for purpose and you can then destroy the original in line with GA 45[2].

Now consider applying an optical character recognition (OCR) process to that digitised image. How well do you trust this? Would you be confident enough to lift off the text layer, save it as a txt, and destroy the image file? Maybe not[3], particularly if you’ve had experiences with hilariously inaccurate OCR. But this doesn’t mean the process can’t be trusted – many organisations rely on OCR to processes forms. It is simply a matter of bridging the gap between the emotive element of user (dis)trust and the technical trust we build into a process.

Metadata and documentation are evidence of trustworthy processes

How do we uphold trust in our records when the technology is not mature or the processes involve significant manual intervention and decision making? A good example is how we manage the digital transfer process. Databases come in all shapes and sizes, and our Digital Archives team manages migration projects on a case by case basis. In some cases we have to decide which parts of the database are State archives, to be retained in the permanent collection, and which are not. There are many decisions to be made and the structure of the database may change in the process.

To ensure that the database’s authenticity and integrity are demonstrable, we maintain comprehensive project documentation that describes the processes and decisions made for the migration. Because this information is persistently linked to the records it relates to, in a sense it forms part of the metadata. So when a researcher in the reading room asks if these digital records are trustworthy, they don’t have to take our word for it, we can show them.

What does this mean for active digital records? The NSW Government is committed to an open government that is transparent and accountable. By maintaining good metadata and documentation, the public can have confidence that the records are trustworthy as a matter of course, and can focus on the information contained within instead.

[1] For example, see Longbottom, J. (2016) Census 2016: Privacy advocates say people’s names should not be retained. Retrieved from, 9 August 2016.

[2] General Authority 45 Original or source records that have been copied (GA45)

[3] Without suggesting that there would be any significant saving to digital storage costs for doing this


Image credit: ‘Large-Scale Research in a Small Package‘ by USFWS Mountain-Prairie is licensed under CC BY 2.0


Leave a Reply

You must be logged in to post a comment.