Saving our digital history byte by byte July 7, 2016
The following is an insight from the Digital Archives team here at State Records with a fascinating look back at our recent digital history.
On 30 June 1992, First State Computing transferred the database that had been used in the recently ended Royal Commission into the Former Chelmsford Private Hospital and Mental Health Services in New South Wales (commonly known as the Chelmsford Royal Commission) to the then Archives Authority of New South Wales. The records of the Chelmsford Royal Commission were retained as State archives under the Disposal Recommendation DR4207 that was approved on 3/12/1990. First State Computing had held the database since the closure of the Commission in 1990. The transfer resulted in a substantial cost-saving to Government at the time.
The database was called STATUS (STATUte Search) – that had the capability of full-text retrieval. The origin of STATUS dates back to the 1970s where Bryan Niblett, a computer scientist and barrister, developed a retrieval program for the United Kingdom Atomic Authority to enable searching for atomic energy legislation. At the time the following quote sums up what was thought of STATUS:
Just as the space program has given us Teflon, atomic energy has given us STATUS.
The program itself proved over time not to be so inspiring as a database but did show itself to be a leader in programming search language.
In 1977 an academic from the Australian National University who had recently returned from the United Kingdom with a copy of the software piqued the Attorney General’s Department’s interest in using the system. In 1981 the Standing Committee of Attorneys-General (SCAG) announced that all the Attorneys throughout the respective States of Australia had agreed that they would only permit statutes and cases from their jurisdictions to be included in systems which conformed with the database design and user interface of the STATUS software. The benefit of using STATUS was that it was capable of running on a number of different platforms. Subsequently the STATUS software was used for the Chelmsford Royal Commission. This enabled Commission staff to carry out searches of the transcript, submissions and other support documents electronically therefore requiring no need to maintain an extensive and unmanageable index.
A personal computer (PC) running MS-DOS was considered the most effective solution when STATUS was transferred to the Archive Authority, as office staff were familiar with this particular environment. The database was the main entry point into the physical records. The CRC database was downloaded to an appropriately configured standalone PC. The software is a 16-bit application and was kept running to provide access to the records from the Commission until late last year when a digital migration project was agreed upon using the State Records Migration Methodology to migrate into the DSA.
Decisions needed to be made
Accessibility was the first consideration of the project. Because the STATUS application is a 16-bit application, a 32-bit Windows virtual machine was required in order to run an instance of the STATUS software on our 64-bit Windows PCs. To migrate just the original binary code into the DSA limited the usage and access considerably to those in the know of emulation (hosting one system to enable a guest to run another computer system). This would be possible but not practical for a public archive. Therefore the decision was made to extract the data in a usable format from a then unknown structure.
An investigative search of STATUS on Google provided a very important link with an adjunct professor from the Australian National University – a computer scientist who had used STATUS in the past and had happened to have kept the file structure in hard copy in a box under his desk. The structures were quite simple: fixed length records with relative block addressing. The database was made up of three files; the critical one being the text file which held all the structures together. The text was arranged in chapters with each chapter containing articles. Using this new found information and working with our systems programmer we were able to create code to extract complete chapter blocks and/or individual article blocks in plain text files for the various structures.
The STATUS database used by the Commission was commonly known as the CRC database, the full title was the Chelmsford Hospital Royal Commission Transcript Enquiry System. The CRC database held five complete databases, the daily transcripts from the public hearings, documents, submissions, documents used for the second term of the Commission and the report. The resulting digital artefacts will be in the format of plain text files which will be stored in perpetuity in the DSA. It will be text searchable just as it was within STATUS. The original binary code will also be held inside the DSA for the digital archivists of the future who would like to emulate and play with a key stroke 16-bit program in all its original glory.
Further work is underway and more decisions are to be made on how the plain text files will be presented – either by article or chapter, depending on the type of content held and the access directions covering that content. Parts of the Commission have closed access directions however an edited transcript of the hearings and the published report will be available by using early access directions. These will be accessible on the State Records website in the near future.
The necessity to migrate and transform old systems that supported government entities to conduct their business and that are required as State archives is critical. To do this sooner than later for the preservation of information which may be lost is vital. Obsolescence is here and will increase as time goes on.
Leave a Reply
You must be logged in to post a comment.