First Hurdle: Data Archiving
Well, here I am, literally one day into my work on my thesis and I’ve hit my first hurdle: The data.
On the positive side of the equation, I don’t have to jump through the hoops of ethics clearance to get access to my data: It’s all been collected and is sitting in a locker waiting for me to use it.
The negative side comes in two parts, the short version and the long version. The short version: Some of it is unreadable.
The long version I’ll leave you to figure out yourselves from this photo I took (I’ll replace it with a scan to better illustrate the problem when I have a scan):
As you can (probably) see, the page is a photocopy of a page of written text – which in itself isn’t great but it’s not the end of the world – what is definitely a problem is the fact that there should be a good deal more text on that page than is actually visible. Back behind the blurry smeared faded areas used to be what I can only assume was legible text, and therefore some of the original data for a language which is very quickly falling out of use is also fading from its written records.
This isn’t (by any means) the condition of the vast majority of the data I have at my disposal – only the earliest data is kept in solely written form, the following data in much better condition and easily decipherable in the form available to me. All the original materials are kept at AIATSIS (Australian Institute of Aboriginal and Torres Strait Islander Studies), who keep a fairly comprehensive archive of language and other cultural material to save it for later use. I must admit that, given the state of the photocopies taken a number of years ago, I seriously doubt the legibility of the original documents also which have had several more years to deteriorate. For me, this is a hurdle which will relatively easily be overcome – since I’m working on an honours thesis, I can exclude a large amount of the data which is illegible because I need to exclude a lot of data in any case.
The broader issue raised, I think, is about how data is stored, archived, and duplicated. All of the data I’ll be using for my honours thesis is in paper-form, with very little available in the way of digital records on the language. This is a relic of the time in which most of the data was collected, and for that reason isn’t something that could be easily helped.
Given that these records are some of the few true records of a moribund language remaining, however, it seems that preserving such records from loss purely because of the deterioration of the original written records would be a tragedy for anybody wanting to look at the language in any additonal detail and for the Yanyuwa people who may want to work with their language in the future.
I’m not at all saying that anybody has been negligent or lazy or anything else regarding this language data – for all I know, the original transcript pages are kept digitally as images in much better shape than I have access to them in – it’s more a general observation about what could happen and the value that should be placed on these vitally important cultural records before it’s too late.

