Although at this stage I cannot make the thesis itself publicly available, I think its worth closing out this blog with some reflections on what I think worked well about the tools and techniques that I used, and also noting some areas that I might perhaps do differently a second time around.
What worked
Version control
This was really key for me, and helped me keep everything in good shape throughout the work. I wrote about the technique I used and a few tips and tricks for the software (Bazaar) that I used. I know that plenty of people get through by making manual backups and having clever naming conventions or similar, but I wouldn't want to do it any other way.
LaTeX
I'd used LaTeX before and knew it was pretty powerful, but I really enjoyed using it to write my thesis (and a few papers along the way). It is really versatile and, although it has plenty of quirks and idiosyncrasies, it can be moulded to do almost anything that you could want. There is also a wealth of information out there on the web to help with getting it to do what you want. Although it might seem like a steep learning curve at first, it's worth it in the end.
Referencing
I'm aware that there are more sophisticated web-based solutions around for reference management (Mendeley for example), but I found that my large folder of pdf files all referenced from a bibtex file and managed using cb2bib was a pretty good approach.
Figures
I developed an approach that allowed me to recompile everything in my thesis from the raw data. This was certainly a hassle at times, as writing a script to lay out a plot definitely takes more time than laying it out by hand, but overall I think it saved me time. Making changes late-on in the process I was able to do things like change colour schemes and a grid to all my figures relatively easily, without having to manually re-edit everything. Having said this, I might consider changing the software that I used for the job in future - see below.
What I might do differently
Not use Matlab for figures
Don't get me wrong, I'm a long time Matlab user and think it's great, but in trying to produce figures that looked as good as possible I feel like I really pushed it to its limits. What started out as some relatively basic plotting scripts, using mostly built-in functionality, evolved over the course of the work to be really quite complicated beasts. In some instances I had multiple sub-plots per figure, all using modified locations, with stacks of up to 4 axes sets in each plot (in order to achieve both axis breaks and different colour grids). These turned out fine in the end, but were quite a frustration to get right.
Since completing the work I have been increasingly producing my visualisations in javascript. This retains the scripted nature of the source code and the ability to regenerate plots as the data changes; it produces/exports vector graphics in a standard format; and it is increasingly common and well supported (both in terms of software and community) on the web, with libraries such as D3 enabling some really awesome stuff. But additionally I think it has the following advantages over Matlab plots:
- It is an open language that is supported in all major web browsers and is therefore about a portable as you can get
- It is less constrained than Matlab, as you have access to the raw line drawing commands.
- It enables a much greater level of user interaction in the results, with tooltips, data selection, animation and linked plots all being achievable.
Make my thesis more 'sectioned'
I struggled to get a consistent story that I wanted my thesis to tell until very late on in the process. Consequently the sections I wrote got juggled around and rewritten a few times. The techniques I used for writing it certainly helped in allowing me to play with the structure, but I feel like I could have done better in this regard. It's always easy to think like this in hindsight, but I would like to have had the content broken up in a more reusable manner.
Unfortunately having reusable sections of content is almost the opposite of having the consistent and coherent narrative that a good thesis (allegedly) requires. I don't have a solution to this at present and I'm not sure I ever will, but it's certainly something I would think more about (if there were going to be) a second time around (which there won't).
Make my thesis more engaging
It upsets me that I put a great deal of effort into something that perhaps only 4 other people will actually ever read through in it's entirety. It also doesn't seem to me to be the best use of the funding that I received. Nearly 4 years ago I proselytised about future documentation methods, and I think we've since seen a lot of progress in this area, particularly on the web. I would love to have been able to present my thesis in a novel and engaging format, but unfortunately I would still be working on it now if I had tried to go down that route (and also still arguing with supervisors and examination boards about accepting it).
I think it needs someone who is well on top of their work to forge ahead in this area, and produce a truly dazzling thesis that sets a standard for others to copy. With the right approach I believe it should be possible to produce something with the technical depth and quality to merit the award of PhD, whilst still being accessible to the lay reader. Perhaps through staged levels of detail, or interactive facilities or similar. Once someone has proved that it can be done and it's "gone viral" or whatever, I feel like the overall approach will be replicable. Unfortunately changing the overall PhD thesis paradigm, whilst simultaneously studying for and obtaining a PhD, is probably a bit much for most students!