31/07/2012

My referencing process

I suspect everyone has a subtly different way of doing this, depending on the tools they prefer and the way they like to work. I thought I would document my process as I think it's pretty efficient and has some real advantages, I'm sure there is still plenty of room for improvement though.

Overall aim

As an academic I often need to refer to work done by others, this normally done by noting their published work as a list of references or bibliography at the end of my documents. The most basic way of achieving this would be to have a stack of published books and papers in my drawer that I can refer to and then type in the reference details at the end of the each document I want to write. I'm sure that plenty of people do work this way, but from my point of view I see it as pretty inefficient (lots of printing, lots of sorting, lots of typing, not very portable, an awful pain to change reference styles, etc).

I manage my references entirely in software using a few specific tools. I've mentioned most of them in other posts but I'll go into a bit more detail here on the actual process I use. 

I'm going to split referencing into two separate processes. Firstly, as I'm researching a topic, I tend to gather references to get an idea what I'm doing. Secondly, when I come to document my work I need to search and cite the references I've found.

Tools

Google Scholar is generally my primary resource for finding papers and documents. I also rely on standard Google searches a massive amount and I have a Google alert set up to email me when a few key phrases appear in new articles added to the web.
IEEExplore generally has most of the published work (in terms of papers, Journal articles, etc) that I need to refer to. The University has a subscription to this that allows me to download what I need (otherwise I'd have to pay!).
Pdf format is generally how almost all papers are delivered and the format that I keep them in. I use Foxit reader to read pdf documents. I keep all my reference documents in one big folder rather than worrying about any kind of complex filing system.
I use Bazaar to version control my folder of pdf documents. I've previously discussed how this works and how it allows me to work between different computers, even without installing any software on them.
I use a tool called cb2bib to maintain a list of references in bibtex format. I started off just using this to add references to my bibtex file, but I've found it to also be really good for browsing references and citing whilst writing a document. I changed some of the default setup to help it retrieve data from the net.
I use Latex to typeset most of my work. Within this I can simply point it at the bibtex file for all the details of the references. I have previously mentioned how to use a bibtex file that is not in the same place as the rest latex document.

Gathering references

  1. Search for the document that I need to find using Google, Google Scholar or general web browsing.
  2. Open and read the document to see if it looks relevant and useful. Assuming that it does...
  3. Download the document to my big folder of "third party" references. I tend to use the full title of the work as the save name of the document - this can lead to quite long file names, but it makes it a lot easier to find things!
  4. Add the saved file to my bazaar version control system. This only takes a couple of clicks through  tortoiseBzr menus in windows explorer.
  5. Add the document to my bibtex file using cb2bib. This is a pretty straightforward process:
    1. Open cb2bib, I have a keyboard shortcut setup for this (it should also remember what bibtex file is being used)
    2. Click "import from pdf file"
    3. Click "select files"
    4. Select pdf files saved previously (hold ctrl to select a bunch at once)
    5. Click "process"
    6. The software will try to extract as much info as possible from the pdf file, this probably won't be enough so...
    7. Click Network query to retrieve all the info about the file from the web (this usually works fine, but it's worth checking the results, you may need to give it the right title to start it off)
    8. Click save to add the reference to the bibtex file
  6. The changed bibtex file and added references will need to be "commited" to the version control repository.
UPDATE: I have added now made a batch file that effectively takes the place of step 4 and occurs after step 5 - it can be run after a whole set of references have been input through cd2bib and adds them all to my version control repository using "Alt"+"p".

Citing references

  1. With a Latex document that has a pointer to the bibtex file within it.
  2. Open cb2bib citer (I use a keyboard shortcut) and select the reference(s) required:
    1. [optional] Select the way I want my reference list displayed by pressing "a" (author), "j" (journal), "t" (title) or "y" (year). I find author is usually best.
    2. [optional] Filter the reference list by pressing "f" and then typing what you want to search for ("d" clears the search)
    3. Click on a chosen reference
    4. [optional] Press "o" to open the reference and read it
    5. Press "enter" to cite the reference, a small pen marker will appear next to it (in author view it will appear next to each author for the same paper). Multiple references may be cited by selecting them and pressing "enter". "delete" clears all the selected references.
    6. Press "c" once all the references for citing are selected, this will close the citer window and copy the latex text for the references to the clipboard.
  3. Paste the text into the latex file to include the references at that point in the document.
Although those might seem like a lot of steps it's really pretty straightforward once you get used to it. The only real difficulty is remembering the keyboard commands for cb2bib citer. With this setup I can take all my references with me between machines (even between Linux and Windows) and use the same process everywhere.

Bazaar GUI issue on Windows 7 - adding files using the commandline GUI interface

Due to an unfortunate hard drive issue I recently had cause to swap to a new drive. Along with this came an upgrade from XP to Windows7 (64bit). This was not a real issue, just a bit of hassle moving everything over and re-installing all my applications.

As part of the process I had to re-install Bazaar which I'm using to version control all my work. This appeared to go fine, including the addition of tortoiseBzr to integrate the Bazaar commands into the windows explorer GUI. Unfortunately when I came to use it with my work it gave me an error, I think due to the "special" folders structure in windows7. Here is a copy of the message I posted to the Bazaar user group:

Hi everyone, I hope someone can help me out.

I'm using Bazaar to version control a whole set of different files between a few different machines. On a previous windows XP machine I was able to use the whole user area as my repository and only add the files that I wanted to control.

I have just upgraded to a windows 7 machine (64bit) and there seem to be some issues with the preconfigured folders not being accessible by Bazaar. I have been able to commit files fine, but when I try to open a window to add new files it crashes as it tries to display them, with the error:

bzr: ERROR: [Error 5] Access is denied: u'C:/Users/welf/AppData/Local/Application Data\\*.*'
(It also seems to have issues with "My Music" and similar)

Any thoughts on why it's happening or a way round the problem?

(I suspect that it would work if I had the repository at a lower level without any "windows" folders, but that is not really the way I'd like to work. I don't actually need to version control anything in these folders, so if they can be skipped in some way that would be fine.)

It seems suspiciously similar to an outstanding bug here:
https://bugs.launchpad.net/qbzr/+bug/1012907

The general upshot is that I can't add new files to version control using the GUI, although I seem to be able to do pretty much everything else. I thought I would detail the workaround I'm currently using. It isn't too bad, but wasn't totally obvious, so I thought I would document it in case it was of use to anyone else:
  1. Right click in the folder containing the files and select the "Tortoise Bazaar" menu item, then select "Run command". A GUI window to allow you to run a command will open.
  1. Select "Core" after "Category"
  2. Select "add" after "Command"
  3. Click "Insert filenames...", a file selecting window will open.
  4. Select the files you wish to add (hold 'Ctrl' to select multiple files) and click "Open"
  1. Click "OK" and the status window should tell you that the files have been added
You're now at the same stage as if you'd added the files using the standard GUI. Of course you will still need to perform a commit before the files are part of the repository.

Hope that's useful to someone.


UPDATE: Discussion on a Bazaar email list confirms that the problem I'm seeing is due to the error linked above. Unfortunately a fix does not look imminent therefore I'll have to continue using the command-line method described.