Showing posts with label bazaar. Show all posts
Showing posts with label bazaar. Show all posts

30/05/2013

More thoughts on thesis documentation...

It's over a year since I put together some thoughts about how documents (PhD theses in particular) get written up. Having just started on the third year of my PhD I need to start compiling all the little bits I've written in the last two years together into something resembling a final thesis format. I can then build this up into a body of work that I can present at the end. This raises the question - how to best go about this process?
First a few conflicting thoughts about the final document I need to produce:
  1. I'd like to write in short concise sections that will either stand on their own, or can be built up into a larger document. I'm concerned that this will not produce a coherent final document though.
  2. I'd like to setup the individual sections in a hierarchy that reflects the hierarchy of the systems they are describing. Unfortunately this is likely to produce far too many levels in the document (when I drew up a draft layout I got to 12 levels!).
  3. I'd like to describe the work down to a level that anyone (even a schoolchild) could understand, however this would probably end up being a massive document that would be too basic for most readers.
I think the problem I have, that lies behind all of this, is that the thesis is really designed to be a document for your examiners to read, cover to cover, in a format that they are familiar and happy with. It will appear printed on paper, be long and probably dull, and, as has been pointed out to me by other people and stated a number of times on the internet, it should "tell a story". A story that took 3 years to write and that a maximum of 3 people will ever actually read.

What I would prefer to write is a document that I could upload to the web and could be referenced and useful to anyone. I've spent some time thinking about how I might be able to cleverly combine the two, but I don't think I'll be able to write just one document to satisfy both aims. So my plan now is to write the thesis in a fairly conventional manner in order to satisfy the requirements for a  PhD, then to later revise this into a format that I prefer. Obviously this second part will be easier if I know what I'm planning during the first part, so here is my initial outline...

Thesis mark I

I'll be using LaTeX for the writing, in whatever document template I'm given (or have to make to fulfill my departments guidelines). I'll try to keep it split into relatively small chunks, but without going to too many levels of subheading (4 max). I plan to keep each chapter in its own subdirectory and include it using the \include command. Within chapters I will break things down again using \input commands to allow me to keep sections and subsections in their own individual tex files.
I'll try to use hyperlinks where possible using \usepackage{hyperref}, using the colorlinks option to avoid the default ugly boxes. By using a subtle colour for the links they shouldn't be too jarring in a printed report (where they will be no use) but obvious enough when read on a computer.
As previously discussed in this blog, I'll try to include figures from matlab and simulink as vector images, stored wherever they are generated. I'll also use a single bibtex reference file for all my references.
As with other work, ongoing revision of the thesis will be controlled using Bazaar. I can include the version used to produce output files within them using vc.tex, which is pretty neat. That way when I have multiple versions printed out and sent to people for comments I hopefully shouldn't lose track of which version they're commenting on!
Latex to Kindle?

Thesis mark II

Latex to html?
html to chd?

I never really finished writing this post, but it gives a flavour of what I was thinking a little while ago when I started writing in earnest!

02/08/2012

Adding bzr files through the cb2bib

I recently described my referencing process where I simultaneously hold reference pdf files in a version control system and keep track of their details in bibtex file. I have also described a problem I've been having with my version control system of choice, Bazaar.

Based on these I decided that there might be a better (more automated) way of adding my references. After a very useful email exchange with the author of cb2bib they confirmed that this should be possible. Now, after a day of messing around and learning about various command-line tools that I hadn't used before, I think I've got it working. Here's how...

First I created a new batch file that I called "bzrAddRef.bat". Here are its contents:


@echo OFF

rem --------------------------------------------------------------------------
rem cb2Bib Tools addition by J. Welford
rem --------------------------------------------------------------------------


echo   bzrAddRef:
echo   cb2Bib script for adding BibTeX files to a Bazaar repository
echo.
echo   Using sed and xargs utilities from:
echo   http://gnuwin32.sourceforge.net
echo.
echo   Path below may need changing to be within the repository
echo.


echo   Processing:
echo.
cd "C:\Users\welf\Documents\literatureReview"
sed -n -e "s|.*file.*=.*{\(.*\)}.*|\"\1\"|p" %1 > bzrRefs.tmp
xargs bzr add <bzrRefs.tmp
del bzrRefs.tmp

Only the last 4 lines are really important. The first one sets the working directory, I don't think it matters what you use as long as you have write access and it is within the repository you want to add to. The next line runs through the current bibtex file and extracts the location of all the references to a temporary file. The next line adds all these files to Bazaar repository. The final line deletes the temporary file. (maybe I could have used a pipe between commands so that the temporary file was not required?)


Some special commands are used that will need to be installed, what you need are: sed.exe (and its dependencies:  regex2.dll, libintl3.dll, libiconv2.dll) and xargs.exe.

Within the settings of cb2bib this batch file can now be pointed at under the "Configure BibTeX" - "External BibTeX Postprocessing" - "Command:" section. Once that is done simply hitting "Alt"+"p" in the cb2bib window should run the batch file and add all the references to version control.

I hope the helps someone! (I presume it could be altered for other version control systems or be called from other applications.)

31/07/2012

My referencing process

I suspect everyone has a subtly different way of doing this, depending on the tools they prefer and the way they like to work. I thought I would document my process as I think it's pretty efficient and has some real advantages, I'm sure there is still plenty of room for improvement though.

Overall aim

As an academic I often need to refer to work done by others, this normally done by noting their published work as a list of references or bibliography at the end of my documents. The most basic way of achieving this would be to have a stack of published books and papers in my drawer that I can refer to and then type in the reference details at the end of the each document I want to write. I'm sure that plenty of people do work this way, but from my point of view I see it as pretty inefficient (lots of printing, lots of sorting, lots of typing, not very portable, an awful pain to change reference styles, etc).

I manage my references entirely in software using a few specific tools. I've mentioned most of them in other posts but I'll go into a bit more detail here on the actual process I use. 

I'm going to split referencing into two separate processes. Firstly, as I'm researching a topic, I tend to gather references to get an idea what I'm doing. Secondly, when I come to document my work I need to search and cite the references I've found.

Tools

Google Scholar is generally my primary resource for finding papers and documents. I also rely on standard Google searches a massive amount and I have a Google alert set up to email me when a few key phrases appear in new articles added to the web.
IEEExplore generally has most of the published work (in terms of papers, Journal articles, etc) that I need to refer to. The University has a subscription to this that allows me to download what I need (otherwise I'd have to pay!).
Pdf format is generally how almost all papers are delivered and the format that I keep them in. I use Foxit reader to read pdf documents. I keep all my reference documents in one big folder rather than worrying about any kind of complex filing system.
I use Bazaar to version control my folder of pdf documents. I've previously discussed how this works and how it allows me to work between different computers, even without installing any software on them.
I use a tool called cb2bib to maintain a list of references in bibtex format. I started off just using this to add references to my bibtex file, but I've found it to also be really good for browsing references and citing whilst writing a document. I changed some of the default setup to help it retrieve data from the net.
I use Latex to typeset most of my work. Within this I can simply point it at the bibtex file for all the details of the references. I have previously mentioned how to use a bibtex file that is not in the same place as the rest latex document.

Gathering references

  1. Search for the document that I need to find using Google, Google Scholar or general web browsing.
  2. Open and read the document to see if it looks relevant and useful. Assuming that it does...
  3. Download the document to my big folder of "third party" references. I tend to use the full title of the work as the save name of the document - this can lead to quite long file names, but it makes it a lot easier to find things!
  4. Add the saved file to my bazaar version control system. This only takes a couple of clicks through  tortoiseBzr menus in windows explorer.
  5. Add the document to my bibtex file using cb2bib. This is a pretty straightforward process:
    1. Open cb2bib, I have a keyboard shortcut setup for this (it should also remember what bibtex file is being used)
    2. Click "import from pdf file"
    3. Click "select files"
    4. Select pdf files saved previously (hold ctrl to select a bunch at once)
    5. Click "process"
    6. The software will try to extract as much info as possible from the pdf file, this probably won't be enough so...
    7. Click Network query to retrieve all the info about the file from the web (this usually works fine, but it's worth checking the results, you may need to give it the right title to start it off)
    8. Click save to add the reference to the bibtex file
  6. The changed bibtex file and added references will need to be "commited" to the version control repository.
UPDATE: I have added now made a batch file that effectively takes the place of step 4 and occurs after step 5 - it can be run after a whole set of references have been input through cd2bib and adds them all to my version control repository using "Alt"+"p".

Citing references

  1. With a Latex document that has a pointer to the bibtex file within it.
  2. Open cb2bib citer (I use a keyboard shortcut) and select the reference(s) required:
    1. [optional] Select the way I want my reference list displayed by pressing "a" (author), "j" (journal), "t" (title) or "y" (year). I find author is usually best.
    2. [optional] Filter the reference list by pressing "f" and then typing what you want to search for ("d" clears the search)
    3. Click on a chosen reference
    4. [optional] Press "o" to open the reference and read it
    5. Press "enter" to cite the reference, a small pen marker will appear next to it (in author view it will appear next to each author for the same paper). Multiple references may be cited by selecting them and pressing "enter". "delete" clears all the selected references.
    6. Press "c" once all the references for citing are selected, this will close the citer window and copy the latex text for the references to the clipboard.
  3. Paste the text into the latex file to include the references at that point in the document.
Although those might seem like a lot of steps it's really pretty straightforward once you get used to it. The only real difficulty is remembering the keyboard commands for cb2bib citer. With this setup I can take all my references with me between machines (even between Linux and Windows) and use the same process everywhere.

Bazaar GUI issue on Windows 7 - adding files using the commandline GUI interface

Due to an unfortunate hard drive issue I recently had cause to swap to a new drive. Along with this came an upgrade from XP to Windows7 (64bit). This was not a real issue, just a bit of hassle moving everything over and re-installing all my applications.

As part of the process I had to re-install Bazaar which I'm using to version control all my work. This appeared to go fine, including the addition of tortoiseBzr to integrate the Bazaar commands into the windows explorer GUI. Unfortunately when I came to use it with my work it gave me an error, I think due to the "special" folders structure in windows7. Here is a copy of the message I posted to the Bazaar user group:

Hi everyone, I hope someone can help me out.

I'm using Bazaar to version control a whole set of different files between a few different machines. On a previous windows XP machine I was able to use the whole user area as my repository and only add the files that I wanted to control.

I have just upgraded to a windows 7 machine (64bit) and there seem to be some issues with the preconfigured folders not being accessible by Bazaar. I have been able to commit files fine, but when I try to open a window to add new files it crashes as it tries to display them, with the error:

bzr: ERROR: [Error 5] Access is denied: u'C:/Users/welf/AppData/Local/Application Data\\*.*'
(It also seems to have issues with "My Music" and similar)

Any thoughts on why it's happening or a way round the problem?

(I suspect that it would work if I had the repository at a lower level without any "windows" folders, but that is not really the way I'd like to work. I don't actually need to version control anything in these folders, so if they can be skipped in some way that would be fine.)

It seems suspiciously similar to an outstanding bug here:
https://bugs.launchpad.net/qbzr/+bug/1012907

The general upshot is that I can't add new files to version control using the GUI, although I seem to be able to do pretty much everything else. I thought I would detail the workaround I'm currently using. It isn't too bad, but wasn't totally obvious, so I thought I would document it in case it was of use to anyone else:
  1. Right click in the folder containing the files and select the "Tortoise Bazaar" menu item, then select "Run command". A GUI window to allow you to run a command will open.
  1. Select "Core" after "Category"
  2. Select "add" after "Command"
  3. Click "Insert filenames...", a file selecting window will open.
  4. Select the files you wish to add (hold 'Ctrl' to select multiple files) and click "Open"
  1. Click "OK" and the status window should tell you that the files have been added
You're now at the same stage as if you'd added the files using the standard GUI. Of course you will still need to perform a commit before the files are part of the repository.

Hope that's useful to someone.


UPDATE: Discussion on a Bazaar email list confirms that the problem I'm seeing is due to the error linked above. Unfortunately a fix does not look imminent therefore I'll have to continue using the command-line method described.

07/06/2011

Portable Version Control

Background
I've mentioned previously that I'm version controlling my work using distributed version control software - specifically Bazaar. I explained that part of the reason I'm doing this is because I work at a number of different locations, and I need to be able to manage various different versions of my work and keep them all in sync.
One of the locations I work at is a Laptop provided by my sponsor. Unfortunately their corporate system does not allow the installation of unauthorised software and Bazaar is not on their approved list. I did briefly look into using subversion (as it was used elsewhere in the company and linking between this and Bazaar is possible) however it looked like it was going to be too much hassle to justify getting it installed. I've therefore decided to attempt to run my version control purely from my flash memory. This will not only allow me to version control my work on my sponsors laptop, but also on any other machine I decide to use.

Incidentally it's worth mentioning that I'm storing all my work on the flash memory (micro-sd) card of my phone. I upgraded to an 8 Gb card which should be plenty (at least initially). I always carry my phone anyway  so all I need to retrieve my work at any time or location is a USB cable (or bluetooth if I'm desperate!). I don't treat this as a working copy, but I backup to it regularly and use it to transfer versions between machines. In version control terminology this means I have a 'branch' on each machine I work at and another 'branch' on my phone. I 'push' or 'pull' changes between the phone branch and the local copy branch to backup/transfer the work.

Bazaar is coded using the Python programming language. This is a pretty modern and, by most accounts, a pretty handy programming language; however it's not something that I'm particularly familiar with. The main benefit to me is that there is a variant of it available called Portable Python which is designed to run from a USB storage device (like my phone). This means that it is possible for me to install Portable Python on my phone, and then run Bazaar from that.

I was quite surprised not to be able to find any instructions for this setup on the net anywhere, as it seems like quite a handy proposition (certainly for someone in my situation). The only references that I could find to it were suggestions of using it here, and here. I'm far from an expert on this type of thing but I've sort of got it working - so I thought I'd publish the steps I took incase anyone has any feedback, or it is of use to someone else.

Unfortunately portable python is only windows based; however this is where I'm most likely to need it. (there's also a chance that it will work under linux through wine, but I haven't really experimented with that yet.)

Python Installation
The current version of Python is 3.2, however I don't think any of the Bazaar source is built on that version, so I went with 2.6 (I'm not sure if I could have gotten away with a newer version or not?). I downloaded Portable Python from here and copied it to a "python" folder on my phone. I then ran the installer and told it to install to the same folder on the phone.

Bazaar Installation
I tried installing Bazaar from some executables, unfortunately these complained that it needed python 2.6 in the registry, and gave me no option for manually specifying a python path. So instead I downloaded a tarball of the Bazaar source from here which I extracted to the Python "App" directory. I then opened a command prompt, navigated to the python "App" directory, and ran:
python bzr-2.4b3\setup.py install
(actually I was still using Linux at this point and instead ran SPE-Portable.exe under wine and then ran setup.py through that - I'm pretty sure it had the same effect though)
This ran for a while and terminated without any errors. Within the bazaar directory a "build" folder had now appeared.

From the command prompt I was then able to run :
python bzr-2.4b3\bzr status C:\test
where C:\test was a directory under version control. This gave the correct response, but it also gave me this warning that some extensions couldn't be loaded. I don't think this is a problem, so I ignored it (the link gives a method for turning off the warning if you're bothered about it).

Bazaar Explorer Installation
I wasn't keen on typing everything through the command line, so I looked into getting bazaar explorer running. The official instructions for this are here.
So I changed to the plugins directory of bazaar:
cd F:\python\PortablePython_1.1_py2.6.1\App\bzr-2.4b3\bzrlib\plugins

and ran bzr with the command to download from launchpad:
F:\python\PortablePython_1.1_py2.6.1\App\python F:\python\PortablePython_1.1_py2.6.1\App\bzr-2.4b3\bzr branch lp:bzr-explorer explorer
This warned me that I hadn't given a launchpad ID, but I don't think that matters as I'm not intending on writing anything back to launchpad. Then it went about downloading and building explorer. This process took a little while.

I then tried to run it, but there were a few dependencies that needed fulfilling. Firstly it told me I needed QBzr. So, still in the plugins directory, I ran:
F:\python\PortablePython_1.1_py2.6.1\App\python F:\python\PortablePython_1.1_py2.6.1\App\bzr-2.4b3\bzr branch lp:qbzr
this then went through the same process as with explorer.

On the next try it requested qt libraries (ERROR: No modeule names PyQt4). Some googling reveals this, which suggests installing it in a "non-portable" install and then copying selected files across. (I did try a proper install before this but got nowhere*). I downloaded "PyQt-Py2.6-gpl-4.5.4-1.exe" from here, and installed to C:\Python26. This created a folder in C:\Python26\Lib called "site-packages", I copied the contents of this to the Lib\site-packages folder of my portable python install.

I then tried running:
python bzr-2.4b3\bzr explorer
and up it popped!

Unfortunately a lot of the buttons bring up a deprecation warning for commands in plugins\explorer\lib\app_suite.py. This is something I haven't been able to fix yet. My guess is that it's due to a new version of explorer being used with an older version of Bazaar, but that's just a guess. If anyone can help with it then I'd love to hear from you!
It's still useful as a GUI for inspecting the changes and looking at differences; however any major commands need to be made from the command prompt.

Also this all took a while working from the phone flash card, next time I might consider doing it locally on the hard drive and then copying it across.

Please let me know if you know of any better methods for getting this working, or where I've gone wrong with getting explorer working. Hope this is of interest to someone.
I've also just come across another (easier looking) solution here.



* The copying method sounded like it might be a bit flaky, and I saw some sites indicating that it could be done in a more "conventional" manner, so I got the files for it from here. I also needed "SIP" (I tried it without but it said no). SIP can be obtained from the same place here, and extracted to the python/App directory. I then changed to the python directory and ran:
python sip-4.12.3\configure.py
after this had created a sip module Makefile it errored with unable to open siplib\siplib.sbf - not sure what this means, so I then tried:
python PyQt-win-gpl-4.8.4\configure.py
this said: "make sure you have a working qt v4 qmake on your path". I was a bit lost then so I gave up and tried the technique of copying across!