Advanced document editing with Subversion version control

April 20, 2010 by Daniel Stender · 3 comments · Printer friendly version
Filed under: Editing 

A Revision Control System (RCS, or: Version Control System – VCS) like Subversion archives versions of your project’s file(s) and restores them on demand, you can track differences, etc. – one may know such a system from the articles at Wikipedia for example. Version control is an essential feature of serious programming, but Subversion could also be employed for the publishing systems which are based on a non-binary, plain text working document such as TeX/LaTeX, Groff, the ones using the Open Document Format (OpenOffice.org Writer and others), etc.

Instead of storing locally, Subversion and the others can also interact with remote server based repositories which is the key to well organised collaborative development – several clients receive their copy of the current version from and submit their changes back to the same server (for this the forerunner of Subversion coined the whole branch of software “Concurrent Versions System” -  CVS). Such a configuration is ideal for employing a general BibTeX database or such things in research groups. Even for a single user an external server based configuration could be pretty useful when working on the same project on different machines, for always keeping a remote backup, etc.

To start with that, a simple local Subversion repository could be setted up in a minute on your Linux machine, and furthermore there are even some LaTeX packets for the collaboration with Subversion. I would like to give a little crash course here for that, next to some pointers to plugins and an outlook to the higher developed distributed systems (if you are interested in the multi user aspect). Everything also runs more or less the same way on Mac OS X and also on Windows.

Working with a local Subversion repository

When it’s not already installed on your system first you have to get Subversion from your packet repository the usual way, of course: on Debian or Ubuntu you’ll get it through apt-get install subversion (it won’t take much time, the software is quite lean). After that there will be the main program svn and some others available on the command line then. To set up a local Subversion repository you first have to pick a suitable place for it, a hidden directory like $HOME/.svnrep or something like that I would say would do well. With: svnadmin create $HOME/.svnrep the repository will be created, after that you’ll see several items in that directory you don’t have to care about: conf, db, format etc. (that’s a Berkeley Database which runs here).

Now, supposed you are in the project’s working directory containing the file myproject.txt or something like this, the whole content is saved to repository with: svn import ./ file:///$HOME/.svnrep -m ‘any comment’ (there always has to be a comment for import and commit, if no -m is given an editor opens). After that you can delete all the directory’s content (in this case myproject.txt; or – if you don’t want to do this – you could move it to /tmp/). Then do a: svn checkout file:///$HOME/.svnrep ./ and myproject.txt appears again, ejected from the repository (together with some administrative files in the hidden subdirectory .svn). The project resp. the directory now has been Subversion-ed. You can check the project’s status with: svn info in the working directory, it will tell what’s the associated repository, the current revision number, etc. svn list –verbose file:///$HOME/.svnrep gives you information what the repository actually contains.

If you want Subversion to handle several projects you could use subdirectories within the same tree: .svnrep/project1, .svnrep/project2, etc. Newly created files in the working directory are marked to be taken into the repository with: svn add filename – it’s useful to be able to specify which files in the working directory are going to be in the repository because when working with LaTeX for example the leading .tex file needs to be committed, but it’s pointless to do that also with the temporal files (.log, .aux, etc. – clean it up before you import the directory, try latexmk -c), or even with the resulting .dvi or .pdf.

When you modify myproject.txt, svn diff myproject.txt shows you the differences between the version in the repository and the new one in the working directory (there are several options for diff, please check the manpage). Whenever you have the feel that the next status of your project worth keeping is reached and it is the right time to save the current version to the repository, just do: svn commit -m ‘any comment’ and that’s it (in the language of VCSs the commit is called “checkin”). In the case you want to recover the last version you can always delete the working files and “checkout” again. There is much more possible, for example you could restore any of the previous versions and also specific files if you’ve exidentially deleted anything, but that’s basically it – if you’ve mastered this, you’re in.

LaTeX and Subversion

Generally, Subversion is able to stamp selected files with several metadata tags, which is very convenient. If the file myproject.txt for example contains the elements $Revision$, $HeadURL$, $Date$, and $Id$ in the header for example, and you give a: svn propset svn:keyword ‘Revision HeadURL Date Id’ myproject.txt, after the next commit the file myproject.txt will carry the revision number at the same place, the path to the repository, date and time of last checkin, and a compression of the other tags (for what was $Id$; this feature is called “keyword substitution”).

There are a few LaTeX packets which are able to evaluate these tags, svn, svninfo, and svn-multi for multi file documents (to be used alternatively; all are part of texlive-latex-extra). With these it’s possible to put a Subversion metatag like $Id$ into a footnote or the document’s header next to having it expanded in the source file’s preamble – with svn for example something like it goes like this:

1
2
3
4
5
6
7
8
9
10
11
12
\documentclass{scrartcl}
 
\usepackage{svn}
\SVN $Id$
 
\usepackage{scrpage2}
\pagestyle{scrheadings}
\chead{\SVNId}
 
\begin{document}
Hello world!
\end{document}

(the document class scrartcl and scrpage2 belong to the koma-script packet). Something like this is pretty useful to make sure what version you are actually proofreading. The tags will be updated automatically with every checkin. Another VCS support for TeX is Stephan Henning’s vc. It can deal with Subversion next to other systems like Git (see below), is also capable to handle multi file documents and runs even with PlainTeX (for the use with ConTeXt see here).

Plugins

Since the VCSs are very popular there are dozens of helpers, GUI frontends and plugins to improve the workflow, for automation of checkout and commit, for enhanced diff-ing, for displaying the repository, tracking revisions, helping with administrational tasks etc. (see for example the article on Subversion clients at Wikipedia). To pick out just a few of them, with a view to popular Linux applications:

  • Since the editor Emacs has become the real “eierlegende Wollmilchsau” (recently Manuel Batsching posted on Emacs on his blog) it’s clear that there are also several scripts for dealing with Subversion: see the EmacsWiki, XSteve’s page, GNU VC etc. etc.
  • For Vim there is the vcscommand and cvsmenu
  • The ultra versatile development platform Eclipse (when you start with that you will never need another hobby) has several plugins to deal with several VCSs, Subclipse and Subversive are two of the available Subversion clients
  • For the KDE file manager Konqueror there is the Subversion integration KSvn
  • A dollop for OS X users: the very fine, price winning editor TextMate also features a plugin for the collaboration with Subversion, see here.

Revision control beyond CVS

Common VCSs like CVS and Subversion which a running a centralized server architecture have some limitations: overlapping commits always evoke a conflict which has to be resolved manually, proper commit access control is a problem, just to name a few. Higher developed applications like Mercurial or Git are employing a noncentralized model for overcoming them, for that they are designated as Distributed Revision Control Systems (DRCS). Since there commits and even branching can be done without being connected (here not commit actually saves to the server but push) there are benefits for offline work. That also means that in collaborative development you could employ the same version control for your own drafts you don’t want to publish, and many other things. But the advantages of these systems go even deeper so that they should be chosen instead of the common ones especially for multi user projects. If interested in these fascinating matters please follow the last pointers on the list of references:

References (systematically):

Critical editing software (on Linux)

March 22, 2010 by Daniel Stender · 4 comments · Printer friendly version
Filed under: Editing 

A critical edition is one which presents the editors hypothesis about some state of text and presents systematically the evidence on which that hypothesis is based in a critical apparatus. To justify the made up text at best the whole of the rejected textual variants have to be presented systematically in the critical apparatus which should build a aggregation of the whole transmission and represent the witnesses sufficiently. Although there are fine critical editions which were made completely with customary word processors like Word or Word Perfect for save yourself a lot of trouble it’s a good idea to plan such a project with software which was made especially for that purpose. They are more convenient for that and special typesetting features like multiple levels of footnotes are solely available here anyway. Here I would like to present the available solutions with a view to Linux.

Classical Text Editor on Wine

The price winning Classical Text Editor (CTE) which was developed by Stefan Hagel at the Austrian Academy of Sciences in context of the KVK project and is pretty mature (feature list here), the actual version is 8.2. The last time it was developed very close to the project Philosophy and medicine in Early Classical India at the ISTB for the edition of the Carakasaṃhitā. The CTE supports Unicode and OpenType fonts and has an indexing system. It’s Windows software, but I’ve tried and it seems to run pretty well on Wine (1.0.1 on Debian Squeeze, 32 Bit Version even on AMD64). It’s not free, the 30-day trial restricts the output. It produces PDF and also TEI/XML and HTML output formats are supported.

PlainTeX / ConTeXt

If you are doing (Plain)TeX (which has its advantages) and want to do critical editions surely Edmac (last version 3.17) is the packet you’ve found already. The Edmac macros which have been developed in the early nineties by John Lavagnino and Dominik Wujastyk – a real pioneer work. Several fine editions have been done with this packet including theVyāḍīyaparibhāṣāvṛtti and of the Skandapurāṇa in Groningen (see here) and several others in other disciplines. The packet has been described in Critical edition typesetting – the EDMAC format for plain TeX (San Francisco, Birmingham 1996) and briefly in Overview of EDMAC [TUGboat 11,4 (1990), 623-43]. There is also a “consumer report” written by H. Breger to be found, Erfahrungen bei der Anwendung von plain TeX und Edmac auf die Leibniz-Edition [Die TeXnische Komödie 4 (1996), 16-22]. The structure of TeX and of the macros makes the whole thing open for getting employed by other applications and furthermore for external datasets to be piped into it, for example the Critical Edition Typesetter (CET) by Bernt Karash (Windows software) features an Edmac export, and the Mac collation packet Collate (it seems that both programs are not developed anymore) could automatically generate Edmac tagged text as an output (cf. E. Johnson: Collate Interactive collation of large textual traditions). It’s right to separate the variant storage from the typesetter like it is the concept of software like Collate (and the successor Anastasia). Thinking about what could be a development of TeX critical editing software, cutting-edge applications should definitely be open for the import of data containers like XML (cf. P. Robinson: Towards a scholarly editing system for the next decades [2nd ISCLS presentation]). Seeking for what is actually happening on the non-LaTeX sector, there is a critical edition packet CriTeXt announced for ConTeXt (see here) which looks very promising even because ConTeXt per se is highly capable to deal with external datasets.

LaTeX

If you’ve mastered the way into LaTeX there are a few complete packets fulfilling different needs of critical editors waiting for you. Just to mention it first here there is Ledmac (documentation) which was developed by Peter R. Wilson and which is a portation of Edmac to LaTeX (later was maintained by Dirk-Jan Dekker and now by Vafa Khaligi, recently 0.7). A FAQ and a showcase could be found on Dekker’s page (here). There is a mailing list on Ledmac to be found here. Ledmac runs fine together with the Unicode/OpenType capable engine XeTeX (towards that on the blog of Michael Slouber here and here so far). There are also subpackets like Ledarab, Ledpar is for parallel typesetting within the critical environment (minimal running example here). The other comprehensive packet for critical editions at LaTeX is Ednotes (recently 1.3a) by Christian Tapp und Uwe Lück (CTAN). Colleagues of mine are working with Ednotes and say it also works pretty fine. A good introduction to Ednotes even in comparison with Ledmac is the article Ednotes – critical editions typesetting with LaTeX [originally: TUGboat 24 (2003), 224-36]. To choose TeX Live as your TeX distribution – not only on Linux where it is the standard everywhere – is I would say always a good idea, and Ledmac and Ednotes are included here so you don’t have to install nothing manually (separate packet texlive-humanities at Debian and Ubuntu, TeX Live 2007 is it at Lenny/Karmic, 2009 at Squeeze/Lucid). TeX Live 2009 includes also the Harvard/Kyoto (or if you like: Kyoto/Harvard) input mapping (CTAN, on Debian/Ubuntu in the packet texlive-xetex). Lately appeared also Itrans input mappings for Devanagari and Kannada (see here) which makes XeTeX even more relevant for Indology (captured!).

Poemscol (currently 2.53) by John Burt of Brandeis University is for typesetting especially poetry. It pretty mature and well documented (see here), Burt also wrote articles on the packet, Typesetting critical editions of poetry [TUGboat 22,4 (2001), 353-61] and Using poemscol for critical editions of poetry (PracTeX 03/2005, with example sources). Poemscol is also included in TeX Live (to be found also in texlive-humanities on Debian and Ubuntu).

A pretty versatile extension packet for LaTeX is David Kastrup’s Bigfoot which provides multiple footnote levels next to many other enhancements of the standard footnotes (paragraphed footnotes etc. etc., a forerunner of that is Manyfoot). Kastrup presented the paper The bigfoot bundle for critical editions at the EuroTeX 2005, and Benefits, care and feeding of the bigfoot package at the BachoTeX 2007. There are also two presentations towards Bigfoot at the BachoTeX 2008 available as videos, State of the ‘bigfoot’ package and Beauty and the beast – design and implementation notes for ‘bigfoot’. Bigfoot ist also included in TeX Live 2009 (texlive-latex-extra at Debian and Ubuntu).