However, a problem is that binary files, like .ODT, don't work well with revision control systems.
Firstly, they aren't diff-able so you have to also keep a .TXT file, in the repository, in sync with the .ODT. This is error-prone and the .TXT diffs don't reflect formatting changes.
Secondly, because OpenOffice.org files are compressed (.ZIP format), the binary deltas that revision control systems (e.g. SVN) use to save space fall apart for even the tiniest of changes to documents, due to the characteristics of compression.
But thanks to the second anonymous commenter on that blog post, I came up with a new way of revision-controlling OpenOffice.org documents:
What I now do is keep the unzipped OpenOffice.org document in revision control, not the binary .ODT file. This means that I can diff the
content.xml
between revisions, not bother keeping a .TXT file in sync and avoid SVN's space-inefficient binary deltas since I'm really only keeping text files in the repository (more on this later).If you checkout such an unzipped OpenOffice.org document from SVN, you can use my magic Makefile to reconstruct the .ODT from these unzipped contents by typing
make
. It's just like adding water to milk powder.And every time you make a change to the .ODT file, the same
make
command works the other way and updates the unzipped contents to match the changed .ODT. Then you can svn commit
a new revision.Caveats:
* Warning:
svn status
will not report changes to the .ODT, until you type make
. Be careful or you might think that your checkout has no local changes and you decide to delete the checkout to "save space"! A "foolproof" way to get around this problem is to skip the svn propedit svn:ignore . [and type kolourpaint-developer-guide.odt]
step.* It actually does store a binary file in revision control, namely
Thumbnails/thumbnail.png
. I haven't dared try to work around this though, for fear that OpenOffice.org won't like me playing with it.* Some files such as
layout-cache
and settings.xml
, while not binary unlike Thumbnails/thumbnail.png
, change on every save and probably shouldn't be revision controlled either.* You can't have 2 people working on the same document as merging two
content.xml
files is asking for trouble. But at least you can diff between revisions.* A lot of lines in
content.xml
may change in response to even a small layout change due to lots of tags changing control numbers.If you try this scheme, please let me know how it goes.
BTW, in the end, my thesis turned out to be 192 pages (probably, about 100 pages too long :)). It was on porting the L4 microkernel to processors without virtual memory, specifically the Blackfin processor. It was written in a rush so apologies for the awful number of spelling and grammatical errors!