-->

a way to track office-suite documents with VCS?

Jan 21, 2010

Nice to hear Sofia's happy with dropbox as a solution for finding the latest version of her dissertation. I was thinking about what she could do if she had to track older or alternative versions of her dissertation, perhaps even offline, instead of only the latest version in Dropbox. Of course I thought of Mercurial which I use for my SCM needs. The problem with Mercurial is that it is good at working with text files, not binary files like those of popular office suites. So I searched a bit and found a possible solution (although I haven't implemented and tested it yet):

David Heffelfinger posted about OpenOffice.org Document Version Control With Mercurial in which he writes about using the flat version of the OpenOffice.org ODT format. According to him this solution is not satisfactory because even in the flat format, a single letter change will change all sorts of metadata in other parts of the file. Due to this it's hard to distinguish between the important changes between two versions and the inconsequential. Instead of going this way he is currently using the oodiff hack from the Mercurial website.

Obviously a hack would not be an acceptable solution for Sofia to adopt, except if she desperately needed the ability to track changes and could live with using OpenOffice instead of Word (or whatever she's using). But a comment on David's post refers to an interesting tool called Beyond Compare which seems to be able to generate differences both for Microsoft Office XML files and OpenOffice ODT files! They claim integration with popular DVCS's. I wonder how easy it would be to integrate Beyond Compare into say TortoiseHG. Has anyone done this? Maybe I will try it sometime soon.

Any other suggestions for tracking office suite documents with a DVCS?