This is a sample git project to demonstrate how to track progress on a thesis/report or whatever else document written in the Microsoft Office Word .docx file format.
It is primarily intended to work on Microsoft Windows operating systems; functionality on other platforms has not been tested.
Tracking your Microsoft Office-based thesis with git works best with Github Desktop.
To generate meaningful and human-readable diffs, the following packages should be installed on your desktop:
-
Git for Windows
The Windows distribution ofgit. Make sure to enable Linux bash commands during the installation process. -
Github Desktop (optional)
This is a really handygitGUI client for Windows which eases repository management and committing. Its use is highly recommended, but every other GUI will likely work, too.
Download the latest release or clone this repository to your local machine and run setup.sh.
This script will create the following entries in the projects .git\config:
[diff "docx"]
textconv = tools/pandoc --from=docx --to=markdown --track-changes=all
prompt = false
binary = true
[diff "pptx"]
textconv = sh -c 'tools/pptx2md --disable-image --disable-wmf "$0" -o ~/.cache/git/presentation.md >/dev/null && cat ~/.cache/git/presentation.md'
cachetextconv = true
prompt = false
binary = true
[diff "pdf"]
textconv = sh -c 'tools/pdftotext -simple -enc UTF-8 "$0" -'
cachetextconv = true
prompt = false
binary = true
[core]
hooksPath = tools/hooks
This configuration will enable the generation of user-readable diffs inside of Github Desktop.
Note: Do not forget to unpack pandoc.zip in the tools folder.
The hooksPath setting tells git to look for hooks in the tracked hooks folder. See Publishing for an explanation of the included post-commit hook.
You can of course also add this configuration settings to your global .gitconfig to make it available for all your projects.
While writing your document in MS Word, stick to some basic rules to avoid problems with document loading times, crashes and other stuff that people keep complaining about. Here are some of those as an entry point:
-
Insert your images as links instead of embedding them in the document.
Managing your images in theimagesfolder has several advantages:.docxfiles are just archives. If you embed an image, it is stored inside themediafolder of the archive and your documents' file size increases. Largedocxfiles are more prone to crashes than small ones.- Linked images are automatically updated in the
docxfile whenever you replace them with a new version in theimagesfolder. No need to manually replace (and most often resize) them inside the document. - The
imagesfolder is tracked bygit. If you replace an image and want to revert the changes later, just click through your history and revert the changes.
If you want to follow this guide, you can add a linked image to your document via
Insert > Quick Parts > Fields > IncludePicture
Enterimages/YourImageName.extensioninto theFile Namefield and you're ready to go.
You can find several helpful scripts in the tools directory. Among them are the three binaries that provide human-readable diffs for Microsoft Word documents, PDFs and Microsoft PowerPoint presentations.
Further, you will find a small Python project called py-pdf which includes scripts to manipulate PDF files generated from your document. These allow you to modify and tailor the PDF TOC to your needs and to replace bitmap graphics embedded in the PDF with vector images.
See the corresponding ReadMe files.
This package comes with two pre-configured automatic release methods:
-
Using the
post-commithook
Inside thehooksfolder you can find apost-commithook. This is enabled by default when running the installation script and is the primary way to automate releases on a local machine.The hook is triggered by commits containing a commit message composed like
Prepare v7.4, wherePrepare vis the keyword the hook checks against and the following version number will be used for the release. If a valid commit is detected and it contains changes to one or more.docxfiles, the hook copies these file to thereleasesfolder and submits aRelease _file name_ v7.4commit to the repository. -
Using the
github workflow
Inside the.github/workflowsfolder you can find a Github action filerelease.yml.dist. This action triggers a release action similar to the one executed by the git hook, but only when changes to a.docxfile are pushed to the Github repository and the commit message containsPrepare v(see the description of the hook).To enable the action, simple remove the
.distextension from the workflow file. -
Manual releases
Of course, you can also manually release a new version of a document by- commiting all changes,
- copying the document to be released to the
releasesfolder, - renaming the new file with
vX.x, - comitting the new release to the repository.
Any improvements, suggestions and questions about this project are welcome at any time.
Feel free to open an issue for discussion if you think you found a mistake, have suggestions for improvement or extension or whatever else is on your mind.
This project is published unter the MIT License.
For more information, please refer to the LICENSE file.