Converting Microsoft Word Files (doc, docx) to reStructuredText (rst)

This article describes how to convert Microsoft Word documents to reStructuredText. Everything should be done within a temporary directory with simplified filenames. So let’s assume you want to convert ‘am.docx’ to reStructuredText. The Word document can contain images.
You need:
A few simple steps:
  1. On the command line (either the old cmd or the PowerShell) go to the temporary directory that contains the Word document (e.g. C:\temp):
    cd c:\temp
  2. Convert ‘am.docx’ to ‘am.rst’ using pandoc
    pandoc.exe -f docx am.docx -t rst -o am.rst
  3. Extract the media files (e.g. images) from the Word document
    unzip .\am.docx

    and move it to current working directory

    mv .\word\media .
  4. All image files should be in the same file format, so convert eml and gif files to png.
    cd media

    to jump into the directory

    dir (to list all files)

    a) Either by hand:

    convert .\image2.gif .\image2.png
    convert .\image1.emf .\image1.png

    b) Or automatically by using mogrify (also part of ImageMagick):

    mogrify.exe -format png *.emf
    mogrify.exe -format png *.gif

    And clean up:

  5. rm *.gif
    rm *.emf
  6. Do not forget to search and replace .emf and .gif with .png in the .rst file with the editor of your choice (gvim or notepad++)
  7. Check the build by creating a quick Sphinx:
    run sphinx-quickstart (and follow the instructions)
    copy the file over the main doc in the source dir
    copy the media folder to source
    run “make.bat html” to create the a website and check the result.

Python 3.4 and Django on Ubuntu 13.10

Why bother about Python versions? 

I recently started a new project creating a web application. As I have a lot of Python programming experience I chose Python with Django over Ruby on Rails. At the beginning of a new project I prefer using the latest versions of the frameworks the application will depend on. Starting now with Python 2.7 would mean that sooner or later there would be additional work porting the codebase to Python 3. Yesterday, Python 3.4 was released. One of the biggest improvements is that it has pip already included which makes handling virtual environments and installing the latest release of Django really easy.

Building Python from source

The downside is, that Linux distributions do not include the latest Python release yet. Most of them still ship with Python 2.7 as default version. The next Fedora and Ubuntu releases might change that, but for now you need to compile it from source. Luckily that is not a hard task. Go to the download page and grab the latest Python release (recommended if you read the post later and a newer version was released) or past the following command into a terminal.

First make sure you have everything installed to compile Python from source.

sudo apt-get install build-essential

Before downloading create a temporary directory to make the cleanup easier. At the end you can just delete “tmpPython”.

mkdir tmpPython
cd tmpPython
wget --no-check-certificate
tar xvf Python-3.4.0.tgz

After the archive is extracted, cd into the source directory. Create a directory to install to, then run configure, build and install.

cd Python-3.4.0
sudo mkdir /opt/Python34
./configure --prefix=/opt/Python34 && make -j4
sudo make install

Now you have Python 3.4 installed on your system.
Add the the path containing the executable to your environment.

export PATH=/opt/Python34/bin:$PATH

Also make sure to add this line to your .bashrc file (or .zshrc if you’re using zsh).

echo "export PATH=/opt/Python3.4/bin:$PATH" >> $HOME/.bashrc

Creating a virtual environment

Go to the directory where you want to create the virtual environment. I recommend /opt if you collaborate with others within the environment (you have to create everything with sudo) or your home directory if you work alone. Then run pyvenv to create it.
pyvenv-3.4 djangoEnv
source djangoEnv/bin/activate

The bash prompt changes to

(djangoEnv) mpei@earth /opt

and that means that you are now within this virtual environment.
This command shows you what you have installed:

pip freeze

Installing Django

Just use pip to install the latest version of Django and its extension:
sudo pip install django django-extensions

And you’re done! You can check the installed versions by running “pip freeze” again. Maybe another blog post on Django and databases? Or the first steps in Django? We’ll see… bye bye!

Organizing C/C++ includes

After starting my new job programming in a big software project I spent some thought on organizing includes and give a recommendation. Here’s what I’ve come up with. As always, some things are obvious, some are not…

  1. You should only include what is necessary to compile the source code. Adding unnecessary
    includes means a longer compilation time, especially in large projects.
  2. Each header and corresponding source file should compile cleanly on its own. That
    means, if you have a source file that includes only the corresponding header, it
    should compile without errors. The header file should include not more
    than what is necessary for that.
  3. Try to use forward declarations as much as possible. If you’re using a
    class, but the header file only deals with pointers/references to
    objects of that class, then there’s no need to include the definition of
    the class. That is what forward declarations are designed for!

    // Forward declaration

    class MyClass;

  4. Note that some system headers might include others. But apart from a few
    exceptions, there is no requirement. So if you need both
    <iostream> and <string> include both, even if you can
    compile only with one of them.
  5. To prevent multiple-inclusions, with loops and all such attendant horrors is having an #ifndef guard.

    #ifndef _FOO_H
    #define _FOO_H
      …contents of header file…

  6. The order in which the includes appear (system includes and user includes) is up to the coding standard you follow.
  7. If you have the choice of picking a coding standard regarding the order at the beginning of a new project, I recommend to go from local to global, each
    subsection in alphabetical order. That way you can avoid introducing
    hidden dependencies. If you reverse the order and i.e. myclass.cpp
    includes <string> then <myclass.h>, there is no way to catch
    at build time that myclass.h may itself depend on string. So if later someone includes myclass.h but does not need string, he’ll
    get an error that needs to be fixed either in the cpp or in the header
  8. So the recommended order would be:
    • header file corresponding to its .cpp file
    • headers from the same component
    • headers from other components
    • system headers

If you use the Eclipse IDE (which I highly recommend), you can use a very nice feature that helps you organizing includes (“Source -> Organize Includes”).

Updating Eclipse

Eclipse 3.7 Indigo has been released! I had a lot of add-ons and did not want to reinstall all of them.

Is an update from 3.6 to 3.7 possible?

Yes! Simply go to “Install Software” and add 


to “Available Software Sites”. Check existing repositories and change them from “Helios” to “Indigo”. Then check for updates.

It worked like a charm!