Reproducible research with Stata

Anyone who has used R for statistical analysis would be familiar with the incredible power and ease of use of RMarkdown. Having the code, data and the narrative together in one document enables reproducible research. More than anything else, it makes life easy.

As a fan of Stata, I wasn’t as satisfied with the stock options in Stata. As usual, user contributed packages come to the rescue. It may not be as elegant or extensive as the RMarkdown – for instance you can’t make HTML5 presentations with Stata programmatically. However, with LaTeX you can create beautiful Beamer presentations directly from a single text file written in Markdown – an easy to use markup system. The opportunities are nearly infinite.

What you will need ?

  1. Stata 14 or above
  2. Markstat package ( install using ssc install markstat)
  3. Whereis package (install using ssc install whereis)
  4. Pandoc – install it from the official Pandoc site

Once you have installed Pandoc, install Markstat and Whereis packages in Stata. The next step is to tell Stata where Pandoc is located – so that it can convert the markdown file into a html document.

If you are on, windows and installed Pandoc in the default location, type this in Stata

If you are on a Mac, fire up the terminal and type

whereis pandoc

Copy the location of the pandoc executable, and use it to tell Stata where it is located , like so

whereis pandoc /usr/bin/pandoc

Now that you are done, get back to Stata. Open Stata do file editor and start typing your narrative. The code blocks are separated by a tab. This is ridiculously easy and is even easier than the official dyndoc option in Stata 15.

Once you are done, save the file as filename.stmd. Note the extension. If your operating system hides file extensions by default, make sure you display them. Otherwise you might save the file as which is not what we want. Now you can generate the final document which you can share with colleagues by typing

markstat using filename.stmd,bundle

The bundle option is to ensure that any image that is generated is knit into the html file. You can omit it if your stmd file doesn’t have any images, but it find it a good habit to use the option anyway.

Here’s my brief screencast of how it is done. (Sans audio)

The best part is, since the source document is essentially a text file, it can be easily shared with colleagues too.

Note: If you want pdfs to be generated you need to have LaTeX installed in your system. (MikTeX for windows, TeXLive for Linux and MacTeX for the Mac)


What do you think? Let me know

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s