A Common Workflow

In this session we take a look at a typical workflow that I find to be a pretty common.

Note

This example could be done pretty easily all within stata but this workflow pattern is pretty common when starting to use python to build your datasets.

However if you have built up resources in stata you may be more interested in using data that you have already compiled in stata and you want to explore some machine learning in python.

Exploring Data in Jupyter

The work often starts by finding and exploring data in jupyter notebooks as in the below demonstration notebook.

Local Notebook Option

You can download the notebook from here

You can get the data files here

Then browse to your download location and load jupyter:

jupyter notebook gravity-model-example.ipynb

the data files need to be located in a data folder in the same directory as the notebook

Cloud Based Option

You can launch the notebook

Warning

Most of this notebook will work except the stata calls via ipystata

(Optional) Saving the Exploration as a Script

You can then distill the needed steps to build a dataset formula as a script

You can download a example python script

Note

This is a particularly useful step if your data needs to be updated in the future

Using the Dataset in Stata

You can then open stata and continue on with your statistical analysis

../../_images/stata-import-and-regress.png