A Common Workflow¶
In this session we take a look at a typical workflow
that I find
to be a pretty common.
Note
This example could be done pretty easily all within stata
but this
workflow pattern is pretty common when starting to use python
to build your
datasets.
However if you have built up resources in stata
you may be more interested in using
data that you have already compiled in stata
and you want to explore some machine learning
in python
.
Exploring Data in Jupyter¶
The work often starts by finding and exploring data in jupyter
notebooks as in the below
demonstration notebook.
Local Notebook Option¶
You can download the notebook from here
You can get the data files here
Then browse to your download location and load jupyter:
jupyter notebook gravity-model-example.ipynb
the data files need to be located in a data
folder in the same directory as the notebook
Cloud Based Option¶
You can launch the notebook
Warning
Most of this notebook will work except the stata
calls via ipystata
(Optional) Saving the Exploration as a Script¶
You can then distill the needed steps to build a dataset
formula as a script
You can download a example python script
Note
This is a particularly useful step if your data needs to be updated in the future