So I know it's possible to read in either Stata categorical labels or values using the convert_categoricals parameter.
I was looking for a way to write/export a pandas dataframe to Stata and include the value labels. However all I could find was either
data_label : str, optional
for the dataset label
or
variable_labels : dict
for column names label,
but nothing for the values themselves.
Here is an answer to your question. It is probably not what you were expecting because I am not using pd.to_Stata
, but the Python integration developed on Stata 16.
The code below must be executed within Stata (from version 16 onwards). Briefly, I am generating a Pandas Data.Frame (df
) that I will export to Stata. The trick is to apply the labels on the values using the ValueLabel.setLabelValue()
functionality that comes from the sfi
library.
clear all
python:
from sfi import ValueLabel, Data
import pandas as pd
data = [['Eren Jaeger', 15, 1, 'Soldier' ] , ['Mikasa Ackerman', 14, 1, 'Soldier'], ['Armin Arlert', 14, 1 , 'Soldier'],['Levi Ackerman', 30, 2, 'Captain']]
#creating DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age', 'Rank_num', 'Rank'])
## Name Age Rank_num Rank
##0 Eren Jaeger 15 1 Soldier
##1 Mikasa Ackerman 14 1 Soldier
##2 Armin Arlert 14 1 Soldier
##3 Levi Ackerman 30 2 Captain
# Set number of observations in Stata
Data.setObsTotal(len(df))
#Create variables on Stata (from Python)
Data.addVarStr("Name",10)
Data.addVarDouble("Age")
Data.addVarInt("Rank_num")
#Store the content of "df" object from Python to Stata
Data.store("Name", None, df['Name'], None)
Data.store("Age", None, df['Age'], None)
Data.store("Rank_num", None, df['Rank_num'], None)
# HERE is where I solve your question!
# 1) Create the labels
ValueLabel.setLabelValue('rank_num_LABEL', 1, 'Soldier')
ValueLabel.setLabelValue('rank_num_LABEL', 2, 'Captain')
ValueLabel.getValueLabels('rank_num_LABEL')
# 2) Attach the labels to the created variable
#Attach the created label
ValueLabel.setVarValueLabel('Rank_num', 'rank_num_LABEL')
end
br
* At the end, you will see the following on the Stata browser
* Name Age Rank_num
* Eren Jaeger 15 Soldier
* Mikasa Ackerman 14 Soldier
* Armin Arlert 14 Soldier
* Levi Ackerman 30 Captain
In case you want to understand better the reasoning behind the code above, here are the references that I used to learn it.