statastata-macros

Conditional Division in Stata


I have a dataset that looks like the following:

  year      Gender        VP
                
  2023      Total           3                  
  2023       F              2            
  2023       M              1            
  2022      Total           6                 
  2022       F              4                  
  2022       M              2               
  2021      Total           2                
  ...

What I want to do is generate a variable VP_per that gives me the female percentage of VPs for every year.

Basically something that looks like

bysort year gender : gen VP_per = VP if gender==1 / VP if gender==2

where 1 is the numeric value for Female and 2 is the numeric value for Total.


Solution

  • Your data layout -- in which subtotals are held in separate observations -- is not especially suitable for most Stata purposes, but it can be accommodated.

    Here is some technique, which can be modified for numeric variables with value labels.

    clear 
    input int Year str5 Gender byte VP
    2023 "Total" 3
    2023 "F"     2
    2023 "M"     1
    2022 "Total" 6
    2022 "F"     4
    2022 "M"     2
    end
    
    * this hinges on sorting F M Total 
    bysort Year (Gender) : gen wanted = 100 * VP[1] / VP[3] 
    
    * need to modify for numeric variables 
    egen num = total(cond(Gender == "F", VP, .)), by(Year)
    egen den = total(cond(Gender == "Total", VP, .)), by(Year) 
    gen wanted2 = 100 * num/den 
    

    If Gender were numeric with 1 for Female and 2 for Total

    egen num = total(cond(Gender == 1, VP, .)), by(Year)
    egen den = total(cond(Gender == 2, VP, .)), by(Year)