plotgraphstata

Filter highest values for a plot


I have this code and I want only the highest values, i.e above 10 percent female share: graph hbar (percent) if Sector=="Professional and Personal Services", over(sex) over(Occupation, sort(1) descending) enter image description here

I was expecting to see only the most relevant variables.

input str20 Sector str10 Sex str20 Occupation
"Trade and Transportation" "Male" "Driver"
"Trade and Transportation" "Female" "Driver"
"Trade and Transportation" "Female" "Driver"
"Trade and Transportation" "Female" "Driver"
"Trade and Transportation" "Male" "Manager"
"Trade and Transportation" "Female" "Manager"
"Trade and Transportation" "Male" "Clerk"
"Trade and Transportation" "Female" "Clerk"
"Manufacturing" "Male" "Worker"
"Manufacturing" "Female" "Worker"
"Manufacturing" "Female" "Worker"
end

Solution

  • There are various ways to do this. Here is one.

    clear 
    input str20 Sector str10 Sex str20 Occupation
    "Trade and Transportation" "Male" "Driver"
    "Trade and Transportation" "Female" "Driver"
    "Trade and Transportation" "Female" "Driver"
    "Trade and Transportation" "Female" "Driver"
    "Trade and Transportation" "Male" "Manager"
    "Trade and Transportation" "Female" "Manager"
    "Trade and Transportation" "Male" "Clerk"
    "Trade and Transportation" "Female" "Clerk"
    "Manufacturing" "Male" "Worker"
    "Manufacturing" "Female" "Worker"
    "Manufacturing" "Female" "Worker"
    end
    
    gen female = 100 * (Sex == "Female") 
    egen pcfemale = mean(female), by(Occupation)
    
    graph hbar pcfemale if pcfemale > 10, ytitle(% female) over(Occupation, sort(1) descending) blabel(bar, format(%2.1f))
    

    If you wanted to look at particular sectors, use if.

    If you want to look at several sectors together, much depends on what you want to do, but the principle of calculating the percents first is likely to be just as important.