countstataunique-values

Count of how many times a unique value appears across all columns and rows of a dataset?


I have unique IDs in rows where columns are the IDs of their 'sent' friends. To get a count of 'received' friends, I need to get a count for how many times an ID appears across all columns and rows of the dataset. This is easy in R, but I'd like to stay with Stata for this project.

ID F1_ID F2_ID F3_ID ID_mentions
1 2 3 4 4
2 4 1 4
3 1 2 3
4 2 1 3 3

Toy data above. Here, there are four mentions of ID #1, three mentions of ID #4, etc.

I want to generate a variable containing the count of how many times each ID value in the first column is mentioned in any column of the data set. This is illustrated in the ID_mentions column.


Solution

  • Turns out I wrote something in this territory. You would need to install this with ssc install tab_chi

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id f1_id f2_id f3_id)
    1 2 3 4
    2 4 1 .
    3 1 2 .
    4 2 1 3
    end
    
    tabm *id
    
               |                   values
      variable |         1          2          3          4 |     Total
    -----------+--------------------------------------------+----------
            ID |         1          1          1          1 |         4 
         F1_ID |         1          2          0          1 |         4 
         F2_ID |         2          1          1          0 |         4 
         F3_ID |         0          0          1          1 |         2 
    -----------+--------------------------------------------+----------
         Total |         4          4          3          3 |        14 
    

    EDIT To count all mentions:

    gen mentions = . 
    
    quietly forval i = 1/`=_N' { 
        egen work = anycount(*id), value(`=id[`i']')
        su work, meanonly 
        replace mentions = r(sum) in `i'
        drop work 
    }
    
    list