rfor-looptable-relationships

R: Iterate through a for loop to print multiple tables


In the house price prediction dataset, there are about 80 variables and 1459 obs.
To understand the data better, I have segregated the variables which are 'char' type.

char_variables = sapply(property_train, is.character)  
char_names = names(property_train[,char_variables])  
char_names

There are 42 variables that are char datatype.
I want to find the number of observations in each variable.
The simple code for that would be:

table(property_train$Zoning_Class)  

    Commer    FVR    RHD    RLD    RMD 
        10     65     16   1150    218

But repeating the same for 42 variables would be a tedious task.
The for loops I've tried to print all the tables show error.

for (val in char_names){  
    print(table(property_train[[val]]))
    }


    Abnorml AdjLand  Alloca  Family  Normal Partial 
        101       4      12      20    1197     125 

Is there a way to iterate the char_names through the dataframe to print all 42 tables.

str(property_train)

    'data.frame':   1459 obs. of  81 variables:  
     $ Id                       : int  1 2 3 4 5 6 7 8 9 10 ...  
     $ Building_Class           : int  60 20 60 70 60 50 20 60 50 190 ...  
     $ Zoning_Class             : chr  "RLD" "RLD" "RLD" "RLD" ...  
     $ Lot_Extent               : int  65 80 68 60 84 85 75 NA 51 50 ...  
     $ Lot_Size                 : int  8450 9600 11250 9550 14260 14115 10084 10382..   
     $ Road_Type                : chr  "Paved" "Paved" "Paved" "Paved" ...  
     $ Lane_Type                : chr  NA NA NA NA ...  
     $ Property_Shape           : chr  "Reg" "Reg" "IR1" "IR1" ...  
     $ Land_Outline             : chr  "Lvl" "Lvl" "Lvl" "Lvl" ...  

Solution

  • Actually, for me your code does not give an error (make sure to evaluate all lines in the for-loop together):

    property_train <- data.frame(a = 1:10,
                     b = rep(c("A","B"),5),
                     c = LETTERS[1:10])
    
    char_variables = sapply(property_train, is.character)
    char_names = names(property_train[,char_variables])
    char_names
    
    table(property_train$b)
    
    for (val in char_names){
      print(table(property_train[val])) 
    }
    

    You can also get this result in a bit more user-friendy form using dplyr and tidyr by pivoting all the character columns into a long format and counting all the column-value combinations:

    library(dplyr)
    library(tidyr)
    
    property_train %>% 
      select(where(is.character)) %>% 
      pivot_longer(cols = everything(), names_to = "column") %>% 
      group_by(column, value) %>% 
      summarise(freq = n())