spss

calculation with different values in SPSS


I have a general problem but I will give an example of what I am looking for:

Let's say I want to find the difference between the time it took an examinee to complete a test and the total test time. In SPSS I would write:

COMPUTE test_diff =  total_test_time - examinee_test_time.

This is simple. However, the total test time is different for each language (For example, in English it would be 60 minutes and in Spanish 70 minutes).

What is the syntax/function in SPSS that changes the total-test_time change according to the language?

DO IF looks cumbersome, since there are a lot of languages.

My logic says that I need some kind of "table" with all the languages and test_times. SPSS needs to identify which language the test was completed in and take the appropriate value of total test time. But I don't know how to do this in SPSS, or maybe there is a better solution.

I applied the solution suggested, adjusted for SPSS V.25:

GET  FILE='C:\Users\givol\Desktop\languages.sav'.
DATASET NAME languages WINDOW=FRONT.

DATASET ACTIVATE languages.
SORT CASES by language.


GET FILE='C:\Users\givol\Desktop\mydata.sav'.
DATASET NAME mydata WINDOW=FRONT.

DATASET ACTIVATE mydata.
SORT CASES by language.

MATCH FILES  /FILE=*
/TABLE=languages
/BY language.
EXECUTE.

COMPUTE test_diff = total_test_time - examinee_test_time.
EXECUTE.

However, since I have multiple cases with the same language, I get an error message, the total_test_time is added empty, and the computed variable are all null.


Solution

  • If there are only a few languages you can create a total_test_time variable by recoding from the language variable:

    recode language 
         ("English"=60)
         ("Hebrew" "Arabic"=65)
         ("Spanish"=70)
      into total_test_time .
    

    If you have a long list of languages this would also be cumbersome to write them all into the syntax. Also if there may be changes and additions to the list, you need an easier way to get them coded.
    So the second way to do it is by creating a dataset containing two columns of language and total_test_time (called "languages" in the syntax below), and then match it to your original data (the dataset "mydata" below) by language:

    * in order to match the two datasets they nees to be sorted first.
    dataset activate languages.
    sort cases by language.
    dataset activate gendata.
    sort cases by language.  
    
    * The original dataset "mydata" contains only the "language" 
    * column and not the "total_test_time" column. The following syntax 
    * will match each row with the right "total_test_time" value according 
    * to the row's "language" value.
    
    match files /file=* /table=languages /by language.
    execute.
    

    Summary: in the first method you created the total_test_time variable through syntax and in the second way you imported it from another table. In both cases you end up with the new variable which you can now use in your original calculation:

    COMPUTE test_diff = total_test_time - examinee_test_time.