sas

SAS - Modifying key-value pairs in a macro variable


I have a macro variable with key-value pairs, separated by a space. Sometimes, a value itself can consist of key-value pairs, but now separated with a semi-colon (and one semi-colon at the end) and enclosed by double quotes. Single values are sometimes enclosed by double quotes and sometimes not. I would like to be able to adjust the contents of this macro variable in a structured and flexible way.

My macro variable has the following structure:

%LET connection_string = KEY1=value1 KEY2="Value2" KEY3="KEY3A=VALUE3A;KEY3B=VALUE3B;KEY3C=VALUE WITH SPACES3C;" KEY4="KEY4A=VALUE4A;KEY4B=VALUE4B;";

I would like to be able to add, delete or update key-value pairs. The exact number of (sub-)key-value pairs that exist in the connection_string is not known beforehand. The modifications should be condition-based, e.g.:

The key-value pairs that I need to modify differ per project, but suppose that I need to add/delete/update the following key-value pairs:

My idea would be to:

  1. decompose the string in the macro variable into a dataset
  2. decompose any key-value pairs into another dataset, if the value of that pair has at least 1 semi-colon
  3. modify the datasets according to the changes that need to be made
  4. roll-up the datasets and compose the connection_string and store this in a new macro variable connection_string_upd

But perhaps there is a better way as well.

Dataset based on original macro variable:

| key  | value
__________________________________________________________________
| KEY1 | value1
| KEY2 | "Value2"
| KEY3 | "KEY3A=VALUE3A;KEY3B=VALUE3B;KEY3C=VALUE WITH SPACES3C;"
| KEY4 | "KEY4A=VALUES4A;KEY4B=VALUE4B;"

Dataset with modifications

| key  | value
___________________________________________________________________________________
| KEY1 | value1
| KEY2 | "Value2"
| KEY3 | "KEY3A=VALUE3A;KEY3B=NEW_VALUE3B;KEY3C=VALUE WITH SPACES3C;KEY3D=VALUE3D;"
| KEY4 | "KEY4A=VALUE4A;"
| KEY5 | "VALUE5"

The result should be:

%PUT &connection_string_upd.;
KEY1=value1 KEY2="Value2" KEY3="KEY3A=VALUE3A;KEY3B=NEW_VALUE3B;KEY3C=VALUE WITH SPACES3C;KEY3D=VALUE3D;" KEY4="KEY4A=VALUE4A;" KEY5="VALUE5";

How would I best approach this?

It feels like this problem of decomposing/composing seems mostly suited for data steps. I tried to have a go, but these double quotes are making my life hard. Code snippets below didn't get my where I want to go. Hopefully, someone can help.

DATA test;
    INPUT input_text : $1024.;
    INFILE DATALINES DSD;
    input_text_resolved = DEQUOTE(RESOLVE(QUOTE(input_text)));
    PUT input_text_resolved;
    DATALINES;
&connection_string.
    ;
RUN;

/* this gives a lot of errors */
DATA test2;
    /* try to double the double quotes (to escape them) */
    DO i=1 TO COUNTW(COMPRESS("&connection_string.", '"', '""'));
        item = SCAN(COMPRESS("&connection_string.", '"'), i, ' ');
        OUTPUT;
    END;
RUN;

Solution

  • It's clunky but this parses connection string and updates. You should be able to adapt parse the update string.

    %LET connection_string = KEY1=value1 KEY2="Value2" KEY3="KEY3A=VALUE3A;KEY3B=VALUE3B;KEY3C=VALUE WITH SPACES3C;" KEY4="KEY4A=VALUE4A;KEY4B=VALUE4B;";
    data test master(keep=id sid value);
       string = symget('connection_string');
       length pair pair2 $128 id sid $32 value1-value2 $128;
       do i = 1 to countw(string,,'qs');
          pair = scan(string,i,,'sq');
          put pair =;
          p = indexc(pair,'=');
          id = substr(pair,1,p-1);
          value1 = substr(pair,p+1); 
          if indexc(value1,'=') then do;
             value1 = dequote(value1);
             do j = 1 to countw(value1,';');
                pair2 = scan(value1,j,';');
                if not missing(pair2) then do;
                   p2 = indexc(pair2,'=');
                   sid = substr(pair2,1,p2-1);
                   value2 = substr(pair2,p2+1);
                   value = value2;
                   output;
                   end;
                end;
             end;
          else do;
             value = value1;
             output;
             end;
          end; 
       run;
    proc print;
       run;
    
    data trans;
       length id sid $32 value $64;
       input id sid value &;
       cards4;
    KEY3 KEY3B NEW_VALUE3B (update in KEY3)
    KEY3 KEY3D VALUE3D (add in KEY3)
    KEY4 KEY4B _
    KEY5 . "VALUE5" (add as KEY5)
    ;;;;
       run;
    proc print;
       run;
    
    data updates;
       update master trans;
       by id sid;
       run;
    proc print;
       run;
    

    enter image description here