intersystems-cachemumps

String Replacing


In your opinion what would be the best way to replace something in a string without using $R? I've written a global and I'm trying to replace PETER(s) with PAUL, but not use $R. Here's what iteration of what I thought would work, but it just replaces the first PETER. What would you guys suggest, for multiple Peters on the same line?

Start  
SET ary="^XA"
SET queryary=$QUERY(@ary@(""))
WRITE !,@queryary
FOR   {
SET queryary=$QUERY(@queryary) 
    QUIT:queryary=""  
    w !,$p(@queryary,"PETER",1)_"PAUL"_$p(@queryary,"PETER",2,$l(@queryary,"PETER"))  

}
  QUIT

This is my second try, but I still have to run it multiple times for it to perform all the changes. Is there something missing in my Loop?

  Start  
  N ary
  S ary="^XA"
  S queryary=$Q(@ary@(""))
  S FROM="PETER"
  S TO="PAUL"
  W !,@queryary
  F   S queryary=$Q(@queryary) Q:queryary=""  w !,@queryary   d 
  . f  s $E(@queryary,$F(@queryary,FROM)-$L(FROM),$F(@queryary,FROM))=TO_" "     Q:ary'["PETER"  
  QUIT

Solution

  • If you're working in Cache and want a utility for this, %GCHANGE is a very powerful program just for doing what you described. I've always used it as a utility and never called it from a program but I believe there are labels where you an call and pass in your parameters.

    The other thing is that you are using multiple indirections in a loop which will slow down your program. I suggest combining all of that into a string and use the E(X)ecute command for indirection on the entire string. You can see the example provided below.

    I included two different methods of replacing string. One uses $P and $L similar to what Evgeny Shvarov suggested and the second method is using $F and $E.

    The second method on average performed 33% faster on a global of 100000 nodes and 4 replacement per node.

    I will include my data gen. and testing functions I wrote as well. I wrote these in legacy MUMPS code so it would work cross platform.

    UPDATE: I just checked GTM documentation. %GCE is a similar utility that is avialable in GTM. UPDATE: I Change the REPLACE function to properly account for the LISA to ELISA problem described by C4xuxo. It still performs faster than using $P $L.

    UPDATE: Made an adjustment to the value of PS in the REPLACE function to fix a bug;

    ;GLOBAL REPLACE METHOD 
    GLBREPLACE(GLB,STR1,STR2) ;(GLOBAL NAME, STRING TO MATCH, STRING TO REPLACE WITH)
     S CMD="N I S I="""" F  S I=$O("_GLB_"(I)) Q:I=""""  S "_GLB_"(I)=$$REPLACE("_GLB_"(I),"""_STR1_""","""_STR2_""")"
     X CMD Q
    
    ;STRING REPLACE METHOD
    REPLACE(STR,V1,V2) ;(INPUT STRING, STRING TO MATCH, STRING TO REPLACE WITH)
     N I,L,F1,F2,PS S PS=0,L=$L(STR,V1) F I=1:1:L-1 S F2=$F(STR,V1,PS),F1=F2-$L(V1),$E(STR,F1,F2-1)=V2,PS=F2+$L(V2) 
     Q STR
    
    
    
    ;======================================================================
    ;ADDITINAL FUNCTIONS
    
    ;THIS IS AN ALTERNATE METHOD, DOESN'T ADDRESS THE LISA TO ELISA PROBLEM
    REPLACE2(STR,V1,V2) 
     N I F I=1:1:$L(STR,V1)-1 S STR=$P(STR,V1)_V2_$P(STR,V1,2,$L(STR,V1))
     Q STR
    
    TESTGLBREPLACE ;THIS FUNCTION TESTS GLBREPLACE AND MEASURS PERFORMANCE
     S STIM=$ZTS S COUNT=100000
     D GENDATA(COUNT),GLBREPLACE("^XA","Peter","PAUL")
     S ETIM=$ZTS,TIMDIF=$P(ETIM,",",2)-$P(STIM,",",2),OCCURS=COUNT*4
     W !,"REPLACED "_OCCURS_" OCCURRENCES IN "_TIMDIF_" SECONDS"
     Q
    
    GENDATA(L) ;THIS FUNCTION GENERATES DATA FOR A GIVE COUNT(L=INTEGER)
     F I=1:1:L S ^XA(I)="Peter Piper picked a peck of pickled peppers; A peck of pickled peppers Peter Piper picked; If Peter Piper picked a peck of pickled peppers, Where's the peck of pickled peppers Peter Piper picked"
     Q