stringxquerymarklogictype-conversionuntyped-variables

Xquery ( XDMP-ARGTYPE error ) the expected type of a function input is different from the actual type


I'm trying to remove stop words from a text in MarkLogic 8 using this function :

declare function rec:remove-stop-words($string, $stop_words)  {  
  (: This is a recursive function. :)  
  if(not(empty($stop_words))) then  
    rec:remove-stop-words(  
      replace($string, $stop_words[1], '', 'i'),  
      (: This passes along the stop words after 
         the one just evaluated. :)  
      $stop_words[position() > 1]  
    )  
  else normalize-space($string)  
};  

Here where I call it

for $r in /rec:Record
return
  rec:remove-stop-words(data($r/rec:Abstract), $stop_words}

It gives me the following error

XDMP-ARGTYPE: (err:XPTY0004) fn:replace((xs:untypedAtomic(" chapter utilized asymmetry of n..."), xs:untypedAtomic(" book interrelationship between ...")), "a", "", "i") -- arg1 is not of type xs:string?

The function expects a string type but the actual type is untypedAtomic. I don't know what to do! NOTE: (( The problem is not in the function because I've tried to use it for a different text and it worked well )).

I tried to the code by converting untypedAtomic to string by:

return
  <info>{rec:remove-stop-words(data(xs:string($r/rec:Abstract)), $stop_words)}</info>

but I got this error:

XDMP-ARGTYPE: (err:XPTY0004) fn:replace((" chapter utilized asymmetry of n...", " book interrelationship between ..."), "a", "", "i") -- arg1 is not of type xs:string


Solution

  • The problem is that when you iterate over /rec:Record and pass $r/rec:Abstract as input, at least one of your records is returning more than one rec:Abstract. The function signature for rec:remove-stop-words allows a sequence of values as input for $string, but the function body where you call fn:replace only handles input for a single value, so it throws an argument exception (given xs:string+ and expecting xs:string?).

    You can handle the sequence by iterating over rec:Abstract before you call the function:

    for $r in /rec:Record
    for $a in $r/rec:Abstract
    return
    rec:remove-stop-words($a, $stop_words)
    

    If you use stricter function signatures, it can help avoid problems like this, or at least make them easier to debug. For example, if you define your function to only allow a single input for the first parameter:

    rec:remove-stop-words($string as xs:string, $stop_words as xs:string*)
    ...
    

    This will throw a similar exception when $string is passed a sequence, but higher up the call stack, which can help make these types of errors a little more obvious.