pythonjupyter-notebookipython-magicjsoniq

Error running a magic programmatically via IPython's run_cell_magic


Consider the following program, which I wrote in two Jupyter Notebook cells.

Cell 1:

import rumbledb as rmbl
%load_ext rumbledb
%env RUMBLEDB_SERVER=http://public.rumbledb.org:9090/jsoniq

Cell 2:

%%jsoniq
parse-json("{\"x\":3}}").x

After executing

spark-submit rumbledb-1.21.0-for-spark-3.5.jar serve -p 9090

in a Git Bash console, when I run these two cells in order, the output of the second cell is

Took: 0.2607388496398926 ms
3

I'd like to rewrite cell 2 so as not to use the cell magic literally (%% syntax) but programatically, via a function. The reason I'd like to avoid using it literally is so that I can encapsulate it in a function, as I described in this post.

I tried the advice in the end of this answer and rewrote cell 2 as follows:

Cell 3:

from IPython import get_ipython
ipython = get_ipython()
ipython.run_cell_magic('jsoniq', '', 'parse-json("{\"x\":3}}").x')

However, when I ran this cell, I got the following error message:

Took: 2.2189698219299316 ms
There was an error on line 2 in file:/home/ubuntu/:


Code: [XPST0003]
Message: Parser failed. 

Metadata: file:/home/ubuntu/:LINE:2:COLUMN:0:
This code can also be looked up in the documentation and specifications for more information.
  1. Why did I get the error message, and how can I get rid of it?
  2. Why did I not get 3 in the output, and how can I get it?

Solution

  • This one's really subtle! It's one of the rare cases where Stack Overflow's syntax highlighting might directly help in understanding the problem.

    As you can see from the JSONiq docs, the input to parse-json must take in literal backslashes. That is, your input to the function should read:

    "{\"x\":3}"
    

    However, since ipython.run_cell_magic runs the input through Python first,1 the backlashes get interpreted as part of an escape sequence and you end up with this instead:

    "{"x":3}"
    

    To make the code work properly, you just need to pass in literal backslashes. Any method works, but I'm partial to raw strings:

    ipython.run_cell_magic('jsoniq', '',  r'parse-json("{\"x\":3}").x')
    

    Substituting in that line gave the following output:

    env: RUMBLEDB_SERVER=http://public.rumbledb.org:9090/jsoniq
    Took: 0.3014998435974121 ms
    3
    

    1 This isn't a problem for the actual cell magic version, which treats your parse-json call as straight-up JSONiq code without any Python processing.