pythonmatplotlibnetworkxmpld3mplcursors

mpld3.show() returns Object of type int is not JSON serializable


I want to create a network where you can hover over each label to read it interactively.

I am using jupyter lab, specs are: Selected Jupyter core packages...

IPython          : 7.6.1
ipykernel        : 5.1.1
ipywidgets       : 7.6.5
jupyter_client   : 7.0.6
jupyter_core     : 4.8.1
jupyter_server   : not installed
jupyterlab       : 1.0.2
nbclient         : not installed
nbconvert        : 5.5.0
nbformat         : 4.4.0
notebook         : 6.0.0
qtconsole        : 4.5.1
traitlets        : 4.3.2

When I run this code in a jupyter notebook:

import matplotlib.pyplot as plt
import numpy as np
import mpld3

fig, ax = plt.subplots(subplot_kw=dict(axisbg='#EEEEEE'))
N = 100

scatter = ax.scatter(np.random.normal(size=N),
                     np.random.normal(size=N),
                     c=np.random.random(size=N),
                     s=1000 * np.random.random(size=N),
                     alpha=0.3,
                     cmap=plt.cm.jet)
ax.grid(color='white', linestyle='solid')

ax.set_title("Scatter Plot (with tooltips!)", size=20)

labels = ['point {0}'.format(i + 1) for i in range(N)]
tooltip = mpld3.plugins.PointLabelTooltip(scatter, labels=labels)
mpld3.plugins.connect(fig, tooltip)

mpld3.show()

That I obtained from here, a new window opens, with interactive labels, as expected and identical to the example in the hyperlink.

My own data is:

index col_A
0     6840
1     6640
2      823
3    57019

index col_B
0     7431
1     5217
2     7431
3    57019

For a network, these are pairs of node labels like this:

col_A  col_B
6840   7431
6640   5217
823    7431
57019  57019

So the output network should have three clusters:

6840-7431-823
6640-5217
57019-57019

When I run this code, which is almost identical to the example code above:

import matplotlib.pyplot as plt
import numpy as np
import mpld3
import mplcursors



import networkx as nx
#G = nx.path_graph(4)
#pos = nx.spring_layout(G)

G = nx.from_pandas_edgelist(final_net,'col_A','col_B',['col_A', 'col_B'])
print(final_net['col_A'][0:10])
print(final_net['col_B'][0:10])

edge_labels = nx.get_edge_attributes(G, "Edge_label")
pos = nx.spring_layout(G)


fig, ax = plt.subplots(subplot_kw=dict(facecolor='#EEEEEE'))
scatter = nx.draw_networkx_nodes(G, pos, ax=ax)
nx.draw_networkx_edges(G, pos, ax=ax)

labels = G.nodes()
tooltip = mpld3.plugins.PointLabelTooltip(scatter, labels=labels)
mpld3.plugins.connect(fig, tooltip)
mplcursors.cursor(hover=True)

mpld3.show()

I do get the correct static image:

enter image description here

But I get an error:

TypeError: Object of type int is not JSON serializable

And the network doesn't open in a new window that I can interact with (ideally the interactive network would remain in jupyter anyway).

I changed the object types to string to see what happened with:

final_net['col_A'] = pd.to_numeric(final_net['col_A'])
final_net['col_B'] = pd.to_numeric(final_net['col_B'])

With the output:

col_A    int64
col_B    int64

But the error remains the same. When I remove the last line, mpld3.show() , the error disappears, so I just get a static image as an output, with no error, but no interactivity either.

I uninstalled and re-installed with conda as per here (which keeps the same error) and then I dumped to JSON as per here

by doing:

import json
import numpy as np

data = [[6840, 7431], [6640, 5217], [823, 7431],[57019,57019]]
final_net = pd.DataFrame(data, columns = ['col_A', 'col_B'])

class NumpyEncoder(json.JSONEncoder):
    """ Special json encoder for numpy types """
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

#dumped = json.dumps(final_net, cls=NumpyEncoder)

#with open(path, 'w') as f:
#    json.dump(dumped, f)
    
final_net['col_A'] = json.dumps(final_net['col_A'],cls=NumpyEncoder)
final_net['col_B'] = json.dumps(final_net['col_B'],cls=NumpyEncoder)

When I dump to json and then rerun my network code again, it outputs:

0    "{\"0\":6840,\"1\":6640,\"2\":823,\"3\":57019}"
1    "{\"0\":6840,\"1\":6640,\"2\":823,\"3\":57019}"
2    "{\"0\":6840,\"1\":6640,\"2\":823,\"3\":57019}"
3    "{\"0\":6840,\"1\":6640,\"2\":823,\"3\":57019}"
Name: Entrez Gene Interactor A, dtype: object
0    "{\"0\":7431,\"1\":5217,\"2\":7431,\"3\":57019}"
1    "{\"0\":7431,\"1\":5217,\"2\":7431,\"3\":57019}"
2    "{\"0\":7431,\"1\":5217,\"2\":7431,\"3\":57019}"
3    "{\"0\":7431,\"1\":5217,\"2\":7431,\"3\":57019}"

And this image (which is wrong), and no interactivity. enter image description here

I'm wonder if someone could show me how to edit my code to make the interactive feature appear (ideally in a jupyter notebook, if not it's ok if it opens in a new window).


Solution

  • The problem seems to be that G.nodes() isn't a list of labels. You can get the node numbers or labels via converting it to a list (list(G.nodes())).

    An updated version could look like:

    import matplotlib.pyplot as plt
    import networkx as nx
    import pandas as pd
    import numpy as np
    import mpld3
    
    final_net = pd.DataFrame({'col_A': [6840, 6640, 823, 57019],
                              'col_B': [7431, 5217, 7431, 57019]})
    G = nx.from_pandas_edgelist(final_net, 'col_A', 'col_B', ['col_A', 'col_B'])
    print(final_net['col_A'][0:10])
    print(final_net['col_B'][0:10])
    
    edge_labels = nx.get_edge_attributes(G, "Edge_label")
    pos = nx.spring_layout(G)
    
    fig, ax = plt.subplots(subplot_kw=dict(facecolor='#EEEEEE'))
    scatter = nx.draw_networkx_nodes(G, pos, ax=ax)
    nx.draw_networkx_edges(G, pos, ax=ax)
    
    labels = list(G.nodes())
    tooltip = mpld3.plugins.PointLabelTooltip(scatter, labels=labels)
    mpld3.plugins.connect(fig, tooltip)
    
    mpld3.show()
    

    mpld3 with a networkx graph