pythonpython-3.xlistmatrixsna

How to convert list of object to sparse matrix in python?


I have list of objects and I want to convert it to sparse matrix in python.

I want the matrix column to be tags and the rows to be titles (There may be duplicate tags between different titles)

I don't know what should I do?

data = [
  {
    'title': 'title1', 'tags': ['tag1', 'tag2', 'tag3']
  },
  {
    'title': 'title2', 'tags': ['tag1']
  }
]

I want something like this: enter image description here


Solution

  • For 0/1 matrix you can use next example:

    data = [
        {"title": "title1", "tags": ["tag1", "tag2", "tag3"]},
        {"title": "title2", "tags": ["tag1"]},
    ]
    
    # using sorted for having tag1 first, tag3 last:
    tags = sorted({t for d in data for t in d["tags"]})
    
    matrix = [[int(tt in d["tags"]) for tt in tags] for d in data]
    print(matrix)
    

    Prints:

    [[1, 1, 1], 
     [1, 0, 0]]
    

    For "pretty" print the matrix:

    data = [
        {"title": "title1", "tags": ["tag1", "tag2", "tag3"]},
        {"title": "title2", "tags": ["tag1"]},
    ]
    
    tags = sorted({t for d in data for t in d["tags"]})
    
    print(("{:<10}" * (len(tags) + 1)).format("", *tags))
    for d in data:
        print(
            ("{:<10}" * (len(tags) + 1)).format(
                d["title"], *[int(tt in d["tags"]) for tt in tags]
            )
        )
    

    Prints:

              tag1      tag2      tag3      
    title1    1         1         1         
    title2    1         0         0