pythonsynset

Merging lists into one as a function output


I created a variable containing a string, and created a function to iterate over each word of this string, to find the corresponding synonyms and return it to a list:

import itertools
str_1 = "Help, Describe, AI, biology, data, machine learning, country"
def process_genre(str_1):
    for genre in str_1.split(", "):
        result = []
        for syn in wordnet.synsets(genre):
            for l in syn.lemmas():
                result.append(l.name())
            print(result)

process_genre(str_1)

The issue is that the result returns repetive output, depending on the number of synonym available on the synonym function, as you can see here:

['aid', 'assist', 'assistance', 'help', 'assistant', 'helper', 'help', 'supporter', 'aid', 'assistance', 'help', 'avail', 'help', 'service', 'help', 'assist', 'aid', 'help', 'aid', 'help', 'facilitate', 'help_oneself', 'help', 'serve', 'help', 'help', 'avail', 'help', 'help']
['describe', 'depict', 'draw', 'report', 'describe', 'account', 'trace', 'draw', 'line', 'describe', 'delineate', 'identify', 'discover', 'key', 'key_out', 'distinguish', 'describe', 'name']
['Army_Intelligence', 'AI', 'artificial_intelligence', 'AI', 'three-toed_sloth', 'ai', 'Bradypus_tridactylus', 'artificial_insemination', 'AI']
['biology', 'biological_science', 'biology', 'biota', 'biology']
['data', 'information', 'datum', 'data_point']
[]
['state', 'nation', 'country', 'land', 'commonwealth', 'res_publica', 'body_politic', 'country', 'state', 'land', 'nation', 'land', 'country', 'country', 'rural_area', 'area', 'country']

What I would like instead:

['account', 'ai', 'AI', 'aid', 'area', 'Army_Intelligence', 'artificial_insemination', 'artificial_intelligence', 'assist', 'assistance', 'assistant', 'avail', 'biological_science', 'biology', 'biota', 'body_politic', 'Bradypus_tridactylus', 'commonwealth', 'country', 'data', 'data_point', 'datum', 'delineate', 'depict', 'describe', 'discover', 'distinguish', 'draw', 'facilitate', 'help', 'help_oneself', 'helper', 'identify', 'information', 'key', 'key_out', 'land', 'line', 'name', 'nation', 'report', 'res_publica', 'rural_area', 'serve', 'service', 'state', 'supporter', 'three-toed_sloth', 'trace']

So to summarize, I would like the get as an output: a single list, of ALL the synonyms of my given string (or list) in order to merge it to the initial list. The idea is to increase the number of words to perform some NLP later on.

I have been having an hard time figuring out how to get where I want to go but can't find anything satisfying. I believe it has to do with the list of synset format. I can't use the set() fonction or merge different lists into one as a result of a function.


Solution

  • Don't print, use return instead. Also you need to reorganize the code to initialize result before the loop and print/return it after the loop.

    def process_genre(str_1):
        result = []
        for genre in str_1.split(", "):
            for syn in wordnet.synsets(genre):
                for l in syn.lemmas():
                    result.append(l.name())
        return result
    
    print(process_genre(str_1))
    

    NB. you can print instead of return if you really want