mysqlsqlwordswordnetlexicon

Wordnet query to return example sentences


I have a use case where I have a word and I need to know the following things:

  1. Synonyms for the word (just the synonyms are sufficient)
  2. All senses of the word, where each sense contains - the synonyms matching that word in that sense, example sentences in that sense (if there), the part of speech for that sense.

Example - this query link. Screenshot for the word carry:

enter image description here

For each 'sense', we have the part of speech (like V), synonyms matching that sense, (like transport in the first sense, pack, take in the second sense, etc), example sentences containing that word in that sense (This train is carrying nuclear waste, carry the suitcase to the car, etc in first sense, I always carry money etc in the second sense, etc.).

How do I do this from a Wordnet MySQL database? I ran this query, it returns the list of meanings for the word:

SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';

How do I get the synonyms, example sentences, part of speech and synonyms specific to that sense for each sense? I queried the vframesentences and vframesentencemaps tables, saw example sentences with placeholders like %s, and based on the wordid column I tried to match them with the words table, but got awfully wrong results.

Edit:

For the word carry, if I run these queries, I get synonyms and sense meanings correctly:

1. select * from words where lemma='carry' //yield wordid as 21354
2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889
3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something"
4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011
5. select * from words where wordid=29630 //yields 'convey'

So all I now need is a way of finding the example sentence for the word carry in each of the 41 senses. How do I do it?


Solution

  • You can get the sentences from the samples table. E.g:

    SELECT sample FROM samples WHERE synsetid = 201062889;
    

    yields:

    The painting of Mary carries motherly love

    His voice carried a lot of anger

    So you could extend your query as follows:

    SELECT 
        a.lemma AS `word`,
        c.definition,
        c.pos AS `part of speech`,
        d.sample AS `example sentence`,
        (SELECT 
                GROUP_CONCAT(a1.lemma)
            FROM
                words a1
                    INNER JOIN
                senses b1 ON a1.wordid = b1.wordid
            WHERE
                b1.synsetid = b.synsetid
                    AND a1.lemma <> a.lemma
            GROUP BY b.synsetid) AS `synonyms`
    FROM
        words a
            INNER JOIN
        senses b ON a.wordid = b.wordid
            INNER JOIN
        synsets c ON b.synsetid = c.synsetid
            INNER JOIN
        samples d ON b.synsetid = d.synsetid
    WHERE
        a.lemma = 'carry'
    ORDER BY a.lemma , c.definition , d.sample;
    

    Note: The subselect with a GROUP_CONCAT returns the synonyms of each sense as a comma-separated list in a single row in order to cut down on the number of rows. You could consider returning these in a separate query (or as part of this query but with everything else duplicated) if preferred.

    UPDATE If you really need synonyms as rows in the results, the following will do it but I wouldn't recommend it: Synonyms and example sentences both pertain to a particular definition so the set of synonyms will be duplicated for each example sentence. E.g. if there are 4 example sentences and 5 synonyms for a particular definition, the results would have 4 x 5 = 20 rows just for that definition.

    SELECT 
        a.lemma AS `word`,
        c.definition,
        c.pos AS `part of speech`,
        d.sample AS `example sentence`,
        subq.lemma AS `synonym`
    FROM
        words a
            INNER JOIN
        senses b ON a.wordid = b.wordid
            INNER JOIN
        synsets c ON b.synsetid = c.synsetid
            INNER JOIN
        samples d ON b.synsetid = d.synsetid
            LEFT JOIN
        (SELECT 
            a1.lemma, b1.synsetid
        FROM
            senses b1
        INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
            AND subq.lemma <> a.lemma
    WHERE
        a.lemma = 'carry'
    ORDER BY a.lemma , c.definition , d.sample;