I am trying to find words (specifically physical objects) related to a single word. For example:
Tennis: tennis racket, tennis ball, tennis shoe
Snooker: snooker cue, snooker ball, chalk
Chess: chessboard, chess piece
Bookcase: book
I have tried to use WordNet, specifically the meronym semantic relationship; however, this method is not consistent as the results below show:
Tennis: serve, volley, foot-fault, set point, return, advantage
Snooker: nothing
Chess: chess move, checkerboard (whose own meronym relationships shows ‘square’ & 'diagonal')
Bookcase: shelve
Weighting of terms will eventually be required, but that is not really a concern now.
Anyone have any suggestions on how to do this?
Just an update: Ended up using a mixture of both Jeff's and StompChicken's answers.
The quality of information retrieved from Wikipedia is excellent, specifically how (unsurprisingly) there is so much relevant information (in comparison to some corpora where terms such as 'blog' and 'ipod' do not exist).
The range of results from Wikipedia is the best part. The software is able to match terms such as (lists cut for brevity):
The biggest problem is classifying certain words as physical artefacts; default WordNet is not a reliable resource as many terms (such as 'ipod', and even 'trampolining') do not exist in it.
I think what you are asking for is a source of semantic relationships between concepts. For that, I can think of a number of ways to go:
[...]
Judging by what you say you want to do, I think the last two options are more likely to be successful. If the relationships are not in Wordnet then semantic similarity won't work and OpenCyc doesn't seem to know much about snooker other than the fact that it exists.
I think a combination of both n-grams and LSA (or something like it) would be a good idea. N-gram frequencies will find concepts tightly bound to your target concept (e.g. tennis ball) and LSA would find related concepts mentioned in the same sentence/document (e.g. net, serve). Also, if you are only interested in nouns, filtering your output to contain only nouns or noun phrases (by using a part-of-speech tagger) might improve results.