Does anyone know a Morphological realisation Tool (preferably a Java one). I am working on a project and I need to realise the correct verb "to be" providing if it is for male/female - singular/plural - first person/third person and regarding such inputs generate the correct verb "to be". SimpleNLG is the ideal software that contains a Morphological realisation but it is only for English and French.For example : if the features are male first person singular the result will be "I", if the features are plural third person males the result will be "they".
You can check out FOMA which is a C library (It is also available as standalone executable for Windows). It is based on Kimmo Koskenniemi's computational model that utilizes finite-state transducers. It is the open source version of xfst. You can see a quick crash course here.
It is very easy to use foma. This repo on github could serve as a sample (Check out the spanish.lexc
and spanish.foma
files). If you fire up foma and put the two scripts in the same directory, you can load the file and test the morphological realizer:
foma[0]: source spanish.foma
Opening file 'spanish.foma'.
defined Word: 1.6 kB. 2 states, 64 arcs, Cyclic.
defined Cleanup: 276 bytes. 1 state, 2 arcs, Cyclic.
Root...5, A...2, N...2, V1...65, V2...65, V3...65
Building lexicon...
Determinizing...
Minimizing...
Done!
7.9 kB. 289 states, 441 arcs, 199 paths.
defined Lexicon: 7.9 kB. 289 states, 441 arcs, 199 paths.
9.2 kB. 290 states, 505 arcs, Cyclic.
Now the good thing about FOMA is that it is two-ways. It can realize and analyze morphological forms at the same time. If you apply up it dissects forms, but if you apply down it acts as a realizer:
foma[1]: up
apply up> leo
leo+N+Sg
leo+A+Sg
leir+V+3C+PresenteIndicativo+1P+Sg
leer+V+2C+PresenteIndicativo+1P+Sg
lear+V+1C+PresenteIndicativo+1P+Sg
In the case of to-be, here's an example of how to use the transducer as a realizer:
foma[1]: down
apply down> estar+V+1C+PresenteIndicativo+3P+Sg
esta
Remember that you define the tags yourself at the start of the lexc script, so you can easily change or augment the existing script in that repo. If you actually read through the documentation, you'll quickly get the hang of it. It's very convenient and easy to use. Good luck!