

All relevant concepts are introduced, starting with an analysis of the prototype wordnet, the Princeton WordNet, after which the focus shifts to its extension to multilingual wordnet databases. In this work a methodology for constructing a wordnet for Afrikaans is proposed, which will be based on the Princeton WordNet that was developed at Princeton University, USA. This thesis is written as part of the preliminary research for a proposed project at the Centre for Text Technology at the North-West University in Potchefstroom, North-West Province, South Africa. Thus, Morphological Analyzer will embark the research wing for developing various Assamese NLP applications.

Also, the authors reported the IR system’s performance on applying the Assamese stemmer and proved its efficiency by retrieving sense oriented results based on the fired query. This developed stemmer for the Assamese language achieves accuracy of 85%. Appropriate stemming rules for the inflected nouns, verbs have been set to the rule engine and later tested the stemmed output with the morphological root words of Assamese WordNet and Named Entities by computing hamming distance. The authors prepare the dictionary with the root words extracted from Assamese WordNet and Named Entities. This paper basically tries to develop a Look-up and rule-based suffix stripping approach for the Assamese language using WordNet. Such inflected words if normalized will help improve the performance of various Natural Language Processing applications. There are various forms of suffixes applied to a word in various contexts. Assamese is a morphologically rich, scheduled Indian language. Stemming is a technique that reduces any inflected word to its root form.
