Skip to content

start:

Lingucomponent Sub-Project: Thesaurus Development

The goal of this project is to improve existing thesauri for OpenOffice.org and to create new thesauri for languages that don't have one yet.

This project started by searching for and finding a synonym list for English (US) that was compatible with the OpenOffice.org licensing and then using that list and some simple software to develop a thesaurus for OpenOffice.org 1.x. OpenOffice.org 2.x now uses a thesaurus automatically built from the data in WordNet. The internal file format has also changed to a text-based one.

TODO

Downloads

  • MyThes-1.zip (4,5MB) - standalone version of the MyThes thesaurus code. This includes a thesaurus for en_US in its new format for OOo 2.0 (but not yet the WordNet-based thesaurus).
  • wn2ooo, the script used to create the OOo thesaurus from WordNet data.

Creating a new thesaurus

If you are willing to maintain a website to collect and coordinate a community developed synonym list for any language we need your help. Please send an e-mail to dev@lingucomponent.openoffice.org listing your skills and interests in being involved in this project. A web-based software for building a new thesaurus is OpenThesaurus, which is already successfully used to maintain the German, Polish, and other thesauri. All you need is some knowledge of PHP and MySQL and some server space to run your own version of OpenThesaurus.



Created: 2001 June.   Last Modified: $Date: 2005/11/04 18:58:26 $, $Revision: 1.15 $