Skip to content

start:

Lingucomponent Sub-Project: Hyphenation

One of the goals of this projects is to improve the quality of the ALTLinux hyphenator and include more hyphenation dictionaries from various languages.

ALTLinux hyphenator is based on libhnj library by Raph Levien. It uses TeX hyphenation dictionaries with small corrections. There are many currently supported languages.

It is reasonably easy to port the TeX hyphenation dictionary (usually located in tex/generic/hyphen/ directory of TeX tree) to ALTLinux hyphenation format. To help with hyphenation dictionary creation there is a standalone version of the hyphenation code (71KB, updated 2005-10-15) with a simple example program that can be used for development and testing. Some hints follow:

  • You need to replace or remove all TeX macroses from library.
  • You need to call substrings.pl (can be found in the standalone hyphenator linked above).
    Usage: substrings.pl <patterns.dic> <newpatterns.dic>
    This will write the modified file to newpatterns.dic.
  • You need to put an indicator of the character encoding used as the first line of the dictionary file (look in hyph_en.dic). Possible values are: ISO8859-1, ISO8859-2, ..., ISO8859-10, KOI8-R
  • Any user can register their hyphenation dictionary so that it can be recognized by OpenOffice.org. Find the file dictionary.lst in either the main install location/share/dict/ooo/ or in the workstation install location in user/wordbook/ and add a line to specify your languagage and region code and the name of your dictionary (do not include the .dic extension). Here is how you would register the English (US) hyphenator as an example:
    HYPH en US hyph_en_US
    Please note this line is case sensitive and must start with the term HYPH
  • Now shutdown both OpenOffice.org (and QuickStarter if running) and restart it. Then go to Tools -> Options -> Language Settings -> Writing Aids and hit the "Edit..." button. Then use the pulldown to select your language and make sure the AltLinux Hyphenator box is checked for that language.

Created: 2001 June.   Last Modified: $Date: 2006/01/11 22:14:10 $, $Revision: 1.20 $