How to convert different legacy formats of Tibetan texts into standard Unicode Tibetan

From Digital Tibetan
Jump to: navigation, search

Online conversion

encodings txt unicode txt rtf html
ACIP Transliteration yes yes yes yes
ALA-LC Transliteration yes yes yes yes
Bandrida yes no yes yes
Beida Founder yes no no no
Huanguang yes no no no
LTibetan no no yes yes
National Standard Extended no yes yes yes
Sambhota 1.0 (Sama) no no yes yes
Sambhota 2.0 (Dedris) no no yes yes
TCRC Bod-Yig no no yes yes
THDL Wylie yes yes yes yes
Tibetan Machine no no yes yes
Tibetan Machine Web no no yes yes
Tongyuan yes no yes yes
Unicode no yes yes yes
Wylie Transliteration yes yes yes yes

Offline conversion

Attu

The makers of PechaMaker have developed a Windows program that converts RTF documents that use a large number of legacy Tibetan fonts into Unicode: Attu.

Converter for legacy Tibetan fonts: Attu

Attu currently supports the conversion of the following legacy fonts into Unicode:[2]

  • Monlam
Monlam ouchan 1, Monlam ouchan 2, Monlam ouchan 3, Monlam ouchan 4, Monlam yigchong
  • Nitartha International (Sambhota)
Dedris, Drutsa, Ededris, Khamdris, Sama
  • Tibetan Computer Company (TCC)
TibetanMachine, TibetanChogyal, TibetanClassic, TibetanCalligraphic, DzongkhaCalligraphic
  • Tibetan Computing Resource Center (TCRC)
TCRC Bod-Yig, TCRC Youtso, TCRC Youtsoweb
  • Tibetan Library of Works and Arts (TLWA)
TB-Youtso, TB-TTYoutso, TB2-Youtso, TB2-TTYoutso
  • Others
LTibetan, LTibetanExtension, LMantra

See the Attu website for more information.

While Attu is a Windows program, is does run on Mac and Linux with Wine.

UDP

For an overview of different available fonts see: Tibetan Fonts.

Tibetan/Dzongkha Font Based Formats[3]

A very useful conversion program for legacy Tibetan text formats is UDP. UDP converts the following legacy Tibetan text formats into Tibetan Unicode:

  • TibetanMachine (free download: www.tibet.dk) and TibetanCaligraphic, TibetanClassic, DzongkhaCaligraphic (commercial fonts: www.tibet.dk)
  • TibetanMachineWeb (free download: www.tibet.dk)
  • Tibetan Modern A (free download: virginia.edu.)
  • Robillard (Ltibetan, etc) (free download: UDP website).
  • Sambhota including Dedris, Eedris, Esama/b/c, Sama/b/c, Samw (free download if only used to view ACIP documents, otherwise commercial license required, see nitartha.org.)
  • TIBETBT (free download via THDL, direct link: www.tibet.cn
  • fonts derived from the "P.R.C. National Standard for Tibetan (Extension A)" (aka "Set A"). For a documentation see Chris Fynn's website: Tibetan Extension A. Chris Fynn's Jomolhari supports this standard.
  • TCRC Bod-Yig, TCRC Youtsoweb, TCRC Youtso (free download: www.tchrd.org
  • All Tibetan Unicode fonts (of course) (see: Tibetan Fonts)

How to use UDP

First time configuration

  1. Get a copy of UDP from UDP website and install the application. UDP can be installed on computers running Windows or on computers running Linux and Wine.
  2. Start UDP and select Options/Font... Select Unicode and chose a Unicode Tibetan font.
  3. Select Options/Advanced... and select Document are saved by default in: Unicode RTF.

Now every document you will load into UDP will be displayed using a Unicode font and will be saved by default as RTF Unicode. RTF Unicode files can be directly edited using OpenOffice or Microsoft word. See How to edit Tibetan texts.

Converting files

Note: This conversion procedures work best with Windows, but it is also possible to run UDP using Wine for Linux (see below).

  • TibetDoc documents:
  1. No conversion needed, continue with: Steps common to TibetDoc and Word documents
  • Word documents:
  1. Export as RTF: Save the document containing legacy Tibetan fonts as RTF document.
  2. Simplify the RTF encoding: Many word processors (like Microsoft Word) create RTF files whose encoding is too complex for UDP to understand and that might cause UDP to crash. It is possible to simplify the RTF encoding by loading the RTF file with wordpad (comes with Windows) and directly saving the file in wordpad again. Wordpad writes the file in a format that is easier to process for UDP. Steps: (1) Load RTF file created in step 1 with wordpad. (2) Save it without changes in wordpad.
  • Steps common to TibetDoc and Word documents
  1. Load into UDP: Load the TibetDoc or RTF file that has been saved in steps above into UDP
  2. Create a Tibetan Unicode RTF file: In UDP, chose File/Save as... and select "Rich text Unicode" as output format.
  3. Done: Use any Unicode application (e.g. OpenOffice) to work with the resulting file.

Using UDP in Linux or OS-X with Wine

UDP can be installed in Linux if Wine is installed. Simply start the installation program for UDP which can be downloaded from the UDP website.

Mac OS-X users need to install wine first, using for example macports.

Converting between Wylie and Tibetan Unicode

You can use the following online converter to convert between Tibetan Unicode and Wylie:

Converting Tibetan Unicode into phonetics

The following online converter automatically gives the pronunciation of Tibetan Unicode text according to THDL and Rigpa phonetics systems:

Converting Tibetan fonts

Sources

  1. Table from: http://utfc.trace.org/UTFC/resources_tools_UTFC_encoding.html
  2. List is taken from Attu's website
  3. Table taken from UDP website