Formatting rules for Tibetan text

From Digital Tibetan
Revision as of 08:26, 14 September 2008 by Domschl (Talk | contribs)

Jump to: navigation, search

This style guide defines formatting, layout and styles used to publish Tibetan booklets, Pechas, and Tibetan-English practice books. A collection of current Unicode standards concerning encoding and formatting of Tibetan texts should be found here.


General rules for formating Tibetan texts

Titles and beginnings of a large text (larger than half a page) start with a sequence starting with yi mgo ༄༅། །. The second and following lines of the heading are indented to the position of the second shad །.

Indented text (like verses)

Headlines use paragraph breaks. Headlines may be colour coded using dark blue.

If a root text contains few comments, comments are written in a smaller font than the root text. If the majority of the text is comment, then the root text is marked by colour (dark red).

A shad ། never starts a new line.

If a shad ། appears after the first syllable of a new line, it is replaced by a rinchen spungs shad ༑. Equally the sequence after the first syllable of a line ། ། is replaced by ༑ ༑. This rule does not apply for the first line of a paragraph or a last shad of a paragraph. If it is an and of a section that is followed by yig chung, this rule still applies.

ཀ and ག are never directly followed by a shad །. Cases with two shad (verses) are written ག ། or ཀ །. This also applies for ཀ and ག with vowel markers, but does not apply if super- or subscripts exist as in the example of སྒ། །.

We developed a OpenOffice.org Extension that implements these rules in your document.

Ideally Tibetan text should be left and right justified.

Points to be clarified

A shad ། after the first syllable of a new line is always transformed into a rinchen spungs shad ༑. Is this also the case if the last consonant of the first syllable of a line is a ང་, leading to a ང་༑.་Assumption: yes, always replace shad.

Texts in verse form use two shad ། །. If they appear after the first syllable of a line, should only the first be transformed into a rinchen spungs shad ༑ །(being considered as formally correct, but inelegant), or both ༑ ༑ (being considered as formally incorrect, but inelegant). Assumption: transform both shad.


Difficulties using typesetting software (focus: OpenOffice)

To solve the problems listed below, we wrote an OpenOffice.org Extension which handles these problems.

All spaces within Tibetan texts must be U+00A0 non-breaking space. To type them from the keyboard use <Ctrl><Space> (OOo handled wrongly non breaking spaces, however it is solved in version 2.4)

A shad ། should never be the first character in a new line. Non of the current text editors acknowledges that. Each reformatting of a given text can generate illegally positioned shad at the beginning of a line. When you have two shad ། ། make double sure to use non breaking spaces between them.

If the first syllable of a new line is followed by a shad །, this needs to be replaced by a rinchen spungs shad (རིན་ཆེན་སྤུངས་ཤད་ ) ༑. Non of the current text editors acknowledges that. Each reformatting of a given text can generate illegally positioned shad ། or rinchen spungs shad ༑.

Because of these previous problems it is very difficult to finalize correctly formatted documents. The slightest change, caused for example by changing a single letter causes reformatting of pages which will most likely cause a breach of the rules mentioned above. To solve this one should run the formating macro only on the final document, or rerun it after changes are applied.

It is illegal to break a line at a tsheg ་ in the middle of a Tibetan word.

Left- and right-justification of Tibetan text works only for 'small' font sizes in OpenOffice. This seems to be solved for the Linux version of OO 2.4. The formating macro solves this problem by inserting zero-wight-spaces (U+200B) after every thseg in the windows OOo version. It seems to work fine most of the time. Anyhow if the last character of the line is a non-breaking space (NBspace) it messes justification up in some cases. In order to solve this, one can replace these NBspaces with regular ones.