Difference between revisions of "Formatting rules for Tibetan text"

From Digital Tibetan
Jump to: navigation, search
m (Line breaking rules: punc, spelling)
m (Inter syllable marker tsheg ་: punc)
Line 23: Line 23:
* Additionally, line breaks are possible after ''shad'' <big>།</big>, terma-sign ''gter ma'' <big>༔</big> and a ''visarga'' <big>ཿ</big>.
* Additionally, line breaks are possible after ''shad'' <big>།</big>, terma-sign ''gter ma'' <big>༔</big> and a ''visarga'' <big>ཿ</big>.
===Inter syllable marker ''tsheg'' <big>་</big>===
===Inter-syllable marker ''tsheg'' <big>་</big>===
* There is never a ''tsheg'' after a ''visarga'': Example ''oṃ āḥ huṃ'', Wylie: ''oM AHhU~M'', <big>ཨོཾ་ཨཱཿཧཱུྃ་</big> there is no ''tsheg'' after ''āḥ''.
* There is never a ''tsheg'' after a ''visarga''. Example: ''oṃ āḥ huṃ'', Wylie: ''oM AHhU~M'', <big>ཨོཾ་ཨཱཿཧཱུྃ་</big> &mdash; there is no ''tsheg'' after ''āḥ''.
===White spaces===
===White spaces===
* Tibetan uses only ''non-breaking spaces'' which do not vary in size on lift-right justification.
* Tibetan uses only ''non-breaking spaces'' which do not vary in size on lift-right justification.

Revision as of 13:55, 2 February 2010

material collection / work in progress


[1] A short Tibetan text in OpenOffice (unformatted)

Current office programs like Microsoft office 2003 / 2007 and OpenOffice 3.x handle Tibetan script quite well, and once they are set up correctly, line breaks are handled correctly in most cases. If your text processor breaks Tibetan syllables in the middle, you either need to update to a newer version, or check the setup for Microsoft Office or Open Office.

The following describes the formatting process using the example of the following short Tibetan text:

This example (see image [1]) shows quite a number of formatting short-comings:

  • There are shads at a beginning of a line, which is forbidden,
  • There is no difference in font size for headline, commentary at the end and main text,
  • There is no yig mgo ༄༅ or sbrul shad marking the start of the text.
  • There is no justification.

The following chapter shows how to enhance the formatting of our example.

Basic formatting rules for Tibetan text

[2] A short Tibetan text in OpenOffice (simple formatting)

Line-breaking rules

  • Line breaks must not occur in the middle of a syllable. (Your word processor should take care of that already).
  • Line breaks can appear after a syllable separator dot tsheg (preferably not in the middle of a Tibetan word). Exception: the sequence nga <tsheg> <shad> ང་། Here the tsheg is a so-called non-breaking tsheg.
  • Additionally, line breaks are possible after shad , terma-sign gter ma and a visarga ཿ.

Inter-syllable marker tsheg

  • There is never a tsheg after a visarga. Example: oṃ āḥ huṃ, Wylie: oM AHhU~M, ཨོཾ་ཨཱཿཧཱུྃ་ — there is no tsheg after āḥ.

White spaces

  • Tibetan uses only non-breaking spaces which do not vary in size on lift-right justification.

Usage of punctuation character shad

  • A line must not start with a shad .
  • A shad is used as a Tibetan inter-punctuation, similar but not identical to a comma.
    • Verses, headlines or ends of longer paragraphs are ended by the sequence <shad> <space> <shad> ། །.
    • Exception: if the last letter of a line is either a ka or a ga , one shad is ommitted. This is also the case if ka or ga have vowel-signs. A shad is not omitted if they have a sub- or superscript. Examples:
      • Incorrect: གི།, ཀུ། །,
      • Correct: གི, ཀུ །, སྐུ།, གྲུ། །.

Rules for replacing shad by rin chen spungs shad

  • In Tibetan, especially in pechas, it is considered a special case, if the last syllable of an expression that is terminated by a shad breaks to a new line. In that case the shad or double shad shad ། ། is replaced by rin chen spungs shad or ༑ ༑. This serves as an optitical indication that there is a left-over syllable at the beginning of the line that actually belongs to the preceding line.
    • a special case would be for example le'u: in a line starting with ལེའུ། །, no rin chen spungs shad would be used, since le'u is pronounced as two syllables.
    • Variants: some books-prints do not use rin chen spungs shad replacements, however the majority of books seems to apply the same rules as are used with pechas.
    • Sometimes in the sequence ། ། only the first shad is replaced: ༑ །, but this style is considered less beautiful.
  • Correctly using rin chen spungs shad ༑ in long texts can be very time consuming, because even very small formatting or content changes might move rin chen spungs shad ༑ into positions where they are no longer correct. Same is true for shad ། that suddenly need to become rin chen spungs shad ༑. The Tibetan formatting OpenOffice.org Extension can do the application of the rin chen spungs shad ༑ automatically within OpenOffice.
[3] A short Tibetan text in OpenOffice (formatting with left/right justification)

Numbers and special signs

  • Numbers: Usually the Space character in Tibetan text is quite wide and occurs only after a shad or , gter ma , or visarga ཿ. Exception are numbers and embedded Western text. Tibetan numbers are separated from left and right Tibetan letters by smaller spaces: Numbers-1.jpg
  • Terma signs: In case a section of text that is actually a gter ma, a single terma symbol replaces both shad and double shad ། །. Wood-block pechas sometimes simplify the gter ma so that it looks like a visarga ཿ, but digital texts should use the proper terma sign .
  • Honorific marks: a honorific emphasis can be expressed by a special prefix , by colour, or by circles und the syllable as in the following example: Honorific-1.jpg.

Head letters, yig mgo ༄༅ and sbrul shad

  • The head letter yig mgo ༄༅། ། in pechas always marks the upper left corner of a front-page of a pecha. It serves for quick optical discrimination between front- and back-side of a pecha. This usage does not occur in books. Additionally, in both pechas and books, the yig mgo ༄༅། ། is used at the start of headlines, and at the start of the first paragraph of a longer text. (See image [2])
  • The sbrul shad is used to indicate:
    • the start of a smaller text or prayer,
    • a chapter boundary
    • in text or prayer collections the start of a new text or prayer
    • in pechas to mark insertions into a text: a sbrul shad would mark both beginning and end of the insert
[4]Left-right justification option for Tibetan in Linux OpenOffice

Small print yig chung

  • Commentaries and annotations in Tibetan books and pechas are printing using an about 25-30% smaller font-size. In pechas additionally head-lines are printed in yig chung size. See image [2] for an example. If normal-size Tibetan and yig chung are mixed, it is important, to align the letter-heads of different size characters to the same height. The character super- and sub-script function of the word processor can be used to correctly align the letters. Example: Yigchung.jpg

Image [2] shows the same text as image [1] with all formatting rules applied.

Advanced formatting with OpenOffice

  • Left/right justification: Western text is left/right justified by slightly expanding white spaces between words. This method does not lead to acceptable results with Tibetan since often there are only very few or even no white space in a line. To correctly justify Tibetan, the spacing between all characters should be adapted equally. The width of the white space character should not be changed significantly: therefore Tibetan texts use the non-breaking space (Unicode U+A0) as white space which doesn't change width on justification.
    • Linux based operating systems: OpenOffice on Linux directly supports left/right justification. The option "Paragraph / Expand single words" (see image [4]) needs to be activated.
[5] Compressing characters to avoid ugly white spaces on justification.
    • Unfortunately this does not work with Windows versions of OpenOffice. Windows Users need to download the Tibetan formatting OpenOffice.org Extension. This extension installs a Tibetan menu in OpenOffice. The option "Tibetan / Insert justification characters" will insert in all paragraphs, which are formatted as left/right justified (using OpenOffice's Format / Paragraph / Alignment / Justified), invisible zero-space characters after each tsheg: this makes OpenOffice correctly justify Tibetan in Windows. See image [3] for an example.
    • Tip: Left/right justification, especially when using the Windows extension, and after using the automatic rin chen spungs chad insertion, might lead to ugly spaces after tshegs. In many cases this effect can be reduced by slightly compressing all characters by 0,1 to 0,2 pt using the OpenOffice command "Format / Character / Position / Condensed" option. See image [5].

The sample file with justification (Windows method) can be found here:

[6] An OpenOffice extension for rin chen spungs shad handling an justification of text.

OpenOffice tools

The Tibetan formatting OpenOffice.org Extension supports two tasks for Tibetan processing in OpenOffice:

  • Left/Right justification for Windows computers
  • Automatic application of rin chen spungs chad insertion. Especially for longer texts this can save a lot of time when formatting texts.

See Tibetan formatting OpenOffice.org Extension for more details.


  • For best results, first create a PDF of a document you intent to print. Some printer drivers do not correctly handle Tibetan stacks and generate inferior results, on direct print from Office.
  • Additionally PDFs make sure that the print-output is always the same regardless of printer and computer used for printing.
  • When printing from Adobe reader, make sure to select "Page Scaling: None" in the print dialog, to prevent slight compression of the pages. This is especially important for pecha printing.

Internal Links

External References