ŋaren crîþa 9 vlefto: Ŋarâþ Crîþ v9

Orthography and phonology

The phonology and orthography of Ŋarâþ Crîþ can be divided into eight layers in two modes (writing and speaking):

The conversions from 0 to 1, 1 to 2w, and 2s to 3s are functional: each valid input corresponds to exactly one output. The conversion from 1 to 2s is almost so, except when a ⟨&⟩ is present. In the opposite direction, the conversions from 4w to 3w, from 3w to 2w*, and from 2w* to 2w are functional. Furthermore, for any conversion, it can be determined whether a given input can be converted into a given output without external information.

In addition, the conversion between 1 and 2w is bijective: valid layer-1 and layer-2w representations can be paired with each other.

Layers 0, 1, and 2w: Cenvos and its romanization

Cenvos, the native script of Ŋarâþ Crîþ, is written from right to left. This script can be analyzed on two levels: graphemes, which constitute the abstract level and glyphs, which are the characters being written. For instance, Cenvos has one grapheme romanized as ⟨c⟩ that corresponds to two different glyphs: the non-final form 𐲀𐲢 (denoted as ²⟨c⟩) and the final form 𐲀 (²⟨c$). As another example, the sequence 𐲌𐲁 (⟨me⟩ = ²⟨me) consists of one glyph but two graphemes.

In this grammar, we primarily use the romanization, whose symbols largely map one-to-one with Cenvos graphemes. Cenvos has four kinds of graphemes:

Of course, there is also the space. Layer 0 also contains the morpheme boundary, ⟦-⟧.

CenNameRomCenNameRomCenNameRom
True letters
𐲀𐲢cac𐲌mam𐲘arh
𐲁ee𐲍aa𐲙ħoħ
𐲂nan𐲎faf𐲚ênê
𐲃𐲢ŋaŋ𐲏gag𐲛ônô
𐲄vav𐲐pap𐲜ânâ
𐲅oo𐲑tat𐲝uħou
𐲆sas𐲒čač𐳀cełaŋaw
𐲇þaþ𐲓înî𐳁avartex
𐲈šaš𐲔jaj𐳂priþnosy
𐲉rar𐲕ii𐳃telrigjonz
𐲊lal𐲖dad
𐲋łał𐲗ðað
Final forms and ligatures (layer 2w)
𐲀c$𐲌𐲁me𐳀𐳀ww
𐲃ŋ$𐲌𐲌mm𐳁𐳁xx
𐲁𐲁ee𐲔𐲜𐳂𐳂yy
𐲁𐲌em𐲜𐲔âj𐳃𐳃zz
Markers
𐲤carþ#𐲦njor+*𐲨nef*
𐲥tor+𐲧es@𐲯sen&
Punctuation
𐲞gen.𐲩ŋos𐲭fos«
𐲟tja;𐲪łil·𐲮þos»
𐲠šac?𐲫rin{𐳄jedva/
𐲡cjar!𐲬cin}𐲣mivaf·ome-
Table 1: The graphemes of Ŋarâþ Crîþ. (The columns are read from left to right.)

The letters ⟨w⟩, ⟨x⟩, ⟨y⟩, and ⟨z⟩ are USR letters. These are used in foreign languages written in Cenvos to represent phonemes that are not approximated by the phonology of Ŋarâþ Crîþ. Each foreign orthography is free to assign them as it pleases.

Cenvos has two graphemes that change form at the end of the word: ⟨c⟩ and ⟨ŋ⟩, as well as several ligatures. We do not distinguish these forms in the romanization.

The marker ⟨*⟩ is used for foreign words, such as loanwords and foreign names. ⟨#⟩ is used to prefix given names. ⟨+⟩ is used to prefix surnames passed by native conventions (i.e. from parent to child within the same gender); ⟨+*⟩ marks a surname passed using non-native conventions. Place names are prefixed with ⟨@⟩. ⟨#⟩, ⟨+⟩, ⟨+*⟩, and ⟨@⟩ can all be used with ⟨*⟩, in which case ⟨*⟩ occurs first. Note that ⟨+*⟩ is a single letter of its own and not a ligature.

At the start of a word, ⟨&⟩ indicates reduplication of an unspecified prefix of the rest of the word. For instance, ⟨&cên⟩ can be pronounced as if it were ⟨cêcên⟩ or ⟨cêncên⟩. (⟨&⟩ occurs after all other markers in this case.) This usage is not productive in standard Ŋarâþ Crîþ, but it appears in a few words, as well as in some idiosyncratic cases. At the middle or the end of a word, or alone, it indicates ellipsis of part or all of the word, most often to abbreviate or censor a word. Lastly, ⟨&{}⟩ is used similarly to the ellipsis in Western punctuation.

Markers can be applied to multi-word strings by surrounding the string with the delimiters ⟨{}⟩. In legal language, ⟨{}⟩ are also used around phrases to resolve ambiguities.

The sentence punctuation ⟨.⟩, ⟨?⟩, and ⟨!⟩ are used as expected. ⟨;⟩ is used to separate two independent clause phrases within the same sentence. The quotation marks, ⟨«»⟩, are used around quotations, direct or indirect. A ⟨.⟩ at the end of a quotation embedded within another sentence is omitted.

⟨’⟩ is used to separate clitics from the rest of the word to which they are attached. ⟨·⟩ indicates lenition; it could be described as a “letter modifier”. It is also used as a decimal point: officially, it is used after the most significant digit of an inexact numeral when written with digits, but it also used unofficially to write non-integers.

⟨/⟩, as its derivation from ⟨i⟩ suggests, is used to separate the number of mjari from the number of edva when writing currency amounts.

Spaces are placed in the following places:

[TODO: cover mentions of letters within the language, corresponding to v7 p17 “When letters or markers are referred to, … but the effects on other glyphs are not standardized”]

Digits are interchangeable with short-form numerals, but not with long-form numerals. They are also written right-to-left in Cenvos, with the most significant digit first: 𐲲𐲺𐲳 is 0x2A3 = 675.

Cen#Cen#Cen#Cen#
𐲰0𐲱1𐲲2𐲳3
𐲴4𐲵5𐲶6𐲷7
𐲸8𐲹9𐲺A𐲻B
𐲼C𐲽D𐲾E𐲿F
Table 2: The digits of Ŋarâþ Crîþ. (The columns are read from left to right.)

Phonotactics

We express the phonotactic rules of Ŋarâþ Crîþ in terms of layer 0.

A manifested grapheme phrase is either a true letter not followed by a lenition marker (plain letter), any of ⟦p t d č c g m f v ð⟧ followed by a lenition mark (lenited letter), or, word-initially, one of the digraphs ⟦mp vp dt nd gc ŋg vf ðþ lł⟧ (eclipsed letter). All other graphemes are ignored for the purposes of phonotactics.

A manifested grapheme phrase has a base letter. The base letter of a plain letter is itself. The base letter of a lenited letter is the letter without the lenition mark. The base letter of an eclipsed letter is the second letter of the digraph.

A vowel is any of ⟦e o a î i ê ô â u⟧. ⟦j⟧ is a semivowel. All other manifested grapheme phrases are consonants.

An effective plosive is a manifested grapheme phrase whose base letter is any of ⟦p t d c g⟧. An effective fricative is a manifested grapheme phrase whose base letter is any of ⟦f v þ ð s š h ħ⟧.

A word consists of one or more syllables, each of which has an initial, a medial, a nucleus, and a coda. An initial consists of one of the following:

The only valid medial, if present, is ⟦j⟧. A nucleus is a vowel.

A coda is either a simple coda or a complex coda. A simple coda is one of ⟦s r n þ rþ l t c f m⟧ or nothing at all. A complex coda is one of ⟦st lt ns ls nþ cþ⟧. While complex codas are allowed in any syllable in layer 0, instances of such codas in the middle of a syntactic word are simplified during the conversion to layer 1, and such instances immediately before a clitic boundary are simplified during the conversion to layer 2. The coda ⟦-m⟧ is used in only a few words.

In addition, ⟦h⟧ is forbidden word-initially. Doubled consonants and vowels are allowed.

If there is more than one way to split a word into syllables, the maximal-onset principle is used. However, clitic boundaries always start a new syllable.

An onset is an initial plus a medial. A bridge is the coda of one syllable plus the onset of the following syllable.

Conversion from layer 0 to layer 1

The following changes are applied as a part of morphology. They occur only when the subsequence involved in a change (that is, the substring being replaced as well as the environment that triggers the change) crosses a morpheme boundary but not a word boundary. For instance, ⟦*@vav-el⟧ becomes ⟨*@vavel⟩ instead of ⟨*@navel⟩. For clarity, however, we omit any ⟦-⟧s from the rules below. (These changes apply from left to right.)

Here, “V[-creaky]” means any of ⟦e o a i u⟧.

The following changes are made to simplify complex codas within a syntactic word if (and only if) the consonant cluster cannot be reinterpreted to avoid the mid-word complex coda.

Here, the consonant graphemes are considered to be organized in the following way based on their pronunciations (with voiceless/voiced pairs):

LabialCoronalDorsalOther
Obstruentp, f, p· / v, vpt, č, þ, s, š, ł, t·, č· / d, ð, dt, d·, ðþc, h, c· / g, gc/ ħ, g·
Nasal/ m, mp/ n, nd/ ŋ, ŋg
Other/ l, lł, r
Table 3: Phonetic features used for complex coda simplification.

Finally, the ⟦j⟧ is removed from any instances of ⟦ji jî ju⟧.

Letter numbering

Sometimes, an integer must be assigned to each letter. In this case, the assignment shown in the table below is used. Note that numbers are not assigned fully sequentially. Furthermore, this function is valid only for layer 1 graphemes.

LetterHexDecLetterHexDecLetterHexDec
True letters
c00m2032h1117
e11a99ħ1218
n22fA10ê101257
ŋ2B43gB11ô104260
v33pC12â109265
o44tD13u1319
s55čDE222w−1−1
þ5585îE14x−2−2
š5E94j6E110y−3−3
r66iF15z−4−4
l77d1016
ł77119ð155341
Markers
#1420+*1622*1925
+1521@1723&1A26
Table 4: Letter numbering in Ŋarâþ Crîþ. (The columns are read from left to right.)

The letter sum of a word is the sum of all of its letters. This value is used in some of the noun declension paradigms.

It is theorized that letter numbers were assigned in the following manner:

Collation

The true letters and the markers are collated in their respective order, except for ⟨&⟩, which is ignored. Lenited letters are treated as their respective base letters, except when two words differ only by the presence or absence of a lenition mark, in which case the lenited variant is collated after the base letter: ⟨saga⟩ < ⟨sag·a⟩ < ⟨sada⟩ < ⟨saħa⟩. Numerals are collated after all letters.

In a directory of personal names, entries are collated on surnames, with given names considered only when surnames are identical. Headings in such a list include the prefix up to an including the first true letter: ⟨+merlan #flirora⟩ would be found under ⟨+m⟩.

Ordered items can be labeled using numerals (starting from 0) or letters. In the latter case, only the letters ⟨c e n v o s r l m a f g p t î i d h⟩ are used.

Numquotes

A digit immediately preceding text surrounded by quotation or grouping marks constitutes a numquote. The digit is usually not pronounced in this case. Numquotes are mainly used for secondary purposes that lack any dedicated punctuation.

NumquoteMeaning
B{}Contains parenthetical information: provides supplementary information. The sentence should still make sense without the parenthetical content.
1{}Lists an alias of a referent mentioned by name.
2{}Surrounds a key-value list. Used as such: ⟨2{3{&{}} 4{&{}} 3{&{}} 4{&{}}}⟩
3{}Used for listing a key inside ⟨2{}⟩.
4{}Used for listing a value inside ⟨2{}⟩. When not directly inside a ⟨2{}⟩ numquote, marks a list: elements are delimited by spaces, and ⟨{}⟩ can be used to insert multi-word elements.
9{}Used to contain abbreviated quantities in the traditional currency system.
*9{}Used to contain abbreviated quantities in a currency system other than the traditional one.
Table 5: Numquotes in Ŋarâþ Crîþ.

Layer 2s

Before the rest of the conversion to layer 2, the complex coda-simplifying changes are performed to simplify such complex codas before clitic boundaries or at the end of a word. (That is, any occurrences of ⟨’⟩ are ignored this time.)

Traditionally, only manifested grapheme phrases are considered to be significant in the conversion from layer 1 to layer 2s. However, other graphemes such as punctuation can affect prosody.

MGPsIPAMGPsIPA
ckpp
eett
n ndnčt͡ʂ
ŋ ŋgŋî
v m· vpvjj
ooii
ssd dtd
þ t·θð d· ðþð
š č·ʂh c·x
rɹħ g·ʕ
l lłlê
łɬô
m mpmâ
aau
f p·ff· v· ð·
g gcɡ
Table 6: Layer 1 to layer 2s conversions.

Layer 2 has a two-way tone contrast between vowels: the high tone (H) is the default, being contrasted with the low tone (L). For historical reasons, the presence or absence of a low tone on a vowel is called [±creaky].

Layer 3s

The conversion from layer 2s to layer 3s is comparatively more complex.

First, the following changes are made:

Plosives in a coda are unreleased. All unvoiced plosives and affricates outside of a coda are aspirated.

While Ŋarâþ Crîþ has two tone levels phonemically, their realizations in the phonetic level is more complex. It is common to describe phonetic tone using seven levels, from 0 (the lowest) to 6 (the highest). Each syllable has one or more tones.

In order to describe tone, we must introduce the concept of “stress”, which is placed according to the following rules:

We also introduce the concept of a tone accounting unit (TAU), which is the level at which tones are realized. That is, the tone of a syllable depends only on the contents of the TAU in which it lies. Instances of content words occupy different TAUs from each other, but some function words occupy the same TAU as the preceding or following word (in particular, such words have no stressed syllable and are confined to a relatively fixed position):

(Stress is accounted by orthographic word, not by TAU.)

First, two adjacent vowels are fused into a diphthong if the vowels are not identical, the first vowel is stressed, the second vowel is [i] or [u̜], and the syllable to which the second vowel belongs can be interpreted as having an empty coda. For purposes of tonekeeping, a diphthong is considered to be composed of two different syllables.

In general, unstressed H and L syllables have tone levels 4 and 2, respectively; stressed H and L syllables have tone levels 5 and 1. However, an open H or L syllable before a stressed syllable gets level 3 or 1, respectively, instead. Diphthongs get different values: 65 for HH, 53 for HL, 13 for LH, and 21 for LL.

If two adjacent copies of an identical vowel have the same tone level at this stage, then the one closer to the stressed syllable rises by one tone level and the one farther from it falls by one level.

A tone level of n is then changed into a tone contour in the following situations, unless doing so would result in an out-of-bounds tone level:

In addition, other syllables change their tone levels:

Finally, if all tones have a level of 4 or higher, then the lowest tone (breaking ties by preferring later tones) is lowered to 3, and all other tones in the same syllable are lowered by the same amount. All level-3 tones are then lowered to level 2.

Isochrony

The isochrony of Ŋarâþ Crîþ falls somewhere between syllable and mora timing, where:

Mutations

Ŋarâþ Crîþ has two kinds of initial mutations: lenition and eclipsis. Neither kind of mutation has any effect on plosive-fricative onsets or any of ⟦r l n ŋ ħ⟧.

Lenition tends to turn plosives into fricatives and is indicated with a middle dot ⟦·⟧ after the consonant affected. In particular, it affects ⟦p t d č c g m f v ð⟧. (See Layer 2 for pronunciation details.) Partial lenition does not affect any of ⟦f v ð⟧; that is, it does not lenite consonants that would become silent. Unless otherwise qualified, lenition refers to total lenition, which affects ⟦f v ð⟧.

In a word containing ⟦&⟧, both instances of the reduplicated prefix are lenited. For example, ⟨&d·enfo⟩ can be pronounced as [ðeðenfo] but not as *[ðedenfo].

Lenition occurs in the following environments:

Eclipsis tends to add voice to voiceless consonants and change voiced stops into nasals. It is indicated by prefixing a consonant: ⟦t d c g f þ ł⟧ become ⟦dt nd gc ŋg vf ðþ lł⟧, respectively. ⟦p⟧ becomes ⟦vp⟧ before any of ⟦i e u î ê⟧ and ⟦mp⟧ elsewhere. If a word starts with a vowel, then it is eclipsed by prefixing ⟦g⟧.

In a word containing ⟦&⟧, only the first instance of the reduplicated prefix is eclipsed. For example, ⟨n&denfin⟩ can be pronounced as [nedenfin] but not as *[nenenfin].

Eclipsis occurs in the following environments:

Lenition can happen on any syllabic onset of a word, but eclipsis is limited to word-initial positions.

In this documentation, lenition is sometimes marked with an empty circle ○, and eclipsis with an filled circle ●. Partial lenition is marked with an empty triangle △.

Loanwords

Almost all loanwords in Ŋarâþ Crîþ are nouns. [TODO: we are reworking nouns]

Generally, when borrowing from languages that use the Cenvos script or a script related to it, and whose orthographies in the script in question do not deviate too far from Ŋarâþ Crîþ usage, Ŋarâþ Crîþ prefers to borrow the word graphemically than phonemically.

The typography of Ŋarâþ Crîþ

In principle, layer 2w is the highest written layer needed to write in Ŋarâþ Crîþ. (Note that there is only one valid layer-2w representation for each layer-1 string; in other words, changing a valid layer-2w string in a way that preserves the layer-1 representation always results in an invalid layer-2w string.) However, speakers of Ŋarâþ Crîþ tend to value aesthetics, even in writing. Thus, a mastery of handwriting beyond layer 2w is considered crucial.

Even though movable type has been available for a long time, prominent parts of printed materials (such as titles) often continued to use plates engraved from handwriting. Eventually, typography and calligraphy were considered parts of the same discipline, leading to typefaces supporting more features from the latter. Even today, logos often opt for lettering over typefaces. Because of this unification, we use the term typography to refer to the discipline of laying out writing in general.

Although a full treatment of Ŋarâþ Crîþ typography is out of scope for this grammar, this section gives an overview of the concerns at hand.

Kerning

Cenvos is a script that absolutely requires kerning. To start, some glyphs such as ²⟨e⟩ and ²⟨m⟩ have long leftward tails that necessitate kerning with glyphs such as ²⟨s⟩ or ²⟨o⟩, which lack descenders, or even some glyphs with descenders such as ²⟨j⟩.

Other glyphs such as ²⟨j⟩ and ²⟨ê⟩ have shorter leftward descenders that also require kerning with following glyphs.

²⟨â⟩ has a descender in the opposite direction; thus, it must kern with certain preceding glyphs.

Diagonal strokes with matching slopes (such as in ²⟨âv⟩ or ²⟨rj⟩) should be kerned to bring them closer.

Examples of glyph pairs that require kerning.
Figure 1: Examples of glyph pairs that require kerning: ²⟨es⟩, ²⟨mj⟩, ²⟨jo⟩, ²⟨ên⟩, ²⟨câ⟩, and ²⟨âv⟩.

Moreover, even pairs are sometimes insufficient. Since ²⟨e⟩ and ²⟨i⟩ are kerned so closely, ²⟨ei⟩ must itself kern with glyphs such as ²⟨s⟩.

Kerning of eis and eig.
Figure 2: Kerning of ²⟨eis⟩ and ²⟨eig⟩. In ²⟨eis⟩, ²⟨ei⟩ has room to kern with ²⟨s⟩. ²⟨ei⟩ obviously cannot kern with ²⟨g⟩; that is, in ²⟨eig⟩, ²⟨i⟩ and ²⟨g⟩ are spaced farther apart than usual.

Ligation and shaping

Another important aspect of typography is the use of ligatures (beyond the required ones). The concepts of higher written layers and the hierarchy of graphic variations have been developed to try to formalize this problem.

To explain the idea behind this model, we note that a good ligature will have the end of one glyph near the start of the next. The starting and ending points of a glyph, in turn, depend on the order in which the strokes are written.

Furthermore, natural handwriting tends to join certain strokes together. In some cases, this joining can affect how a glyph ligates; for instance, ³⟨a1α cannot ligate with the previous character (ligating through the middle would cause a stroke collision with stroke 2 of ³⟨a1α), but ³⟨a1β, in which the two strokes are joined without a loop, can do so.

In addition, rapid handwriting often produces stylistic variations of glyphs. For example, ³⟨i2α (“²⟨i⟩ with the stroke going upward”) can often end in a leftward swash at the end of the stroke. Since this deviation does not create any ambiguity, it has been accepted, yielding the stylistic variant ⁴⟨i2αS.

The ideas behind ligation.
Figure 3: (a) An example of a bad ligature, in which the first glyph ends at the baseline and the second glyph starts at the top line. In the next example, the second glyph starts at the baseline as well, avoiding an awkward joining point. (b) A difference in stroke order (shown with the glyph ²⟨a⟩) can change the starting points (shown as blue dots) and the ending points (shown as red dots) of a glyph. (³⟨a1α does not have a starting point suitable for ligation.) (c) The first stroke of ³⟨a1α blocks ligation from a previous glyph, but such a stroke is absent in ³⟨a1β. (d) The default variant ⁴⟨i2α in comparison to ⁴⟨i2αS (both ligated after ⁴⟨f1α).

We now cover the formalism itself. Layers 2w*, 3w, and 4w are aesthetic layers; the writer decides the precise sequence of glyphs to realize a layer-2w string in higher layers. Nonetheless, not all layer-3w or -4w strings are valid, even those that correspond to valid layer-2w strings; for instance, ³⟨s1i1 is not a valid realization of ²⟨si⟩ because it requires a base-to-top ligation.

Only some glyphs participate in typesetting. Notably, all letters participate, but no numerals do so, nor does the space.

Each participating layer-2w* glyph has a hierarchy of variations as follows:

Layer 2w is transliterated using mostly the same symbols as the layer-1 romanization, but required ligatures are notated with an overline (such as in ²⟨me for 𐲌𐲁), and final forms are written as if they were ligatures with a special $ symbol: ²⟨c$ for 𐲀. Layer 2w* introduces discretionary ligatures, which are similarly marked in our notation. By discretionary ligature, we mean a ligature that the writer may choose to use but is not obligated to do so, and that cannot be derived by simply connecting the ending stroke of one glyph to the starting stroke of another.

Layer 3w works on topological variants. The overline denotes optional ligatures between topological variants; it is now omitted for required and discretionary ligatures, which are their own layer-2w* glyphs in their own right: ³⟨+1αme1αr1αl2βa1αn1α #1αf1αl2δi1βr1αo1αr2αa3β transliterates a particularly fancy realization of ⟨+merlan #flirora⟩.

#merlan +flirora
Figure 4: What ³⟨+1αme1αr1αl2βa1αn1α #1αf1αl2δi1βr1αo1αr2αa3β would look like.

Layer 4w works on stylistic variants. In the transliteration, the overline is used as in 3w.

Layer 3w can be thought of as the ‘ligation layer’; similarly, layer 4w can be thought of as the ‘shaping layer’.

Table 7 describes the canonical stroke order of each glyph, and Table 8 lists the stroke-order variants.

GlyphStroke order
c(1) Counterclockwise
e(1) From top right to bottom left
n(1) From top left to bottom right
ŋ(1) From top right to bottom
v(1) From right to left
o(1) From top to bottom left
s(1) From top right to bottom left
þ(1) Rightmost stroke from right to left
(2) Leftmost stroke from right to left
š(1) From top right to bottom left
r(1a) From bottom to top (1b) to left
l(1a) r-stroke from bottom to top (1b) to left
(2) Intersecting stroke from right to left
ł(1a) o-stroke from top to bottom (1b) to left
(2) Intersecting stroke from right to left
m(1) e-stroke from top right to bottom left
(2) Intersecting stroke from right to left
a(1) þ-sloping stroke from left to right
(2) f-sloping stroke from right to left
f(1) Rightmost stroke from right to left
(2) Leftmost stroke from right to left
g(1) From top right to bottom
p(1) From right to bottom
t(1a) v-stroke from right to top (1b) to left
(2) Vertical stroke from top to bottom
č(1) Ascending stroke from top to bottom
(2) f-sloping stroke from right to left
î(1) From bottom right to top left
j(1) From top right to bottom left
i(1) From top to bottom
d(1) þ-sloping stroke from left to right
(2) f-sloping stroke from right to left
ð(1) Leftmost þ-sloping stroke from left to right
(2) Rightmost þ-sloping stroke from left to right
(3) f-sloping stroke from right to left
h(1) From right to left
ħ(1) Clockwise, starting and ending at the top
ê(1) From top right to bottom left
ô(1) From top to bottom
â(1) From bottom right to top left
u(1) o-stroke from top to bottom left
(2) Rightmost dot
(3) Leftmost dot
w(1) From top to bottom
x(1) Stroke with descender, starting from the top-right corner and ending on the descender
(2) Wave stroke, from right to left
y(1) From right to left
z(1) From right to left
c$(1) From right to bottom left
ŋ$(1) ŋ-stroke from top right to bottom
(2) Intersecting stroke from right to left
ee(1) e-stroke from top right to bottom left
(2) Overbar from right to left
em(1) e-stroke from top right to bottom left
(2) Roof from right to lef
me(1) e-stroke from top right to bottom left
(2) Intersecting stroke from right to left
(3) Overbar from right to left
mm(1) e-stroke from top right to bottom left
(2) Intersecting stroke from right to left
(3) Roof from right to left
(1) j-stroke from top right to bottom left
(2) Ring clockwise (starting and ending point unspecified)
âj(1) â-stroke from bottom right to top left
(2) Ring clockwise (starting and ending point unspecified)
ww(1) w-stroke, from top to bottom
(2) Ring clockwise (starting and ending point unspecified)
xx(1) Stroke with descender, starting from the top-right corner and ending on the descender
(2) Wave stroke, from right to left
(3) Bottom-right tick
(4) Top-left tick
yy(1) y-stroke, from right to left
(2) Tick, from top to bottom
zz(1) z-stroke, from right to left
(2) Ring clockwise (starting and ending point unspecified)
#(1) From bottom right to top left
+(1) From top right to bottom left
+*(1) From top right to bottom left
(2) Vertical stroke from top to bottom
(3) f-sloping stroke from top right to bottom left
(4) þ-sloping stroke from bottom right to top left
@(1) Vertical stroke from top to bottom
(2) v-stroke from right to left
*(1) Vertical stroke from top to bottom
(2) Horizontal stroke from right to left
(3) f-sloping stroke from top right to bottom left
(4) þ-sloping stroke from bottom right to top left
&(1) Sinusoid from right to left
(2) Arrowhead
.(1) Main stroke from right to left
(2) Arrowhead
;(1) Main stroke from right to left
(2) Arrowhead
?(1) Main stroke from right to left
(2) Arrowhead
!(1) Main stroke from right to left
(2) Arrowhead
{(1) From right to left
}(1) From right to left
«(1) From top to bottom
»(1) Vertical stroke from top to bottom
(2) Left cornered edge from top to bottom
/(1) From bottom, curving at the top toward the left, then descending while crossing to the right half and possibly to the left again
(ra)(1) Stroke as in ²⟨r⟩, but with the end extending to the descender line
(2) Stroke intersecting the second part of stroke 1
(ro)(1a) The stem of the ²⟨r⟩-stroke, from bottom to top (1b) A ²⟨v⟩-stroke from right to left
Table 7: Canonical stroke orders for layer-2w* glyphs. (Glyphs in parentheses are discretionary ligatures.)
Canonical stroke orders of layer-2w glyphs.
Figure 5: Canonical stroke orders of layer-2w glyphs.
Stroke orders of discretionary ligatures.
Figure 6: Stroke orders of discretionary ligatures.
Glyph123456
c1
e1
n11′
ŋ1
v1
o1
s1
þ1 2
š1
r11a′ 1b
l1 21a′ 1b 2
ł1 2
m1 22′ 11 2′
a1 22 1′1′ 2
f1 2
g1
p1
t1 21a+2 1b
č1 2
î1
j1
i11′
d1 22 1′1′ 2
ð1 2 33 1 2
h1
ħ1
ê1
ô1
â1
u1 2 3
w1
x1 2
y1
z1
c$1
ŋ$1 21 2′
ee1 22 1
em1 22 1
me1 2 32′ 1 31 2′ 33 1 23 2′ 13 1 2′
mm1 2 32′ 1 31 2′ 33 1 23 2′ 13 1 2′
1 2
âj1 2
ww1 2
xx1 2 3 4
yy1 2
zz1 2
#1
+1
+*1 2 3 4
@1 2
*1 2 3 4
&1 2
.1 2
;1 2
?1 2
!1 2
{1
}1
«1
»1 21+2′
/1
(ra)1 2
(ro)11a′ 1b
Table 8: Stroke order variants of glyphs, in reference to the canonical stroke order. The prime symbol denotes the reverse direction; the plus denotes a fused stroke.
GlyphStart joinEnd joinDescriptionUse
c1αMDefaultDefault
e1αMvDDefaultDefault
e1βBvDStem shortened to start at baseAfter glyphs that end at the base
n1αDefaultDefault
n2αBMDefaultBefore glyphs that start at the mid
ŋ1αMDvDefaultDefault
v1αBBDefaultDefault
o1αTvMDefaultDefault
o1βMMLoop on stroke to allow for mid ligation with previous glyphAfter glyphs that end at the mid
s1αMBDefaultDefault
þ1αBMDefaultDefault
þ1βBMStrokes 1 and 2 connectedStylistic
š1αMBvDefaultDefault
r1αDvBDefaultDefault
r2αMvBDefaultRare (β form is more common), but sometimes after glyphs that end at the mid
r2βBvBStroke 1 disconnected from 2 (starts at base instead)After glyphs that end at the base
l1αDvMDefaultDefault
l1βDvMStrokes 1 and 2 connectedStylistic
l2αMvMDefaultRare (β form is more common), but sometimes after glyphs that end at the mid
l2βBvMStroke 1 disconnected from 2 (starts at base instead)After glyphs that end at the base
l2γMvMStrokes 2 and 3 connectedRare (δ form is more common), but stylization of α
l2δBvMStroke 1 disconnected from 2 (starts at base instead), and strokes 2 and 3 connectedStylization of β
ł1αTvBDDefaultDefault
ł1βTvBDStrokes 1 and 2 connectedStylistic
m1αMvDefaultDefault
m2αDDefaultRare; β form is more common
m2βDStrokes 1 and 2 connectedStylistic
m3αMvDefaultRare; β form is more common
m3βMvStrokes 1 and 2 connectedStylistic
a1αDDefaultDefault
a1βMDStrokes 1 and 2 fused, with 2 beginning where 1 ends (without a loop)Stylistic (‘italic’ variant)
a1γDStrokes 1 and 2 connected (with a loop)Stylistic
a2αMMDefaultAfter glyphs that end at the mid
a2βMMStrokes 1 and 2 connected (rare)Stylistic
a3αBDDefaultAfter glyphs that end at the base
a3βBDStrokes 1 and 2 connectedStylistic
f1αMBDefaultDefault
f1βMBStrokes 1 and 2 connectedStylistic
g1αMDvDefaultDefault
p1αBDvDefaultDefault
t1αBDefaultDefault
t2αBBDefaultStylistic
č1αTBDefaultDefault
î1αBMDefaultDefault
j1αMDDefaultDefault
i1αTvBvDefaultDefault
i1βMBvLoop on stroke to allow for mid ligation with previous glyphAfter glyphs that end at the mid
i2αBTDefaultAfter glyphs that end at the base
d1αBDefaultDefault
d2αMMDefaultAfter glyphs that end at the mid
d3αBBDefaultAfter glyphs that end at the base
ð1αBDefaultDefault
ð1βBStrokes 1 and 2 connectedStylistic
ð1γBStrokes 2 and 3 connectedStylistic
ð1δBStrokes 1, 2, and 3 connectedStylistic
ð2αMMDefaultAfter glyphs that end at the mid, or as a stylization
ð2βMMStrokes 2 and 3 connectedStylistic
h1αMMDefaultDefault
ħ1αDefaultDefault
ê1αMDDefaultDefault
ê1βMStroke bends to the right at the end, preventing linkage with the next glyphStylistic
ô1αMDDefaultDefault
â1αDMDefaultDefault
u1αTvDBDefaultDefault
u1βMDBLoop on stroke 1 to allow for mid ligation with previous glyphAfter glyphs that end at the mid
w1αMDvDefaultDefault
x1αMMDefaultDefault
y1αBBDefaultDefault
z1αBBDefaultDefault
c$1αMDDefault (in practice, final forms have no successor to ligate to)Default
ŋ$1αMDBDefaultDefault
ŋ$2αMDefaultRare; β form is more common
ŋ$2βMStrokes 1 and 2 connectedStylistic
ee1αMvMDefaultDefault
ee2αMDDefaultSometimes after a glyph that ends at the mid
ee2βMDStrokes 1 and 2 connected (uncommon)Stylistic
em1αMvMDefaultDefault
em2αMDDefaultStylistic
em2βMDStrokes 1 and 2 connected (uncommon)Stylistic
me1αMvMDefaultDefault
me2αMDefaultStylistic
me2βMStrokes 1 and 2 connectedStylistic
me3αMvMDefaultStylistic
me3βMvMStrokes 1 and 2 connectedStylistic
me3γMStrokes 2 and 3 connectedStylistic
me3δMStrokes 1, 2, and 3 connectedStylistic
me4αMDDefaultSometimes after a glyph that ends at the mid
me4βMDStrokes 1 and 2 connectedStylistic
me5αMDDefaultSometimes after a glyph that ends at the mid
me5βMDStrokes 1 and 2 connectedStylistic
me5γMDStrokes 2 and 3 connectedStylistic
me5δMDStrokes 1, 2, and 3 connectedStylistic
me6αMDefaultSometimes after a glyph that ends at the mid
me6βMStrokes 1 and 2 connectedStylistic
me6γMStrokes 2 and 3 connectedStylistic
me6δMStrokes 1, 2, and 3 connectedStylistic
mm1αMvMDefaultDefault
mm2αMDefaultStylistic
mm2βMStrokes 1 and 2 connectedStylistic
mm3αMvMDefaultStylistic
mm3βMvMStrokes 1 and 2 connectedStylistic
mm3γMStrokes 2 and 3 connectedStylistic
mm3δMStrokes 1, 2, and 3 connectedStylistic
mm4αMDDefaultSometimes after a glyph that ends at the mid
mm4βMDStrokes 1 and 2 connectedStylistic
mm5αMDDefaultSometimes after a glyph that ends at the mid
mm5βMDStrokes 1 and 2 connectedStylistic
mm5γMDStrokes 2 and 3 connectedStylistic
mm5δMDStrokes 1, 2, and 3 connectedStylistic
mm6αMDefaultSometimes after a glyph that ends at the mid
mm6βMStrokes 1 and 2 connectedStylistic
mm6γMStrokes 2 and 3 connectedStylistic
mm6δMStrokes 1, 2, and 3 connectedStylistic
1αMMDefaultDefault
âj1αDDDefaultDefault
ww1αMDefaultDefault
xx1αMDDefaultDefault
yy1αBMDefaultDefault
zz1αBDefaultDefault
#1αMDefaultDefault
+1αMDefaultDefault
+*1αDefaultDefault
@1αTvMDefaultDefault
@1βMMLoop on stroke 1 to allow for mid ligation with previous glyphAfter a glyph that ends at the mid
*1αMDefaultDefault
&1αDefaultDefault
.1αMTDefaultDefault
;1αBDefaultDefault
?1αMTDefaultDefault
!1αMDefaultDefault
{1αTTvDefaultDefault
}1αBvBDefaultDefault
«1αDefaultDefault
»1αDefaultDefault
»2αDefaultStylistic (handwriting variant)
/1αDefaultDefault
ra1αDvDefaultDefault
ro1αDvMDefaultDefault
ro2αMvMDefaultRare (β form is more common), but sometimes after glyphs that end at the mid
ro2βBvMStroke 1 disconnected from 2 (starts at base instead)After glyphs that end at the base
Table 9: Topological variants of glyphs: ligation properties and descriptions. (Stroke numbers are in reference to the stroke-order variant, not the 2w glyph.)

Table 9 lists all topological variants with their possible join positions on each side, with B for base, M for mid (or mean), T for top (ascender line), and D for descender. If more than one position is listed, then any one of them can be used. A v suffix on a position indicates that the stroke end at the appropriate side is vertical.

In general, for two topological variants a and b to ligate to each other (in that order), there must exist a position C such that a can join at C endward and b can join at C startward, with at least one end not being vertical.

There are a few exceptions to this rule: any topological variant of ²⟨l⟩ can be ligated before ³⟨i2α (see Figure 4 for an example).

Stylistic variants are much less standardized in comparison, but there are some widely recognized variants:

²⟨’⟩ and ²⟨·⟩ are special: they can ligate with any participating glyph on either end, appearing as an extension of the stroke near the ²⟨’⟩ or ²⟨·⟩. Nonetheless, such ligation is not particularly common.

The rules over layers 3w and 4w dictate only what is legal, not what is considered beautiful. (Indeed, it is perfectly legal to use the 1α form of every glyph and abstain from all non-required ligatures.) Nor do they dictate how an eligible pair of glyphs should be ligated. There are some guidelines, however, on what is desirable:

Connotations associated with choices in layer-4w realization

Of course, context also plays a role in deciding how to realize text into layer 4w. First, the purpose of the writing has an influence (text meant for children or language learners will be less embellished, and header text tneds to be more embellished than body text).

Another part of context is the expressive connotation that the writer wishes to communicate.

ConnotationProperties of realization
Elegant, refinedIncreased use of ligation in general; use of ‘broken ²⟨r⟩-stroke forms’ such as ³⟨r2β and ³⟨l2β
RationalUse of the non-H stylistic variants of glyphs such as ³⟨r1α after ²⟨e⟩ or ²⟨m⟩ rather than the H variants
Casual, informalUse of ³⟨a1β
Table 10: Expresive connotations associated with choices in layer-4w realization.

Vertical ligation

Another desirable practice is vertical ligation, in which the strokes of two glyphs in different lines are connected. This is naturally difficult even in handwriting, let alone in type!