Layers 1 and 2w: Cenvos and its romanization
Rather than starting at layer 0, we start at layers 1 and 2w.
Cenvos, the native script of Ŋarâþ Crîþ, is written from right to left. This script can be analyzed on two levels: graphemes, which constitute the abstract level and glyphs, which are the characters being written. For instance, Cenvos has one grapheme romanized as ⟨c⟩ that corresponds to two different glyphs: the non-final form 𐲀𐲢 (denoted as ²⟨c⟩) and the final form 𐲀 (²⟨c$⟩). As another example, the sequence 𐲌𐲁 (⟨me⟩ = ²⟨me⟩) consists of one glyph but two graphemes.
In this grammar, we primarily use the romanization, whose symbols largely map one-to-one with Cenvos graphemes. Cenvos has four kinds of graphemes:
- True letters are graphemes that represent sounds.
- Markers, while considered letters, do not represent sounds. Instead, they indicate that the words affected are treated specially. They occur on the level of a word and do not actively participate in morphology.
- Punctuation includes the clause-end punctuation ⟨.⟩, ⟨;⟩, ⟨?⟩, and ⟨!⟩; the clitic boundary mark ⟨’⟩; the lenition mark ⟨·⟩; the grouping brackets ⟨{}⟩; and the quotation marks ⟨«»⟩.
- Digits can be used to write short numerals.
Of course, there is also the space.
Cen | Name | Rom | Cen | Name | Rom | Cen | Name | Rom |
---|---|---|---|---|---|---|---|---|
True letters | ||||||||
𐲀𐲢 | ca | c | 𐲌 | ma | m | 𐲘 | ar | h |
𐲁 | e | e | 𐲍 | a | a | 𐲙 | ħo | ħ |
𐲂 | na | n | 𐲎 | fa | f | 𐲚 | ên | ê |
𐲃𐲢 | ŋa | ŋ | 𐲏 | ga | g | 𐲛 | ôn | ô |
𐲄 | va | v | 𐲐 | pa | p | 𐲜 | ân | â |
𐲅 | o | o | 𐲑 | ta | t | 𐲝 | uħo | u |
𐲆 | sa | s | 𐲒 | ča | č | 𐳀 | cełaŋa | w |
𐲇 | þa | þ | 𐲓 | în | î | 𐳁 | avarte | x |
𐲈 | ša | š | 𐲔 | ja | j | 𐳂 | priþnos | y |
𐲉 | ra | r | 𐲕 | i | i | 𐳃 | telrigjon | z |
𐲊 | la | l | 𐲖 | da | d | |||
𐲋 | ła | ł | 𐲗 | ða | ð | |||
Final forms and ligatures (layer 2w) | ||||||||
𐲀 | c$ | 𐲌𐲁 | me | 𐳀𐳀 | ww | |||
𐲃 | ŋ$ | 𐲌𐲌 | mm | 𐳁𐳁 | xx | |||
𐲁𐲁 | ee | 𐲔𐲜 | jâ | 𐳂𐳂 | yy | |||
𐲁𐲌 | em | 𐲜𐲔 | âj | 𐳃𐳃 | zz | |||
Markers | ||||||||
𐲤 | carþ | # | 𐲦 | njor | +* | 𐲨 | nef | * |
𐲥 | tor | + | 𐲧 | es | @ | 𐲯 | sen | & |
Punctuation | ||||||||
𐲞 | gen | . | 𐲩 | ŋos | ’ | 𐲭 | fos | « |
𐲟 | tja | ; | 𐲪 | łil | · | 𐲮 | þos | » |
𐲠 | šac | ? | 𐲫 | rin | { | 𐳄 | jedva | / |
𐲡 | cjar | ! | 𐲬 | cin | } | 𐲣 | mivaf·ome | - |
𐳅 | vas | : |
The letters ⟨w⟩, ⟨x⟩, ⟨y⟩, and ⟨z⟩ are USR letters. These are used in foreign languages written in Cenvos to represent phonemes that are not approximated by the phonology of Ŋarâþ Crîþ. Each foreign orthography is free to assign them as it pleases.
Cenvos has two graphemes that change form at the end of the word: ⟨c⟩ and ⟨ŋ⟩, as well as several ligatures. We do not distinguish these forms in the romanization.
The marker ⟨*⟩ is used for foreign words, such as loanwords and foreign names. ⟨#⟩ is used to prefix given names. ⟨+⟩ is used to prefix surnames passed by native conventions (i.e. from parent to child within the same gender); ⟨+*⟩ marks a surname passed using non-native conventions. Place names are prefixed with ⟨@⟩. ⟨#⟩, ⟨+⟩, ⟨+*⟩, and ⟨@⟩ can all be used with ⟨*⟩, in which case ⟨*⟩ occurs first. Note that ⟨+*⟩ is a single letter of its own and not a ligature.
At the start of a word, ⟨&⟩ indicates reduplication of an unspecified prefix of the rest of the word. For instance, ⟨&cên⟩ can be pronounced as if it were ⟨cêcên⟩ or ⟨cêncên⟩. (⟨&⟩ occurs after all other markers in this case.) This usage is not productive in standard Ŋarâþ Crîþ, but it appears in a few words, as well as in some idiosyncratic cases. At the middle or the end of a word, or alone, it indicates ellipsis of part or all of the word, most often to abbreviate or censor a word. Lastly, ⟨&{}⟩ is used similarly to the ellipsis in Western punctuation.
Markers can be applied to multi-word strings by surrounding the string with the grouping brackets ⟨{}⟩. In legal language, ⟨{}⟩ are also used around phrases to resolve ambiguities.
The sentence punctuation ⟨.⟩, ⟨?⟩, and ⟨!⟩ are used as expected. ⟨;⟩ is used to separate two independent clause phrases within the same sentence.
The quotation marks, ⟨«»⟩, are used around quotations, direct or indirect. A ⟨.⟩ at the end of a quotation embedded within another sentence is omitted. In legal language, ⟨«»⟩ are used in contracts around terms that refer to specific entities or places.
⟨:⟩ is used for the following functions:
- listing separate elements of a symbolic identifier, in which case it is not surrounded by spaces,
- separating the usual end of an independent clause from a postposed adjunct, in which case it is preceded by a space,
- or separating principal parts of a lexical entry, in which case it is surrounded by spaces on both sides.
⟨’⟩ is used to separate clitics from the rest of the word to which they are attached. ⟨·⟩ indicates lenition; it could be described as a “letter modifier”. It is also used as a decimal point: officially, it is used after the most significant digit of an inexact numeral when written with digits, but it also used unofficially to write non-integers.
⟨/⟩, as its derivation from ⟨i⟩ suggests, is used to separate the number of mjari from the number of edva when writing currency amounts.
The morpheme boundary marker, ⟨-⟩, is sometimes used metalinguistically to mark a morpheme boundary, but it is not strictly a part of layer 1.
Spaces are placed in the following places:
- between orthographic words, but not between a clitic and the word to which it is attached
- after (but not before) ⟨.⟩, ⟨;⟩, ⟨?⟩, and ⟨!⟩
- before ⟨«⟩ and after ⟨»⟩ (but not on the other sides)
- around ⟨&{}⟩
[TODO: cover mentions of letters within the language, corresponding to v7 p17 “When letters or markers are referred to, … but the effects on other glyphs are not standardized”]
Digits are interchangeable with short-form numerals, but not with long-form numerals. They are also written right-to-left in Cenvos, with the most significant digit first: 𐲲 is 0x2A3 = 675.
Cen | # | Cen | # | Cen | # | Cen | # |
---|---|---|---|---|---|---|---|
𐲰 | 0 | 𐲱 | 1 | 𐲲 | 2 | | 3 |
| 4 | | 5 | | 6 | | 7 |
| 8 | | 9 | | A | | B |
| C | | D | | E | | F |
Letter numbering
Sometimes, an integer must be assigned to each letter. In this case, the assignment shown in the table below is used. Note that numbers are not assigned fully sequentially. Furthermore, this function is valid only for layer 1 graphemes.
Letter | Hex | Dec | Letter | Hex | Dec | Letter | Hex | Dec |
---|---|---|---|---|---|---|---|---|
True letters | ||||||||
c | 0 | 0 | m | 20 | 32 | h | 11 | 17 |
e | 1 | 1 | a | 9 | 9 | ħ | 12 | 18 |
n | 2 | 2 | f | A | 10 | ê | 101 | 257 |
ŋ | 2B | 43 | g | B | 11 | ô | 104 | 260 |
v | 3 | 3 | p | C | 12 | â | 109 | 265 |
o | 4 | 4 | t | D | 13 | u | 13 | 19 |
s | 5 | 5 | č | DE | 222 | w | −1 | −1 |
þ | 55 | 85 | î | E | 14 | x | −2 | −2 |
š | 5E | 94 | j | 6E | 110 | y | −3 | −3 |
r | 6 | 6 | i | F | 15 | z | −4 | −4 |
l | 7 | 7 | d | 10 | 16 | |||
ł | 77 | 119 | ð | 155 | 341 | |||
Markers | ||||||||
# | 14 | 20 | +* | 16 | 22 | * | 19 | 25 |
+ | 15 | 21 | @ | 17 | 23 | & | 1A | 26 |
The letter sum of a word is the sum of all of its letters. This value is used in some of the noun declension paradigms.
It is theorized that letter numbers were assigned in the following manner:
- The basic true letters inherited from Necarasso Cryssesa (i.e. those corresponding to ⟨c e n v o s r l m a f g p t î i d h⟩) received sequential numbers from zero. The number of ⟨m⟩ was changed due to superstitions against the number eight.
- ⟨ŋ þ š ł č ð⟩ received numbers based on what letter pairs (or triplets in the case of ⟨ð⟩) they were based on.
- ⟨ê⟩, ⟨ô⟩, and ⟨â⟩ were numbered as 256 + base glyph number.
- The other letters and the markers received sequential numbers after ⟨h⟩, skipping 0x18.
Collation
The true letters and the markers are collated in their respective order, except for ⟨&⟩, which is ignored. Lenited letters are treated as their respective base letters, except when two words differ only by the presence or absence of a lenition mark, in which case the lenited variant is collated after the base letter: ⟨saga⟩ < ⟨sag·a⟩ < ⟨sada⟩ < ⟨saħa⟩. Numerals are collated after all letters.
In a directory of personal names, entries are collated on surnames, with given names considered only when surnames are identical. Headings in such a list include the prefix up to and including the first true letter: ⟨+merlan #flirora⟩ would be found under ⟨+m⟩.
Ordered items can be labeled using numerals (starting from 0) or letters. In the latter case, only the letters ⟨c e n v o s r l m a f g p t î i d h⟩ are used.
Numquotes
A digit immediately preceding text surrounded by quotation or grouping marks constitutes a numquote. The digit is usually not pronounced in this case. Numquotes are mainly used for secondary purposes that lack any dedicated punctuation.
Numquote | Meaning |
---|---|
B{} | Contains parenthetical information: provides supplementary information. The sentence should still be grammatical without the parenthetical content. |
1{} | Lists an alias of a referent mentioned by name. |
2{} | Surrounds a key-value list. Used as such: ⟨2{3{&{}} 4{&{}} 3{&{}} 4{&{}}}⟩ |
3{} | Used for listing a key inside ⟨2{}⟩. |
4{} | Used for listing a value inside ⟨2{}⟩. When not directly inside a ⟨2{}⟩ numquote, marks a list: elements are delimited by spaces, and ⟨{}⟩ can be used to insert multi-word elements. |
9{} | Used to contain abbreviated quantities in the traditional currency system. |
*9{} | Used to contain abbreviated quantities in a currency system other than the traditional one. |
Backreferences
Sometimes, repeated sections of a text are notated using backreferences. A backreference definition consists of two ŋoses followed by a string of letters (the identifier) and a space, and then a phrase inside a ⟨{}⟩ pair. The text inside the delimiters will be transcluded using a backreference itself, which consists of two łils followed by the same identifier.