The Complete Lojban Language (1997)/Chapter 19

Introductory
This chapter is incurably miscellaneous. It describes the cmavo that specify the structure of Lojban texts, from the largest scale (paragraphs) to the smallest (single words). There are fewer examples than are found in other chapters of this book, since the linguistic mechanisms described are generally made use of in conversation or else in long documents.

This chapter is also not very self-contained. It makes passing reference to a great many concepts which are explained in full only in other chapters. The alternative would be a chapter on text structure which was as complex as all the other chapters put together. Lojban is a unified language, and it is not possible to understand any part of it (in full) before understanding every part of it (to some degree).

Sentences: I
The following cmavo is discussed in this section: .i     I       sentence separator Since Lojban is audio-visually isomorphic, there needs to be a spoken and written way of signaling the end of a sentence and the start of the following one. In written English, a period serves this purpose; in spoken English, a tone contour (rising or falling) usually does the job, or sometimes a long pause. Lojban uses a single separator: the cmavo “.i” (of selma'o I): 2.1)  mi klama le zarci .i do cadzu le bisli        I go to-the store. You walk on-the ice. The word “separator” should be noted. “.i” is not normally used after the last sentence nor before the first one, although both positions are technically grammatical. “.i” signals a new sentence on the same topic, not necessarily by the same speaker. The relationship between the sentences is left vague, except in stories, where the relationship usually is temporal, and the following sentence states something that happened after the previous sentence.

Note that although the first letter of an English sentence is capitalized, the cmavo “.i” is never capitalized. In writing, it is appropriate to place extra space before “.i” to make it stand out better for the reader. In some styles of Lojban writing, every “.i” is placed at the beginning of a line, possibly leaving space at the end of the previous line.

An “.i” cmavo may or may not be used when the speaker of the following sentence is different from the speaker of the preceding sentence, depending on whether the sentences are felt to be connected or not.

An “.i” cmavo can be compounded with a logical or non-logical connective (a jek or joik), a modal or tense connective, or both: these constructs are explained in Chapter 9, Chapter 10, and Chapter 14. In all cases, the “.i” comes first in the compound. Attitudinals can also be attached to an “.i” if they are meant to apply to the whole sentence: see Chapter 13.

There exist a pair of mechanisms for binding a sequence of sentences closely together. If the “.i” (with or without connectives) is followed by “bo” (of selma'o BO), then the two sentences being separated are understood to be more closely grouped than sentences connected by “.i” alone.

Similarly, a group of sentences can be preceded by “tu'e” (of selma'o TUhE) and followed by “tu'u” (of selma'o TUhU) to fuse them into a single unit. A common use of “tu'e ... tu'u” is to group the sentences which compose a poem: the title sentence would precede the group, separated from it by “.i”. Another use might be a set of directions, where each numbered direction might be surrounded by “tu'e ... tu'u” and contain one or more sentences separated by “.i”. Grouping with “tu'e” and “tu'u” is analogous to grouping with “ke” and “ke'e” to establish the scope of logical or non-logical connectives (see Chapter 14).

Paragraphs: NIhO
The following cmavo are discussed in this section: ni'o   NIhO    new topic no'i   NIhO    old topic da'o   DAhO    cancel cmavo assignments The paragraph is a concept used in writing systems for two purposes: to indicate changes of topic, and to break up the hard-to-read appearance of large blocks of text on the page. The former function is represented in both spoken and written Lojban by the cmavo “ni'o” and “no'i”, both of selma'o NIhO. Of these two, “ni'o” is the more common. By convention, written Lojban is broken into paragraphs just before any “ni'o” or “no'i”, but a very long passage on a single topic might be paragraphed before an “.i”. On the other hand, it is conventional in English to start a new paragraph in dialogue when a new speaker starts, but this convention is not commonly observed in Lojban dialogues. Of course, none of these conventions affect meaning in any way.

A “ni'o” can take the place of an “.i” as a sentence separator, and in addition signals a new topic or paragraph. Grammatically, any number of “ni'o” cmavo can appear consecutively and are equivalent to a single one; semantically, a greater number of “ni'o” cmavo indicates a larger-scale change of topic. This feature allows complexly structured text, with topics, subtopics, and sub-subtopics, to be represented clearly and unambiguously in both spoken and written Lojban. However, some conventional differences do exist between “ni'o” in writing and in conversation.

In written text, a single “ni'o” is a mere discursive indicator of a new subject, whereas “ni'oni'o” marks a change in the context. In this situation, “ni'oni'o” implicitly cancels the definitions of all pro-sumti of selma'o KOhA as well as pro-bridi of selma'o GOhA. (Explicit cancelling is expressed by the cmavo “da'o” of selma'o DAhO, which has the free grammar of an indicator – it can appear almost anywhere.) The use of “ni'oni'o” does not affect indicators (of selma'o UI) or tense references, but “ni'oni'oni'o”, indicating a drastic change of topic, would serve to reset both indicators and tenses. (See Section 8 for a discussion of indicator scope.)

In spoken text, which is inherently less structured, these levels are reduced by one, with “ni'o” indicating a change in context sufficient to cancel pro-sumti and pro-bridi assignment. On the other hand, in a book, or in stories within stories such as “The Arabian Nights”, further levels may be expressed by extending the “ni'o” string as needed. Normally, a written text will begin with the number of “ni'o” cmavo needed to signal the largest scale division which the text contains. “ni'o” strings may be subscripted to label each context of discourse: see Section 6.

“no'i” is similar in effect to “ni'o”, but indicates the resumption of a previous topic. In speech, it is analogous to (but much shorter than) such English discursive phrases as “But getting back to the point ... ”. By default, the topic resumed is that in effect before the last “ni'o”. When subtopics are nested within topics, then “no'i” would resume the previous subtopic and “no'ino'i” the previous topic. Note that “no'i” also resumes tense and pro-sumti assignments dropped at the previous “ni'o”.

If a “ni'o” is subscripted, then a “no'i” with the same subscript is assumed to be a continuation of it. A “no'i” may also have a negative subscript, which would specify counting backwards a number of paragraphs and resuming the topic found thereby.

Topic-comment sentences: ZOhU
The following cmavo is discussed in this section: zo'u   ZOhU    topic/comment separator The normal Lojban sentence is just a bridi, parallel to the normal English sentence which has a subject and a predicate: 4.1)  mi klama le zarci        I went to the market In Chinese, the normal sentence form is different: a topic is stated, and a comment about it is made. (Japanese also has the concept of a topic, but indicates it by attaching a suffix; other languages also distinguish topics in various ways.) The topic says what the sentence is about: 4.2)   zhe$4$ xiao$1$xi$2$   wo$3$ zhi$1$dao le        this news      I know [perfective] As for this news, I knew it. I’ve heard this news already. The wide space in the first two versions of Example 4.2 separate the topic (“this news”) from the comment (“I know already”).

Lojban uses the cmavo “zo'u” (of selma'o ZOhU) to separate topic (a sumti) from comment (a bridi): 4.3)  le nuzba zo'u mi ba'o djuno       The news : I [perfective] know. Example 4.3 is the literal Lojban translation of Example 4.2. Of course, the topic-comment structure can be changed to a straightforward bridi structure: 4.4)   mi ba'o djuno le nuzba I [perfective] know the news. Example 4.4 means the same as Example 4.3, and it is simpler. However, often the position of the topic in the place structure of the selbri within the comment is vague: 4.5)  le finpe zo'u citka        the fish : eat Is the fish eating or being eaten? The sentence doesn’t say. The Chinese equivalent of Example 4.5 is: 4.6)   yu$2$    chi$1$ fish  eat which is vague in exactly the same way.

Grammatically, it is possible to have more than one sumti before “zo'u”. This is not normally useful in topic-comment sentences, but is necessary in the other use of “zo'u”: to separate a quantifying section from a bridi containing quantified variables. This usage belongs to a discussion of quantifier logic in Lojban (see Chapter 16), but an example would be: 4.7)  roda poi prenu ku'o su'ode zo'u de patfu da         For-all X which-are-persons, there-exists-a-Y such-that Y is the father of X.         Every person has a father. The string of sumti before “zo'u” (called the “prenex”: see Chapter 16) may contain both a topic and bound variables: 4.8)   loi patfu roda poi prenu ku'o             su'ode zo'u de patfu da        For-the-mass-of fathers for-all X which-are-persons, there-exists-a-Y such-that Y is the father of X.       As for fathers, every person has one. To specify a topic which affects more than one sentence, wrap the sentences in “tu'e ... tu'u” brackets and place the topic and the “zo'u” directly in front. This is the exception to the rule that a topic attaches directly to a sentence: 4.9)  loi jdini zo'u tu'e do ponse .inaja do djica [tu'u]       The-mass-of money :  ( [if] you possess, then you want )         Money: if you have it, you want it. Note: In Lojban, you do not “want money”; you “want to have money” or something of the sort, as the x2 place of “djica” demands an event. As a result, the straightforward rendering of Example 4.8 without a topic is not: 4.10)  do ponse loi jdini .inaja do djica ri        You possess money only-if you desire its-mere-existence. where “ri” means “loi jdini” and is interpreted as “the mere existence of money”, but rather: 4.11) do ponse loi jdini .inaja do djica tu'a ri        You possess money only-if you desire something-about it. namely, the possession of money. But topic-comment sentences like Example 4.9 are inherently vague, and this difference between “ponse” (which expects a physical object in x2) and “djica” is ignored. See Example 9.3 for another topic/comment sentence.

The subject of an English sentence is often the topic as well, but in Lojban the sumti in the x1 place is not necessarily the topic, especially if it is the normal (unconverted) x1 for the selbri. Thus Lojban sentences don’t necessarily have a “subject” in the English sense.

Questions and answers
The following cmavo are discussed in this section: xu     UI      truth question ma     KOhA    sumti question mo     GOhA    bridi question xo     PA      number question ji     A       sumti connective question ge'i   GA      forethought connective question gi'i   GIhA    bridi-tail connective question gu'i   GUhA    tanru forethought connective question je'i   JA      tanru connective question pei    UI      attitude question fi'a   FA      place structure question cu'e   CUhE    tense/modal question pau    UI      question premarker Lojban questions are not at all like English questions. There are two basic types: truth questions, of the form “Is it true that ... ”, and fill-in-the-blank questions. Truth questions are marked by preceding the bridi, or following any part of it specifically questioned, with the cmavo “xu” (of selma'o UI): 5.1)  xu do klama le zarci        [True or false?] You go to the store        Are you going to the store/Did you go to the store? (Since the Lojban is tenseless, either colloquial translation might be correct.) Truth questions are further discussed in Chapter 15.

Fill-in-the-blank questions have a cmavo representing some Lojban word or phrase which is not known to the questioner, and which the answerer is to supply. There are a variety of cmavo belonging to different selma'o which provide different kinds of blanks.

Where a sumti is not known, a question may be formed with “ma” (of selma'o KOhA), which is a kind of pro-sumti: 5.2)  ma klama le zarci        [What sumti?] goes-to the store        Who is going to the store? Of course, the “ma” need not be in the x1 place: 5.3)   do klama ma        You go-to [what sumti?] Where are you going? The answer is a simple sumti: 5.4)  le zarci        The store. A sumti, then, is a legal utterance, although it does not by itself constitute a bridi – it does not claim anything, but merely completes the open-ended claim of the previous bridi.

There can be two “ma” cmavo in a single question: 5.5)  ma klama ma        Who goes where? and the answer would be two sumti, which are meant to fill in the two “ma” cmavo in order: 5.6)   mi le zarci I, to the store. An even more complex example, depending on the non-logical connective “fa'u” (of selma'o JOI), which is like the English “and ... respectively”: 5.7)  ma fa'u ma klama ma fa'u ma        Who and who goes where and where, respectively? An answer might be 5.8)   la djan. la marcas. le zarci le briju John, Marsha, the store, the office. John and Marsha go to the store and the office, respectively. (Note: A mechanical substitution of Example 5.8 into Example 5.7 produces an ungrammatical result, because “* ... le zarci fa'u le briju” is ungrammatical Lojban: the first “le zarci” has to be closed with its proper terminator “ku”, for reasons explained in Chapter 14. This effect is not important: Lojban behaves as if all elided terminators have been supplied in both question and answer before inserting the latter into the former. The exchange is grammatical if question and answer are each separately grammatical.)

Questions to be answered with a selbri are expressed with “mo” of selma'o GOhA, which is a kind of pro-bridi: 5.9)  la lojban. mo        Lojban [what selbri?]        What is Lojban? Here the answerer is to supply some predicate which is true of Lojban. Such questions are extremely open-ended, due to the enormous range of possible predicate answers. The answer might be just a selbri, or might be a full bridi, in which case the sumti in the answer override those provided by the questioner. To limit the range of a “mo” question, make it part of a tanru.

Questions about numbers are expressed with “xo” of selma'o PA: 5.10) do viska xo prenu        You saw [what number?] persons.        How many people did you see? The answer would be a simple number, another kind of non-bridi utterance: 5.11)  vomu Forty-five. Fill-in-the-blank questions may also be asked about: logical connectives (using cmavo “ji” of A, “ge'i” of GA, “gi'i” of GIhA, “gu'i” of GUhA, or “je'i” of JA, and receiving an ek, gihek, ijek, or ijoik as an answer) — see Chapter 14; attitudes (using “pei” of UI, and receiving an attitudinal as an answer) — see Chapter 13; place structures (using “fi'a” of FA, and receiving a cmavo of FA as an answer) — see Chapter 9; tenses and modals (using “cu'e” of CUhE, and receiving any tense or BAI cmavo as an answer) — see Chapter 9 and Chapter 10.

Questions can be marked by placing “pau” (of selma'o UI) before the question bridi. See Chapter 13 for details.

The full list of non-bridi utterances suitable as answers to questions is:


 * any number of sumti (with elidable terminator “vau”, see Chapter 6)
 * an ek or gihek (logical connectives, see Chapter 14)
 * a number, or any mathematical expression placed in parentheses (see Chapter 18)
 * a bare “na” negator (to negate some previously expressed bridi), or corresponding “ja'a” affirmer (see Chapter 15)
 * a relative clause (to modify some previously expressed sumti, see Chapter 8)
 * a prenex/topic (to modify some previously expressed bridi, see Chapter 16)
 * linked arguments (beginning with “be” or “bei” and attached to some previously expressed selbri, often in a description,see Chapter 5)

At the beginning of a text, the following non-bridi are also permitted:


 * one or more names (to indicate direct address without “doi”, see Chapter 6)
 * indicators (to express a prevailing attitude, see Chapter 13)
 * “nai” (to vaguely negate something or other, see Chapter 15)

Where not needed for the expression of answers, most of these are made grammatical for pragmatic reasons: people will say them in conversation, and there is no reason to rule them out as ungrammatical merely because most of them are vague.

Subscripts: XI
The following cmavo is discussed in this section: xi     XI      subscript The cmavo “xi” (of selma'o XI) indicates that a subscript (a number, a lerfu string, or a parenthesized mekso) follows. Subscripts can be attached to almost any construction and are placed following the construction (or its terminator word, which is generally required). They are useful either to extend the finite cmavo list to infinite length, or to make more refined distinctions than the standard cmavo list permits. The remainder of this section mentions some places where subscripts might naturally be used.

Lojban gismu have at most five places: 6.1)  mi cu klama le zarci le zdani le dargu le karce        I go to-the market from-the house via-the road using-the car. Consequently, selma'o SE (which operates on a selbri to change the order of its places) and selma'o FA (which provides place number tags for individual sumti) have only enough members to handle up to five places. Conversion of Example 6.1, using “xe” to swap the x1 and x5 places, would produce: 6.2)   le karce cu xe klama le zarci le zdani le dargu mi        The car is-a-transportation-means to-the market from-the house via-the road for-me. And reordering of the place structures might produce: 6.3)  fo le dargu fi le zdani fa mi fe le zarci fu le karce cu klama        Via the road, from the house, I, to the market, using-the car, go. Examples 6.1 to 6.3 all mean the same thing. But consider the lujvo “nunkla”, formed by applying the abstraction operator “nu” to “klama”: 6.4)   la'edi'u cu nunkla mi le zarci le zdani le dargu le karce The-referent-of-the-previous-sentence is-an-event-of-going by-me to-the market from-the house via-the road using-the car. Example 6.4 shows that “nunkla” has six places: the five places of “klama” plus a new one (placed first) for the event itself. Performing transformations similar to that of Example 6.2 requires an additional conversion cmavo that exchanges the x1 and x6 places. The solution is to use any cmavo of SE with a subscript "6" (see Chapter 19): 6.5)  le karce cu sexixa nunkla mi             le zarci le zdani le dargu la'edi'u        The car is-a-transportation-means-in-the-event-of-going by-me             to-the market via-the road which-is-referred-to-by-the-last-sentence. Likewise, a sixth place tag can be created by using any cmavo of FA with a subscript: 6.6)   fu le dargu fo le zdani fe mi fa la'edi'u             fi le zarci faxixa le karce cu nunkla Via the road, from the house, by me, the-referent-of-the-last-sentence, to the market, using the car, is-an-event-of-going. Examples 6.4 to 6.6 also all mean the same thing, and each is derived straightforwardly from any of the others, despite the tortured nature of the English glosses. In addition, any other member of SE or FA could be substituted into “sexixa” and “faxixa” without change of meaning: “vexixa” means the same thing as “sexixa”.

Lojban provides two groups of pro-sumti, both belonging to selma'o KOhA. The ko'a-series cmavo are used to refer to explicitly specified sumti to which they have been bound using “goi”. The da-series, on the other hand, are existentially or universally quantified variables. (These concepts are explained more fully in Chapter 16.) There are ten ko'a-series cmavo and 3 da-series cmavo available.

If more are required, any cmavo of the ko'a-series or the da-series can be subscripted: 6.7)  daxivo        X sub 4 is the 4th bound variable of the 1st sequence of the da-series, and 6.8)   ko'ixipaso something-3 sub 18 is the 18th free variable of the 3rd sequence of the ko'a-series. This convention allows 10 sequences of ko'a-type pro-sumti and 3 sequences of da-type pro-sumti, each with as many members as needed. Note that “daxivo” and “dexivo” are considered to be distinct pro-sumti, unlike the situation with “sexixa” and “vexixa” above. Exactly similar treatment can be given to the bu'a-series of selma'o GOhA and to the gismu pro-bridi “broda”, “brode”, “brodi”, “brodo”, and “brodu”.

Subscripts on lerfu words are used in the standard mathematical way to extend the number of variables: 6.9)  li xy.boixipa du li xy.boixire su'i xy.boixici        The-number x-sub-1 equals the-number x-sub-2 plus x-sub-3        x$1$  = x$2$  + x$3$ and can be used to extend the number of pro-sumti as well, since lerfu strings outside mathematical contexts are grammatically and semantically equivalent to pro-sumti of the ko'a-series. (In Example 6.9, note the required terminator “boi” after each “xy.” cmavo; this terminator allows the subscript to be attached without ambiguity.)

Names, which are similar to pro-sumti, can also be subscripted to distinguish two individuals with the same name: 6.10) la djan. xipa cusku lu mi'enai do li'u la djan. xire        John$1$ expresses “I-am-not you” to John$2$. Subscripts on tenses allow talking about more than one time or place that is described by the same general cmavo. For example, “puxipa” could refer to one point in the past, and “puxire” a second point (earlier or later).

You can place a subscript on the word “ja'a”, the bridi affirmative of selma'o NA, to express so-called fuzzy truths. The usual machinery for fuzzy logic (statements whose truth value is not merely “true” or “false”, but is expressed by a number in the range 0 to 1) in Lojban is the abstractor “jei”: 6.11) li pimu jei mi ganra        The-number .5 is-the-truth-value-of my being-broad. However, by convention we can attach a subscript to “ja'a” to indicate fuzzy truth (or to “na” if we change the amount): 6.12)  mi ja'a xipimu ganra I truly-sub-.5 am-broad Finally, as mentioned in Section 2, “ni'o” and “no'i” cmavo with matching subscripts mark the start and the continuation of a given topic respectively. Different topics can be assigned to different subscripts.

Other uses of subscripts will doubtless be devised in future.

Utterance ordinals: MAI
The following cmavo are discussed in this section: mai    MAI     utterance ordinal, -thly mo'o   MAI     higher order utterance ordinal Numerical free modifiers, corresponding to English “firstly”, “secondly”, and so on, can be created by suffixing “mai” or “mo'o” of selma'o MAI to a number or a lerfu string. Here are some examples: 7.1)  mi klama pamai le zarci .e remai le zdani        I go-to (firstly) the store and (secondly) the house. This does not imply that I go to the store before I go to the house: that meaning requires a tense. The sumti are simply numbered for convenience of reference. Like other free modifiers, the utterance ordinals can be inserted almost anywhere in a sentence without affecting its grammar or its meaning.

Any of the Lojban numbers can be used with MAI: “romai”, for example, means “all-thly” or “lastly”. Likewise, if you are enumerating a long list and have forgotten which number is wanted next, you can say “ny.mai”, or “Nthly”.

The difference between “mai” and “mo'o” is that “mo'o” enumerates larger subdivisions of a text; “mai” was designed for lists of numbered items, whereas “mo'o” was intended to subdivide structured works. If this chapter were translated into Lojban, it might number each section with “mo'o”: this section would then be introduced with “zemo'o”, or “Section 7.”

Attitude scope markers: FUhE/FUhO
The following cmavo are discussed in this section: fu'e   FUhE    open attitudinal scope fu'o   FUhO    close attitudinal scope Lojban has a complex system of “attitudinals”, words which indicate the speaker’s attitude to what is being said. The attitudinals include indicators of emotion, intensity markers, discursives (which show the structure of discourse), and evidentials (which indicate “how the speaker knows”). Most of these words belong to selma'o UI; the intensity markers belong to selma'o CAI for historical reasons, but the two selma'o are grammatically identical. The individual cmavo of UI and CAI are discussed in Chapter 13; only the rules for applying them in discourse are presented here.

Normally, an attitudinal applies to the preceding word only. However, if the preceding word is a structural cmavo which begins or ends a whole construction, then that whole construction is affected by the attitudinal: 8.1)  mi viska le blanu .ia zdani [ku]        I see the blue [belief] house.        I see the house, which I believe to be blue. 8.2)   mi viska le blanu zdani .ia [ku] I see the blue house [belief]. I see the blue thing, which I believe to be a house. 8.3)  mi viska le .ia blanu zdani [ku]        I see the [belief] blue house        I see what I believe to be a blue house. 8.4)   mi viska le blanu zdani ku .ia        I see (the blue house) [belief] I see what I believe to be a blue house. An attitudinal meant to cover a whole sentence can be attached to the preceding “.i”, expressed or understood: 8.5)  [.i] .ia mi viska le blanu zdani        [belief] I see the blue house.        I believe I see a blue house. or to an explicit “vau” placed at the end of a bridi.

Likewise, an attitudinal meant to cover a whole paragraph can be attached to “ni'o” or “no'i”. An attitudinal at the beginning of a text applies to the whole text.

However, sometimes it is necessary to be more specific about the range of one or more attitudinals, particularly if the range crosses the boundaries of standard Lojban syntactic constructions. The cmavo “fu'e” (of selma'o FUhE) and “fu'o” (of selma'o FUhO) provide explicit scope markers. Placing “fu'e” in front of an attitudinal disconnects it from what precedes it, and instead says that it applies to all following words until further notice. The notice is given by “fu'o”, which can appear anywhere and cancels all in-force attitudinals. For example: 8.6)  mi viska le fu'e .ia blanu zdani fu'o ponse        I see the [start] [belief] blue house [end] possessor        I see the owner of what I believe to be a blue house. Here, only the “blanu zdani” portion of the three-part tanru “blanu zdani ponse” is marked as a belief of the speaker. Naturally, the attitudinal scope markers do not affect the rules for interpreting multi-part tanru: “blanu zdani” groups first because tanru group from left to right unless overridden with “ke” or “bo”.

Other attitudinals of more local scope can appear after attitudinals marked by FUhE; these attitudinals are added to the globally active attitudinals rather than superseding them.

Quotations: LU, LIhU, LOhU, LEhU
The following cmavo are discussed in this section: lu     LU      begin quotation li'u   LIhU    end quotation lo'u   LOhU    begin error quotation le'u   LEhU    end error quotation Grammatically, quotations are very simple in Lojban: all of them are sumti, and they all mean something like “the piece of text here quoted”: 9.1)  mi pu cusku lu mi'e djan [li'u]        I [past] express [quote] I-am John [unquote]        I said, “I’m John”. But in fact there are four different flavors of quotation in the language, involving six cmavo of six different selma'o. This being the case, quotation deserves some elaboration.

The simplest kind of quotation, exhibited in Example 9.1, uses the cmavo “lu” (of selma'o LU) as the opening quotation mark, and the cmavo “li'u” (of selma'o LIhU) as the closing quotation mark. The text between “lu” and “li'u” must be a valid, parseable Lojban text. If the quotation is ungrammatical, so is the surrounding expression. The cmavo “li'u” is technically an elidable terminator, but it’s almost never possible to elide it except at the end of text.

The cmavo “lo'u” (of selma'o LOhU) and “le'u” (of selma'o LEhU) are used to surround a quotation that is not necessarily grammatical Lojban. However, the text must consist of morphologically correct Lojban words (as defined in Chapter 4), so that the “le'u” can be picked out reliably. The words need not be meaningful, but they must be recognizable as cmavo, brivla, or cmene. Quotation with “lo'u” is essential to quoting ungrammatical Lojban for teaching in the language, the equivalent of the * that is used in English to mark such errors: 9.2)  lo'u mi du do du la djan. le'u na tergerna la lojban.        [quote] mi du do du la djan. [unquote] is-not a-grammatical-structure in Lojban. Example 9.2 is grammatical even though the embedded quotation is not. Similarly, “lo'u” quotation can quote fragments of a text which themselves do not constitute grammatical utterances: 9.3)   lu le mlatu cu viska le finpe li'u zo'u lo'u viska le le'u             cu selbasti .ei lo'u viska lo le'u       [quote] le mlatu cu viska le finpe [unquote] : [quote] viska le [unquote] is-replaced-by [obligation!] [quote] viska lo [unquote]. In the sentence “le mlatu viska le finpe”, “viska le” should be replaced by “viska lo”. Note the topic-comment formulation (Section 4) and the indicator applying to the selbri only (Section 8). Neither “viska le” nor “viska lo” is a valid Lojban utterance, and both require “lo'u” quotation.

Additionally, pro-sumti or pro-bridi in the quoting sentence can refer to words appearing in the quoted sentence when “lu ... li'u” is used, but not when “lo'u ... le'u” is used: 9.4)  la tcarlis. cusku lu le ninmu cu morsi li'u             .iku'i ri jmive        Charlie says [quote] the woman is-dead [unquote].             However, the-last-mentioned is-alive.        Charlie says “The woman is dead”, but she is alive. In Example 9.4, “ri” is a pro-sumti which refers to the most recent previous sumti, namely “le ninmu”. Compare: 9.5)   la tcarlis. cusku lo'u le ninmu cu morsi le'u            .iku'i ri jmive Charlie says [quote] le ninmu cu morsi [unquote]. However, the-last-mentioned is-alive. Charlie says “le ninmu cu morsi”, but he is alive. In Example 9.5, “ri” cannot refer to the referent of the alleged sumti “le ninmu”, because “le ninmu cu morsi” is a mere uninterpreted sequence of Lojban words. Instead, “ri” ends up referring to the referent of the sumti “la tcarlis.”, and so it is Charlie who is alive.

The metalinguistic erasers “si”, “sa”, and “su”, discussed in Section 13, do not operate in text between “lo'u” and “le'u”. Since the first “le'u” terminates a “lo'u” quotation, it is not directly possible to have a “lo'u” quotation within another “lo'u” quotation. However, it is possible for a “le'u” to occur within a “lo'u ... le'u” quotation by preceding it with the cmavo “zo”, discussed in Section 10. Note that “le'u” is not an elidable terminator; it is required.

More on quotations: ZO, ZOI
The following cmavo are discussed in this section: zo     ZO      quote single word zoi    ZOI     non-Lojban quotation la'o   ZOI     non-Lojban name The cmavo “zo” (of selma'o ZO) is a strong quotation mark for the single following word, which can be any Lojban word whatsoever. Among other uses, “zo” allows a metalinguistic word to be referenced without having it act on the surrounding text. The word must be a morphologically legal (but not necessarily meaningful) single Lojban word; compound cmavo are not permitted. For example: 10.1) zo si cu lojbo valsi        “si” is a Lojbanic word. Since “zo” acts on a single word only, there is no corresponding terminator. Brevity, then, is a great advantage of “zo”, since the terminators for other kinds of quotation are rarely or never elidable.

The cmavo “zoi” (of selma'o ZOI) is a quotation mark for quoting non-Lojban text. Its syntax is “zoi X. text .X”, where X is a Lojban word (called the delimiting word) which is separated from the quoted text by pauses, and which is not found in the written text or spoken phoneme stream. It is common, but not required, to use the lerfu word (of selma'o BY) which corresponds to the Lojban name of the language being quoted: 10.2) zoi gy. John is a man .gy. cu glico jufra        “John is a man” is an English sentence. where “gy” stands for “glico”. Other popular choices of delimiting words are “.kuot.”, a Lojban name which sounds like the English word “quote”, and the word “zoi” itself. Another possibility is a Lojban word suggesting the topic of the quotation.

Within written text, the Lojban written word used as a delimiting word may not appear, whereas within spoken text, the sound of the delimiting word may not be uttered. This leads to occasional breakdowns of audio-visual isomorphism: Example 10.3 is fine in speech but ungrammatical as written, whereas Example 10.4 is correct when written but ungrammatical in speech. 10.3) ?mi djuno fi le valsi po'u zoi gy. gyrations .gy.        I know about the word which-is “gyrations”. 10.4)  ?mi djuno fi le valsi po'u zoi jai. gyrations .jai I know about the word which-is “gyrations”. The text “gy” appears in the written word “gyrations”, whereas the sound represented in Lojban by “jai” appears in the spoken word “gyrations”. Such borderline cases should be avoided as a matter of good style.

It should be noted particularly that “zoi” quotation is the only way to quote rafsi, specifically CCV rafsi, because they are not Lojban words, and “zoi” quotation is the only way to quote things which are not Lojban words. (CVC and CVV rafsi look like names and cmavo respectively, and so can be quoted using other methods.) For example: 10.5) zoi ry. sku .ry. cu rafsi zo cusku        “sku” is a rafsi of “cusku”. (A minor note on interaction between “lo'u ... le'u” and “zoi”: The text between “lo'u” and “le'u” should consist of Lojban words only. In fact, non-Lojban material in the form of a “zoi” quotation may also appear. However, if the word “le'u” is used either as the delimiting word for the “zoi” quotation, or within the quotation itself, the outer “lo'u” quotation will be prematurely terminated. Therefore, “le'u” should be avoided as the delimiting word in any “zoi” quotation.)

Lojban strictly avoids any confusion between things and the names of things: 10.6) zo .bab. cmene la bab.        The-word “Bob” is-the-name-of the-one-named Bob. In Example 10.6, “zo .bab.” is the word, whereas “la bab.” is the thing named by the word. The cmavo “la'e” and “lu'e” (of selma'o LAhE) convert back and forth between references and their referents: 10.7)  zo .bab. cmene la'e zo .bab. The-word “Bob” is-the-name-of the-referent-of the-word “Bob”. 10.8) lu'e la bab. cmene la bab.        A-symbol-for Bob is-the-name-of Bob. Examples 10.6 through 10.8 all mean approximately the same thing, except for differences in emphasis. Example 10.9 is different: 10.9)  la bab. cmene la bab. Bob is the name of Bob. and says that Bob is both the name and the thing named, an unlikely situation. People are not names.

(In Examples 10.6 through 10.7, the name “bab.” was separated from a preceding “zo” by a pause, thus: “zo .bab.”. The reason for this extra pause is that all Lojban names must be separated by pause from any preceding word other than “la”, “lai”, “la'i” (all of selma'o LA) and “doi” (of selma'o DOI). There are numerous other cmavo that may precede a name: of these, “zo” is one of the most common.)

The cmavo “la'o” also belongs to selma'o ZOI, and is mentioned here for completeness, although it does not signal the beginning of a quotation. Instead, “la'o” serves to mark non-Lojban names, especially the Linnaean binomial names (such as “Homo sapiens”) which are the internationally standardized names for species of animals and plants. Internationally known names which can more easily be recognized by spelling rather than pronunciation, such as “Goethe”, can also appear in Lojban text with “la'o”: 10.10) la'o dy. Goethe .dy. cu me la'o ly. Homo sapiens .ly.       Goethe is a Homo sapiens. Using “la'o” for all names rather than Lojbanizing, however, makes for very cumbersome text. A rough equivalent of “la'o” might be “la me zoi”.

Contrastive emphasis: BAhE
The following cmavo are discussed in this section: ba'e   BAhE    emphasize next word za'e   BAhE    next word is nonce English often uses strong stress on a word to single it out for contrastive emphasis, thus 11.1) I saw George. is quite different from 11.2)  I saw George. The heavy stress on “George” (represented in writing by italics) indicates that I saw George rather than someone else. Lojban does not use stress in this way: stress is used only to help separate words (because every brivla is stressed on the penultimate syllable) and in names to match other languages’ stress patterns. Note that many other languages do not use stress in this way either; typically word order is rearranged, producing something like 11.3) It was George whom I saw. In Lojban, the cmavo “ba'e” (of selma'o BAhE) precedes a single word which is to be emphasized: 11.4)  mi viska la ba'e .djordj. I saw the-one-named [emphasis] “George”. I saw George. Note the pause before the name “djordj.”, which serves to separate it unambiguously from the “ba'e”. Alternatively, the “ba'e” can be moved to a position before the “la”, which in effect emphasizes the whole construct “la djordj.”: 11.5) mi viska ba'e la djordj.        I saw [emphasis] the-one-named “George”.        I saw George. Marking a word with a cmavo of BAhE does not change the word’s grammar in any way. Any word in a bridi can receive contrastive emphasis marking: 11.6)  ba'e mi viska la djordj. I, no one else, saw George. 11.7) mi ba'e viska la djordj.        I saw (not heard or smelled) George. Emphasis on one of the structural components of a Lojban bridi can also be achieved by rearranging it into an order that is not the speaker’s or writer’s usual order. Any sumti moved out of place, or the selbri when moved out of place, is emphatic to some degree.

For completeness, the cmavo “za'e” should be mentioned, also of selma'o BAhE. It marks a word as possibly irregular, non-standard, or nonce (created for the occasion): 11.8) mi klama la za'e. .albeinias        I go-to so-called Albania marks a Lojbanization of an English name, where a more appropriate standard form might be something like “la ctiipyris.”, reflecting the country’s name in Albanian.

Before a lujvo or fu'ivla, “za'e” indicates that the word has been made up on the spot and may be used in a sense that is not found in the unabridged dictionary (when we have an unabridged dictionary!).

Parenthesis and metalinguistic commentary: TO, TOI, SEI
The following cmavo are discussed in this section: to     TO      open parenthesis to'i   TO      open editorial parenthesis toi    TOI     close parenthesis sei    SEI     metalinguistic bridi marker The cmavo “to” and “toi” are discursive (non-mathematical) parentheses, for inserting parenthetical remarks. Any text whatsoever can go within the parentheses, and it is completely invisible to its context. It can, however, refer to the context by the use of pro-sumti and pro-bridi: any that have been assigned in the context are still assigned in the parenthetical remarks, but the reverse is not true. 12.1) doi lisas. mi djica le nu to doi frank. ko sisti toi do viska le mlatu        O Lisa, I desire the event-of (O Frank, [imperative] stop!) you see the cat.        Lisa, I want you to (Frank! Stop!) see the cat. Example 12.1 implicitly redefines “do” within the parentheses: the listener is changed by “doi frank.” When the context sentence resumes, however, the old listener, Lisa, is automatically restored.

There is another cmavo of selma'o TO: “to'i”. The difference between “to” and “to'i” is the difference between parentheses and square brackets in English prose. Remarks within “to ... toi” cmavo are implicitly by the same speaker, whereas remarks within “to'i ... toi” are implicitly by someone else, perhaps an editor: 12.2) la frank. cusku lu mi prami do to'isa'a do du la djein. toi li'u        Frank expresses “I love you [you = Jane]” The “sa'a” suffix is a discursive cmavo (of selma'o UI) meaning “editorial insertion”, and indicating that the marked word or construct (in this case, the entire bracketed remark) is not part of the quotation. It is required whenever the “to'i ... toi” remark is physically within quotation marks, at least when speaking to literal-minded listeners; the convention may be relaxed if no actual confusion results.

Note: The parser believes that parentheses are attached to the previous word or construct, because it treats them as syntactic equivalents of subscripts and other such so-called “free modifiers”. Semantically, however, parenthetical remarks are not necessarily attached either to what precedes them or what follows them.

The cmavo “sei” (of selma'o SEI) begins an embedded discursive bridi. Comments added with “sei” are called “metalinguistic”, because they are comments about the discourse itself rather than about the subject matter of the discourse. This sense of the term “metalinguistic” is used throughout this chapter, and is not to be confused with the sense “language for expressing other languages”.

When marked with “sei”, a metalinguistic utterance can be embedded in another utterance as a discursive. In this way, discursives which do not have cmavo assigned in selma'o UI can be expressed: 12.3) la frank. prami sei la frank. gleki la djein.        Frank loves (Frank is happy) Jane. Using the happiness attitudinal, “.ui”, would imply that the speaker was happy. Instead, the speaker attributes happiness to Frank. It would probably be safe to elide the one who is happy, and say: 12.4)  la frank. prami sei gleki la djein. Frank loves (he is happy) Jane. The grammar of the bridi following “sei” has an unusual limitation: the sumti must either precede the selbri, or must be glued into the selbri with “be” and “bei”: 12.5) la frank. prami sei gleki be fa la suzn. la djein.        Frank loves (Susan is happy) Jane. This restriction allows the terminator cmavo “se'u” to almost always be elided.

Since a discursive utterance is working at a “higher” level of abstraction than a non-discursive utterance, a non-discursive utterance cannot refer to a discursive utterance. Specifically, the various back-counting, reciprocal, and reflexive constructs in selma'o KOhA ignore the utterances at “higher” metalinguistic levels in determining their referent. It is possible, and sometimes necessary, to refer to lower metalinguistic levels. For example, the English “he said” in a conversation is metalinguistic. For this purpose, quotations are considered to be at a lower metalinguistic level than the surrounding context (a quoted text cannot refer to the statements of the one who quotes it), whereas parenthetical remarks are considered to be at a higher level than the context.

Lojban works differently from English in that the “he said” can be marked instead of the quotation. In Lojban, you can say: 12.6) la djan. cusku lu mi klama le zarci li'u        John expresses “I go to-the store”. which literally claims that John uttered the quoted text. If the central claim is that John made the utterance, as is likely in conversation, this style is the most sensible. However, in written text which quotes a conversation, you don’t want the “he said” or “she said” to be considered part of the conversation. If unmarked, it could mess up the anaphora counting. Instead, you can use: 12.7)  lu mi klama le zarci seisa'a la djan. cusku be dei li'u       “I go to-the store (John expresses this-sentence)” “I go to the store”, said John. And of course other orders are possible: 12.8) lu seisa'a la djan. cusku be dei mi klama le zarci        John said, “I go to the store”. 12.9)  lu mi klama seisa'a la djan cusku le zarci “I go”, John said, “to the store”. Note the “sa'a” following each “sei”, marking the “sei” and its attached bridi as an editorial insert, not part of the quotation. In a more relaxed style, these “sa'a” cmavo would probably be dropped.

The elidable terminator for “sei” is “se'u” (of selma'o SEhU); it is rarely needed, except to separate a selbri within the “sei” comment from an immediately following selbri (or component) outside the comment.

Erasure: SI, SA, SU
The following cmavo are discussed in this section: si     SI      erase word sa     SA      erase phrase su     SU      erase discourse The cmavo “si” (of selma'o SI) is a metalinguistic operator that erases the preceding word, as if it had never been spoken: 13.1) ti gerku si mlatu        This is-a-dog, er, is-a-cat. means the same thing as “ti mlatu”. Multiple “si” cmavo in succession erase the appropriate number of words: 13.2)  ta blanu zdani si si xekri zdani That is-a-blue house, er, er, is-a-black house. In order to erase the word “zo”, it is necessary to use three “si” cmavo in a row: 13.3) zo .bab. se cmene zo si si si la bab.        The-word “Bob” is-the-name-of the word “si”, er, er, Bob. The first use of “si” does not erase anything, but completes the “zo” quotation. Two more “si” cmavo are then necessary to erase the first “si” and the “zo”.

Incorrect names can likewise cause trouble with “si”: 13.4) mi tavla fo la .esperanto si si .esperanton.        I talk in-language that-named “and” “speranto”, er, er, Esperanto. The Lojbanized spelling “.esperanto” breaks up, as a consequence of the Lojban morphology rules (see Chapter 4) into two Lojban words, the cmavo “.e” and the undefined fu'ivla “speranto”. Therefore, two “si” cmavo are needed to erase them. Of course, “.e speranto” is not grammatical after “la”, but recognition of “si” is done before grammatical analysis.

Even more messy is the result of an incorrect “zoi”: 13.5) mi cusku zoi fy. gy. .fy.  si si si si zo .djan        I express [foreign] [quote] “gy” [unquote], er, er, er, er, “John”. In Example 13.5, the first “fy” is taken to be the delimiting word. The next word must be different from the delimiting word, and “gy.”, the Lojban name for the letter “g”, was chosen arbitrarily. Then the delimiting word must be repeated. For purposes of “si” erasure, the entire quoted text is taken to be a word, so four words have been uttered, and four more “si” cmavo are needed to erase them altogether. Similarly, a stray “lo'u” quotation mark must be erased with “fy. le'u si si si”, by completing the quotation and then erasing it all with three “si” cmavo.

What if less than the entire “zo” or “zoi” construct is erased? The result is something which has a loose “zo” or “zoi” in it, without its expected sequels, and which is incurably ungrammatical. Thus, to erase just the word quoted by “zo”, it turns out to be necessary to erase the “zo” as well: 13.6) mi se cmene zo .djan.  si si zo .djordj.        I am-named-by the-word “John”, er, er, the-word “George”. The parser will reject “zo .djan. si .djordj.”, because in that context “djordj.” is a name (of selma'o CMENE) rather than a quoted word.

Note: The current machine parser does not implement “si” erasure.

As the above examples plainly show, precise erasures with “si” can be extremely hard to get right. Therefore, the cmavo “sa” (of selma'o SA) is provided for erasing more than one word. The cmavo following “sa” should be the starting marker of some grammatical construct. The effect of the “sa” is to erase back to and including the last starting marker of the same kind. For example: 13.7) mi viska le sa .i mi cusku zo .djan.        I see the  ...  I say the-word “John”. Since the word following “sa” is “.i”, the sentence separator, its effect is to erase the preceding sentence. So Example 13.7 is equivalent to: 13.8)  mi cusku zo .djan. Another example, erasing a partial description rather than a partial sentence: 13.9) mi viska le blanu zdan. sa le xekri zdani        I see the blue hou ...  the black house. In Example 13.9, “le blanu zdan.” is ungrammatical, but clearly reflects the speaker’s original intention to say “le blanu zdani”. However, the “zdani” was cut off before the end and changed into a name. The entire ungrammatical “le” construct is erased and replaced by “le xekri zdani”.

Note: The current machine parser does not implement “sa” erasure. Getting “sa” right is even more difficult (for a computer) than getting “si” right, as the behavior of “si” is defined in terms of words rather than in terms of grammatical constructs (possibly incorrect ones) and words are conceptually simpler entities. On the other hand, “sa” is generally easier for human beings, because the rules for using it correctly are less finicky.

The cmavo “su” (of selma'o SU) is yet another metalinguistic operator that erases the entire text. However, if the text involves multiple speakers, then “su” will only erase the remarks made by the one who said it, unless that speaker has said nothing. Therefore “susu” is needed to eradicate a whole discussion in conversation.

Note: The current machine parser does not implement either “su” or “susu” erasure.

Hesitation: Y
The following cmavo is discussed in this section: .y.    Y       hesitation noise Speakers often need to hesitate to think of what to say next or for some extra-linguistic reason. There are two ways to hesitate in Lojban: to pause between words (that is, to say nothing) or to use the cmavo “.y.” (of selma'o Y). This resembles in sound the English hesitation noise written “uh” (or “er”), but differs from it in the requirement for pauses before and after. Unlike a long pause, it cannot be mistaken for having nothing more to say: it holds the floor for the speaker. Since vowel length is not significant in Lojban, the “y” sound can be dragged out for as long as necessary. Furthermore, the sound can be repeated, provided the required pauses are respected.

Since the hesitation sound in English is outside the formal language, English-speakers may question the need for a formal cmavo. Speakers of other languages, however, often hesitate by saying (or, if necessary, repeating) a word (“este” in some dialects of Spanish, roughly meaning “that is”), and Lojban’s audio-visual isomorphism requires a written representation of all meaningful spoken behavior. Of course, “.y.” has no grammatical significance: it can appear anywhere at all in a Lojban sentence except in the middle of a word.

No more to say: FAhO
The following cmavo is discussed in this section: fa'o   FAhO    end of text The cmavo “fa'o” (of selma'o FAhO) is the usually omitted marker for the end of a text; it can be used in computer interaction to indicate the end of input or output, or for explicitly giving up the floor during a discussion. It is outside the regular grammar, and the machine parser takes it as an unconditional signal to stop parsing unless it is quoted with “zo” or with “lo'u ... le'u”. In particular, it is not used at the end of subordinate texts quoted with “lu ... li'u” or parenthesized with “to ... toi”.

List of cmavo interactions
The following list gives the cmavo and selma'o that are recognized by the earliest stages of the parser, and specifies exactly which of them interact with which others. All of the cmavo are at least mentioned in this chapter. The cmavo are written in lower case, and the selma'o in UPPER CASE.


 * “zo” quotes the following word, no matter what it is.
 * “si” erases the preceding word unless it is a “zo”.
 * “sa” erases the preceding word and other words, unless the preceding word is a “zo”.
 * “su” is the same as “sa”, but erases more words.
 * “lo'u” quotes all following words up to a “le'u” (but not a “zo le'u”).
 * “le'u” is ungrammatical except at the end of a “lo'u quotation.
 * ZOI cmavo use the following word as a delimiting word, no matter what it is, but using “le'u” may create difficulties.
 * “zei” combines the preceding and the following word into a lujvo, but does not affect “zo”, “si”, “sa”, “su”, “lo'u”, ZOI cmavo, “fa'o”, and “zei”.
 * BAhE cmavo mark the following word, unless it is “si”, “sa”, or “su”, or unless it is preceded by “zo”. Multiple BAhE cmavo may be used in succession.
 * “bu” makes the preceding word into a lerfu word, except for “zo”, “si”, “sa”, “su”, “lo'u”, ZOI cmavo, “fa'o”, “zei”, BAhE cmavo, and “bu”. Multiple “bu” cmavo may be used in succession.
 * UI and CAI cmavo mark the previous word, except for “zo”, “si”, “sa”, “su”, “lo'u”, ZOI, “fa'o”, “zei”, BAhE cmavo, and “bu”. Multiple UI cmavo may be used in succession. A following “nai” is made part of the UI.
 * “.y.”, “da'o”, “fu'e”, and “fu'o” are the same as UI, but do not absorb a following “nai”.

List of Elidable Terminators
The following list shows all the elidable terminators of Lojban. The first column is the terminator, the second column is the selma'o that starts the corresponding construction, and the third column states what kinds of grammatical constructs are terminated. Each terminator is the only cmavo of its selma'o, which naturally has the same name as the cmavo. be'o   BE              sumti attached to a tanru unit boi    PA/BY           number or lerfu string do'u   COI/DOI         vocative phrases fe'u   FIhO            ad-hoc modal tags ge'u   GOI             relative phrases kei    NU              abstraction bridi ke'e   KE              groups of various kinds ku     LE/LA           description sumti ku'e   PEhO            forethought mekso ku'o   NOI             relative clauses li'u   LU              quotations lo'o   LI              number sumti lu'u   LAhE/NAhE+BO    sumti qualifiers me'u   ME              tanru units formed from sumti nu'u   NUhI            forethought termsets se'u   SEI/SOI         metalinguistic insertions te'u   various         mekso conversion constructs toi    TO              parenthetical remarks tu'u   TUhE            multiple sentences or paragraphs vau    (none)          simple bridi or bridi-tails ve'o   VEI             mekso parentheses