Introduction

Downstep is a pervasive phenomenon in tonal languages that lowers the pitch of a tone relative to the same tone earlier in the utterance (Rialland 1997; Connell 2011; Leben 2018). It affects not only the pitch of a particular tone, but rather the entire pitch range, or register, meaning that all tones following an instance of downstep are also lowered. This is demonstrated in Fig. 1. Register R1 contains the sequence of tones HML. Following the first L, there is downstep, which shifts the register down to its new position R2. Within R2, there is the same sequence of tones, but each is realized at a lower pitch than the same tone in R1. Downstep can theoretically occur an infinite number of times in an utterance, gradually lowering each downstepped tone and all subsequent tones, resulting in many more surface pitches than underlying tonal categories: in practice, though, the number of times downstep occurs is limited by phrase length (Hyman 1979a; Leben 2018).

Fig. 1
figure 1

Representation of register lowering

Full size image

Downstep can be classified into two categories according to its triggers: Automatic and nonautomatic downstep (Stewart 1965). Automatic downstepFootnote1 is often characterized as downstep conditioned when a linked H follows a linked L (e.g., Yip 2002; Gussenhoven 2004; Snider 2007). This definition is limited, making two claims that are inadequate when languages with more than two tones are considered: 1) the (only) trigger of automatic downstep is an L tone, and 2) the (only) target of automatic downstep is an H tone. There are, however, languages where nonlow tones can trigger downstep of following Hs, as well as languages where L tones trigger downstep of nonhigh tones. Examples of the former include Yala-Ikom (Armstrong 1968) and Northern Toussian, both of which have three contrastive level tones. In these languages, M tones cause following H tones to be downstepped. The latter case, languages for which tones other than H can be the target of downstep, include Supyire (Carlson 1994), a four-tone language where the lowest tone downsteps all other tone categories, as well as Seenku (McPherson 2019), a four-tone language where the lowest tone downsteps the two highest tones, but not the second lowest. Yala-Ikom and Northern Toussian are also examples of this, as Ls cause following Ms to be downstepped (Armstrong 1968). A more general definition of automatic downstep, then, is that it is a process where certain linked nonhigh tones downstep certain linked nonlow tones—the sets of tones that act as trigger or target of downstep are language specific.

Nonautomatic downstep, conversely, is downstep that is not triggered by a preceding linked tone. Often, nonautomatic downstep is attributed to floating L tones. Under this approach, the sequence /HHHH HH/, where circled tones are floating tones, is realized as [HHHHHH]. Other mechanisms have been argued to cause nonautomatic downstep. For example, in Yemba, downstepped Ls occur when a linked L is preceded by the sequence of floating tones   (Hyman 1985).Footnote2 In some languages, nonautomatic downstep has been analyzed as a phonetic effect caused when two Hs are adjacent, without the presence of an L tone. This is the case for Supyire (Carlson 1983) and Shambaa (Odden 1982).

The representation of downstep has been a major locus of study—is it the phonetic realization of sequences of certain tones, or is it a phonological unit in its own right (Stewart 1981; Clements and Ford 1981; Clements 1983; Carlson 1983; Yip 2002; Lionnet 2022b, In press)? If the latter, is it part of the featural representation of tonal categories, or is it distinct from them?

I address these questions through the description and analysis of double downstep in Northern Toussian (Niger Congo, potentially Gur/Mabia; Burkina Faso), presenting data collected with the speakers Karim Traor, Daouda Traor, and Safiatou Diabat during fieldwork conducted since 2018. There has been very little study of tone in Northern Toussian—except for Struthers-Young (2022), there are no published accounts of downstep in the language. This article presents novel data of hitherto undescribed tonal and morphosyntactic features of the language.

Double downstep is a rare tonal effect where the register is lowered more than is typical of other instances of downstep in the language, leading to a larger drop in pitch. Consider (1).

  1. (1)
    figure b

In both (1a) and (1b), the verb  bears an H tone. When singly downstepped in (1a), it surfaces at a pitch lower than the H of s ‘father’ but higher than the M of the preceding word p is. When doubly downstepped in (1b), the pitch of  is lower than that of the preceding M noun p ‘husband,’ considerably lower than in (1a). I show that this is due to two separate phenomena, both of which individually cause downstep: 1) grammatical tone and 2) a prosodic boundary effect. When the two effects target the same word, their effects are cumulative, leading to double downstep.

I argue that these data provide evidence that downstep is phonologically controlled, rather than being a phonetic effect. To account for these data, I propose a model of tonal representation based on Register Tier Theory (Snider 1990, 2020) that employs subtonal featural representations and has the ability to model complex registral effects like double downstep. In this model, like Snider’s, tones are comprised of the tonal features H and L as well as the register features h and l. It differs, however, in that multiple register features can be stacked onto a single tone-bearing unit (TBU), and downstep is caused for each additional l register feature beyond what is lexically specified for that tone. This model allows for the representation of double downstep in an analytically simple manner, while being capable of representing a diverse array of downstepping phenomena attested crosslinguistically.

The paper is structured as follows. I first introduce Northern Toussian, describing the aspects of its morphosyntax and tonology relevant to this study in Sect. 2. Sect. 3 is a description of double downstep in Northern Toussian. In Sect. 4, I discuss other languages with double downstep and summarize how they have been analyzed. In Sect. 5, I discuss different theoretical approaches to the representation of tonal categories and downstep. Following this, in Sect. 6, I present a novel model of tonal representation, with that I analyze the Northern Toussian data. After a discussion of remaining issues and future research in Sect. 7, I conclude in Sect. 8.

Northern Toussian

Northern Toussian is a minority language of Burkina Faso spoken to the southwest of Bobo-Dioulasso. The number of speakers of the language is uncertain: national demographic surveys only include language data of the seven largest languages spoken in the country. The last linguistic survey of Northern Toussian conducted by SIL in 1995 estimated at just under 20,000 speakers (Eberhard et al. 2024). Northern Toussian, and the Toussian languages in general, are quite vital—we might, then, predict that the number of Toussian speakers has risen roughly proportionally with the increase in population. Therefore, there might be around 40,000 speakers currently. There are certain varieties that are not as vital: in particular, the varieties spoken in Moami and Tien, which are quite divergent from Northern Toussian and Southern Toussian and might constitute a third Toussian language. In these villages, speakers appear to be shifting to Dioula, and the variety of these villages might be endangered. Figure 2 is a map of the Toussian languages.

Fig. 2
figure 2

Map of the Toussian languages

Full size image

Both Toussian languages are underdescribed. What little research has been published on the languages has focused on Southern Toussian (Prost 1964; Mous 1999; Wiesmann 2004; Barro et al. 2004). Outside of my work Struthers-Young (2022, 2023), Zaugg-Coretti (2005) and Bossers and Boone (2023) are the sole publications about Northern Toussian.

2.1 Basic morphosyntax

Some of the tonal phenomena discussed in this paper depend on the properties of certain morphosyntactic markers. For that reason, I give a brief overview of Northern Toussian morphosyntax, focusing on the aspects relevant for the realization of double downstep.

The Toussian languages have SAuxOVX word order. This order is an areal feature that likely originated with the Mande languages, but has spread to many other families within the Macro-Sudan belt like the Senoufo languages and some Songhay, Atlantic, Kru, and Kwa languages, among others (Güldemann 2007). Aux refers to a domain that includes various auxiliary elements, including tense, aspect, mood, and polarity items (TAMP), discourse markers, and auxiliary verbs. X houses adjuncts and oblique arguments, such as postpositional phrases, most adverbials, etc. Multiple auxiliary elements can co-occur within a single phrase, and the combination of auxiliary elements serves to mark grammatical categories—there is almost no inflectional morphology on the verb itself.

The elements of the Aux domain are shown in Fig. 3. The linear order of elements is indicated by the order of the columns, with P standing for ‘position.’ Elements within the same column are either in complementary distribution or can occur in any relative order.

Fig. 3
figure 3

Auxiliary elements of Northern Toussian

Full size image

The elements of P6 and P7 share characteristics with main verbs: 1) they are the target of grammatical tone that I discuss in Sect. 3.2, and 2) some of the P6–P7 markers exhibit concordant marking of imperfectivity. For these reasons, I consider these words to be auxiliary verbs, rather than particles like the markers in P1–P5.

Three auxiliary verbs exhibit suppletive aspect marking, including kw/fFootnote3 ‘be able.pfv/ipfv,’ pw/p ‘come.pfv/ipfv,’ and ky/tj~tj~tj ‘go.pfv/ipfv.’ The latter two can also function as main verbs. All other verbs have a single form used in imperfective and perfective contexts. The aspectual interpretation of the phrase depends on the dynamicity of the verb. Unmarked dynamic verbs are interpreted as perfective, whereas unmarked stative (adjectival) verbs can be interpreted either as stative (the sheep is white) or inchoative (the sheep became white). In (2), there is no overt TAMP morphology in both phrases with the dynamic verb  ‘sweep’ and the stative verb w ‘be long.’

  1. (2)
    figure e

For verbs that do not undergo a suppletive aspect alternation, imperfectivity is only indicated by a nasal proclitic that attaches to the left edge of the VP. When the auxiliary verbs p prost ‘again,’ p ‘come.ipfv,’ and ty ‘go.ipfv’ are present, the imperfective marker exhibits multiple exponence, attaching to each auxiliary in addition to the left edge of the VP. It does not attach to p prog or f ‘be able,’ likely for historical reasons: p originated as the copula—in other Toussian languages, in fact, the two markers are homophonous, such as p in Southern Toussian (Prost 1964) or p in the Northern Toussian of Kourinion (personal research)—and copular phrases are not marked for imperfective aspect. The imperfective marker would not have been present at any point in the grammaticalization process from copula to a progressive marker, and it has not yet spread to p through analogy.

The positional classification as, e.g., a P6 vs P5 marker, is based solely on linear order of markers, rather than behavior, which is why f ‘be able.ipfv’ and p prog are considered P6 markers alongside t ‘again’ and p pros even though they do not receive imperfective marking. I have not assigned the imperfective marker to any of these positions because of the multiple exponence it exhibits.

2.2 Basic tonology

Northern Toussian has a complex tonal system with three contrastive level tones: H (), M (), and L (). CV syllables can bear lexical contour tones, including the two-tone contours HL (), HM (), and LH (), as well as the three-tone contours HLH (´), LHM ( ). Monomorphemic verbs are limited to a handful of melodies, namely H, M, L, HM, and HL. Tones on nouns, conversely, are assigned on a per-syllable basis, leading to contrasts like HL.L  ‘orphan,’ H.HL  ‘hyena,’ and H.L  ‘uterus.’ Each of these words has a similar shape, CVCC(C)VC, yet the inflection point of the H and L is at a different location, indicating that there is no consistent melody-mapping process in nouns.

In sequences of Hs and sequences of Ms, there is very little declination, the gradual lowering of the average pitch of the tones across an utterance (Connell and Ladd 1990). This is seen with Hs in (3a) and Ms in (3b). In these, the pitch stays quite stable across words. For sequences of Ls, however, there is some declination accompanied by phrase-final lowering, as seen in (3c). Asymmetries in declination rates by tonal categories are not necessarily unexpected: it is attested in Mambila (Connell 2011), a four-tone language in which there is no declination in sequences of like tones, except for the lowest tonal category, which also exhibits declination and final lowering.

  1. (3)
    figure f

There are multiple reasons to consider the downtrend in (3c) to be declination, rather than downstep. First, there are no sudden drops in pitch, as is typical for downstep—the lowering is gradual across the course of the utterance.

Second, the rate of declination varies according to sentence length, where it decreases as the length of the utterance increases—this is a characteristic feature of declination (Lindau 1986). Consider the three phrases in (4), which are three (4a), five (4b), and seven (4c) syllables in length, all comprised exclusively of L tones. Each of them starts at approximately 140–60 Hz, and falls to around 110–20 Hz by the end of the utterance. The slope of declination for (4a) is approximately −50 Hz/s, (4b) is −41 Hz/s, and (4c) is −17 Hz/s. Although more utterances would be needed to determine the average rate of declination by utterance length, there is a clear trend that the slope increases: i.e., the rate of declination decreases as the utterance becomes longer.

  1. (4)
    figure g

Third, the lowering is independent of context, i.e., it is not conditioned by syntactic construction, nor associated with particular lexemes. In (4), there are both intransitive (4a) and transitive sentences (4b and c), and multiple types of constructions, e.g., possessive constructions (4c) and adjectives modifying nouns (4b and c). Across each phrase, there is no discernable difference in behavior of the tones conditioned by any of these various constructions. As I will show throughout Sect. 3, words of other tonal categories exhibit contrastive downstep in some of these same constructions. These three factors point to the downtrend in sequences of L tones being declination, rather than downstep.

Automatic downstep is triggered whenever a lower tone precedes a higher tone. This means that an L tone causes both a following M (5a) and H (5b) to be downstepped, and an M causes a following H (5c) to be downstepped.

  1. (5)
    figure h

Throughout the rest of this paper, automatic downstep is not represented in transcriptions. Unless otherwise stated, the symbol  can be interpreted as representing nonautomatic downstep.

Double downstep in Northern Toussian

Double downstep in Northern Toussian is caused when two separate downstepping processes occur at the same position in the utterance, causing the register to be lowered twice cumulatively. The first of these processes is an instance of prosodically conditioned downstep, where downstep occurs following a M positioned at the right edge of the phonological phrase (Section 3.1). The second is an instance of grammatical tone (Section 3.2), indicating that a verb lacks a preverbal internal (i.e., nonsubject) argument.

3.1 Prosodically conditioned downstep

Earlier, I showed a sequence of M tones where there is no downstep, repeated in (6).

  1. (6)
    figure i

Consider, however, (7), where the second of two Ms is downstepped.

  1. (7)
    figure j

There are multiple contexts in which an M is downstepped following another M, including, as seen in (7), when an M object follows an M subject, as well as in a possessive construction (8a), within a postpositional phrase (8b), and when the word following an M verb has an initial M tone (8c).

  1. (8)
    figure k

Downstep does not occur when an M nominal determiner or modifier follows an M (9a–c) or when an M verb follows an M object (9d).

  1. (9)
    figure l

What are the factors that condition the downstep? In most cases, the downstep occurs following the right edge of DPs and VPs. Selected examples from above are repeated below, with their syntactic structures shown. (10a–c) are examples of downstep following the right edge of a DP, and (10d) is downstep after a VP.

  1. (10)
    figure m

(11) shows the examples in (9a–c) with their syntactic structures overlaid. These are all examples of nouns followed by their modifiers within DPs—there is no downstep in these contexts.

  1. (11)
    figure n

Based on this, one might be inclined to argue that downstep is purely syntactically conditioned, whereby the element following the right edge of certain XPs like DPs and VPs is downstepped. There are two reasons why this analysis is insufficient. First, the DP internal to the VP does not trigger downstep, seen in (12).

  1. (12)
    figure o

Second, the downstep only occurs following M tones; if there is a sequence of Hs or Ls in environments parallel to those in (8), there is no corresponding nonautomatic downstep. Were the downstep a purely syntactic phenomenon, it would be expected to occur following any tone, not M tones alone. (13) shows that there is no downstep of an object DP following an H or L subject DP, (14) of two H or L nouns in a possessive construction, (15) of a postposition following an H complement, and (16) of a noun following an H or L verb.

  1. (13)
    figure p
  1. (14)
    figure q
  1. (15)
    figure r
  1. (16)
    figure s

Instead, I argue that the location of downstep is better explained by a syntax–prosody interaction where words are downstepped following an M positioned at the right edge of the phonological phrase. Many recent theories of syntax–prosody propose that phonological phrases often correspond to syntactic constituents, such that certain XPs are parsed into a phonological phrase. Moreover, they hold that recursive prosodic phrasing is possible (Truckenbrodt 1999; Selkirk et al. 2011). I adopt these basic assumptions. In Northern Toussian, DPs and VPsFootnote4 are typically parsed into phonological phrases, and the downstep occurs at the right edge of these phrases. This can be seen below: the subject DP (17a), a DP within a possessive construction (17b), or the complement of a postposition (17c) constitute phonological phrases, and the element following them is downstepped. Phonological phrases are marked by parentheses throughout the rest of this paper.

  1. (17)
    figure t

Although DPs correspond to phonological phrases in most contexts, the constituents of the VP are tightly coupled: I analyze that VP-internal DPs do not constitute phonological phrases. Instead, they are parsed within the same phrase as the verb, shown in (18). As there is no phonological phrase boundary to the right of the object, the verb is not downstepped.

  1. (18)
    figure u

In addition to the absence of downstep following the object DP, there are two primary pieces of evidence supporting this analysis. First, pausing preferentially occurs before or after the VP, rarely occurring within it. Second, there is L tone spreading restricted to the VP. When the final tone of the object is L, it spreads onto the verb, causing H toned verbs to become LH (19a) and HL verbs to become L (19b).Footnote5 Verbs of other tones are unaffected.

  1. (19)
    figure v

The following example demonstrates that the spreading is restricted to the VP. The L does not spread from the subject to the object (20a), within a postpositional phrase (20b), or within a possessive construction (20c). Likewise, there is no spreading within the DP in contexts where the prosodic boundary effect does not occur, e.g., between a noun and adjective (20d). Northern Toussian is not the only language that does not parse a VP-internal DP into a phonological phrase: see, e.g., Kimatuumbi (Odden 1987), Chitumbuka (Downing 2006), Chichewa (Downing and Mtenje 2011), and Niuean (Clemens 2019).

  1. (20)
    figure w

Since the object and verb are in the same phonological phrase and there is no downstep of the verb when both are M, downstep acts as a diagnostic for determining phonological phrasing. This allows the prosodic structure of the language to be further probed. This diagnostic is crucial for determining the phonological phrasing within the verbal domain, which I now turn to.

The VP, as well as its auxiliary modifiers, correspond to a phonological phrase. This is evidenced by the lack of downstep triggered by M Aux markers: there is no downstep following the discourse marker p is (21a), m ‘more; longer’ (21b), or the prospective aspect auxiliary verb p (21c) when they precede an M word.

  1. (21)
    figure x

If the auxiliary elements were parsed into separate phonological phrases from the VP, we would expect the object to be downstepped, as shown in (22).

  1. (22)
    figure y

It is commonly assumed in indirect reference theories of syntax–prosody interactions that functional elements are not parsed into independent phonological phrases, and instead either occur within the phonological phrase of a lexical element, or are unparsed (Truckenbrodt 1999). The examples above show that this generalization holds in Northern Toussian.

Having established how prosodic phrasing functions in Northern Toussian, let us consider how a longer phrase, such as that in (23), is prosodified and how downstep is realized. The subject DP, parsed into its own phonological phrase, is an M noun, p ‘husband.’ Because of its M tone, the following object n ‘people’ is downstepped. The VP constitutes a single phonological phrase, so there is no downstep within it—however, kj ‘wife,’ the complement of the postposition following it, is downstepped. The M of kj conditions downstep of the postposition tj ‘place.’Footnote6 This example also serves to show that this particular affect is truly downstep, rather than being a local lowering effect that modifies the pitch of the tone, but does not affect the register. If it were a local effect, one of the later M words, e.g., bw, would be expected to raise back to around the pitch of p. The fact that the downstep progressively lowers the pitch of each downstepped tone and the tones following it is a hallmark of downstep (Leben 2018).

  1. (23)
    figure z

Thus far, I have only shown how the downstep behaves with sequences of like tones. This is because there are no alternations of the like seen above with sequences of nonidentical tones. This is the case both for sequences where an alternation might be expected, i.e., when a tone follows an M such as in M H or M L sequences, or in contexts where no alternation is expected, such as L H, L M, H M, and H L sequences. I show the former in (24) and leave the latter in the Appendix (85). In these examples, I note all instances of downstep, both automatic and nonautomatic.

  1. (24)
    figure aa

At the top, we see M M sequences, where, as shown above, there is downstep of the second mid in postpositional phrases and possessive constructions, but not when the noun is modified by an adjective or is internal to the VP. In these same contexts, an H following an M is always downstepped, and an L following an M is never downstepped. These data could be interpreted as evidence that this prosodic effect only occurs between two M tones as a type of OCP effect. I argue that this is not the case, and that instead, the prosodic boundary effect does occur in the M H sequences, but that the contrast is neutralized in the DP-internal and VP-internal contexts because the H tones are downstepped due to automatic downstep. I provide evidence for this in Sect. 3.3.

The lack of downstep with L tones can be explained by a language-wide prohibition against downstepped L tones. Although there is declination with sequences of L tones, as was shown in (3c) and (4), there are no instances where there is a contrast between [L L] and [L L]. The lack of L is not surprising, as L is exceedingly rare crosslinguistically, though it is attested in several languages such as Yemba (Hyman 1985), Paic (Lionnet 2022b), and Kikuyu (Clements and Ford 1981): see footnote 2 of Lionnet (In press) for a more extensive list of languages with L.

The following is a summary of the properties of prosodic phrasing relevant for the present study:

M tones positioned at the right edge of phonological phrases downstep following tones

  • This targets both M and H
  • L are unaffected

DPs as well as VPs and their Aux modifiers constitute phonological phrases

  • Except for the VP-internal DP, which is parsed into the same phonological phrase as the VP

I now turn to grammatical tone that also conditions downstep.

3.2 Grammatical tone

3.2.1 Basic distribution

There is grammatical tone indicating that a verb lacks a preverbal internal argument, which I gloss as apvia (absent preverbal internal argument).Footnote7 In transitive sentences, verbs surface with their lexical tones, like the H verb  ‘watch’ (25a) or the HL verb  ‘search’ (25b). If the object of these verbs is elided, there is a tonal alternation where  ‘watch’ surfaces as LH (26a) and  ‘search’ as L (26b).

  1. (25)
    figure ab
  1. (26)
    figure ac

This same tonal pattern is seen with intransitive verbs. H pw ‘come’ is realized with a LH tone (27a), and HL by ‘complain’ surfaces as L (27b). The lexical tones of these verbs can be seen in a number of contexts, e.g., in imperative, conditional, negative, or subjunctive clauses, among others. I demonstrate this with imperatives in (28).

  1. (27)
    figure ad
  1. (28)
    figure ae

Example (29) shows this process with all attested verb melodies: H surfaces as LH, HM as LHM, and HL as L; M and L toned verbs are unaffected.Footnote8

  1. (29)
    figure af

The apvia marker targets verbs regardless of the tone of the subject:

  1. (30)
    figure ag

When auxiliaries are present—shown again in Fig. 4—there is a further alternation.

Fig. 4
figure 4

Auxiliary elements of Northern Toussian

Full size image

If a P1 or P2 Aux particle precedes an intransitive verb, the effects shown in (29) target the verb. When other auxiliaries are present and the tone of the verb is H, the verb is downstepped. Otherwise, it surfaces with its lexical tone. Example (31) shows the tonal alternation on the verb when P1/P2 auxiliaries are present, including the P1 markers fn ‘also’ (31a) and kwn ‘anyway’ (31b), as well as the P2 auxiliaries  pst (31c), and w evid (31d).

  1. (31)Tonal change with P1/P2 Aux + Verb
    figure ai

Example (32) shows how H verbs are downstepped following other markers. The verb  ‘watch’ is downstepped after the P3 marker r sbjv (32a) and the P4 marker k neg (32b). A pitch track of the latter is shown in (33).

  1. (32)Downstep with P3–P7 and H verb
    figure aj
  1. (33)

Downstep also occurs when the verb is marked for imperfectivity, as in (34). Note that the downstep occurs after the imperfective marker, realized phonetically in this context as a homorganic syllabic nasal bearing the same pitch as the preceding H—this has implications for the representation of the grammatical tonal effect, which I address shortly.

  1. (34)
    figure ak

There is no downstep when verbs of other tones follow one of these particles, such as the HL verb  ‘search’ (35a), M  ‘sweep’ (35b), or HM k ‘walk’ (35c).Footnote9

  1. (35)
    figure al

Based on this distribution, the tonal effect appears to be a floating L positioned at the left edge of the verb when it lacks a preverbal internal argument. When there are no auxiliary elements or a P1–P2 Aux and the verb, like those in (27) or (31), it can dock onto the verb, causing the effects in (29). With P3–P7 auxiliaries it remains floating, where it downsteps H verbs instead. With verbs of other tones, it is deleted. When proclitics attach to the verb, as in (34), the effects of the tone target the verb alone and not the proclitics attached to it, indicating that the tone is positioned immediately before the verb and after any verbal proclitics. In the following examples, I represent the grammatical tone as a superscript  placed before the verb.

3.2.2 Tonal properties of the copula and progressive auxiliary verb

I characterized the effects in (29), where the apvia marker docks onto the verb when no auxiliary particles are present, as a general process. There are, however, two exceptions: regardless of TAMP context, the copula p and progressive auxiliary verb p are always downstepped by the grammatical tone—it never docks onto them, forming a contour tone, as is the case for other H toned verbs. This can be seen in the following examples. In (36), the two markers are downstepped when there is no TAMP marker before the copula or progressive marker, and in (37) when following the past marker.

  1. (36)
    figure am
  1. (37)
    figure an

To summarize its properties, the apvia marker:

indicates that there is no preverbal internal argument;

causes H verbs to be downstepped when preceded by P3–P7 auxiliaries: it is deleted before verbs of other tones;

cannot dock onto the copula p or the progressive auxiliary verb p—they are always downstepped.

3.3 Double downstep

Double downstep arises when both the prosodic boundary downstep and the apvia marker target the same word. Broadly, this occurs in two contexts: 1) when an M subject is followed by an H verb in an intransitive imperfective phrase, and 2) when an M subject is followed by the copula or progressive auxiliary verb.

Example (38) demonstrates the first context. The subject DP p is parsed into a phonological phrase. Since it has an M tone, it causes the tone of the following word to be downstepped. The downstep is realized on the verb, and not the imperfective marker, because n= is toneless: the downstep targets the following tone. The verb  ‘watch’ is preceded by the imperfective marker, which conditions the apvia marker to trigger downstep, rather associating with the next TBU. These two downstepping processes each target , causing it to be doubly downstepped.

  1. (38)
    figure ao

The other context where double downstep occurs is with the copula and progressive auxiliary verb. Since the grammatical tone can never dock onto them and always causes downstep, they are doubly downstepped following an M subject:

  1. (39)
    figure ap

We can be certain that this is indeed double downstep, rather than, say, the H of p becoming an M tone, due to the behavior of the words following the doubly downstepped H. Consider (40), repeated from (1b).

  1. (40)
    figure aq

The pitch of n ‘mother’ surfaces at the same pitch as the H of . Had the tone of  changed to , the following H tones should surface at a higher pitch than , as is typical for an H following an M, seen above in, e.g., (1a) and (7).

The following examples demonstrate that double downstep only occurs in the limited circumstances shown in (38) and (39). There is no double downstep if a tone-bearing auxiliary is present because it intervenes between the subject and the apvia marker, as in (41). The M subject p ‘husband’ causes p is to be downstepped, and the grammatical tone downsteps the copula p. As the copula is downstepped only once, p is realized at a pitch higher than p, slightly lower than the pitch of p. I analyze that p is downstepped due to the grammatical tone, but the source of downstep is in fact ambiguous: if there were no floating L tone between p and pp would be lowered due to automatic downstep triggered by the M H sequence.

  1. (41)
    figure ar

If the subject is anything but M, there is no double downstep since only M tones condition the prosodic boundary downstep. In (42a) and (42b), the tone of the subject is H and L, respectively, and in both contexts, the copula p is only downstepped once.

  1. (42)
    figure as

Phonetically, a doubly downstepped high typically surfaces at approximately the same pitch as a downstepped mid, shown with the contrast between p in (43a) and p in (43b).

  1. (43)
    figure at

Double downstep, then, appears to be caused by two different downstepping processes, a grammatical L floating tone and a prosodic effect, which cumulatively lowers the register twice. Table 1 shows all the sources of downstep in Northern Toussian.

Table 1 Sources of downstep in Northern Toussian
Full size table

In Sect. 6, I provide an analysis of this process, but first I describe other cases of double downstep (Sect. 4), and give background on how downstep has been modeled (Sect. 5).

Double downstep in other languages

Double downstep is rarely attested. It is reported in three Grassfields languages, Medmba (Voorhoeve 1971), Yemba (Hyman and Tadadjeu 1976),Footnote10 and Bangante (Hyman and Tadadjeu 1976, 77); in two western Nilotic languages, Kumam (Hieda 2010) and Acooli (Hieda 2011); as well as two closely related Oceanic languages spoken in New Caledonia, Drubea and Num (Lionnet In press).

The reported cases of double downstep in the Eastern Grassfields and Western Nilotic languages are all analyzed as arising due to two floating L tones, but with an important nuance: a floating H intervenes between the two floating Ls. The double downstep in Medmba, Kumam, and Acooli arises from the structure in (44a); Yemba has the structure shown in (44b).

  1. (44)
    figure au

The H intervening between the Ls serves two purposes: 1) it prevents OCP effects that might cause the two Ls to merge or delete, and 2) it provides a target for the downstep. The second point might seem unorthodox, as floating tones are not typically understood to be the target of downstep. I return to this point shortly. In many early works, especially those published before Leben (1973) and Goldsmith (1976), the triggers and targets of downstep were often rigorously defined. Voorhoeve (1971) was explicit in his definition of downstep in Medmba, giving the rule shown in (45).

  1. (45)
    figure av

Voorhoeve uses downstep features ([+d]), which are added to the existing tone and act as instructions to lower the register of the utterance at that point. This rule would be represented as (46) in an autosegmental representation.

  1. (46)
    figure aw

This is a restrictive rule that predicts downstep in (47a), but not (47b), because there are two adjacent Ls between the Hs—downstep is predicted if and only if a single L intervenes between two Hs. This is different from the typical conception of downstep, which is triggered when an L precedes an H, regardless of the tone before the L (Gussenhoven 2004, 100; Yip 2002, 148).

  1. (47)
    figure ax

The rule in (45) is motivated by some of the unique tonal properties of Medmba. Like many other Eastern Grassfields languages, Medmba is a two-tone language with an incredibly complex tonal system due to its many floating tones, both lexically and grammatically conditioned. Example (48) gives a set of words that have floating tones at their edges (Voorhoeve 1971, 50).

  1. (48)
    figure ay

These nouns can be combined by means of an associative construction, which, among other things, marks possession. This construction is formed by juxtaposing the two nouns, where the first word is the possessee, and the second is the possessor. An associative marker intervenes between the two nouns. It can be either a floating H or L, conditioned by the noun class of the first word.Footnote11 In phrases like (49), where there is a series of floating Ls between the two Hs, there is no downstep because the sequence HLH does not occur at any point in the clause.

  1. (49)
    figure az

In (50), HLH does occur, as the H of the associative marker acts as the first H, and the first two tones of  function as the remaining two tones. This results in  being downstepped.

  1. (50)
    figure ba

Double downstep occurs when there is a sequence H  H, such as in (51). The first word of the associative construction is , the associative marker is a floating H, and the tone of the second word is .

  1. (51)
    figure bb

This satisfies the rule in (46) twice—the  of the associative marker is downstepped due to the preceding HL tones, as is the H of . Downstepping a floating H tone might seem unconventional, as downstep is often viewed as the phonetic implementation of a sequence of tones, rather than being a phonological category separate from the target tone that could have nonlocal effects (Yip 2002, 150–151). For Voorhoeve, however, the register-lowering effects of downstep features do not depend on them being associated with linked tones. Instead, downstep could be triggered by floating tones, lowering the register, with the phonetic effects of the register lowering only being perceptible on the next linked tone. Hyman (1985, 53) makes this argument explicit in his analysis of downstepped L tones in Yemba.

The rules that produce downstep in Yemba, as analyzed in Hyman and Tadadjeu (1976) are more complex than those of Medmba, and their details are beyond the scope of this paper. However, the circumstances under which double downstep arises are similar—it is triggered by an alternating sequence of L and H floating tones, both grammatically and lexically specified. First, consider the sentence in (52).

  1. (52)
    figure bc

Hyman and Tadadjeu (1976) propose that hodiernal tense is marked segmentally by a preverbal particle k as well as by a tonal circumfix . Transitive verbs are marked with an object marker suffix. For class 1a, this marker is , which harmonizes with the preceding vowel (Tadadjeu 1980, 175). When contrastive emphasis is placed on the object, an additional object marker occurs between the verb and the object (SVO word order). Like the verbal suffix, the class 1a marker is , realized as  in this context. There are two instances of downstep in this sentence. The floating L of the hodiernal marker, because it is surrounded by H tones, causes the object marker to be downstepped. Similarly, the lexically specified floating L of  ‘child’ results in downstep as well.

Without contrastive emphasis, the object marker particle between the object and verb is absent. In such a context, there is the sequence of floating tones     before the linked H of the object. Each of the floating L tones is flanked by H tones, conditioning register lowering that leads to the object being downstepped twice.Footnote12 This is seen in (53).

  1. (53)
    figure bd

In Acooli and Kumam (Hieda 2010, 2011), there is similarly a set of tonal processes and vowel deletion that results in a sequence of floating  between two linked H tones, causing double downstep.

In all of these cases, double downstep is attributed to an alternating sequence of floating Ls and Hs. Floating Hs can be the target of downstep, causing a register lowering whose phonetic effects are realized on the next tone-bearing unit. Double downstep arises from the cumulative register lowering of two separate instances of downstep, one that targets a floating H and the other a following linked H.

Turning now to Northern Toussian, a downstep formation rule like (45) would not adequately account for the double downstep attested in the language. There is no independent evidence for the presence of a floating H tone intervening between two floating L tones—if the prosodic boundary effect is caused by a floating L tone, the double downstep would have to arise from the structure in (54), where two floating L tones directly precede a linked H.

  1. (54)
    figure be

Both Voorhoeve (1971) and Hyman and Tadadjeu (1976)’s downstep formation rules explicitly predict that the structure in (54) would not result in double downstep: this is in line with the widespread assumption that downstep occurs when an L precedes an H, but not when an L precedes another L. (49) shows that multiple adjacent L tones do not have a cumulative downstepping effect in Medmba. This is also the case in Acooli. Consider example (55).

  1. (55)
    figure bf

There are two phonological processes that apply to this phrase, derived in (56). The first is a vowel hiatus resolution rule causing the a of p ‘of’ and the o of pj ‘Opiyo’ to coalesce, producing an o. Hieda (2011) argues that this results in the first L delinking and floating. Following the hiatus resolution is a tone-spreading rule where the H of bk ‘book’ spreads onto the following syllable, delinking the L. Assuming this analysis is correct, at this stage in the derivation, there are two floating Ls before the H in pj. This does not result in double downstep—instead the high is only downstepped once.

  1. (56)
    figure bg

The trigger of double downstep for Northern Toussian, then, is novel, caused by the cumulative effects of two separate downstepping processes—not through an alternating sequence of .

The nature of downstep

What triggers downstep? Many, if not most, analyses assume that L tones are involved: either linked Ls in the case of automatic downstep or floating Ls for nonautomatic downstep. This is not universally the case, of course, as some authors have proposed that downstep is a phonological primitive (Stewart 1981) or can be a phonetic effect (Odden 1982; Carlson 1983), but the current mainstream view seems to hold that downstep typically arises from L tones.Footnote13 If this view is maintained, and the prosodic boundary effect in Northern Toussian is attributed to an insertion of a floating L tone, an instance of double downstep such as (57) would arise when the two floating L tones, one exponing apvia, the other from the prosodic boundary effect, each triggers downstep of the following word: in this case, p cop.

  1. (57)
    figure bh

An accurate model of downstep in Northern Toussian must then be able to account for the following patterns:

  1. (58)
    figure bi

Downstep only occurs between the last L of a sequence of linked Ls and a following H: none of the Ls are downstepped: but each floating L before an H triggers downstep. Moreover, we want to maintain the fundamental analytical insight gained by floating L tones: floating Ls are L tones, just like linked Ls, and both floating and linked Ls should have similar behaviors. Therefore, floating Ls and linked Ls should trigger downstep in much the same way, and any rule deriving downstep should be applicable to both linked and floating tones.

There are two analyses that are possible when we restrict the representation of tonal categories to tonal primitives. Hypothesis 1: downstep is a context-dependent phonetic effect caused when an H follows a L, in line with proposals like Yip (2002, 150–151). In this analysis, floating L tones behave identically to linked tones, and will only cause downstep when they occur before an H. Hypothesis 2: L tones in and of themselves act as instructions to lower the register, independent of the phonological category of the following tone.

These hypotheses make two different sets of predictions, neither of which reflect the behavior of downstep in Northern Toussian. Consider the implications of these predictions in (59).

  1. (59)
    figure bj

Under hypothesis 1, the linked sequence /LLH/ results in the expected [LLH], as the H is downstepped when it follows the second L. However, in the floating sequence / H/ would be predicted to surface as [H], as the first L is not local to the H—it is separated from the H by the second L. Conversely, hypothesis 2 correctly predicts the behavior of the floating L tones, as each  individually triggers downstep, causing the H to be downstepped twice. A problem arises in the behavior of linked Ls. Since the downstepping rule states that all L tones cause following tones to be downstepped, regardless of the tonal category of the following tone, this model predicts that the second linked tone should also be downstepped.

Why is there this asymmetry? These data imply that floating L tones and linked L tones do not condition downstep in the same way. This is the problem that Hyman faced in his analysis of Yemba (Hyman 1985). Like its closely related sister Medmba, it has a complex tonology rife with floating tones. In Yemba, nonautomatic downstep occurs and is attributed to floating L tones, but the language lacks automatic downstep. This means that while each floating L causes a linked H to be downstepped (60a), a linked H is not downstepped following a linked L (60b).

  1. (60)
    figure bk

Hyman attributes the difference in downstep to a difference in representation, where an utterance has an autosegmental register tier in addition to the tonal tier, and register tones (H or L) within this tier effectuate registral shifts up or down. In this approach, downstep is not a phonetic effect, and is instead phonologically conditioned through register tones. H tones are represented as is typical for autosegmental diagrams (61a). Downstepped Hs, however, have an added L in the register tier that is associated with the H tone itself—not with the TBU (represented as X). This is shown in (61b). The additional register L causes the downstep.

  1. (61)
    figure bl

Hyman analyzes nonautomatic downstep as arising from the rule in (62).Footnote14 It states that when a floating L precedes a linked H, it attaches itself as a register feature to the H, causing downstep. As this rule specifically applies to floating tones, a linked L will not spread onto a following H, and therefore there is no automatic downstep.

  1. (62)
    figure bm

Snider (1990, 2020) further developed these registral intuitions into Register Tier Theory (RTT) a feature-geometric approach to tonal representation.Footnote15 In it, there are two distinct tiers to a tone: the register tier and the tone tier. The register tier contains the register features h and l, that function as instructions to shift the register up and down, respectively, setting the pitch target of tonal categories to a new level. The tone tier has two tonal features, H and L, which situate the pitch within the register. The former sets the pitch of the tone at the top of the register, the latter at the bottom. Both tiers are linked to a tonal root node, which is then associated to the TBU. The tiers are shown in (63)—this represents a tonal category that has an H tone feature and h register feature, equivalent to the high tone in (61a).

  1. (63)
    figure bn

This model permits the four tonal categories in (64). High tones have an H tone feature and h register feature, and Low tones have an L tone feature and l register feature. Tones other than high or low can be represented in one of two ways: they can have an H tone feature and l register feature (64c) or an L tone feature and h register feature (64d). Snider (2020) calls these tones M1 and M2, respectively. Using H and L to refer to subtonal features renders these symbols, as well as the term tone, ambiguous: H could refer to the tonal category H h, or the tonal feature H alone. To avoid this ambiguity, I will use words High, Mid, and Low, to refer to the tonal category, and the letters H and L to refer to the tonal features. In prose, I use the term tone to refer exclusively to tonal categories, not the tonal features H or L—for these, I will always call them either tonal features or tone features.

A four-tone language makes use of all four categories, but a three-tone language uses a subset. Snider (1990) states that High tones must always be H and h and Low tones L and l, but allows the language to select between M1 and M2. Lionnet (2022a), however, argues that subtonal featural representations are emergent, and there could be a language with three tonal categories H l for High, L h for Mid, and L l for Low.Footnote16

  1. (64)
    figure bo

Each separate l register feature that is not shared between two tonal root nodes acts as an instruction to lower the register—a sequence of tones where each has its own l register feature will result in every subsequent tone being downstepped. This is seen in (65a)—the three Mid1 tones have separate l register features and each tone is downstepped. When a single l register feature is shared among multiple tonal root nodes, as is the case for (65b), the register is shifted lower at left edge of the sequence of tones, resulting in the surface mid pitch, but none of the subsequent TBUs are downstepped because they are linked to the same l feature.

  1. (65)
    figure bp

Automatic downstep is achieved by spreading an l register feature onto a subsequent TBU and delinking the register feature previously associated with it. The Medmba downstep rule, where an H is downstepped only in the sequence HLH, would be formulated in Register Tier Theory in (66). When flanked by two High tones, Low tones spread their l register feature onto following linked High tones, delinking their h register features. This lowers the register at the beginning of the span of ls, causing the second High to be downstepped and all subsequent Highs to surface at the level of the downstepped High. Had the l spreading not occurred, the h register feature would shift the register back to the level of the first High tone, and the two High tones would be phonetically identical.

  1. (66)
    figure bq

While this model is more complex than the Hyman (1985) model, requiring three tiers to represent the tones and long-distance spreading rules to prevent overproliferation of downstep in contexts like (65b), it has two major advantages. First, it has subtonal features, enabling assimilatory processes that are otherwise unmotivated, such as raising Low to Mid before High, or lowering High to Mid after a Low. The former is achieved by spreading either the h or H feature of the High to the preceding Low, the latter by spreading either the l or L feature of the Low to the following High. Subtonal featural representations have been shown to have empirical value for diverse phenomena in languages like Seenku (McPherson 2016), Babanki (Akumbu 2019), and Laal (Lionnet 2022a).Footnote17 There is no representational reason for these types of assimilatory processes to occur in models that employ only tonal primitives. Second, the l register feature borne by Low and Mid tones explains why they trigger downstep, both automatic and nonautomatic. Both are lexically specified for l, allowing for register feature spreading.

However, Register Tier Theory is a restrictive model that faces theoretical challenges. In (66), associating the l register feature with a High tone changes the phonological category of the tone, rendering it identical to the Mid1 tone in (64c). This predicts that downstep in a four-tone language will always result in neutralization of two of the lexical tonal categories. Empirically, this does not hold, as there are languages such as Seenku (McPherson 2019) and Supyire (Carlson 1994) where the highest of the four tones is distinct from the middlemost tones when downstepped. If there were this type of neutralization, phonological processes targeting Mid tones should apply to both lexical Mid tones as well as downstepped High tones. Languages like Supyire and Seenku are therefore evidence that downstepped High and Mid must be representationally distinct.

For languages with three tones, the model predicts that Mid tones have a limited behavior. If a downstepped High tone (i.e., H l) is phonologically distinct from a Mid tone, the Mid tone must be L h—this is the case for Babanki, which has two lexical tones but a derived downstepped High tone and Mid tone (Akumbu 2019). However, limiting a Mid tone to the features L h means that a Mid tone cannot trigger automatic downstep, as it has no l feature. As we have seen in Northern Toussian, downstepped High tones are distinct from Mid tones and Mid tones cause automatic downstep, therefore this model does not adequately account for the behavior of the tonal categories in the language.

These issues arise, as noted in Lionnet (In press), because l register features serve two distinct roles in Snider’s model: they have 1) a paradigmatic function, used to define tonal categories, as well as 2) a syntagmatic use, being the source of downstep. Some of these issues could potentially be avoided if tonal features and register features are decoupled in the representation of tonal categories, as is the case, he argues, for Drubea and Num. These languages appear not to have contrastive tone but do have contrastive downstep, which is evidence that they only employ register features, not tone features. This eliminates the need for tonal root nodes and the tonal tier, consequently reducing much of the representational complexity of RTT. If not all languages with contrastive downstep also have contrastive tone, we are left with a view of tonal representation that is much more idiosyncratic and language specific: some languages might only have register features or tonal features, but not both. This raises the question of whether a tonal root node is necessary at all, as Lionnet (2024) suggests: perhaps the tone features and register features link directly to the TBU or the register-bearing unit (RBU), and there is no tonal root node. A consequence of this is that the tonal tier must have more expressive tonal representations than the H and L features of RTT—perhaps additional tonal primitives like M, or the subtonal featural representations of Yip (1980)/Pulleyblank (1986) are used instead. Under this approach, register features are not part of tonal representations and tone and downstep are orthogonal.

However, an association of register and tonal features to a tonal root node (or some other representational equivalence) has empirical value for certain languages. As seen in Northern Toussian, both Mid and Low tones trigger automatic downstep, therefore they belong to a natural class of tonal categories that effectuate registral shifts. Building register features into the representation of tonal categories explains why these types of natural classes exist. In the following section, I provide a model of tonal representation that allows for a connection between register and tonal features, while avoiding the neutralization problem endemic to standard RTT. I use this to argue that there is empirical evidence in support of including register features in the representation of tonal categories: at least for certain languages.

A register feature analysis of Northern Toussian

In this section, I adopt the use of both tonal and register features, as in Register Tier Theory, but propose substantive modifications to the theory that account for less-common, but attested tonal phenomena like double downstep, expanding its empirical adequacy. I propose that register features can stack onto tonal nodes, and that downstep arises for each extra l register feature beyond what is lexically specified. I then reanalyze Northern Toussian double downstep using this model. This model has several analytical benefits. Like standard Register Tier Theory, it allows automatic and nonautomatic downstep to be modeled as a single phenomenon—the association of a register feature with a TBU—and it motivates why both Mid and Low tones trigger automatic downstep. Additionally, it eliminates the locality issues discussed in Sect. 5 and it explains the difference in behavior of the grammatical tone and prosodic boundary effect, i.e., that the grammatical tone can dock onto following syllables, but the boundary effect cannot.

6.1 Details of the model

This model seeks to represent all the lexical and derived tonal categories of Northern Toussian in Table 2, while maintaining that a downstepped tone is featurally more similar to its nondownstepped counterpart than to a different lexical tonal category, e.g., the representation of downstepped High should be closer to High than to Mid, etc. Moreover, it aims to provide a mechanism by which double downstep can occur.

Table 2 Tonal categories and surface forms
Full size table

To do so, I adopt many of the basic assumptions of Register Tier Theory. Tonal categories are comprised of the tonal features H and L as well as register features h and l, which are linked to a tonal root node and can be spread to adjacent tones, causing downstep, tonal assimilation, contour tone formation, etc.

I diverge from Register Tier Theory in two significant ways. First, tonal root nodes are not restricted to hosting a single register feature, and instead multiple register features can stack onto a single tone, achieved when register features spread by association lines onto tonal root nodes without dissociating the existing register feature. This is proposed in Lionnet (In press), who analyzes double downstep in Drubea and Num as being caused by association of two separate l register features with a single register bearing unit.

Second, register lowering (i.e., downstep) does not occur with every l register feature. Instead, downstep is triggered for each additional l register feature associated with the tonal root node beyond what is lexically specified. This means that, if a high tone has an H tonal feature and h register feature, then a downstepped high tone has both of these features as well as an additional l register feature. Register features, then, have the capacity to effectuate register shifts, but only do so in certain circumstances: they function as paradigmatic features defining tonal categories when lexically specified, but have syntagmatic registral effects in derived contexts.

This approach has both conceptual and theoretical benefits. In Register Tier Theory, register shifting is an inherent property of the lexical representation of a tone, but in practice, it is often a contextual phenomenon that occurs when tonal categories interact with one another. By modeling register shifts as derived processes, this is in line with how downstep is typically conceptualized, i.e., triggered by neighboring tones. Theoretically, it is a more permissive model that allows for phenomena like double downstep, while often resulting in more parsimonious analyses. In standard RTT, it is often necessary to posit extensive register feature merging rules to prevent adjacent like tones from being downstepped, as in (65b), and it is assumed that adjacent tonal features are likewise merged. This is empirically warranted for many languages that have word-level tonal melodies, but it might not be appropriate for languages in which tones appear to be assigned per-syllable or per-mora. While such merging rules are a potential tool for a particular analysis in this model, I do not assume a priori that adjacent identical features must merge, as is the case in RTT.

I show how downstep functions in (67a). In this, a Low tone is followed by a High tone. The l register feature of the Low spreads onto the tonal root node of the High without delinking the h feature of the High. This results in the High having three features in total: the lexically specified H tonal feature and the h register feature, as well as the l register feature that has spread onto it. The l feature triggers a lowering of the register, producing a High. If the l register feature causes the h register feature to be delinked, as in (67b), this alters the phonological category of the tone—in this instance, it would produce an M tone, as the resultant features are H and l.

  1. (67)
    figure br

The reader might question why associating two register features with a TBU would cause double downstep, rather than merging due to the Twin Sisters Convention or the OCP. This is because register features are not binary features in a traditional sense—they are instructions to modify the register, shifting it up or down. The effects of these registral modifications are cumulative in a way that binary features are not. This conception of register features shares similarities with hierarchical models of featural representation like Clements (1991), which defines vowel quality by the values of the [open] feature across multiple register tiers, or the representations of vowels in particle phonology, whose phonetic implementation depends on the combination of the particle features au, and i (Schane 1984). In these theories of phonology, as well as the current one, accumulation of certain types of phonological features like register features leads to cumulative effects. Similarly, metrical theories with grids assign different phonological effects, e.g., primary vs secondary vs no stress, depending on the number of grid marks present on a given syllable: the cummulativity of the grid marks determines the phonetic realization of the syllable (Hayes 1995).

6.2 Basic tonal representations and automatic downstep

In Northern Toussian, Mid and Low tones constitute a natural class, as they both trigger automatic downstep. Since register features control registral effects like upstep and downstep, it follows that both Mid and Low have l register features as part of their lexical representations. I propose, then, that Northern Toussian’s tonal categories are structured as in (68): High tones have an h register feature and an H tonal feature; Mid tones are l and H; and Low tones are l and L.

  1. (68)
    figure bs

Under this approach, the Northern Toussian surface forms in Table 2 arise when each category has the features in (69). If the tonal category has one register feature, it is a lexical tone; if it has two or more register features, it is derived. I represent these feature bundles as follows: the tonal and register features are shown in curly braces. Each derived l register feature that triggers downstep is preceded by a plus sign. This means that a High is represented as {H, h} and a High as {H, h, +l}.

  1. (69)
    figure bt

Automatic downstep occurs through a rule where l register features spread onto subsequent tones, shown in (70). In the diagram, r stands for either register feature.

  1. (70)
    figure bu

This rule is constrained in two important ways. First, there is no downstep between two like tones, i.e., /Mid Mid/ is realized as [Mid Mid] and not *[Mid Mid]. This indicates that the spreading only occurs between two tones of different categories.Footnote18 Second, there is a phonology-wide constraint against downstepped Low tones. Low tones are realized at roughly the same pitch regardless of whether they follow a High or Mid tone, If we represent this visually with Chao letters, the sequences /High Low/ and /Mid Low/ surface as [ ] and [ ], respectively. If a Mid caused a subsequent Low tone to be downstepped, we would expect a Low following a Mid to be lower in pitch than a Low following a High, i.e., the output *[ ] would be expected instead. However, there appears to be a prohibition against downstepped Low tones, as they are unattested in the language: there is no contrast between [L L] and [L L]. It follows that this constraint is maintained when Low tones follow Mids.

One may question whether this model overgenerates. What combinations of features can constitute lexical tonal categories? How many register features can associate with a single tonal root node? These are open questions that require further study. However, we might begin with the following restrictions: 1) lexical categories must have one tonal feature and one register feature, i.e., all tones with more than one register feature are necessarily derived tones, and 2) there can be at most three register features associated with a single tonal root node. This means that there can be at most four separate level tones per language,Footnote19 and each tone can be downstepped at most twice. A further question might then be, if there are four lexical tonal categories, each of which could potentially be downstepped twice, could this not result in potentially twelve surface tone levels? The model only defines how many possible phonological categories there are, and not how many phonetic pitches are possible—there are likely functional pressures that would prevent this. It would be difficult, if not impossible, for a speaker to be able to differentiate so many distinct pitches at fluent speech rates. It is not necessary that each distinct phonological category have a separate phonetic output—the surfaces tones can be neutralized, realized at the same pitch. Languages such as the Northern Senoufo languages (Carlson 1994, 42), Northern Mao (Ahland 2012), or Sierra Jurez Zapotec (Bickmore and Broadwell 1998) are analyzed as having two mid tones that have the same phonetic realization, but different behaviors. These tones, therefore, belong to different phonological categories for which distinct representations are required, but are phonetically identical. In other languages, downstepped H and M have identical pitches, as is the case in Babanki (Hyman 1979b; Akumbu 2019) and Bimoba (Snider 1998). Similarly, as discussed in Sect. 3.3, Northern Toussian double downstepped H and downstepped M are phonetically neutralized, produced at roughly the same pitch. They are, however, phonologically distinct, as can be intuited by their different behaviors—downstepped M conditions the prosodic boundary effect, whereas double downstepped H does not, etc. Neutralization of the pitches of separate tonal categories, therefore, permits more phonological categories than pitch levels. Even if this model produces more categories than is attested in any single language, this is not necessarily an issue: after all, no language makes use of every consonant or vowel feature combination. It should not be surprising if no language makes use of all possible tonal feature combinations as well.

6.3 Double downstep in Northern Toussian

In this section, I analyze the prosodic boundary effect and apvia grammatical tone using register features, showing how they combine to cause double downstep. Recall that the prosodic boundary effect causes tones to be downstepped when they follow a Mid tone at the right edge of a phonological phrase. This is caused by the spreading rule in (71). When an l register feature is at the right edge of a phonological phrase and is associated to the same TRN as an H feature, it spreads onto the subsequent tonal root node.

  1. (71)
    figure bv

This rule is applied in (73) to the phrase in (72). The Mid-toned subject p ‘husband’ is at the right edge of a phonological phrase, triggering l spreading. This results in n having the features {H, l, +l}, realized as a downstepped Mid tone.

  1. (72)
    figure bw
  1. (73)
    figure bx

The grammatical tone can have two realizations when it targets an H verb depending on the TAMP markers present in the clause and the lexical properties of the verb. Either 1) it docks onto the verb, altering the tone of the verb, e.g., causing a High verb to surface with a LH contour tone (74a), or 2) it causes downstep (74b).

  1. (74)
    figure by

This distribution is evidence that the grammatical tone is a preverbal floating Low tone—it has tonal and registral effects, and therefore has both an l register feature and an L tonal feature. When the grammatical tone docks and creates a contour tone, the tonal root node bearing the L tonal feature and the l register feature associates with the TBU tier of the verb (75).

  1. (75)
    figure bz

The downstep in (74b) is triggered by the register feature of the floating Low tone associating with the following word, shown in (76).Footnote20 When this happens, the L tonal feature and tonal root node are deleted due to stray erasure.

  1. (76)
    figure ca

Whether the docking in (75) or the downstep in (76) occurs is conditioned by phonology-external factors—namely the lexical properties of the verb and the auxiliary present before the verb. These effects are therefore in complementary distribution. There is no apparent reason why there is tonal association in one construction and downstep in the other—it might be related to morphosyntactic phrasing, however there is no obvious explanation.

Example (78) is the derivation of (77), showing how these two downstepping processes combine to cause double downstep. Underlyingly, the grammatical tone is a fully specified Low tone, bearing both an l register feature and an L tonal feature. First, the l register feature spreads from the grammatical tone to p cop, coincident with the deletion of the L tonal feature and the tonal root node. Following this, the prosodic boundary effect applies, spreading the l register feature of the subject onto the verb.

  1. (77)
    figure cb
  1. (78)
    figure cc

At this point, both l register features have associated with p, causing double downstep.

There is one final piece of this analysis to address. In a phrase such as (79), there is only a single instance of downstep.

  1. (79)
    figure cd

In this environment, double downstep might be expected, as the l register feature of the grammatical tone would associate with , and automatic downstep should cause the l of p to spread onto  as well. It appears, then, that the l spreading caused by automatic downstep can only target a tone with a single register feature, i.e., automatic downstep does not target syllables that are already downstepped. Since  already has a lexically specified h and the derived l from the floating tone, the l feature of p does not spread to it. This constraint appears to be found in other languages with double downstep—all attested cases of double downstep are caused by a combination of 1) grammatically conditioned floating tones, 2) floating tones arising through vowel deletion, or 3) prosodic boundary effects (as is the case with Northern Toussian)—it does not appear that automatic downstep is ever one of the factors that leads to double downstep.Footnote21 There is nothing inherent in the model that predicts that this should be the case. However, with the typological rarity of double downstep, there are too few instances to warrant building this restriction into the model itself. With time and a better understanding of double downstep, the model should be refined to account for this, if needed.

6.4 Summary and explanatory value of the model

By representing downstep as a derived effect caused by the association of l register features, this model allows for downstep and double downstep to be modeled more precisely than is possible when tonal primitives are used—downstep arises from l register features, with double downstep occurring when two extra ls are associated with a single tonal root node. In an approach where downstep is caused by the phonetic implementation of a Low tone before a High tone, there is no mechanism for the two Ls in (80a) to each cause downstep, as the first L is not local to the H. Modeling downstep with autosegmental association lines circumvents this locality problem, since two l register features can each associate with a subsequent tone, as in (80b).

  1. (80)
    figure ce

The model has the flexibility to account for other instances of double downstep, as well as diverse tonal phenomena attested crosslinguistically. First, let us consider another case of double downstep. I summarize the Medmba data presented earlier in (81), as analyzed in Voorhoeve (1971).

  1. (81)
    figure cf

In (82), I present a tentative reanalysis of these data employing the current model. High tones have the features {H, h} and Low tones are {L, l}. What Voorhoeve analyzed as floating L and H tones are floating l and h register features. There is a rule that spreads floating l register features onto the following syllable. This causes the lexically assigned l register feature of  ‘child’ to associate with its tonal root node in both examples. In (82b) the l of  ‘thing’ also associates with the tonal root node of , resulting in double downstep.Footnote22

  1. (82)
    figure cg

In many languages, downstep regularly occurs between two adjacent High tones. In instances where there is High tone spreading, the downstep occurs at the edge of the span, indicating that the process occurs at the boundary of separate autosegments—if a High is associated with multiple TBUs, there is no downstep, whereas two separate adjacent High tones result in the second being downstepped. Such a contrast occurs in Shambaa such as with the words nyk ‘snake’ and ngt ‘sheep’ (Odden 1982):

  1. (83)
    figure ch

This type of process has been variously argued to occur due to the phonetic implementation of two adjacent H tones, as in Shambaa, or through L insertion at the juncture of two adjacent H tones, as in Tiriki (Paster and Kim 2011 and references therein). In the current model, this could be explained in one of two ways. Following Paster and Kim’s (2011) analysis, this could be attributed to l register insertion. Alternatively, high tones in languages like Shambaa could have the features {H, l}, with a rule that spreads the l feature to the other High, causing downstep. As discussed in Sect. 5, unlike in Snider (1990), I take an emergentist perspective of featural representation, meaning that High tones do not necessarily need to have the features {H, h}, but instead have representations that follow from their behavior (Lionnet 2022a).

In languages like Akan (Stewart 1965; Genzel and Kügler 2011; Genzel 2013) or Igbo (Welmers 1970), downstepped High tones can be lexically specified or produced through phonological processes, but do not otherwise occur when two High tones are adjacent. Under this model, lexical specification of downstepped High tones is simple—the downstepped tone has a floating l register feature before it, which then spreads onto the subsequent tonal root node. This is shown with the Akan word bf ‘messenger’:

  1. (84)
    figure ci

Issues and future research

This model allows for complex downstep without neutralization of distinct categories and can effectively represent double downstep. However, it does not resolve the underlying tensions that Lionnet (In press) notes in that register features play a role both in paradigmatic definitions of tonal categories as well as syntagmatic registral shifts. In RTT, register shifting and tonal categories are strictly linked, such that its empirical coverage is limited. Representing downstep as a derived effect, rather than a purely representational one, has injected needed flexibility into the model to account for a wider array of attested tonal phenomena. However, register features still have the two-fold role in defining both paradigmatic and syntagmatic tonal properties. These roles are instead mediated by the type of association—whether it is part of its lexical specification of the tonal category or it associates to a tone at some point in the derivation.

There are instances where the link between tone features and register features is not desirable. Lionnet (In press) shows that there are languages that appear only to have register features and not tonal features. Representing these languages with a fully articulated model such as this is potentially unnecessary. Moreover, the model predicts certain tonal trends that might not hold as the tonal properties of more languages are better studied. The current model would not predict, e.g., a four-tone language in which the lowest three tones trigger automatic downstep of the highest tone, as only two tonal categories would underlyingly have l register features. This could potentially be accounted for if a rule is present that inserts a floating l register feature following the {L, h} tone, but this would be stipulative and conceptually unsatisfactory. I am unaware of such a language, but this reflects the general dearth of data on downstep, especially automatic downstep, for languages with three or more tones.

An adequate model of downstep and tonal representation must be able to handle languages in which registral effects like downstep appear to be linked to tonal categories through phenomena like automatic downstep, while accounting for languages in which there appears to be no such link. I leave such a model to future research, as this paper sought only to address the former problem. I will note, however, that in order to construct a satisfactory model, it is necessary to build a more thorough understanding and typology of the types of registral processes found crosslinguistically—especially for understudied phenomena like double downstep, upstep, and downstep in languages with more than three tonal categories. A better-defined problem space would enable theoreticians to determine what phenomena should and should not be accounted for, and design the theory accordingly.

Conclusion

In this paper, I have shown the mechanisms by which double downstep arises in Northern Toussian. Unlike other languages, the double downstep does not arise from the sequence   H, but rather from the cumulative effects of two register features targeting a following H. This phenomenon cannot be accounted for using an analysis with tonal primitives where downstep is a local phonetic effect. Instead, I proposed a register feature analysis whereby downstep is triggered by association of autosegmental register features. This avoids undesirable locality effects that arise when only tonal primitives are used, and allows for a unified mechanism of all types of downstep, both automatic and nonautomatic. These data are in support of models where downstep is phonologically controlled, rather than being phonetically conditioned.

The model used is based on Register Tier Theory, but has several important differences: 1) register features can stack onto a single TBU, and 2) downstep arises as a derived effect triggered for each additional l register feature that is spread onto the TRN. These differences allow for cumulative register effects like double downstep as well as for distinct representations of lexical tonal categories and downstepped tones. This model is readily applicable to diverse downstepping phenomena such as double downstep, lexically conditioned downstep, and downstep conditioned by adjacent High tones.