1. Introduction
1.1. Focalizing Ser (FS)
Focus is defined in terms of its relationship with contextual information, as it indicates the new information provided in a sentence (Chomsky, 1971; Jackendoff, 1972). Following Krifka’s (2007) observations about information structure, in this paper we define focus as the constituent that provides new information, often appearing as an answer to a question:
(1)
Salióel perrito2
leave-past the puppy
The puppy left
In many cases, focus can be contrastive, which entails that a given element is selected from a set of alternatives provided in the context. Hence, contrastive focus introduces a new element, compares it with the presupposed possibilities, and entails that the speaker is correcting or adding specific information that was previously mentioned (Krifka, 2007). For instance, the focused element in (2) indicates that in order to give updated information the speaker is introducing a new alternative (el niño ‘the boy’) and is contrasting it to the possible entities that could have left (el perrito ‘the puppy’, la mujer ‘the woman’, etc.):
(2)
In many languages, cleft sentences are employed to provide focus to specific elements in a given sentence. In Spanish, clefts generally show contrastive focus and employ a complex syntactic structure, with a main clause containing the verb ser (‘to be’) and the focalized element, and a subordinate clause containing a relative pronoun (Goldsmith, 1981; Guitart 1989). There are three main types of cleft structures in standard Spanish: clefts (3), inverted clefts (4), and pseudo-clefts (5):
(3) Fueel perrito quien/el que salió ladrando
be-past the puppy rel pron leave-past bark-progr
It was the puppy who came out barking
(4) El perrito fue quien/el que salió ladrando
the puppy be-past rel pron leave-past bark-progr
The puppy was who came out barking
(5) El que/Quien salió ladrando fueel perrito
rel pron leave-past bark-progr be-past the puppy
The one who came out barking was the puppy
Several Spanish varieties (such as Colombian, Dominican, Ecuadorian, Panamanian, and Venezuelan), allow for an alternative structure, often referred to as focalizing ser (FS):
(6) Salió ladrando fueel perrito
leave-past bark-progr be-past the puppy
The one who came out barking was the puppy
FS has been considered as a more simplified, incomplete version of the pseudo-cleft, in which the relative pronoun is absent (Albor, 1986; Sedano, 1990; Toribio, 2002). However, several researchers have demonstrated that FS is syntactically unrelated to clefts (Bosque, 1999; Camacho, 2006; Curnow & Travis, 2004; Méndez Vallejo, 2009)3.
In this paper, we follow Méndez Vallejo’s (2009) analysis of FS and we view it as a mono-clausal independent syntactic structure, in which ser joins given information (topic) and new information (focus). FS ser is understood here as a discourse link (Pato, 2010), which associates both topic and focus, functioning as a connector between the two pieces of information (both old and new). This allows FS ser to introduce, emphasize, or intensify the presence of new information in the sentence.
Syntactically, based on the fact that FS can only focus elements found in the mid- to low-TP area, we support the claim that FS occupies a TP-internal position in a focus phrase, below T and above vP (Méndez Vallejo, 2009). Also, as it has been observed in previous literature (Bosque, 1999; Camacho, 2006; Méndez Vallejo, 2009), FS and the pseudo-cleft behave differently when used in certain structures (e.g., clitic climbing and negation), which provides further evidence to support the claim that these two structures are syntactically different4.
Furthermore, the fact that our data originates from spontaneous semi-directed speech, supports Méndez Vallejo’s (2015b, 2019) and Escalante & Ortiz-López’ (2017) claims that FS is not a stigmatized structure in Colombian Spanish, emerging naturally in the speech of a wide range of speakers.
Finally, the FS structure has also been examined from other linguistic perspectives with data from different communities. Using data from Venezuelan and Colombian Spanish, it has been studied as a sociolinguistic phenomenon (Sedano, 1994; Castro, 2014; Escalante & Ortiz-López, 2017), and from semantic and pragmatic perspectives (Curnow & Travis 2004; Pato, 2010). It has also been examined as a result of linguistic evolution (Pato, 2013), from cross-dialectal perspectives (Méndez Vallejo, 2019, 2015a), and as a marker of Colombian immigrant Spanish in the U.S. (Ramírez, 2003). Furthermore, it has been described as a salient feature in Brazilian Portuguese (Mioto, 2012). This study uses Colombian Spanish data and focuses on the prosodic-syntactic interface only.
1.2. The Current Study
As we noted in the previous section, several characteristics of FS have been analyzed in the literature. However, previous studies have not accounted for the role of prosody in the production of FS. This is an important aspect of the phenomenon, if we consider that several authors have highlighted the need to understand why this alternate form occurs alongside more standard options, such as clefts (Bosque, 1999; Sedano, 1994; Méndez Vallejo, 2009; Pato, 2010, 2013; González Támara, 2017). As suggested by Sedano (1994) and Pato (2010), the answer may not be entirely based on structural or syntactic reasons, but rather on contextual aspects. These contextual reasons may include the role of referential discourse, the discourse function of ser, and other elements that may contribute to working memory and focus processing. Hence, studying the prosodic characteristics of FS may help explain the nature of this form and ultimately give us insight into speakers’ linguistic choices.
This paper seeks to provide an acoustic analysis of FS and drawing prosodic comparisons with cleft structures within contrastive focus declaratives. For instance, given a set of comparable sentences as in (7a) - (7d), we expect to find certain prosodic differences between the standard forms ((7a) - (7c)) and the dialectally-marked form (7d):
(7)
¿Salió el gato?
leave-3sg-pret the cat
Did the cat leave?
No, fueel perrito quien salió (Cleft)
no be-3sg-pret the puppy who leave-3sg-pret
No, it was the puppy who left
No, el perrito fue quien salió (Inverted cleft)
no the puppy be-3sg-pret who leave-3sg-pret
No, the puppy was the one who left
No, quien salió fue el perrito(Pseudo-cleft)
no who leave-3sg-pret be-3sg-pret the puppy
No, it was the puppy who left
No, salió fueel perrito(FS)
no leave-3sg-pret be-3sg-pret the puppy
No, it was the puppy who left
Specifically, we will address the following research questions:
What is the prosodic description of FS in Colombian Spanish varieties?
What are the prosodic focus marking differences between FS and other types of focus, such as clefts and pseudo-clefts?
We hypothesize that pitch accents will be used to mark the focused element in all of the sentences, but that there will be prosodic marking differences between the different syntactic strategies (clefts, pseudo-clefts, FS) used by the same speakers. We are particularly interested in learning if ser in FS structures is also marked with a pre-nuclear pitch accent. We expect to find some individual variation and perhaps differences based on city of origin.
The rest of the article is organized as follows: Section 2 presents previous literature on the recent interest in the syntactic-prosodic interface, prosodic research on focus marking and intonation studies on Colombian Spanish, specifically. Section 3 describes the participants, the methods used for collecting the data, and the syntactic and acoustic analysis. Section 4 delivers the results of the study and Section 5 provides a discussion and our conclusions.
2. Previous Literature
Few studies to date examine the syntactic-prosodic interface of specific linguistic phenomena for Spanish (Zubizarreta, 1998; Domínguez, 2004; Gabriel, 2010; Feldhausen & Vanrell, 2015; García García & Uth, 2018), and of these few, the majority examines focus realization. Zubizarreta (1998) pioneered the prosodic approach for studying focus realization in syntactic structures by analyzing phrasal prominence (nuclear stress) and its relation to word order. Research has since examined the placement of pitch accents on focused elements (Face, 2002; Domínguez, 2004; Uth, 2018; Vanrell & Fernández-Soriano, 2018).
2.1 Syntactic Focus Marking
Spanish is an interesting language for examining the syntactic-prosodic interface of focus because it allows flexible word order, within the limits of focus marking. Spanish is generally considered an SVO language, though VSO has been proposed as an unmarked word order as well (Zubizarreta, 1998). Traditionally, Spanish is said to follow a weak-strong focal prominence pattern, where the main stress in broad focus5 declaratives falls at the end of the utterance, or right periphery (Selkirk, 1995; Domínguez, 2004). Given this default order, informational declaratives with no focused elements (8) are indistinguishable prosodically from informational declaratives with sentence-final focused elements (9).
(8)
(9)
Contrastive focus can be shown syntactically by altering the word order of the sentence. For instance, the focused element in (10) moves to the sentence-initial position thereby signaling that the new information is contrary to what was expected.
(10)
In cleft structures, this contrast may be indicated by moving the focused element to the sentence-initial position, particularly in the case of clefts (11a) and inverted clefts (11b). However, in pseudo-clefts (11c) and FS (11d) the contrastive focused element remains in situ.
(11)
¿Llegó Rubén?
arrive-past Rubén
Did Rubén arrive?
No, fueel niñoquien llegó
no be-past the boy rel pron arrive-past
No, it was the boy who arrived
No, el niño fue quien llegó
no the boy be-past rel pron arrive-past
No, the boy was who arrived
No, quien llegó fueel niño
no rel pron arrive-past be-past the boy
No, the one who arrived was the boy
No, llegó fueel niño
no arrive-past be-past the boy
No, it was the boy who arrived
Looking at the relationship between word order and prosody, Domínguez (2004) analyzed the differences in prosodic marking of the focused element for broad focus, informational focus and contrastive focus sentences in peninsular Spanish. The data came from 5 female participants between the ages of 30-40 from Alicante. Domínguez found that focused elements in contrastive declaratives can be located in either the left-most periphery (sentence- initial position) or in situ, that is, in sentence-final position. The position of the focused element affected pitch accent patterns, which will be detailed in the next section.
The results of Domínguez (2004) indicated that in contrastive focus sentences where the focused element moves to the left periphery, there was a difference in pitch peak height based on the grammatical role of the focused element. Traditionally, the subject in SVO and VSO word orders cannot be marked with narrow focus; however, the Domínguez’ study did reveal focused subjects in SVO structures. Furthermore, if the subject was the focused element in an SVO sentence, it had a higher pitch peak than if the object was the focused element in OVS, hence revealing a prosodic effect for grammatical role of focused elements in the same sentence position. Domínguez explained that because the movement of the object already signals focus, it does not require a secondary cue; however, a focused subject would appear in a canonical word order, which would require a stronger cue for focus marking.
2.2. Prosodic Focus Marking
For this study we used nuclear pitch accents (NPA) as the prosodic correlate of focus. The NPA is the last pitch accent of an utterance, or right-most pitch accent, and usually perceived as the most prominent (Ladd, 2008). As a stress language, Spanish uses pitch for intonational purposes. The final contour (nuclear pitch accent + final boundary tone) conveys the pragmatic meaning of an utterance, such as the difference between a statement and a question (Pierrehumbert, 1980). Within declaratives, differences between broad and narrow focus in terms of intonational phrasing has generally been studied based on the pre-nuclear pitch accent contours over the focused words. More recent intonation research also notes that NPAs can also convey focus type distinctions in Spanish varieties (Prieto & Roseano, 2010). Both pre-nuclear and nuclear pitches will be examined in this paper and focus distinctions presented in the results.
The Spanish Tones and Breaks Indices (Sp_ToBI) (Beckman et al, 2002, Estebas Vilaplana & Prieto, 2008) is a phonetic notation system that helps label prosodic movement over accented syllables. The ToBI system, originally developed for English, is based on the autosegmental-metrical (AM) theory (Pierrehumbert, 1980). AM contributed a way to analyze intonation on its own, separate from phonetic segments. It recognizes the phonological parts of an utterance and the importance of the utterance ending to convey meaning. In the Sp_ToBI system, mono-tonal accents are flat across the stressed syllable and labeled as either high (H*) or low (L*) in relation to the pitch height of the rest of the utterance. Bitonal pitch accents show rises and falls within or across the accented syllables, where for example H + L* describes a falling tone and L + H* describes a rising tone within the stressed syllable. Spanish is known to have both monotonal H* and L* accents, as well as five bitonal pitch accents: H + L*, L + H*, L* + H, L + >H*, and L + ! H* (Aguilar, De-la-Mota, & Prieto, 2009). In this dataset, we only observed the pitch accents described in Table 3. The labeling criteria is described in the section 3.4.
In addition to prosodic movement, pitch peak alignment is also meaningful. Pitch peak alignment, whether it occurs within or immediately following the stressed syllable, is used to mark focus prosodically in Spanish (Face, 2002; Face & Prieto, 2006). Face (2002) found that peninsular Spanish uses two types of pitch accents: in broad (informational) focus the pitch peak occurs after the tonic syllable of the focused word, but in narrow (or contrastive) focus, the pitch peak usually occurs within the stressed syllable. Domínguez (2004) also found that in contrastive focus declaratives, the pitch peak aligns with the tonic syllable of the focused word, as in L + H*. Early peak alignment, therefore, has been shown to signal contrastive focus in Spanish.
For Spanish, there are few comprehensive cross-dialectal examinations of intonation, including Sosa (1999), and Prieto & Roseano (2010). Sosa (1999) is the most recent of these studies to include Colombian Spanish. The work describes intonation contours for declaratives and interrogatives in the speech of educated speakers from several cities around the world, including Bogotá. Just as languages have their own intonation system, specific intonation patterns can also differ by dialect and region. For Spanish declaratives in general, the final contour tends to be descending in all dialects (Sosa, 1999).
2.3. Intonation in Colombian Spanish
Colombia’s Spanish varieties can be divided into regions based on lexicon, pronunciation and grammatical structures. Montes Giraldo (1982) makes a general distinction between coastal and central-Andean «superdialects» Lipski (1994) divided the varieties into four zones: Andean center surrounding Bogotá, Caribbean coast that includes Cartagena and Barranquilla, the Pacific coast including Chocó and Cali, and the less populated Amazon region to the east. Many studies that compare the linguistic landscape across Colombian varieties further distinguish between Bogotá and an inland paisa variety around Medellín (Ayala & Dorta Luis, 2015; Méndez Vallejo, 2015a; Velásquez Upegui, 2016).
Prosodic studies on any variety of Colombian Spanish are still quite limited. A few recent intonation studies have worked on describing intonation patterns for specific cities in these regions. Most of the studies examine interrogatives, but a few studies included declarative intonation, such as Sosa (1999) and Ham (2003) for Bogotá, and Ayala and Dorta Luis (2015) for Medellín. Figure 1 illustrates this broad focus declarative pattern found by Sosa (1999) for Bogotá, with two pre-nuclear pitch peaks (L* + H) followed by a falling final boundary tone (H* L %). Ayala and Dorta Luis (2015) found the same intonation pattern in Medellín declaratives as well. In all of those studies, only broad focus declaratives were analyzed6.
Velásquez Upegui (2016) compared the intonation patterns of vocatives in the speech of four cities, one from each of the different regions. Velásquez Upegui observed the canonical falling final boundary tone (L %) for Spanish declaratives for all Colombian regions as well. She found that although the nuclear pitch accents were rising in all the cities, the slope of the rise for Cartagena (northern coast) was considerably smaller than the steeper rises of Bogotá, Cali, and Medellín. The study found that speakers from Cali and Medellín had larger tonal ranges than those from Cartagena and Bogotá. Given these findings, we will also note any dialectal differences in our dataset in terms of intonation contours.
3. Methods
3.1. Participants
The data analyzed in this study was collected in the summer of 2015 in four Colombian cities: Barranquilla, Bogotá, Cali, and Medellín. These cities were selected based on their demographic importance and their geographical characteristics. Specifically, these are the largest urban centers in the country with populations that range between 1.5 and 8 million inhabitants, and they represent four distinct cultural and geographical regions: the Caribbean region, the Central region, the Southwestern region, and the Northwestern region, respectively.
A total of 40 informants (10 from each city) participated in this study. All of our participants were pre-screened with a linguistic background survey to confirm their geographic origin and to ensure that they had lived in their respective cities most of their lives.
At the time of the data collection, all participants were studying at a higher education institution: in Barranquilla, at University of Norte; in Bogotá, at Caro and Cuervo Institute; in Cali, at ICESI University and Santiago de Cali University; and in Medellín, at University of Antioquia.
Narrowing the participant group to university students allowed us not only to survey a population that shared a similar background and community of practice7, but also to maintain more control over the informants, thereby facilitating comparison across dialects. Table 1 provides an account of our participants and summarizes relevant information about each city.
City | Population | Geographic region | Place of data collection | Number of participants |
---|---|---|---|---|
Barranquilla | 1,386,865 | Caribbean | University of Norte | 10 |
Bogotá | 7,878,783 | Andean - Central | Caro and Cuervo Institute | 10 |
Cali | 2,369,821 | Andean - Southwestern | ICESI University Santiago de Cali University | 10 |
Medellín | 2,499,080 | Andean - Northwestern | University of Antioquia | 10 |
3.2. Data Collection
Despite the fact that FS is a common structure in Colombian Spanish, it is difficult to elicit naturally occurring data with traditional elicitation methods, such as interviews or unstructured narratives. Since FS (and clefts) are focusing structures, having contextual information (i.e., explicit or implicit presuppositions) is fundamental to their production.
Having this in mind, we designed a semi-production test using the images from «Frog, Where Are You? », an illustrated short story about the adventures of a boy and his dog (Mayer, 1969). 8
Before the test, participants received a tutorial to become familiarized with the characters of the story, to train them on the specific task and to prepare them to answer the questions with complete sentences. Afterwards, d, we formulated the questions carefully in an effort to promote the production of FS and clefts. For example, when looking at the picture in Figure 2, we asked questions that could be confirming or challenging the illustration, hence stimulating the speaker to produce the syntactic structures of interest (see possible questions for this particular picture in (12)-(17)).
Hence, the test provides participants with a particular discourse context, which allows them to produce focusing structures in a more natural way. It is important to mention that similar tests were used successfully in other studies, in which researchers needed to elicit clefts (Vanrell & Fernández-Soriano, 2018) and FS (Méndez Vallejo, 2015b).
(12) ¿Dónde está el perrito? Informative focus
Where is the puppy?
(13) ¿El perrito está en el piso? Contrastive focus: Focus in-situ
Is the puppy on the floor?
(14) ¿Es en el piso donde está el perrito? Contrastive focus: Cleft
Is it on the floor where the puppy is?
(15) ¿En el piso es donde está el perrito? Contrastive focus: Inverted cleft
Is it on the floor where the puppy is?
(16) ¿Dónde está el perrito es en el piso? Contrastive focus: Pseudo-cleft
Where is the puppy is on the floor?
(17) ¿El perrito está es en el piso? Contrastive focus: FS
Is it on the floor that the puppy is?
3.3. Syntactic Analysis
All of the recordings were orthographically transcribed and organized by city and by individual speaker. Each of the tokens included the question asked by the interviewer and the response given by the participant. The responses were then labeled for which syntactic strategy was used: in-situ focus, cleft, inverted cleft, pseudo-cleft, or FS. Table 2 shows examples from this dataset of each syntactic strategy found.
Examples | Strategy | |
---|---|---|
(18) | ¿Adónde se fue el niño? ¿El niño se fue para el lago? | In situ focus |
Where did the boy go? Did the boy go to the lake? | ||
No, el niño se fue para el árbol. | ||
No, the boy went towards the tree. | ||
(19) | ¿Quién se cayó? ¿Fue el niño el que se cayó por la ventana? | Cleft |
Who fell? Was it the boy the one who fell off the window? | ||
No, fue el perrito quien se cayó por la ventana. | ||
No, it was the puppy who fell off the window. | ||
(20) | ¿Quién está mirando la colmena? ¿El perro es el que está mirando la colmena? | Inverted cleft |
Who is looking at the hive? Is the dog the one who is looking at the hive? | ||
Sí, el perro es el que está mirando la colmena. | ||
Yes, the dog is the one who is looking at the hive. | ||
(21) | ¿Quién se cayó? ¿El que se cayó fue el perrito? | Pseudo-cleft |
Who fell? Was the one who fell the puppy? | ||
No, el que se cayó fue el niño. | ||
No, the one who fell was the boy. | ||
(22) | ¿A quién están buscando las abejas? ¿Las abejas están buscando es al niño? | FS |
Who are the bees looking for? Are the bees looking for the boy? | ||
No, van es a atacar al perro. | ||
No, what they are doing is attacking the dog. |
The majority of the tokens had in-situ focus and a predictable prosodic pattern, and therefore were ultimately excluded from the analysis. The cleft and inverted cleft tokens were collapsed into one «cleft» category, due to the similar position of the focused item in both utterance types.
Although FS may yield both contrastive and non-contrastive readings (Curnow and Travis, 2004; Méndez Vallejo, 2009), our data limits the present analysis to contrastive cases of FS. All of our tokens yielded a contrastive reading because the participants were asked clarification questions that presupposed the possibility of answering with a few other options. As shown in Table 2, the questions prompt the speakers to choose from a limited series of possibilities based on what they see in the images. For instance, in (19), there are only a handful of characters that could have fallen (the puppy, the boy, the frog, the groundhog, the owl, and the deer). Hence, asking who fell and whether or not it was the puppy provides a contrastive context9.
4.4. Acoustic Analysis
The pitch tracker in Praat 6.1.14 (Boersma & Weenink, 2020) was used to generate a prosodic description of the tokens. First, the pitch accents in each token (utterance) were identified with the pitch tracker using fundamental frequency (FF, formerly known as F0) measured in hertz. Pitch accents generally indicate the prominent syllables of an utterance (Ladd, 2008). While pitch accents can serve other functions, based on our data’s context, they are interpreted here as markers of prosodically focused words. The task used in this study yields contrastive focus declaratives, thus limiting the options of words that can be emphasized, though not necessarily resulting in pitch accents over the syntactically focused words. The grammatical category of the prosodically focused element was labeled in a spreadsheet and we indicated whether or not it matched the syntactically focused element of the utterance. For example, in an FS token such as No, se cayó fue el perro (‘No, the one that fell was the dog.’), the syntactic focus is on dog in response to the question, ‘Did the boy fall?’ However, the (pre-nuclear) pitch accents on the words no and cayó could have higher pitch peaks than the nuclear pitch accent over perro. We analyzed the initial yes/no reply word as a separate utterance from the rest of the answer and then noted the highest pitch peak in the main utterance. In this example, the syntactic focus occurs at the right edge, but in other cleft structures, that is not the case. We did account for downstep, or lowering of pitch, over the course of the utterance.
The pitch accents and contours over the prosodically marked words were labeled using the Sp_ToBI guidelines (Estebas Vilaplana & Prieto, 2008; Aguilar, De la Mota, & Prieto, 2009; Prieto & Roseano, 2010). Since the prosodic description of focalizing ser is the item of most interest in this paper, the pitch contour over the ser items of each token was also examined and labeled separately. The pitch contours of the prosodically focused words and ser items for each type of syntactic strategy (clefts, pseudo-clefts, and FS) were examined separately and then compared.
Table 3 illustrates the most common pitch accent contours found in the data with their phonetic label, visual schematic and brief description.
While the focus is on pitch, intensity and the duration of the focused elements and conjugated ser item in each token were also examined. However, elongation (duration) as a cue for focus marking was the least used and always as a secondary cue to pitch or intensity cues. The results section will elaborate on the few cases where duration was used.
4. Results
4.1. General Results
The dataset consisted of 962 contrastive focus declarative sentences, where 831 (86 %) were produced with syntactic focus in situ, that is, at the right periphery. See Table 2 for an example. The remaining 131 tokens (14 %) are the focus of this analysis; of these, 42 are clefts (including inverted clefts), 44 are pseudo-clefts, and 45 have FS. Two tokens were eliminated from the study for audio quality issues: one pseudo-cleft and one FS, leaving a total of 129 tokens to analyze. Table 4 shows the raw data organized by city.
Syntactic Focus Structure | Barranquilla | Bogotá | Cali | Medellín | Totals |
---|---|---|---|---|---|
Cleft | 8 | 23 | 7 | 4 | 42 |
Pseudo-cleft | 11 | 26 | 2 | 4 | 43 |
FS | 9 | 19 | 5 | 11 | 44 |
Totals | 28 | 68 | 14 | 19 | 129 |
In this dataset, the strongest pitch accents did not usually coincide with the syntactically focused items in the utterances. The cleft structure had the highest percentage (38 %), 16 out of 42 tokens where prosodic and syntactic focus both occurred together on the grammatical subject. While the syntactic focus in contrastive-focus declaratives is on the lexical item that answers the specific question posed, the perceivable prosodic prominence occurs where there is a pitch peak (and often accompanying intensity peak) in the intonational phrase.
Originally, we separated the yes/no reply word from the rest of the response in anticipation of finding pre- nuclear pitch accents over these words and then later over another word in the token. However, only about half of the utterances, regardless of syntactic structure (cleft, pseudo-cleft, or FS), exhibited a pitch accent on the yes/no word and then usually no other words had a notable pitch accent. Therefore, we are analyzing the yes/no word together with the rest of the utterance and reporting on the strongest pitch accents of the token. As shown in Figures 5 and 6, the majority of the tokens had only one pitch accent amid a fairly monotonous pitch contour.
Table 5 shows where the highest pitch accents occurred in the tokens based on structure and grammatical category. Only a total of 15 tokens exhibited multiple pitch accents, with the expected down step or lowering across the utterance (as in Figure 3). These cases occurred on 4-6 tokens from each syntactic focus structure and were produced by the same six speakers.
An interesting finding was that the ser item in the FS structure could receive the highest pitch peak of the utterance; it happened in six cases. The item ser was not prosodically marked in the other two structures. Table 5 summarizes the results.
yntactic focus structure | Yes-No | Subject | Verb | Ser | Object | Totals |
---|---|---|---|---|---|---|
Cleft | 20 | 16 | 2 | 0 | 4 | 42 |
Pseudo-cleft | 24 | 5 | 7 | 0 | 7 | 43 |
FS | 23 | 4 | 4 | 6 | 7 | 44 |
Totals | 67 | 25 | 13 | 6 | 18 | 129 |
The prosodically focused items were marked with the highest pitch peak (and usually intensity peak) of the utterance as shown in Figure 3. The two most common pitch contours on the focused items for all structure types were H + L* and L + H*, in that order. The most common pattern in the data, the falling pitch (H + L*), as illustrated in Figure 3 over the word no. The line reads No, el que está encima del niño es el perrito (‘No, the one that is on top of the boy is the dog.’)
Figure 4 shows an example of a cleft token where the subject word perro (‘dog’) is focused with a rising pitch contour (L + H*), which appeared equally as often as the falling contour (H + L*) for the cleft structure tokens. This rising pre-nuclear pitch accent is associated with broad focus (Sosa, 1999) in Spanish, including Colombian Spanish, but the early peak we see here can be used for contrastive focus declaratives as well (see Section 2.2.). This example also illustrates a token where the focused word had both the highest pitch peak and highest intensity peak on a word that occurs in the middle of the utterance.
Table 6 below summarizes the breakdown of the pitch contour patterns over the prosodically marked words in the data.
Syntactic focus structure | H + L* | L + H* | Flat/Other | Totals |
Cleft | 19 | 19 | 4 (unclear) | 42 |
Pseudo-cleft | 33 | 7 | 3 (2 H*; 1 unclear) | 43 |
FS | 29 | 10 | 5 (1 H*; 1 unclear; 3 monotone) | 44 |
Totals | 81 | 36 | 12 | 129 |
The two main pitch contours, H + L* and L + H*, account for 81 % of the dataset. No patterns emerged among the remaining 9 % (12 out of 129 of tokens) to explain these exceptions other than individual speaker variation, which included speakers with fairly monotone productions.
There were a few more interesting cases where none of the prosodic cues aligned with the syntactic focus, such as the example illustrated in Figure 5No, llevaba fue una rana en la mano (‘No, (he) was carrying a frog in his hand.’) This is an FS token produced by a male speaker from Medellín, where the pitch peak of the utterance is on the conjugated ser item fue and the intensity peak is over the verb llevaba. Neither of the verbs carry syntactic focus since the token was in response to the question set: ‘What was the boy carrying? Was he carrying a dog on his head?’ The syntactic focus for this example is the object rana (‘frog’) (and mano (‘hand’) for part 2). Auditorily, llevaba is the most prosodically prominent word in this utterance. This means that either the intensity cue is stronger than the pitch cue, or the rising pitch accent on llevaba marks prosodic prominence, even if the peak occurs after, as is the norm for broad focus declaratives, but was unexpected here.
We initially considered all three lexical stress indicators to determine a prosodic description of the FS tokens: pitch, intensity, and duration. Pitch and intensity were the primary prosodic cues, and duration only seemed to be used as a secondary cue in a few cases, mainly over the words sí or no at the beginning of the utterance. However, the data did reveal that ser in FS tokens provides an optional pause point for the speakers where they can elongate the word as they think about the rest of their response. This dataset contains four such cases, all in the token type fue, with durations up to 489 milliseconds, or 17 % of the utterance duration (as shown in Figure 6, produced by a male speaker from Barranquilla). For one of the speakers who produced one of the elongated tokens, the duration of his other five FS fue tokens averaged 190 milliseconds, or 6 % of the utterance duration, making the difference to the elongated token considerable.
Although the data was more limited for Medellín and Cali than Bogotá and Barranquilla, there did not appear to be any significant differences in the intonation patterns of each structure based on the speakers’ city of origin. This is not necessarily surprising given that all three of the focus structures analyzed here are less-common alternatives to the default in situ structure. Perhaps due to the lower frequency or salience of the structure, any dialectal pitch accent differences are leveled out.
4.2. Results for ‘ser’ Items across Structures
The FS tokens showed more prosodic variation overall than the cleft and pseudo-cleft structure tokens. Of particular interest to this paper was the finding that the ser item in the FS structures could also be marked prosodically with a pre-nuclear pitch accent. These ser items are not the main verbs in the utterances and could never be the syntactically focused words. Yet, 17 out of 44 (39 %) of FS tokens exhibited a pitch accent over the ser item as opposed to 6 out of 43 (14 %) of pseudo-clefts and 0 out of 42 of the cleft tokens. The differences will be explained below. In 12 of the 44 FS tokens (25 %), there was a pre-nuclear pitch accent over the conjugated ser item (see Appendix). Within these twelve, there were four tokens where the ser item had the highest pitch peak of the utterance, thus receiving the prosodic and auditory focus of the utterance. In all four of those tokens, the ser item was conjugated in the past tense, fue. In the other eight FS tokens with a pitch peak over ser, there was another word with an equal or slightly higher pitch peak (as measured in Hz), usually over the initial no.
No immediate patterns emerged to separate the twelve marked tokens from the rest of the FS tokens. The twelve FS tokens with a pitch peak over ser came from the data of male and female speakers from all four cities. Due to the nature of the task, there were many instances of the exact same sentence (produced by different speakers) with different intonation patterns. Therefore, the marked tokens were not lexically or structurally distinctive. The preferred pitch contour over these ser items was H + L*, the most common contour overall, with only one exception which had a flat high (H*) pitch accent (See Appendix for an elaborated description of the twelve marked tokens).
For the cleft tokens, either the initial word no or the grammatical subject were prosodically marked with a pitch accent (see Table 5). The highest pitch peak of the utterance usually aligned over the stressed syllable of the focused content word in the phrase, such as the second syllable of ranita (‘baby frog’) in La ranita fue la que se perdió (‘The baby frog was the one that got lost.’) Only one cleft token had a pitch accent over the ser item (though not the highest pitch peak of the utterance).
In the pseudo-cleft tokens, the prosodically marked element was usually the initial word no in the sentence. For example, No, el que está nadando es el perrito (‘No, the one swimming is the puppy.’) Of the 43 pseudo-cleft tokens, the conjugated verb ser was marked prosodically with a pitch peak in 3 cases (though not the highest peak of the utterance) and with the highest intensity peak of the utterance in 4 cases. An overlap where both pitch and intensity peaks occurred on ser occurred in only one case, for a total of 6 pseudo-cleft tokens where the ser item was prosodically marked. A few of the no items also had longer than average durations, meaning that duration was used as a secondary cue to give the no word more prominence. Incidentally, elongation (duration cue) was not used on any of the ser items in this focus structure. Auditorily, none of the ser items in pseudo-cleft tokens were the most prominent word of the utterance.
The distribution of data by city did not allow for testing the influence on prosodic focus marking based on the city of origin of participants. However, there were a few tendencies in the FS tokens worth mentioning. Medellín speakers produced only falling pitch contours over the ser item of FS tokens. Meanwhile, Cali speakers produced both rises (2 L + H*) and falls (3 H + L*), as Bogotá speakers produced four different patterns (rises, falls, flat highs, flat lows) in their 19 FS tokens with an almost equal distribution. Differently, Barranquilla speakers produced two contours over ser, 3 flat highs (H*) and 6 falling contours.
5. Discussion and Conclusion
Using data from 40 Colombian speakers, we examined three specific focus structures: cleft, pseudo-cleft and focalizing ser. Previous studies (Domínguez, 2004; Vanrell & Fernández-Soriano, 2018) have found that focus in Spanish is achieved by prosody and syntax together. If this is interpreted to mean that the prosodic and syntactic focus occur over the same lexical item, then our results would seem to contradict this finding. However, our interpretation is more global; focus is interpreted using cues from both prosody and syntax, though these need not occur over the same word. The results of this study showed that the same lexical items do not necessarily receive both the prosodic and syntactic focus, as in the example shown in Figure 5.
Our elicitation task allowed us to analyze contrastive focus declaratives, where the syntactic focus is primarily on the subject of the sentences, though prosodic focus in this data rarely fell on the grammatical subjects. We analyzed the prosodic patterns of several syntactic focus structures, including the FS structure, as well as the intonation patterns of the focused words and of the ser item by using the Sp_ToBI labeling system (Estebas Vilaplana & Prieto, 2008). In tokens from all the focus structures, we found that despite syntactic focus, the prosodic focus usually aligned with the initial word yes or no in the responses. Based on our task and specific questions, the no response was produced more often.
In this study we set out to answer two main research questions: (1) What is the prosodic description of FS? and (2) What are the prosodic focus marking differences between the FS structure and other types of focus structures? After the inquiry, we found that the ser in FS can be prosodically marked with a pre-nuclear pitch accent. Intensity peaks can also be used to mark the ser item in FS, leading to an auditory prominence. We found that by contrast, the ser items in cleft structures are not generally emphasized prosodically (pitch, intensity, or duration). While the pseudo-cleft tokens technically did use pitch and intensity cues over a few ser items (only six tokens), these pitch and intensity movements were more minor than those of the FS tokens, in that the peak heights were smaller and/or the ser item was not the only element focused in these six cases. In the pseudo-cleft tokens where the ser item received a minor marking, the other focused element received at least two markings (a pitch accent, intensity peak, and/or longer duration). This confirms our original hypothesis that there are prosodic marking differences between FS and the other syntactic strategies used by the same speakers.
Although ser in FS may be considered a discourse or emphatic marker (Méndez Vallejo, 2009; Pato 2010), it still behaves like a content word in terms of syntax and prosody. Syntactically, ser in the FS structure retains its morphological features, as it may be conjugated to match the main verb of the sentence and the focused element. For instance, in (23)-(24), ser10.
(23) Salí fuiyo
leave-pret-1sg be-pret-1sg I
The one who left was I
(24) Salía erayo
leave-imp-1sg be-imp-1sg I
The one who used to leave was I
Although this agreement pattern holds for most of the tokens found in this study, it is important to mention that there are a few cases where speakers choose the default form of ser (as in fue). As exemplified in Figures 5 and 6, fue is the selected form and no agreement is established between ser and the main verb (25), or between ser and the focused element (26). This indicates that fue has indeed become the most common form of the verb and it shows certain grammaticalization of the FS structure. In the case of (26), the prosodic elongation of fue may also be explained by the fact that the speaker needed time to think about the next segment in the sentence and fue serves as the default form of ser.
(25) No, llevaba fueuna rana en la mano
no carry-imp-3sg be-pret-3sg a frog in the hand
No, it was a frog that he was carrying in his hand
(26) No, encontraron fuedos ranas
no find-pret-3pl be-pret-3sg two frogs
No, it was two frogs that they found
Prosodically, the analysis that we present in this paper shows that ser in the FS structure can receive prosodic focus marking. Furthermore, the prosodic patterns that we find in the data seem to confirm the idea that ser serves as a resting spot in speech where speakers can elongate it as they think about the rest of the utterance. This brings to mind Pato’s (2010) observation that ser in the FS structure functions as a discourse link, without adding anything new to the meaning of the sentence. In this sense, ser behaves as an equative verb, a bridge between topic (old information) and focus (new information), which allows speakers to process and convey information.
In this study, all of the tokens correspond to contrastive environments. Due to the nature of the structure, the tokens came from an elicited speech task. A future study could replicate the analysis with broad focus declaratives and/or semi-spontaneous speech in order to examine the syntactic-prosodic interface for FS compared to cleft structures in other discourse contexts. The size of the dataset also was smaller for both Medellín and Cali, and thus it could benefit from adding data from these cities and possibly from other dialectal regions.