Welcome to one of the most active flamenco sites on the Internet. Guests can read most posts but if you want to participate click here to register.
This site is dedicated to the memory of Paco de Lucía, Ron Mitchell, Guy Williams, Linda Elvira, Philip John Lee, Craig Eros, Ben Woods, David Serva, Tom Blackshear and Sean O'Brien who went ahead of us.
We receive 12,200 visitors a month from 200 countries and 1.7 million page impressions a year. To advertise on this site please contact us.
They are in danger of landing on one horn of the social science dilemma (the one where your result conforms to expectations “Thanks for nothing, prof, you wasted your time and our money proving what we already knew”).
(The other horn: your result diverges from expectations “typical BS written by out of touch boffins who have no common sense”.)
Thanks for posting. It's interesting to see the results, their discussion/interpretation less so. I've only had a quick/superficial read but won't let that stop me! I appreciate their efforts, I learnt something. Is this a precursor to LLM produced flamenco ?
Some of their prior assumptions about criteria for palos are faulty i.e. not recognising that it is primarily the cante melody that defines an estilo, palo not the letra. Just because there are some letras more commonly found in some palos, that doesn't mean that they define the palo.
"Criteria adopted to define palos are rhythmic patterns, chord progressions, lyrics and their poetic structure, and geographical origin"
Their interpretations of the results are perhaps naive. And over-reaching - but they have to try to maximise the worth of their efforts - understandable academic behaviour. (I deal more with physiology/medical research where it's usual to see researchers searching for significance in secondary outcome measures.)
" For example, the majority of confusions of bulerias (39%) are with solea. The same confusion may be seen with 40% of tientos being confused with tangos, or with 53% of malaguenas being confused with fandangos. These particular confusions are enlightening.." "Remarkably, without any knowledge about the musical features or historical origin of the different genres, the algorithm can discern their relationships considering only their lexical content." Yes because letra are interchangeable!
Here they landed firmly on horn one: (“Thanks for nothing, prof, you wasted your time and our money proving what we already knew”).
"... we have identified insightful differences between Flamenco styles. While alegrias are mainly festive and referent to Cadiz, seguiriyas cover deep concerns and embody gipsy terms, while bulerıas covers a wide range of topics and vocabulary maybe due to its use as an instrument of anmmenization and celebration."
Their interpretations of the results are perhaps naive
In their defense, they did use something called the Multinomial Naive Bayes (MNB) model
OK, while I skimmed it and could be wrong because I missed something, I have some concerns (in addition to the foundational flaw of classifying according to letras differences when letras are known to be used almost randomly and are not definitional to palos):
1. Why infer richness of vocabulary per palo based on total words/types when the size of the palo letra buckets are very different due to the nature of the dataset, with Bulerias being with >= 3 times more letras than alegrias and malaguenas? Compare Fig. 1 with Fig. 2; the only interesting thing there to note is that M and A have about the same "volume" of words but vastly different vocabulary "richness". However, these two also have barely 150 instances in the dataset, so who knows, with so few datapoints, if what is in the dataset is representative.
2. Wouldn't finding apparent differences (as they did) between palos be expected just because of the small sample sizes (statistically speaking)? I.e. it is just an artefact of using a relatively small number (and smaller still for some palos) of palo+letra instances from different years. What would have been extraordinary is to find no difference! One way to check for something like this is, say, to divide randomly the bulerias into three equal (or unequal) buckets, then run the same analysis and plot the same thing as if these are potential different palo buckets. I strongly suspect they would get similar "meaningful" differences. If so, what would that say about their results and interpretations?
3. Their predictions (attempts at classification based on letras) are based on only 15% of the original dataset (since they used 85% for training). It is not clear (they dont say, I think) if they partition the 85/15 per palo, but it would make sense. Therefore, they are down to 1/7 of the original number of palos - about 60 bulerias, soleas, etc.... down to about 20 alegrias, malaguenas, tientos, etc. for the predictions/results.
Combine all this, and it seems like data artefact results to me.
yes looks quite pointless. What they needed to realize first is the letras can be divided first mathematically, ignoring traditional applications. By that I mean 8 syllable 4 line verse for Romances, Solea, Tona/martinete etc, tiento tango, and all Fandango family are interchangeable. Next the 3 and 5 line verse of those 8 syllables, including a subset for the zero verse in the 3 line family, as covering Solea/Buleria/Tango etc.. The 5 line being Fandango family specific (the formal structure allows both 4 and 5 Line for the same styles whereas the Solea derivatives change the form via the line number and delivery, which strongly separates the two families. Interesting to note there will be examples of a 4 line verse interchanging between fandango and Solea families, and also we can find 5 line alterations to a 4 line standard, which we can infer from the poem itself that the verse was adapted for use with a fandango of some sort, I saw this in Borrow vs Demofilo examples). The final math category will be the seguidilla/serrana, 5+7 family.
Once that math gets clarified then they could in theory run a bunch of letras that combine with Demofilo and Borrow to "modern" ones and try to find old collections of poetry that maps to those models. I suspect that a collection might have once existed that constitutes a "source" from either an individual or a collective. I lean in the direction of Renaissance humanist poets, but have not located a published version that matches what I envision. There was a collection of Romances by Gongora that were interesting to me, and as mentioned earlier the Briseño collection has interesting overlap with our flamenco (namely the cielito lindo seguidilla/serrana), but what I am envisioning is a specific group of poems in a singular collection, more like hand written (MS) rather than published. And more specifically, the Borrow reference of "Los del Aficion" once collected by Luis Lobo (?), from late 1780s (50 years prior to Borrow's encounter with the lottery salesman that memorized a bunch of it).
It would be interesting to see what AI might come up with someday. The interchangeability of the letras between palos (contrafaction) negates the thematic correlations in general, and the math of seguidilla also negates thematic "correct assignments" of those (because we don't see Solea letras adapted to siguiryas unless the poetic math itself gets altered which is harder to do than the 4 vs 5 line cases where altering just means adding a line of verse that may or may not rhyme).
In terms of melody, my own brain ran the algorithm over 20 some years and it allowed me to stumble on what I think is the origin of the palos musically (melody and harmony, and loosely, the compas phrasing as well). AI should be able to do the same thing but you have to feed it the better transcriptions and versions accepted by aficionados as interpretations considered "puro". That turns out to be the more difficult challenge. As an example I provided some correlations to Ocon in that thread about Arabic cante.
The interchangeability of the letras between palos (contrafaction)
I still don't understand why the practice became what it is. Contrafaction is one thing (substituting another text/story into a preexisting melody/song), but don't flamencos go one step more restrictive, and do this to a single 4 (or 3, or 5) verse? Meaning no story is longer than a single 4-liner (which also implies that the story has to be complete within the 32 or so syllables). Like a real-life early version of twitter but 32 syllables instead of 140 characters. Why not tell a story spanning several 4-liners - it can still be contrafaction, not at the single 4-liner lvel but, rather, en banc for as many 4-liners the original song had?
Not to overlook everything else you said in your reply, which I find very interesting.
Meaning no story is longer than a single 4-liner (which also implies that the story has to be complete within the 32 or so syllables). Like a real-life early version of twitter but 32 syllables instead of 140 characters. Why not tell a story spanning several 4-liners - it can still be contrafaction, not at the single 4-liner lvel but, rather, en banc for as many 4-liners the original song had?
Well, they DO do this, they are the "Romances o Corridas" and there is a whole episode on these examples in Rito y Geografia. Estebañez Calderon has included an entire performance of one by Planeta (even uses the word "Soledad" in one tercio). Understanding these as melody driven, what I am seeing are two things. The melody creates the formal structure, and this structure may repeat with each new verse, we call that melody a "style", and Norman's site is dedicated to to recognizing the "style" of cante being used. The second issue is that this formal structure seems to be understood as singing with no guitar accompaniment, however now recognizing the roll of the vihuela/guitarra, and specifically used for formal structures once called "Romances" by Andalusian musicians in the 16th century and earlier even, the story of "cantes sin guitarra" needs revision IMO.
So these forms are like epic poems that can go on and on with the same story, very repetitive melody (think hotel California or American pie). However, the other palos follow the formal structure (let me use the term "Motet" as this gets at the heart of the concept of the formal structure as a general concept, with a villancico, madrigal, or Romance as a specific type of motet), yet use un-related poetry. As a concrete example of Castro Buendia, he has noted the motet of Pisador called an "endecha" I believe a funeral lament, and his formal structure seems similar or the same as one used by Fuenllana, yet completely different lyric sets. Castro has also correlated a specific line of verse (tercio) to the Polo Tobalo as he transcribed it from Pepe de La Matrona (others sing the same style, so I am in agreement with the correlation, read pp.744-6 of his big dissertation).
So the plug and chug lyrics into a song form are an old concept, and the motet idea of "voices" coming together to create the exotic harmony ("chords") that follows the cante melody as a driving force delivery for poetry, is the basis of the palos. I can envision AI crunching the data to come up with some close fits of the poetic structure that has evolved over time by changing one or two words here or there, but otherwise will point to an origin (collection or collation) of the cante. Some clear Tabs or chord charts might go with the set (something like Briseño but with our specific palo progressions).