Lexicon 語 · 02

The Origin of the Vocabulary

Portuguese is, in essence, Latin transformed by time. Onto that inherited core successive layers were laid — pre-Roman, Germanic, Arabic and modern.

en

The Portuguese lexicon is, in its overwhelming majority, Latin transformed by time. Roughly four-fifths of the vocabulary is reckoned to go back, directly or indirectly, to the language of Rome. Onto that inherited core, across more than two millennia, successive layers of words from other sources came to settle — some older than Romanisation, others brought by invaders, merchants and, later, by the maritime expansion itself. To trace the origin of the vocabulary is, in good measure, to read that stratigraphy.

The inherited core

The inherited (or popular) words are those that reached Portuguese through unbroken oral transmission from spoken Latin, undergoing every regular sound change of the language. They form the oldest and most central stratum: the names of the body, the house, kinship, the numbers and the commonest verbs. They are recognisable by characteristic developments — the loss of intervocalic consonants, the voicing of voiceless stops, the palatalisation of the clusters pl-, cl-, fl-.

Some regular developments from Latin to Portuguese
LatinPortugueseChange
LŪNA*lua* (moon)loss of intervocalic -n-
DOLŌRE*dor* (pain)loss of intervocalic -l-
LUPU*lobo* (wolf)voicing -p- > -b-
PLĒNU*cheio* (full)*pl-* > *ch-* [ʃ]
NOCTE*noite* (night)*-ct-* > *-it-*

Learned words and divergent doublets

Not all Latin-derived vocabulary is inherited. From the Middle Ages on, and above all in the Renaissance, Portuguese drew thousands of words from Classical Latin by the learned route — the cultismos — which entered through writing and so escaped the popular sound changes. When a single Latin word gave rise at once to an inherited form and a learned form, the result is a doublet, or divergent pair: the two coexist today with distinct meanings.

PLĒNU → *cheio* (inherited) and *pleno* (learned) · ARTICULU → *artelho* and *artigo* · CLAVĪCULA → *chavelha* and *clavícula*

The popular form evolved over centuries; the learned one was taken up late, almost intact, from written Latin.

Alongside the Latinisms, Greek supplied — almost always through Latin and the modern languages — the bulk of the scientific and technical vocabulary (filosofia, democracia, telefone), in a process that remains productive.

The pre-Latin and Germanic layers

Before Rome, the western strip of the Peninsula spoke pre-Roman languages — Celtic, and tongues such as Lusitanian. From them survives a faint but real substratum: words like barro (clay), carro (cart), cerveja (beer), gancho (hook) or lousa (slate) are credited to that base, while Basque seems to have left esquerda (left). The fall of the Empire brought the rule of the Germanic peoples, chiefly Suevi and Visigoths. Their contribution — the Germanic superstratum — clusters around warfare, craft and naming: guerra (war), elmo (helmet), espora (spur), roubar (to rob), ganhar (to win), branco (white), and given names now wholly ordinary, such as Fernando, Rodrigo or Álvaro.

The Arabic legacy

The Islamic presence in the Peninsula, from 711 onward, left the most visible of the post-Latin layers. Some thousand words in Portuguese are reckoned to be of Arabic origin, many recognisable by the agglutinated article al-. They span agriculture and irrigation, administration, commerce and the sciences: azeite (olive oil), açúcar (sugar), alface (lettuce), arroz (rice), alfândega (customs house), armazém (warehouse), álgebra, alfaiate (tailor). Even an everyday interjection, oxalá (from in šā’ Allāh, “God willing”), is an Arabism.

*oxalá* venha a chover esta semana

[ɔʃɐˈla]

“Oxalá” — from Arabic in šā’ Allāh, “God willing” — is perhaps the most living Arabism in daily speech: ‘let's hope it rains this week’.

The modern layers

The maritime expansion, from the 15th century, opened the lexicon to languages of three continents. From Tupi and other Amerindian languages came abacaxi (pineapple), mandioca (cassava), caju (cashew); from the Bantu languages of Africa, caçula (youngest child), cafuné (a caress of the hair), moleque; from the Malayo-Asian world, chá (tea), biombo (folding screen), catana. In modern times, French (abajur, garagem, elite) and, especially over the past century, English (futebol, líder, software) have become the great sources of borrowing.

This overlay of origins — Latin at the base, Arabic and Germanic at the margins, global at the surface — is what gives Portuguese its particular texture. The following sections of this chapter follow each layer in close-up.

Sources

  1. José Pedro Machado. Dicionário Etimológico da Língua Portuguesa . Livros Horizonte (1952)
  2. Paul Teyssier. História da Língua Portuguesa . Sá da Costa (1980)
  3. Celso Cunha & Lindley Cintra. Nova Gramática do Português Contemporâneo . Edições João Sá da Costa (1984)