Introduction and Explanation
It is often forgotten that (dictionaries) are artificial
repositories, put together well after the languages they
define. The roots of language are irrational and of a
-Jorge Luis Borges, Prologue to "El otro, el mismo."
Etymologies are not definitions; they're explanations of what our words meant 600 or 2,000 years ago. Think of it as looking at pictures of your friends' parents when they were your age. People will continue to use words as they will, finding wider meanings for old words and coining new ones to fit new situations. In fact, this list is a testimony to that process.
These are histories of words only, not things or ideas. The modern word for something might have replaced old, forgotten words for the same object or concept. (Where possible, I've tried to indicate that.)
ablative, the Latin case of adverbial relation, typically expressing the notion "away from," or the source or place of an action.
Abnaki, Algonquian language of northern New England and eastern Canada.
accusative, typically the case of the direct object, but also sometimes denoting "motion towards." Nouns and adjectives in French, Spanish, and Italian, languages from which English borrowed heavily, generally were formed from the accusative case of a Latin word.
adj., "adjective." A word used to qualify, limit, or define a noun or noun-like part of speech.
adv., "adverb." One of the indeclinable parts of speech, so called from being ordinarily joined to verbs for the purpose of limiting or extending their signification, but used also to qualify adjectives and other adverbs.
Afrikaans, Germanic language of South Africa, an offshoot of Dutch, also known as South African Dutch.
agent noun, form expressing the notion "doer of action." Hunter is an agent noun, and -er is an agentive suffix.
Akkadian, Semitic language group spoken in ancient Mesopotamia, including Babylon and Assyria.
Algonquian, widespread group of North American native languages.
American English, the English language as spoken and written in the United States of America.
American Spanish, the Spanish language as spoken and written in the New World.
Anglian, the Old English dialect of the Angles; the dialect of Old English spoken in the Anglo-Saxon kingdom of East Anglia.
Anglo-French, the French written in England from the Norman Conquest (1066) through the Middle Ages; the administrative and legal language of England 12c.-17c. Derived from the French of Normandy, which was influenced by Germanic.
Anglo-Latin, the form of Medieval Latin used in England during the Middle English period.
Anglo-Norman, the dialect of Anglo-French spoken by the Norman settlers (French-speaking descendants of Scandinavians who settled in Normandy in the 9c.) in England after the Conquest (1066). For purposes of this site, essentially the same as Anglo-French.
aphetic, a form of a word altered by loss of a short, unaccented vowel at the beginning (such as squire from esquire).
Arabic, the Semitic language of the Arabs and the language of Islam.
Aramaic, Semitic language of the Middle East. Aramaic became the lingua franca of the Assyrian empire and later for centuries was the official language of the Persian kingdom and the daily language of Israel at the time of Christ.
Arawakan, native language family of northern South America and the West Indies.
Armenian, the Indo-European language of Armenia.
Assyrian, Akkadian dialect spoken in the empire that flourished on the Tigris River 7c. B.C.E.
asterisk (*): Words beginning with an asterisk are not attested in any written source. Some have been reconstructed by etymological analysis, such as Proto-Indo-European *ped-, the root of words for "foot" in most of its daughter tongues. In other cases they are hypothetical words or forms of words that might have, but didn't, come into use in a modern language (Modern English *astronomian, if Middle English astronomyen had survived). Or they are presumed forms in ancient languages of words that are attested only in oblique or derived forms.
Attic, ancient Greek dialect of Athens and its region.
augmentative, expressing increase in the force of the idea conveyed. It is used as a noun and an adjective in linguistics, and can be applied to a whole word or to an affix. Opposite of diminutive. Medallion is from an Italian augmentative of medal, so "a large medal." Chariot is from a French augmentative of the word that became English car. A squadron is a large squad.
back-formation, the process by which an apparently complex word is erroneously split up and a new, simple form produced from it (burgle is a back formation of burglar).
base, the unanalysable element which is the kernel of a word’s structure.
Basque, non-Indo-European language of the Basque people.
Breton, Celtic language of Brittany, the last surviving Celtic language in continental Europe.
c. "century," when following a number (16c.); "circa" when preceding one (c. 1500).
Cantonese, type of Chinese spoken in the southeastern province of Canton (Guangzhou), where Europeans and Chinese often interacted in early modern times.
Carib, South American native language spoken in northern South America and parts of Central America.
Catalan, Romance language of Catalonia and some nearby places; closely related to Provencal.
causative, a form of a verb expressing the notion "cause X to Y." The en- in enrich is a causative prefix.
Caxton, William Caxton (d. 1491), the first English printer, responsible for a number of spelling changes.
Celtic, Indo-European language branch that includes Irish, Scottish Gaelic, Welsh, and Breton. Also the language spoken by the ancestral group during the presumed period of unity.
cf. abbreviation of Latin confer "compare." In other words, "see the following entry for more information."
Cherokee, native Iroquoian language of the southern Appalachians.
Chinook, native language of a North American people who lived along the Columbia River; also Chinook jargon, a lingua franca of the Pacific Northwest made up of English, French and native elements.
Church Latin, Late Latin as used in Christian ritual, discipline, and theology.
cognate (adj.), having the same ancestor. As a noun, a word that has a common ancestor with another.
combining, the form of a word when it combines with other words.
comparative, the second degree of comparison of an adjective or adverb. Longer is the comparative of long.
Coptic, Hamitic language descended from ancient Egyptian, now confined to liturgical use.
conj., "conjunction." A connective particle serving to unite clauses of a sentence or coordinate words in a clause or sentence (and, but, or, etc.).
Cornish, Celtic language spoken until late 18c. in Cornwall.
Coverdale, Myles (or Miles) Coverdale (c.1488–1569), translator who produced the first complete printed translation of the Bible into English.
Croatian, Serbo-Croatian as spoken in Croatia, and written in the Latin alphabet.
Czech, West Slavic language of old Bohemia (modern Czech Republic).
Danish, North Germanic language spoken in Denmark.
DAS "Dictionary of American Slang," by Harold Wentworth and Stewart Berg Flexner, published 1960, revised four times since.
dative, typically the case of the indirect object, but sometimes also denoting "motion toward." In old Germanic languages, the "fourth case," catch-all for I.E. dative, ablative, locative and other cases.
diminutive, a form of a word used to express smallness, as ringlet is the diminutive of ring. The principal English diminutive suffixes are -et, -kin, -ling, -ock, -in, -y or -ie. In many words they have lost their force and no longer are felt as such.
dissimilation, the process by which a word with a repeated sound changes one of the two; Latin peregrinus became French pelerin ("pilgrim") by dissimilation.
Dravidian, non-Indo-European language family centered in southern India, including Tamil and Telugu.
Dutch, West Germanic language spoke in the Netherlands, descended from the Low German dialects of the Franks and Saxons.
Ecclesiastical Greek, Greek as used by the early Christians.
echoic, indicating a word that sounds like what it means.
East Frisian, variant of Frisian spoke on the islands off the North Sea coast of Germany.
e.g. abbreviation of Latin exempli gratia "for the sake of example."
Egyptian, Afroasiatic (Hamitic) language spoken in ancient Egypt.
English, West Germanic language spoken in England after c. 450; after c. 1000 heavily influenced by French and somewhat by Scandinavian.
Estonian, Finno-Ugric (non-Indo-European) language of Estonia.
Etruscan, language spoken by an ancient people of what is now Tuscany, neighbors of the early Romans; the origin of their language is uncertain.
Faeroese, Scandinavian language of the Faeroe Islands.
fem., "feminine," the grammatical gender in highly inflected Indo-European languages that denotes females and many other words to which no distinction of sex is apparent.
Finnish, Finno-Ugric (non-Indo-European) language of Finland.
Flemish, West Germanic dialect spoken in Flanders, generally regarded as the Belgian variant of Dutch rather than as a separate tongue.
Frankish, West Germanic language of the Franks, inhabitants of northern Gaul 5c.-6c., their descendants ruled France, Germany, Italy in 9c., and the language contributed hundreds of words to French and strongly influenced the form of it that was brought to England as Anglo-French.
French, Romanic language spoken chiefly in France.
frequentative, case denoting recurring action.
Frisian, West Germanic language spoken in Friesland, the lowland coast of the North Sea and nearby islands, closely related to Dutch and Old English.
Fulani, language of northern Nigeria.
future, the verb tense indicating time to come. English lacks a pure future tense, but Latin and other languages have it.
Gallo-Romance or Gallo-Roman, the vernacular language of France c. 500-900 C.E.; intermediate between Vulgar Latin and Old French.
Gaelic, Celtic language of Highland Scotland.
Gaulish, Celtic language of ancient Gaul.
genitive, the case of the complement, typically expressing "possession" or "origin."
German, West Germanic language spoken in Germany, Austria, parts of Switzerland, technically "New High German." A general rule when comparing English and German in their evolution from a common source is that English has been more conservative with its consonants, German with its vowels.
gerund, a verbal noun, in English usually ending in -ing.
Gothic, the East Germanic language of the Goths, extinct since 16c., but because of early missionary work among them we have Gothic texts 200 years earlier than those in any other Germanic language, which are crucial to reconstructing Proto-Germanic.
Greek, Indo-European language spoken in Greece in the classical period, c. 8c. B.C.E.-4c. C.E. Among its dialects were Ionian-Attic (the language of Homer and the Athenian dramatists), Aeolic (used in Thessaly, Boeotia and Lesbos), and Dorian (the language of Sparta).
Germanic, a branch of Indo-European, ancestral language of English, German, Dutch, Frisian, Scandinavian tongues and several extinct languages such as Gothic and Frankish.
Guarani, native South American language of Paraguay, related to Tupi.
Hawaiian, Polynesian (non-Indo-European) language of the Hawaiian Islands.
Hebrew, classical Hebrew, ancient Semitic language of the Israelites.
Hungarian, Finno-Ugric (non-Indo-European) language spoken in Hungary; also known as Magyar.
Ibo, Kwa language of Nigeria.
IE, "Indo-European," the family of languages that includes most of the languages of modern Europe (English among them) and some current and extinct ones in western and southern Asia. All are presumed to share a common ancestor, Proto-Indo-European (PIE).
imitative, "a convenient term to include onomatopoeic and echoic" [Weekley].
imperative, the verbal category expressing commands or orders.
imperfect, tense/aspect category indicating progressive aspect: I was saying is in the "past imperfect" tense.
I-mutation, also known as "i-umlaut."
inceptive, see inchoative.
inchoative, aspect expressing the notion "entering into an action, beginning." Latin verbs ending in -sco, -scere. Also sometimes inceptive.
indicative, the mood expressing assertion.
infinitive, the form of a verb that expresses existence or action.
instrumental, case encoding the notion "means by which x is done."
intensive, giving force or emphasis.
intrans., "intransitive," of verbs, not taking a direct object.
Irish, the Celtic language spoken in Ireland.
Iranian, the branch of Indo-European languages spoken on and around the plateau of Iran, including modern Farsi and Kurdish.
Iroquoian, North American native language family.
Italian, the Romanic language spoken in Italy, it evolved out of the Tuscan dialect in the Renaissance.
iterative, marking repetition; generally identical with frequentative.
Kentish, the dialect of Old English spoken by the Jutes who formed the Anglo-Saxon kingdom of Kent.
Japanese, the native language of Japan, with no known outside connections.
Kurdish, Iranian (Indo-European) language of Kurdistan.
Kwa, Niger-Congo language group spoken along the south coast of West Africa, which was a major supply region for American slavers.
Late Latin, the literary Latin language as spoken and written c. 300-c. 700.
Latin, classical Latin, the Italic language of ancient Rome until about 4c.
Lithuanian, the Baltic language spoken in Lithuania.
loan-translation, a literal piece-by piece translation from one language to another. Old English ymb-sniþan"around-cut" is a loan-translation of Latin circum-cidere.
locative, the case denoting "location in."
Low German, "plattdeutsch," the modern descendant of Old Saxon.
Malay, Indonesian language of the Malay peninsula.
Mandarin, the Beijing-area dialect of Chinese, now regarded as the standard form.
Mandingo, Niger-Congo language group of West Africa.
masc., "masculine," the grammatical gender in highly inflected Indo-European languages that denotes males and used with many other words to which no distinction of sex is apparent.
Medieval Latin, Latin as written and spoken c. 700-c. 1500.
Mercian, the Anglian dialect of Old English spoken in the Anglo-Saxon kingdom of Mercia.
metathesis, inversion of segments within a word; Old English þridda became Modern English third through metathesis of -r- and -i.
Mexican Spanish, Spanish as spoken in Mexico.
Micmac, Algonquian language of the Canadian Maritimes.
Middle Dutch, the Dutch language as it was spoken and written c. 1100-c. 1500.
Middle English, the English language as written and spoken c. 1100-c. 1500.
Middle French, the French language as written and spoken c. 1400-c. 1600.
Middle High German, the High German language as written and spoken c. 1100-c. 1500.
Middle Irish, Irish as written and spoken in the high Middle Ages.
Middle Low German, the Low German language as written and spoken c. 1100-c. 1500.
Modern English, language of Britain, Australia, British America, etc., since mid-16c.
Modern Greek, language of Greece since c. 1500.
Modern Latin, Latin language in use since c.1500, chiefly scientific.
Muskogean, North American native language family originally in the southeastern U.S. and including Choctaw, Chickasaw, Creek, and Seminole.
Nahuatl, Uto-Aztecan language of native Mexico; the language of the Aztecs and their neighbors.
neut., "neuter," the third grammatical gender in highly inflected Indo-European languages.
North Germanic, the subgroup of Germanic comprising Norwegian, Danish, Swedish, Icelandic, Old Norse, etc.; also the language spoken by the ancestral group during the presumed period of unity.
nominative, the case that typically codes the grammatical function of the subject.
North Sea Germanic, the closely related languages of the Germanic tribes along the coastal and lowland regions of the North Sea coast of continental Europe before the period of the Anglo-Saxon migration, comprising Old Low Franconian, Old Saxon, Old Frisian, and Old English.
Northumbrian, the Anglian dialect of Old English spoken in the Anglo-Saxon kingdom of Northumbria.
N.T., New Testament.
objective, designating or of the case of the object of a transitive verb or preposition.
obsolete, a word or meaning or form of a word no longer in use.
Old Celtic, ancestral language of modern Irish, Scottish, Welsh and related languages.
OED, "Oxford English Dictionary," the principal source for modern English etymologies, begun in 1879 (as the "New English Dictionary"); a second edition was published in the 1980s, and the work is ongoing. References to it on this site, unless otherwise noted, are to the 2nd print edition.
Old French, the French language as written and spoken c. 900-1400. More than 90 percent of it was from Vulgar Latin, with a smattering of Celtic and Germanic, plus some Medieval Latin learned terms.
Ojibwa, native Algonquian language of the people on either side of Lake Superior.
Old Church Slavonic, the earliest attested Slavic language, known from 9c. C.E. Used by the Slavs of Macedonia and Bulgaria.
Old Danish, the form of West Norse spoken in Denmark after c. 1000 C.E.
Old Dutch, also known as Old Low Franconian, the Germanic speech used on the North Sea coast of continental Europe c. 700-c. 1000.
Old English, the English language as written and spoken c. 450-c. 1100. It is purely Germanic and had several dialects (West Saxon, Anglian, Kentish, Northumbrian).
Old Frisian, Germanic language akin to English spoken on the North Sea coast of modern Netherlands and Germany before 1500.
Old High German, the ancestor of the modern literary German language, a Germanic language spoken in the upland ("high") regions of Germany; German language as written and spoken from the earliest period to c. 1100.
Old Irish, the Irish language as written and spoken from earliest times to 11c.
Old Italian, the Italian language as written and spoken before 16c.
Old Low German, the Low German language as written and spoken from earliest times to 12c.
Old Norse, the Norwegian language as written and spoken c. 100 to c. 1500 C.E., the relevant phase of it being "Viking Norse" (700-1100), the language spoken by the invaders and colonizers of northern and eastern England c. 875-c. 950. This was before the rapid divergence of West Norse (Norway and the colonies) and East Norse (Denmark and Sweden), so the language of the vikings in England was essentially the same, whether they came from Denmark or from Norway. Only a few of the loan words into English can be distinguished as being from one or the other group.
Old North French, the dialect of northern France before the 1500s, especially that of coastal Normandy and Picardy.
Old Persian, the Persian language as written and spoken from 7c. B.C.E. to 4c. B.C.E.
Old Provençal, Romanic language of the troubadours, spoken in southern France before c. 1500.
Old Prussian, a West Baltic language similar to Lithuanian, extinct since 17c.
Old Saxon, a West Germanic language, the earliest written form of Low German, spoken c. 700-c. 1100.
optative, a mood expressing wishing. The archaic Heaven forfend would be an example of optative, though unlike some I.E. languages English has no specific markers for this case.
Oscan, the Italic language of the Samnites in middle and southern Italy in pre-Roman times; related to Umbrian and in the same family as Latin.
Old Slavic, another name for Old Church Slavonic (q.v.).
Old Spanish, the Spanish language as written and spoken c. 1145-16c.
Old Swedish, the Swedish language as written and spoken c. 900-c. 1500.
O.T., Old Testament.
participle, a verbal form having some functions of both verbs and adjectives (in English, usually ending in -ing).
Pashto, Iranian (Indo-European) language of Afghanistan.
passive, the form of a verb which indicates that the subject is the recipient of the action. The tree was struck by lightning is a passive construction.
past participle, a form of a verb that can be both a verb and an adjective, and which denotes action which has been completed. In Modern English, it commonly ends in -ed or -en. Thus, asked is the past participle of ask. French past participles commonly were adopted as finite verbs in Middle English.
past tense, indicating an action completed or in progress at a former time.
Pennsylvania Dutch, also, and more properly, Pennsylvania German, High German and Low Franconian dialect mingled with English words and deformed in pronunciation, spoken in the German-settled counties of interior Southeastern Pennsylvania.
perfective, the tense or formation expressing the notion of "completion." To eat is non-perfective; to eat up is perfective.
Persian, also known as Farsi, modern Iranian language spoken in Iran and Afghanistan.
person, the form a verb takes in indicating whether it refers to the person speaking, the person spoken to, or the person or thing spoken about. In Modern English I is the "first person singular;" you is the "second person singular," we is the "first person plural," etc.
Phoenician, the extinct Semitic language of the Phoenicians, closely related to Hebrew.
Phrygian, Indo-European language formerly spoken in Anatolia.
PIE, "Proto-Indo-European," the hypothetical reconstructed ancestral language of the Indo-European family. The time scale is much debated, but the most recent date proposed for it is about 5,500 years ago.
Piman, branch of Uto-Aztecan languages spoken in what is now southern Arizona and northern Mexico.
plural, the form of a word that denotes it refers to more than one person or thing. Some languages have a dual number (there are relics of it in Old English), and in those the plural refers to more than two people or things.
Polish, West Slavic language spoken in Poland.
Portuguese, Romanic language spoken chiefly in Portugal and Brazil.
possessive, form of a word designating possession or some similar relationship. Usually formed in English with an -s and an apostrophe; John's is possessive of John.
Pre-Greek, a proposed substrate language spoken in the region of Greece before the arrival of Indo-European peoples. It is supposed to have contributed a number of words and toponyms to Greek.
prep., "preposition," a word that connects a noun to another element of a sentence; in Modern English common prepositions include in, by, for, with, to.
present participle, a form of a verb that can be a verb, an adverb, and even a noun (gerund), and which denotes action which is ongoing. In Modern English, most easily identified by its characteristic ending -ing. Thus, asking is the present participle of ask.
present-preterite, a group of Germanic verbs (mostly auxiliaries such as may, shall, can) whose original past tense forms split off and became separate pres. tense verbs (might, should, could).
preterite, the simple past tense.
privative, indicating negation, absence, or loss, such as the prefix un- or the suffix -less.
Proto-Germanic, hypothetical prehistoric ancestor of all Germanic languages, including English.
Proto-Italic, hypothetical prehistoric ancestor of all Italic languages, including Latin and its descendants, the Romance languages.
Proto-Indo-European, see PIE.
Provençal, Romance language of several dialects in southern France.
Quechua, South American native language spoken in the Andes; the language of the Inca Empire.
q.v., abbreviation of Latin quod vide "which see."
reduplicated, an inflectional device in which a syllable or part of a syllable is copied. Ancient Greek formed its perfect tenses by reduplication: leipo "I leave," le-loipa "I have left." It's rare in English, but examples would be tom-tom and chitchat.
reflexive, form of a word which indicates the subject and object of a verb in a sentence are the same, so that a transitive verb is directed back on its subject. ("John hurt himself" is a reflexive sentence.)
rhotacism, the tendency in spoken language for "r" to take the place of other sounds, especially "s/z." Latin flos "flower" has genitive floris, an instance of rhotacism.
root, a historical term which refers to the etymological source of base elements in languages.
Russian, East Slavic language of Russia.
Sanskrit, the classical Indian literary language from 4c. B.C.E.
Scandinavian, also known as North Germanic, sub-group of Germanic spoken in Scandinavia consisting of Norwegian, Swedish, Danish.
Scottish, the variety of English spoken by the people of Scotland. Not to be confused with Gaelic, which is Celtic. A number of French words entered English through Scotland because of the political alliance and connection of Scotland and France 13c.-16c.
Semitic, major subgroup of Afroasiatic language family, including Hebrew, Aramaic, Akkadian.
Serbian, eastern variant of Serbo-Croatian, a Slavic language, generally written in Cyrillic.
Serbo-Croatian, South Slavic language or group of dialects spoken in Serbia, Croatia, Montenegro, Bosnia, and Herzegovina. Official standard language of the former Yugoslavia.
singular, the form of a word that denotes it refers to only one person or thing.
Sinhalese, Indic language of Sri Lanka.
Siouan, North American native language group of the U.S. upper Midwest, including the language of the Sioux, Crow, Omaha and Osage peoples.
Slavic, a principal branch of the Indo-European language family spoken in Eastern Europe. Includes Russian, Polish, Serbo-Croatian.
Slovak, West Slavic language of Slovakia.
Spanish, also known as Castilian, Romanic language spoken in Spain and Spanish America.
subjunctive, the mood typically denoting notions like unreality, doubt.
Sumerian, non-Indo-European, non-Semitic language of ancient Sumer.
superlative, the third degree of comparison of an adjective or adverb. Longest is the superlative of long.
substrate/substratum, refers to the languages of the non-Indo-European tribes native to mainland Europe before the arrival of the IE peoples. They presumably survived for a time under Indo-European domination. Linguists have begun to see evidence of pre-Indo-European loanwords in IE languages (especially in names of rivers, landscape features, animals, and plants) and even reconstructed some of their phonological qualities, though the process is not without controversy.
Swedish, North Germanic language spoken in Sweden.
Tagalog, Indonesian language of the Philippines.
Tamil, Dravidian (non-Indo-European) language of southern India and Sri Lanka.
Telugu, Dravidian (non-Indo-European) language of southern India.
Thai, Sino-Tibetan language of the Tai group spoken in Thailand.
Tibetan, Sino-Tibetan language of Tibet.
trans., "transitive," of verbs, taking a direct object.
Tupi, native South American language of the Amazon basin; the northern branch of Tupi-Guarani.
Turkish, Turkic (non-Indo-European) language spoken in Turkey.
Turkic, branch of the Altaic language family spoken in Turkey and parts of south-central Asia. It includes Turkish, Uzbek, Kirghiz.
Twi, Kwa (Niger-Congo) language of Ghana.
Ukrainian, East Slavic language spoken in Ukraine.
Urdu, language of the Muslim conquerors of India; originally a dialect of Hindi with a large admixture of Arabic and Persian. From zaban-i-urdu "language of the camp."
U.S., United States.
Uto-Aztecan, native North American language family of the southwestern U.S. and Mexico.
v., "verb." A word that asserts or declares; that part of speech of which the office is predication, and which, either alone or with various modifiers or adjuncts, combines with a subject to make a sentence.
Vulgar Latin, the everyday speech of the Roman people, as opposed to literary Latin.
vocative, the case or expression of "direct address." In English it long ago merged with the nominative.
West African, languages of the Guinea coast and inland regions of Africa, the principal source of slaves for the European colonies in the New World.
West Frisian, dialect variant of Frisian spoken in the Netherlands.
West Germanic, the subgroup of Germanic comprising English, Dutch, German, Yiddish, Frisian, etc.; also the language spoken by the ancestral group during the presumed period of unity.
Wolof, Niger-Congo language of Senegal and Gambia.
West Saxon, the dialect of Old English spoken in the Anglo-Saxon kingdom of Wessex.
Xhosa, Bantu language of South Africa.
Yoruba, West African tonal language of the Kwa group, spoken in Nigeria.
This is the creation of an amateur. Great care has been taken and it's as accurate as I can make it. But if you're a professional linguist or a serious student of linguistics, you shouldn't be doing your homework here. This is for the rest of us.
Etymonline is compiled from published sources and in some points from original material made available on the Internet. The first step was to draw up a list of English words out of an ordinary dictionary and seek their etymologies.
To compile an entry, I look up the word in my major sources: the Oxford English Dictionary (second edition), the Barnhart Dictionary of Etymology (1988), Weekley’s "Etymological Dictionary of Modern English" (1921), and Ernest Klein's “Comprehensive Etymological Dictionary of the English Language” (1971).
In most cases, all of these offered substantially the same derivations. My work was to merge them and put them into a consistent and accessible format. In some cases, one text offered information where others were silent. I’ve erred on the side of inclusion. In a few cases, they differed significantly in their derivations. I’ve tried to indicate that. In some cases they varied in their descriptions of the precise path a word took through Latin and French to English. I have not always indicated these differences.
The next step was to take the draft entry to my secondary shelf of sources: principally dictionaries of Old English, Middle English, Latin, Greek, French, and etymology dictionaries for French (Gamillscheg), German (Kluge), Greek (Beekes), and Latin (de Vaan). From these I attempted to flesh out the entries and give them some nuance and answer some questions I had about words that the big books did not notice.
For slang and colloquial usage, I consult the Kipfer/Chapman "Dictionary of American Slang" (which despite its title embraces many Britishisms), DARE, and Farmer. And for navigating the back alleys of English I had as a lantern the always delightful "Dictionary of Buckish Slang, University Wit and Pickpocket Eloquence." The modern definitions are largely from or adopted from the excellent "Century Dictionary" (1889-1902).
The list of books on the secondary shelf has changed somewhat since the work began. At the start, I made use of a couple of books which seemed authoritative but eventually revealed themselves to be whimsical or worse. One was so deadly it became the only book I’ve ever destroyed as a menace to society. I’ve tried to weed out the bad seeds they left in this work. It’s possible a few remain.
Originally I did not intend to include Proto-Indo-European roots, in part because there was such wide disagreement among the sources I consulted, in part because the whole field seems so speculative. But users wrote to me seeking them, so I’ve added them to the best of my ability, mainly based on the Watkins "American Heritage Dictionary of Indo-European Roots" but also by consulting Pokorny and such modern works as are available for Latin, Greek, and Germanic in the Leiden Indo-European Etymological Dictionary Series (see "sources").
With many words, root meanings or sense evolutions remain obscure. I would be content to leave them as such, but readers are curious to know what guesses have been made (or dismissed) by the experts, as well as what facts have been settled. So I’ve included such speculations that have appeared in the sources I consulted. They ought to be prefaced by "perhaps" in this text, even where the sources say "probably."
My tendency at first was to “clump” the words -- have, say, sleep, sleeper, sleepwalking, sleepy, slept all under one head. But the tyranny of search engines has tended to turn me into a "splitter."
In many cases my want of education in this field has led me astray in understanding or interpreting the sources, and in others I’ve simply bungled my notes, or typed things amiss. Many readers, including many academics, have been kind enough to point these out, as they find them, and set me right. Whatever indignation they might feel at my trespass on their demesne, they appreciate the usefulness of a free, ready online resource and have contributed much to make it better than I could.
A NOTE ON DATES: Old English manuscripts are too few and of too uncertain origin for dates to have any meaning. In Middle English, this site generally makes use of the dates in the Barnhart dictionary, whose compilers gave especial attention to this period, and the online version of the University of Michigan's exhaustive Middle English Dictionary. The Oxford English Dictionary is the principal source of dates in modern English up to about 1700.
For the 19th and 20th centuries, the OED print edition's inattention to American sources begins to tell: As recently as the 2001 printing of the hardcover 2nd edition, there was no entry for a good American word like "bloviate," "dribble" was recognized as a term in soccer but not in basketball, and the earliest citation for Dixieland music was from "Punch." The OED is a monumental work, rightly revered, but a suspicion arises that there might be more to the language than what is in it. The OED's editors and stringers are constantly revising and updating their text.
The recent availability of newspaper and magazine archives in computer files, and the flood of material presented in the searchable Google books project, opens a vast field for careful research -- careful because the Google publications too often are misdated on the Google introduction pages. The bulk of the 19th and 20th century dates in this work have been found, or confirmed, in such sources.
The availability of Internet newsgroups archives -- again, used judiciously -- is a convenient way to find rough early dates for the most contemporary words.
One of my chief sources for this work is Ernest Klein's dictionary. Klein, Rabbi of Nové Zámky in Czechoslovakia from 1931-44, was deported to Dachau and returned home after liberation to find "that my father, my wife, my only child Joseph, and two of my three sisters had suffered martyrdom in Auschwitz." He moved to Canada, and out of his sorrow and urged on by his surviving sister he set down his lifelong love of etymology into a book, and in its introduction he wrote:
May this dictionary, which plastically shows the affinity and interrelationship of the nations of the world in the way in which their languages developed, contribute to bringing them nearer to one another in the sincere pursuit of peace on earth — which was one of my cardinal aims in writing this dictionary.
This work is dedicated to all those who seek the old paths,
the well-worn, unpaved hill-ways;
and especially to those who honor the elder teachers.