Import formats
This document presents the structure of the export and import files of Antidote’s personal dictionaries.
Structure of the Export and Import Files
- Files are in text format (.txt).
- Each line must contain 1, 2 or 8 tokens separated by tabulations.
- Lines beginning with “//” are ignored with some exceptions (see the Word language and Encoding sections).
- Empty lines are ignored.
-
For more details about the file format, examine the example files below:
Word Language
Bilingual Antidote
If the words will be imported into a version of Antidote that handles both languages, the language of each of the words must be specified. One of the following three identifiers must be used:
* FR: French
* EN: English
* ML: Multilingual (English and French)
If the dictionary contains words from only one language, the language is specified globally by putting the identifier in parentheses at the beginning of the file in the following comment:
Name: DICTIONARY_NAME (LANGUAGE_IDENTIFIER)
If the dictionary contains English words and French words, the language identifier is added as the first attribute (see the section below on Importing eight tokens);
If the language is not identified according to these specifications, an import error will occur.
Monolingual Antidote
Specifying the words’ language is not necessary when Antidote handles a single language. By default, the language will correspond to the one handled by your version of Antidote.
Encoding
Antidote tries to guess the file’s encoding based on the content and your system platform. You can also tell Antidote which encoding you are using by inserting the special comment coding:
at the beginning of the file followed by the encoding identifier.
Choose from the following identifiers:
mac_roman
iso8859_15
cp1252
utf_8
utf_16_le
utf_16_be
Example:
// coding: utf_8
Importing One Token
If a line contains a single token, that token is considered the base form; Antidote infers the word’s syntactic category, and (if required by the category) its morphology or verb model. In multilingual Antidote, the word’s language must be specified globally at the top of the file with a special comment (see the Word language section)
BASE_FORM __END_OF_LINE__
Importing Two Tokens
If a line contains two tokens, the first token is considered the base form and the second is considered the syntactic category. Antidote infers the word’s morphology or verb model accordingly.
In multilingual Antidote, the word’s language must be specified globally at the top of the file with a special comment (see the Word language section)
BASE_FORM __TAB__ CATEGORY __END_OF_LINE__
Importing eight tokens
If a line contains eight tokens, these tokens are considered in the following order:
- Base form
- Category
- Inflection 1 field
- Inflection 2 field
- Inflection 3 field
- Inflection 4 field
- Attributes separated by +
-
Definition
- To introduce a new definition, insert the “◊” character.
- To introduce a new paragraph, insert the “\r” characters.
-
For a multilingual word, it is possible to specify a definition for each language. All definitions are included in the same field. The French definition must be preceded by the keyword
|FR|
and the English definition by the keyword|EN|
.Example:
|FR|Définition de « ABCDEFG »|EN|Definition of “ABCDEFG”
Accepted Categories
French Words
Acronyme
(acronym or initialism)Adj
(adjective)Adv
(adverb)Interj
(interjection)LocutionLatine
(Latin expression)Nom
(common noun)NP
(proper noun)Verbe
(verb)Pref
(prefix)
English Words
Acronym
(acronym or initialism)Adj
(adjective)Adv
(adverb)Interj
(interjection)Noun
(common noun)PN
(proper noun)Verb
(verb)
Multilingual Words
Acronyme
(acronym or initialism)Acronym
(acronym or initialism)NP
(proper noun)PN
(proper noun)
Inflection Fields
French Words
- Acronym, adjective, adjectival Latin expression, noun, proper noun
- Inflection 1 field: Masculine singular
- Inflection 2 field: Masculine plural
- Inflection 3 field: Feminine singular
- Inflection 4 field: Feminine plural
- Verb
- Inflection 1 field: Verb model
- Inflection 2 field: Leave empty
- Inflection 3 field: Leave empty
- Inflection 4 field: Leave empty
- Adverb, adverbial Latin expression, interjection, prefix
- Inflection 1 field: Leave empty
- Inflection 2 field: Leave empty
- Inflection 3 field: Leave empty
- Inflection 4 field: Leave empty
English Words
- Noun, acronym
- Inflection 1 field: Singular
- Inflection 2 field: Plural
- Inflection 3 field: Leave empty
- Inflection 4 field: Leave empty
- Adjective
- Inflection 1 field: Singular
- Inflection 2 field: Comparative
- Inflection 3 field: Superlative
- Inflection 4 field: Leave empty
- Verb
- Inflection 1 field: 3rd person
- Inflection 2 field: Past
- Inflection 3 field: Past participle
- Inflection 4 field: Present participle
- Adverb, interjection, proper noun
- Inflection 1 field: Leave empty
- Inflection 2 field: Leave empty
- Inflection 3 field: Leave empty
- Inflection 4 field: Leave empty
Multilingual Words
- Acronym, proper noun
- Inflection 1 field: Masculine singular
- Inflection 2 field: Masculine plural
- Inflection 3 field: Feminine singular
- Inflection 4 field: Feminine plural
Attributes and Sub-Attributes Accepted for Each Category
French Words
- Noun
Chose
Entite
Diplome
Personne
Fonction
Animal
UniteMesure
- Adjective
Habitant
Langue
- Interjection
Sensation
Message
Bruit
- Adverb
Maniere
Temps
Lieu
- Latin Expression
Adjectif
Adverbe
- Proper Noun
Famille
Prenom
Compagnie
Marque
Lieu
VoieDeCirculation
Ville
Ile
Region
CoursDeau
CorpsCeleste
Habitant
Langue
TitreOeuvre
Autre
- Acronym
Compagnie
Chose
NonComptable
Diplome
Personne
FonctionSociale
- Verb
- No attribute is accepted
- Accepted models
Aimer
Finir
Courir
Rendre
- Prefix
FacultTDU
BesoinTDU
ErreurTDU
FacultTDUReforme
BesoinTDUReforme
ErreurTDUReforme
English Words
- Noun
Thing
Entity
Diploma
Person
Profession
Animal
Group
UnitOfMeasurement
- Adjective
Demonym
Language
- Interjection
Emotional
Message
Sound
- Adverb
Manner
Time
Place
- Proper Noun
LastName
FirstName
Company
Brand
Place
Street
Town
Island
Region
CourseOfWater
CelestialBody
Demonym
Language
TitleOfAWork
Other
- Acronym
Company
Thing
Uncountable
Diploma
Person
Profession
- Verb
- No attribute is accepted
Multilingual Words
- Proper Noun
Famille
,LastName
Prenom
,FirstName
Compagnie
,Company
Marque
,Brand
Lieu
,Place
VoieDeCirculation
,Street
Ville
,Town
Ile
,Island
Region
,Region
CorpsCeleste
,CelestialBody
Habitant
,Demonym
Langue
,Language
TitreOeuvre
,TitleOfAWork
Autre
,Other
- Acronym
Compagnie
,Company
Chose
,Thing
NonComptable
,Uncountable
Diplome
,Diploma
Personne
,Person
FonctionSociale
,Profession
- Make sure that each sub-attribute is accompanied by its attribute and that they use the spellings listed above. They must be separated by a "+" symbol, with no spaces.