Exploitation
de ressources linguistiques en ligne
1) Création d’un corpus
Research
influenza vaccination
Relevance
/ Most recent
Enregistrement
au format Texte
2) Téléchargement du concordancier
Monconc
perso.univ-lyon2.fr/~maniezf/c.zip
Word list
Alt QCF
Stop list
Simple search
Ctrl S
Collocate Search
stud?%%
Ctrl F
*demic*
*-*ed
Alt Q A
List of expressions
Alt Q A
Search Term
?????* ?????*
?????* ?????* ?????*
Adjectival suffixes :
????*ic ????*
antigenic shift
Linguee
http://www.linguee.fr/francais-anglais/search?source=anglais&query=antigenic+shift
mutation antigénique
cassure antigénique
Termium
cassure antigénique
????*al ????*
????*ive ????*
????*ar ????*
3) Utilisation d’un corpus étiqueté et
lemmatisé
Google : corpus CRTT
http://perso.univ-lyon2.fr/~maniezf/Corpus/Corpus_medical_FR_CRTT.htm
http://perso.univ-lyon2.fr/~maniezf/MEDFR.zip contient la
totalité du corpus
AnnCardAng_txt.rar
version
étiquetée et lemmatisée
File
Tag settings
Part of Speech tags
Collect
Tag Information
Recherche de mots
appartenant à une catégorie grammaticale donnée :
*_*_V*
*_présenter_V*
Ctrl F
*_diagnostic_N*
Ctrl F
*_évoquer_V*
4) Utilisation du Corpus of Contemporary American English
A) Recherche d’un mot ou d’une expression
dissertation
LIST
CHART
KWIC
B) Comparaison de mots ou d’expressions
COMPARE
dissertation thesis
SORT BY
FREQUENCY
RELEVANCE
doctoral dissertation is 5 times more
prevalent than doctoral thesis
COLLOCATES * 1 0
The comparison shows that doctoral and unpublished are
the only adjectives that qualify dissertation.
committee commission
COLLOCATES *al 1 0
COLLOCATES [jj] 1 0
strange odd
COLLOCATES [nn] 0 1
efficient vs. effective
COLLOCATES [nn] 0 1
Sections : ACAD:Medicine
Minimum Frequency : 1 1
efficacious, potent
COLLOCATES [nn] 1 0
reaction vs. response
COLLOCATES [nn] 0 1
cell
vs. cellular
biologic vs. biological
pathologic vs. pathological
Recherche d’un
groupe nominal entier :
RESET
LIST
Sections : ACAD:Medicine
*ic.[jj*] [nn*]
*al.[jj*] [nn*]
LIST
CHART
C) Variations diachroniques et diastratiques
facebook
twitter
hypothesize
D) Comparaison de structures syntaxiques
Effacement de
CLAIM THAT pronoun
verb
ACADEMIC
[claim].[v*] that [p*] [v*] 574
[claim].[v*] [p*] [v*] 385
SPOKEN
[claim].[v*] that [p*] [v*] 624
[claim].[v*] [p*] [v*] 1427
KNOW THAT THE noun
verb
ACADEMIC
[know] that the [n*]
[v*] 304
[know] the [n*] [v*] 257
SPOKEN
[know] that the [n*]
[v*] 758
[know] the [n*] [v*] 1182
CONTEND THAT THE
noun
ACADEMIC
[contend] that the [nn] 271
[contend] the [nn] 18
SPOKEN
[contend] that the [nn] 17
[contend] the
[nn]
28
E) Utilisation des parties du discours
You can also use part of speech tags is by selecting
them from the drop-down list (click on [POS LIST] to show it).
|
Syntax |
Meaning |
Examples
(Click to run) |
Sample
matches |
[pos] |
Part of speech (exact) |
going, using |
|
[lemma] |
Lemmas (all forms of a word) |
sing, singing, sang |
|
[=word] |
Synonyms |
low, tired, soft,
vulnerable, etc. |
|
word|word |
Any of these words |
stunning, charming, gorgeous |
|
*xx |
Wildcard: * = any # letters |
unlikely, unusually |
|
-word |
NOT (followed by PoS,
lemma, word, etc. Most useful for "multiple slot" queries; see
below) |
the, in, is |
|
Combinations of preceding (samples) |
|||
You can limit to a particular part of speech by
adding a period (full stop) and then the part of speech tag in brackets. |
|||
word.[pos] |
Exact word and part of speech |
strike (only as a verb) |
|
word*.[pos] |
Substring and part of speech |
discovered, disappeared, discussed |
|
[lemma].[pos] |
Lemma and part of speech |
strike, struck, striking |
|
[word].[pos] |
Synonym and part of speech |
hit, strike, defeat |
|
You can add "lemma" to any other type of search,
such as synonym or customized list, to see all forms of the matching words.
Just use an extra set of brackets. |
|||
[[=word]] |
Synonym and lemma |
announced, circulating, publishes, issue |
|
You can also choose lemma and part of speech by
combining the preceding symbols |
|||
[[=word]].[pos] |
Synonym and lemma and part of speech |
mop, scrubs, polishing |
|
Multiple "slots" :
Create sequences of words, using any of the preceding query types.
Note that in each case, there is a space between the word "slots"
in the query. These are just a few examples, from an unlimited number of
combinations. Note on advanced queries involving variable length between words. |
|||
nooks and crannies |
|||
fast food |
|||
pretty smart |
|||
get her to stay |
|||
.|,|;
nevertheless [p*] [v*] |
. Nevertheless it is |
||
break the law |
|||
beat the Yankees |
|||
beautiful woman |
5) Utilisation d’un corpus aligné dans
Access
http://perso.univ-lyon2.fr/~maniezf/Bird_Flu.txt
Importation
des données
Occurrences
de :
spread
Comme
"*monitor*"
Pas
Comme "*contrôl*"
Pas
Comme "*surveill*"
6) Utilisation d’un corpus multilingue
aligné
Google : Opus Tiedemann
EMEA - European Medicines Agency
documents (EMEA0.3.tar.gz - 5.0
GB)
Intersection
des colonnes en et fr
Google : alinea kraif
Téléchargement
d’Alinea
Texte