|
Corpora and Language Dynamics
April 12-14, 2010, University of Helsinki, Finland
Ilmoittautumislomake
(Langnet students only)
Palautelomake
(Langnet students only)
Program
Organizers: Langnet, Finland & Doctoral School
of Humanities, Tallinn University, Estonia
Registration deadline March 15th
ECTS: 3-4
Contact: Anna Verschik anna.verschik (at) tlu.ee; Ulla Vanhatalo
ulla.vanhatalo (at) helsinki.fi
Registration
Please send your name and affliation to ulla.vanhatalo (at) helsinki.fi.
If you are willing to present your own corpora related PhD study
(15min+15min), please let us know at registration. Once registration
is over, we will send more information regarding venues.
Description
This PhD course aims at giving the participants an introduction
to current topics at the forefront of corpus linguistics. Topics
include the following. The World Atlas of Language Structures (WALS)
which will be presented by Matti Miestamo. WALS
is a typological database containing information on ca. 2500 languages
and 142 linguistic features. This talk will clarify the nature of
the typological research underlying the database and situate WALS
in the context of other typological databases. The CLARIN
project will be introduced by Antti Arppe. CLARIN aims to create
an infrastructure that makes language resources and technology available
to scholars of all disciplines, especially the social sciences and
humanities by uniting existing digital archives into a federation
of archives with unified web access. It also provides language and
speech technology tools as web services operationg on language data
in archives. Anna Mauranen’ s presentation will talk about
the challenges of compiling manageably small corpora which simultaneously
cover a maximally broad range of desirable variation. It points
out some pitfalls that we tend to forget, which undermine corpus
comparability and scale benefits. On the bright side, the delights
of compiling a new kind of corpus, exemplified by ELFA,
a spoken lingua franca database, will be brought up. If exploratory
corpora look risky, they have the advantage of posing novel questions.
The course also includes a series of lectures. Antti Arppe will
hold a lecture entitled “Exploiting Multiple Methods in Linguistics”,
which will be followed by a case study. Antti Arppe will, tentatively,
hold a lecture entitled “Basic Explorations with Univariate
Analysis”. Pille Eslon, Annekatrin Kaivapalu and Erika Matsak
will hold a lecture on learner language and language corpora. Professor
Raymond Hickey from the University
of Duisburg-Essen will have three talks concerning using the Internet
as a corpus. There is also room for student presentations. If you
are willing to present your own corpora related PhD study (15min+15min),
please let us know once you are accepted to the course. |