Moodle    NordLing  

  In English     På svenska

Avaa valikko  |  Sulje valikko   

 

Corpora and Language Dynamics
April 12-14, 2010, University of Helsinki, Finland

Ilmoittautumislomake (Langnet students only)
Palautelomake
(Langnet students only)

Program

Organizers: Langnet, Finland & Doctoral School of Humanities, Tallinn University, Estonia

Registration deadline March 15th

ECTS: 3-4

Contact: Anna Verschik anna.verschik (at) tlu.ee; Ulla Vanhatalo ulla.vanhatalo (at) helsinki.fi

Registration

Please send your name and affliation to ulla.vanhatalo (at) helsinki.fi. If you are willing to present your own corpora related PhD study (15min+15min), please let us know at registration. Once registration is over, we will send more information regarding venues.

Description

This PhD course aims at giving the participants an introduction to current topics at the forefront of corpus linguistics. Topics include the following. The World Atlas of Language Structures (WALS) which will be presented by Matti Miestamo. WALS is a typological database containing information on ca. 2500 languages and 142 linguistic features. This talk will clarify the nature of the typological research underlying the database and situate WALS in the context of other typological databases. The CLARIN project will be introduced by Antti Arppe. CLARIN aims to create an infrastructure that makes language resources and technology available to scholars of all disciplines, especially the social sciences and humanities by uniting existing digital archives into a federation of archives with unified web access. It also provides language and speech technology tools as web services operationg on language data in archives. Anna Mauranen’ s presentation will talk about the challenges of compiling manageably small corpora which simultaneously cover a maximally broad range of desirable variation. It points out some pitfalls that we tend to forget, which undermine corpus comparability and scale benefits. On the bright side, the delights of compiling a new kind of corpus, exemplified by ELFA, a spoken lingua franca database, will be brought up. If exploratory corpora look risky, they have the advantage of posing novel questions.

The course also includes a series of lectures. Antti Arppe will hold a lecture entitled “Exploiting Multiple Methods in Linguistics”, which will be followed by a case study. Antti Arppe will, tentatively, hold a lecture entitled “Basic Explorations with Univariate Analysis”. Pille Eslon, Annekatrin Kaivapalu and Erika Matsak will hold a lecture on learner language and language corpora. Professor Raymond Hickey from the University of Duisburg-Essen will have three talks concerning using the Internet as a corpus. There is also room for student presentations. If you are willing to present your own corpora related PhD study (15min+15min), please let us know once you are accepted to the course.

  Ylläpito: Meri Korhonen / Itä-Suomen yliopisto