Welcome to YKI-Corpus!

The corpus has been compiled from the examinations of The National Certificates of Language Proficiency. It is intended for research purposes.

The National Certificates of Language Proficiency is a language testing system for adults and is not connected to any curriculum or syllabus. Examinations can be taken in nine languages: English, Finnish, German, Italian, North Sami, Russian, Spanish, and Swedish. There are three test levels: Basic, Intermediate and Advanced offering six levels of proficiency (1-2, 3-4, 5-6). The corpus contains data from all nine languages and levels.

The corpus contains both quantitative and qualitative data. Quantitative data provide corpus users with assessments of the four skills of reading, writing, listening, and speaking (structures and vocabulary prior 2012) as well as background information given by the candidates during the examinations. Because filling in the background information questionnaire is voluntary, this data are not available on every candidate.

Qualitative data contain oral and written performances, i.e. candidates’ responses to the tests tasks. There are three written performances and one oral performance from each candidate.

The data are connected with candidates’ id numbers, which makes it possible to search between different data. The data are not necessarily equal in size.

The corpus is dynamic in nature, new data are added after each test round.

More information about the National Certificates of Language Proficiency at www.oph.fi and at www.solki.jyu.fi