The Santiago UNiversity Corpus Of Discussions in Academic Contexts (SUNCODAC) is a collection of asynchronous online discussions held through the Moodle learning platform as part of an undergraduate course on English-Spanish translation. They represent a highly focused and goal-oriented type of interaction, where students were prompted to collaborate in the translation of a short passage by offering critical feedback on an initial draft proposed by a classmate.

The compilation spans four consecutive academic years, from 2014 to 2017. During the first two years, Spanish was the language used in most of the posts and only a few of the discussions contain some posts in Galician and some in English. In the 2016 and 2017 editions of the course, only English was used. As a result, the corpus contains roughly comparable amounts of discussions in Spanish and in English, thus permitting comparative analyses of language features and patterns characteristic of first and second-language contexts. The regular participation of a small number of exchange students of diverse nationalities defines a context where English in particular plays a genuine role as a lingua franca.

The texts are unedited and have been stored in XML format with minimum annotation restricted to the relevant metadata (L1 and gender of post author, date and time of post, type of contribution to the forum, etc.). The current version of the online search tool allows the user to search for words, short phrases and words in the proximity of other words in the whole corpus or in specific sections through the application of a series of filters, and to retrieve full texts according to selected criteria for more qualitative analyses.

The SUNCODAC corpus was developed with generous financial support from the Spanish Ministry of Education, Innovation and Universities (grant PGC2018-093622-B-100), the European Regional Development Fund, the University of Santiago and the Regional Government of Galicia (grant ED431B2021/02). The compilation was prepared by Mario Cal Varela and Francisco Javier Fernández Polo, members of the SPERTUS research group. The tool interface was developed by Mario Barcala and the NLPGO group, who also converted the texts to the xml format. SUNCODAC is now freely available to researchers.


SUNCODAC. 2021. The Santiago University Corpus of Discussions in Academic Contexts. Santiago de Compostela: University of Santiago de Compostela. [] (date of last access)