Frequency of idiomatic word groups in the www: a comparative language study




... wie der Esel zwischen 2 Heuhaufen.
Groups of words often adopt a new meaning exceeding the composite meaning of the contributing words (idioms, phrases). They can be identified in large text corpora by search engines. The by far largest text corpus is actually the www, a heterogeneous collection of all sorts of texts, from high arts down to the most trivial gossip.

Mentir comme un arracheur de dents

Watched pot never boils
In spite of its dynamic character, screening the www extensively for word groups returns surprisingly robust results (Berger, J Quant Linguist 26: 81-94, 2019). In the proposed project, it is foreseen to extend this study to more languages, recruiting students of various mother tongues.
Aquila non captat muscas
MEi:CogSci is a joint master’s programme offered by the Comenius University in Bratislava, Eötvös Loránd University Budapest, University of Ljubljana, and University of Vienna & Medical University of Vienna. In March 2019, Viktoriia Vinokurova and Mustafa Mohammed started to extend the study to Russian and to Arabic, respectively.


back