Guest speakers

Damon Mayaffre (Univ. Côte d'Azur / CNRS) - L'ADN de l'ADT. Aux limites de l'interdisciplinarité

The DNA of ADT lies in its interdisciplinary nature. Statistical description informs linguistic interpretation. The quantitative feeds the qualitative. Numbers interact with words. However, the interdisciplinary approach requires a dual awareness: the linguistic awareness of the text object and the mathematical awareness of the statistical index. In this contribution, the hermeneutical dimension of the text is posited, the better to summon a statistic that is not probative but exploratory. The texts, brought together in a corpus, are interpretative paths: the ADT integrates the measure, the coefficient, the tree or the vector into these reading paths.

Keywords: text, textual statistics, digital humanities, AFC, CA, textometry, corpus semantics, digital hermeneutics.

Barbara McGillivray (Kings College London) - Quantitative analyses of diachronic semantics: corpus annotation, disambiguation, and change detection

The slides are available here: McGillivray_JADT_1.pdf

The digital transformation, combined with the advancement of computational methodologies in natural language processing, has paved the way for new quantitative research on textual data in previously underexplored areas such as diachronic semantics, the study of word meaning over time. Leveraging large-scale datasets, we can now quantitatively analyse the evolution of word meanings over both short and long time intervals, not only testing linguistics hypotheses, but also advancing research in disciplines such as history, literary studies, and lexicography.

Recent advances in automatic methods for the detection of semantic change from large corpora offer new quantitative perspectives on diachronic and historical semantics and on the between semantic shifts and the broader cultural and societal context. In parallel, the semantic annotation of corpora, both manual and automatic via word sense disambiguation techniques drawn from natural language processing research, has the potential to reveal nuanced patterns in the distribution of words’ meanings in context.

In this presentation I will offer an overview of computational techniques for the annotation, detection, and analysis of semantic shifts in diachronic texts of various periods, from ancient languages to social media. Drawing upon a range of projects across different languages, the talk delves into methodological challenges and the broader implications of this research for quantitative linguistics.

Ashwin Ittoo (ULiège) - Les LLMs (Large Language Models) et la textométrie

L'évolution de la textométrie, ou l'analyse quantitative des textes, a été un domaine en constante mutation au cours des dernières décennies. Initialement utilisée principalement dans le domaine de la linguistique, la textométrie a rapidement étendu son influence à d'autres domaines tels que la recherche en sciences sociales, l'analyse de données textuelles sur internet et même la littérature numérique.

Cette évolution a été grandement facilitée par les progrès technologiques. L'essor des LLMs a profondément influencé le domaine de la textométrie, offrant des avantages considérables mais aussi soulevant des préoccupations importantes. Parmi les points forts, on compte leur capacité à analyser et à traiter de vastes quantités de texte de manière efficace, permettant aux chercheurs d'explorer des corpus massifs et de découvrir des tendances et des informations significatives.

Mon exposé servira à Lla démystification des LLMs (Large Language Models) . Ces modèles, tels que GPT (Generative Pre-trained Transformer), ont suscité à la fois fascination et inquiétude en raison de leur capacité à générer du texte de manière fluide et cohérente, semblable à celui produit par des humains. On verra les mécaniques sous-jacents et les dérives de des modèles. Ensuite, on traitera de l'impact de ces modèles sur la textométrie.

Privacy | Accessibility