From Context to Meaning

Distributional Models of the Lexicon in Linguistics and Cognitive Science

Special issue of the Italian Journal of Linguistics

edited by Alessandro Lenci


Background and Motivation

The hypothesis that word co-occurrence statistics, as extracted from text corpora, can provide a natural basis for semantic representations has been gaining growing attention both in computational linguistics and in cognitive science. Some variation of the so-called distributional hypothesis – i.e. words with similar distributional properties have similar semantic properties – lies at the heart of a number of computational approaches that share the assumption that it is possible to dynamically build semantic representations of the lexical space through the statistical analysis of the contexts in which words co-occur.

Distributional models of meaning are directly related to the classical discovery procedure of the structuralist tradition, and to the collocational analysis typical of corpus linguistics. Both have gained new explanatory power thanks to the availability of large–scale textual corpora, the development of more sophisticated mathematical techniques to model word statistical co-occurrence, and – last but not least – the existence of computational frameworks that have turned the distributional approach into an effective tool for building semantic lexical representations from texts.

Computational analysis has been applied to extract different types of semantic lexical properties, spanning from synonymy relations to argument structure. From the epistemological point of view, distributional approaches raise the twofold question of the extent to which lexical properties can be reduced to usage patterns derived from texts, and of the role of context in determining the shape of the word semantic space. In fact, computational approaches to meaning have been claiming to provide not only a different way to investigate the words semantic properties through “text-mining” processes, but also the opportunity to design radically new styles of usage-anchored semantic representations patterns.

In cognitive science, many researchers have strongly argued for the psychological validity of distributional semantic spaces. For instance, corpus-derived measures of semantic similarity have been assessed in a variety of psychological tasks ranging from similarity judgments to simulations of semantic and associative priming, etc. Distributional techniques have also been applied to model child lexical development as a bootstrapping process in which lexical and grammatical categories are extracted from the statistical distributions in the adults’ input.

Goal of the Special Issue

The aim of this thematic number of the Italian Journal of Linguistics is to provide an interdisciplinary arena to carry out an in-depth analysis of the true potentialities as well as of the limits of the distributional paradigm as a viable alternative to more traditional ways of representing meaning. Some of the questions discussed in the papers include:

  • Which is the role of distributional approaches to build cognitively and linguistically plausible semantic representations?
  • How can context-derived representations shed new light on semantic conundrums of human lexical competence, such as polysemy, semantic analogy making, the syntax-semantics interface, etc.?
  • What is the impact of distributional models on the psycholinguistic and neuropsychological investigation of the mental lexicon?
  • To what extent the distributional paradigm can be a viable model for the acquisition of semantic categories?
  • Which are the intrinsic explanatory limits of “usage-based” approaches to meaning?
  • What are the parameters these models depend on (e.g. context format, mathematical modeling tools, etc.)?
  • Which aspects of meanings are best captured by distributional analysis and which appear to inexorably lie beyond its limits?
  • Which semantic representations can be built using distributional information, and how do they relate to traditional ways of representing meaning, such as feature structures, frames, etc.?

About the Editor

Alessandro Lenci is a tenured researcher at the University of Pisa, Department of Linguistics, where he teaches Computational Linguistics and directs the Laboratory for Computational Linguistics.

He received his PhD in Linguistics in 1999 from the Scuola Normale Superiore in Pisa. He has extensively published in lexical semantics, computational lexica and natural language processing. His current research themes focus on corpus-based semantic models, and their application in (computational) linguistics and in cognitive sciences. In 2008, he organized with Marco Baroni and Stefan Evert the ESSLLI Workshop on Distributional Semantics. In 2009, he was co-lecturer with Stefan Evert of the ESSLLI Advanced Course on Distributional Semantic Models.


Personal Tools