Skip to main content

Reflections on the "why" of language processing

It is interesting to reflect, for a moment, on the lengths I have gone to learn, understand and develop natural language processing methods/tools. After all, I didn't start out with the intention of developing a specific expertise in NLP, rather I always considered it a means to an end. The end being the ability to ask and answer various research questions, and derive insight into some social or ecological phenomena of interest.

In my dissertation research, one of my projects focused on measuring coherence of ideological arguments related to agri-food movements (i.e. local food, slow food, sustainable agriculture, etc.). I developed a modified form of discourse analysis to identify and connect statements among food movement actors to better determine the implicit connections between what people say they stand for, and the ways they act (or don't act) to materialize their ethical or moral arguments. Looking back, this is and was a rather amazing NLP task. I didn't realize it at the time, but this would help launch an almost obsessive relationship with language processing.

Chapter 2 of the dissertation can be access here: Rhetoric and Reality of Social Equity in Agri-Food Movements.
Abstract
Increasing sustainable food consumption is essential to reducing the environmental and social impacts of our current industrial agri-food system. However, public perceptions of sustainability remains contested, fluid, and in some cases incomplete. Groups locked in the fight over the future of our food systems advocate varying perspectives of sustainability that reflect their interests and ideologies. This may complicate people's ability to make adequately informed purchasing decisions, undermining the efficacy of a sustainable consumerism. A sustainable agri-food system is often framed with an emphasis on environmental health and economic viability for farmers. Yet, social equity represents the "third pillar" of sustainability, and is an important aspect of sustainable food consumption. This study analyzes the degree to which social equity is included among efforts to shape public understanding of sustainable food and farming systems. In particular, I analyze websites from groups advocating for alternative and conventional agri-food systems in the United States. This study shows that social equity is a difficult concept to communicate. This difficulty has the potential for minimizing, and obfuscating the role of equity in consumer’s food purchasing decisions. The implication is an environmental and economically sustainable food system that remains inequitable. Results also indicate that advocates of alternative agri-food systems need to be more inclusive of social equity in order to maintain a distinction from conventional actors who are adopting the language of sustainability to capitalize on consumer demand.

Fast forward to my post-doctoral research and again, I found myself processing and connecting language to represent cognitive structures or frames about disaster events. The goal was to determine if specific social learning activities altered these cognitive frames about the preceding disaster event and whether these frames led to greater community resilience via a shared or collective frame about the event and ways to respond in the future.
Smith, J. G., DuBois, B., & Krasny, M. E. 2015. Framing Resilience through Social Learning: Impacts of Environmental Stewardship on Youth in Post-disturbance Communities. Sustainability Science, 1-13. DOI: 10.1007/s11625-015-0348-y
In both examples, I was less interested in  the mechanics of NLP except in as much as I could ensure I was using the correct methods to generate results relevant to answering my research questions. However, the deeper I dive, the more relevant those mechanics have become, and  this has meant diving into the mathematical formalisms of describing language models and semantic representations of data.  And while I can honestly say I'm a poor student of arithmetic, I have come to find a home in set theory, abstract algebra and the formalism of deductive methods. Besides, computers are great at arithmetic!

Through these experiences and with a growing number of new research questions driving my intellectual curiosity, NLP as a discipline in human-machine learning has taken on a much more central role to my overall research trajectory.


Comments

Popular posts from this blog

Notes on defining a language model

Wikipedia defines "Language Model" as " a  probability distribution  over sequences of words.  Given such a sequence, say of length  m , it assigns a probability   to the whole sequence."    The Stanford NLP Group similarly implies this definition through the description of the language modeling in the context of Information Retrieval .  The equation above refers to the chain rule defined by:  See chain-rule definition in the  NLP Review of Basic Probability Theory .  Generating a probability distribution is one part of building a usable language processing infrastructure. A  useful statistical language model typically depends on the specific need, or problem you want to solve, and of course the domain of your problem. Thus the ability to cluster and partition sequences of words based on their likely occurrence given a query as input can serve as the starting point for connecting probability distri...

Q-Analysis of Natural Language

Q-Analysis is a methodological perspective and language that can be applied to study system structure, and its dynamics. Indeed, q-analysis has been dubbed the “language of structure” ( Legrand 2002 ), because it provides both a mathematical framework and particular vocabulary for defining system features and relationships ( Atkin & Casti, 1977 ; Gould 1980 ). The mathematical framework of q-analysis is built on algebraic topology , a branch of abstract mathematics that is interested in space and shape under continuous deformation (e.g. the bending, compressing, stretching of shapes). In topology, and specifically q-analysis, shape is defined by the relationships between elements in open sets. The relationship between these sets produce new sets representing edges, faces and simplicial complexes that form as a result of the relational mapping λ     from some set A and some set B to a new set C.   The relation  λ represents a rule for defining the condit...

Defining "lenses" for a Q-Language Topology

I n my previous post,  Q-Analysis of Natural Language  I started to describe a path for applying q-analysis in the study of natural language. One of the particularly interesting aspects of q-analysis is the ability to connect hierarchical data in a rather straightforward (although non-trivial) manner. The process of connecting data are described through the definition of a relational mapping and the rules defined for that mapping.  The relational mappings result in a new subset consisting of the combinations of the two input sets. The resulting new combinatorial set serves as a cover for constructing q-connected simplicies. Thus allowing for inspection of the q-connectivity of sets across hierarchical scales. The below example described in Beaumont and Gatrell , shows the mappings between elements at different hierarchical levels of N. The structure is the resulting mapping between three interrelated sets defined by the relation. In the language of q-analysis, ...