Skip to main content

Posts

Feature Vectors as Criteria Sets in a Q-Language Model

Following the previous post on cover sets in q-analysis it is important to consider another way for constructing simple cover sets where key terms represent criteria for determining ranked “meaning” in a text stream. This is particularly relevant in the automated formation of ontologies from a given set of text documents.  The recent proliferation of formalized linked vocabularies for domain specific knowledge representations provide a valuable input source for generating new cover sets in the q-language system.  The elements (vocabulary words) are the most important features because they correspond directly to the terms we are hoping to cluster documents around.  And in a sense we can ensure some level of relatedness between terms and our document vectors through simple cooccurrence calculations. In this way the features, operate as "attractors" -- points at which other terms congregate around due to the rules of a given relation.  In this case, it ...
Recent posts

Defining "lenses" for a Q-Language Topology

I n my previous post,  Q-Analysis of Natural Language  I started to describe a path for applying q-analysis in the study of natural language. One of the particularly interesting aspects of q-analysis is the ability to connect hierarchical data in a rather straightforward (although non-trivial) manner. The process of connecting data are described through the definition of a relational mapping and the rules defined for that mapping.  The relational mappings result in a new subset consisting of the combinations of the two input sets. The resulting new combinatorial set serves as a cover for constructing q-connected simplicies. Thus allowing for inspection of the q-connectivity of sets across hierarchical scales. The below example described in Beaumont and Gatrell , shows the mappings between elements at different hierarchical levels of N. The structure is the resulting mapping between three interrelated sets defined by the relation. In the language of q-analysis, ...

Q-Analysis of Natural Language

Q-Analysis is a methodological perspective and language that can be applied to study system structure, and its dynamics. Indeed, q-analysis has been dubbed the “language of structure” ( Legrand 2002 ), because it provides both a mathematical framework and particular vocabulary for defining system features and relationships ( Atkin & Casti, 1977 ; Gould 1980 ). The mathematical framework of q-analysis is built on algebraic topology , a branch of abstract mathematics that is interested in space and shape under continuous deformation (e.g. the bending, compressing, stretching of shapes). In topology, and specifically q-analysis, shape is defined by the relationships between elements in open sets. The relationship between these sets produce new sets representing edges, faces and simplicial complexes that form as a result of the relational mapping λ     from some set A and some set B to a new set C.   The relation  λ represents a rule for defining the condit...

Reflections: Semantic Representations for Decision-Making (In Financial Markets)

Over the last several months I have been working on an exciting project to  improve connections between data and knowledge resources  related to climate change and food insecurity. The goal of the project has been to classify and create ontologies of applied agriculture research and open linked data. While necessary, not incredibly cutting edge. However, my approach was different and if successful  has broad application in the information retrieval and the semantic web. My proposal was based on a straightforward idea, explore machine learning to infer semantic relationships between open linked data and knowledge resources published by  U.S. university extension systems . While the idea is certainly straightforward, it has been anything but simple. Nevertheless I have persisted, hacking away, obsessively at this project almost for the past 8 months. During this time, I began working on another idea related to a concept I call, 'resilience based investment.' Very...

Reflections on the "why" of language processing

It is interesting to reflect, for a moment, on the lengths I have gone to learn, understand and develop natural language processing methods/tools. After all, I didn't start out with the intention of developing a specific expertise in NLP, rather I always considered it a means to an end. The end being the ability to ask and answer various research questions, and derive insight into some social or ecological phenomena of interest. In my dissertation research, one of my projects focused on measuring coherence of ideological arguments related to agri-food movements (i.e. local food, slow food, sustainable agriculture, etc.). I developed a modified form of discourse analysis to identify and connect statements among food movement actors to better determine the implicit connections between what people say they stand for, and the ways they act (or don't act) to materialize their ethical or moral arguments. Looking back, this is and was a rather amazing NLP task. I didn't realize ...

Notes on defining a language model

Wikipedia defines "Language Model" as " a  probability distribution  over sequences of words.  Given such a sequence, say of length  m , it assigns a probability   to the whole sequence."    The Stanford NLP Group similarly implies this definition through the description of the language modeling in the context of Information Retrieval .  The equation above refers to the chain rule defined by:  See chain-rule definition in the  NLP Review of Basic Probability Theory .  Generating a probability distribution is one part of building a usable language processing infrastructure. A  useful statistical language model typically depends on the specific need, or problem you want to solve, and of course the domain of your problem. Thus the ability to cluster and partition sequences of words based on their likely occurrence given a query as input can serve as the starting point for connecting probability distri...

Introduction to Self

This is blog is simply a public account of my personal learning process, experimentation and self-discovery in developing ways to process human discourse to better understand human behavior in politics, financial markets. That said, this blog will likely be a collection of notes rather than a massive brain dump.