Sequences from usage
Introduction: Neural networks democratize dimensions, but
the pattern of queries served by the neural network inform more about some of
them rather than others over lengthy periods of time. These are like sequences
used in transformers except that sequence is not given between successive
queries and instead must be learned over time and curated as idiosyncratic to
the user sending the queries. Applications of such learners are significant in
spaces such as personal assistant where the accumulation of queries and the
determination of these sequences offer insights into interests of the user. The
idea behind the capture of sequences over large batches of queries is that they are representational not actual and get
stored as state rather than as dimensions. There is a need to differentiate the
anticipation, prediction, and recommendation of the next query to determine the
user traits from usage over time. The latter does not play any role in
immediate query responses but adjusts the understanding whether a response will
be with high precision and recall for the user. In an extension of this
distinguishing curation of user traits, this approach is different from running
a latent semantic classifier on all incoming requests because that would be
short-term and helping with the next query or response but not assist with the
inference of the user’s traits.
If the neurons in a large language model were to light up
for the query and responses, there would be many flashes across millions of
them. While classifiers help with the salience of these flashes for an
understanding of the emphasis in the next query and response, it's the capture
off repetitive flash sequences over a long enough time span in the same space
that indicate recurring themes and a certain way of thinking for the end-users.
Such repetitions have their origins in the way a person draws upon personal
styles, way of thinking and vocational habits to form their queries. When
subjects brought up the user are disparate, her way of going about exploring
them might still draw upon their beliefs, habits and way of thinking leading to
sequences that at the very least, is representative of themselves.
Curation and storage of these sequences is like the state
stored by transformers in that must be encoded and decoded. Since these encoded
states are independent of the forms in which the input appears, it eschews the
vector representation and consequent regression and classification, focusing
instead on the capture and possible replay of the sequences, which are more
along the lines of sequence databases use of which is well-known in the
industry and provides opportunities for interpretation not limited to machine
learning.
Continuous state aggregation for analysis might also be
possible but not part of this discussion. Natural language processing relies on
encoding-decoding to capture and replay state from text. This state is discrete and changes from one
set of tokenized input texts to another. As the text is transformed into vectors
of predefined feature length, it becomes available to undergo regression and
classification. The state representation remains immutable and decoded to
generate new text. Instead, if the encoded state could be accumulated with the
subsequent text, it is likely that it will bring out the topic of the text if
the state accumulation is progressive. A progress indicator could be the mutual
information value of the resulting state. If there is information gained, the
state can continue to aggregate, and this can be stored in memory. Otherwise,
the pairing state can be discarded. This results in a final state aggregation
that continues to be more inclusive of the topic in the text.
NLP has algorithms like BERT to set the precedence for state
encoding and decodings but they act on every input and are required to be
contiguous in their processing of the imputs. A secondary pass for selectivity
and storing of the states is suggested in this document. Reference:
https://booksonsoftware.com/text