Transformers work very well because of three components: 1.
Positional Encoding, 2. Attention and 3. Self-Attention. Positional encoding is about enhancing the
data with positional information rather than encoding it in the structure of
the network. As we train the network on lots of text data, the transformers
learn to interpret those positional encodings. It really helped transformers
easier to train than RNN. Attention refers to a concept that originated from the
paper aptly titled “Attention is all you need”. It is a structure that allows a
text model to look at every single word
in the original sentence when deciding to translate the word in the
output. A heat map for attention helps with understanding the word and its
grammar. While attention is for understanding the alignment of words,
self-attention is for understanding the underlying meaning of a word to
disambiguate it from other usages. This often involves an internal
representation of the word also referred to as its state. When attention is directed
towards the input text, there can be differences understood between say
“server, can I have the check” and the “I crashed the server” to interpret the
references to a human versus a machine server. The context of the surrounding
words helps with this state.
BERT, an NLP model, make use of attention and can be used
for a variety of purposes such as text summarization, question answering,
classification and finding similar sentences. BERT also helps with Google search and Google cloud AutoML
language. Google has made BERT available for download via TensorFlow library
while Hugging Face company has made Transformers available in Python language.
A recent study on Copilot by Gartner found that the most
successful pilots focus on demonstrating business potential, not on technical
feasibility. The difference between the two is the realization of the
transformative potential of this technology. Since the technology is still
broad and emerging, IT leaders find it hard to prioritize generative AI use
cases. Mature AI partners involve business partners and software engineers as
key members of their AI projects. Generative AI allows for faster development cycle
than traditional AI projects. As always but more so from shorter development
cycles, success is realized via rapid testing, refinement, and the elimination
of low priority and severity use cases.
No comments:
Post a Comment