The DFCS drone video sensing platform is intended to be one that facilitates writing video sensing applications for a variety of purposes. The platform strives to do one use case for a video sensing application very well with its proprietary model and RAG enhanced autonomous routing, but the same knowledge base, world catalog and vector search and image processing pipeline can also be used for other applications.
The analytics engine of the DFCS platform supports natural language queries but this could be customized. For example, “Contextual Embeddings” was popularized by Anthropic and Microsoft communities for text data and this is applicable to the DFCS world catalog. Given a document, they split it into chunks of text and prepend a chunk specific explanatory context generated by an LLM. There is no loss of information at chunk boundaries. Let us call this a ContextRetrieval class and it has methods implemented with langchain for Azure to load a pdf and parse using AzureAIDocumentIntelligenceLoader, then process it with a method that splits the document efficiently ensuring no loss of information at chunk boundaries, follow it with generating a context for each chunk using a ChatPromptTemplate by identifying the main topic in the chunk along with its relation to the broader context and including key figures, dates or percentages.
While the contextualized chunks can be used with an RAG for semantic similarity-based search aka dense retrievers, it can also be used with lexical search like Best Match 25 aka BM25 to find exact matches and specific terminology. So, the class has an additional method to supplement retrieval. When the document arrives in the pipeline, the dense and sparse retrievers quickly narrow down a search space to get relevant chunks. The ContextRetrieval class has methods to create bm25 index. Then for each of these relevant chunks, a large language model is used to generate an answer using a prompt. With the generated answers, the ContextRetrieval class performs a second stage processing called re-ranking by analyzing the deep semantic relationship between the query and the chunk, considering factors like factual alignment, answer coverage, and contextual relevance. This was demonstrated to have a better match than using chunks without contexts or the whole document as the baseline.
Since the world catalog is based on location, contextual embeddings leverage the indexes on location and the associated keypoint features so that the responses for the user query are more aligned.
No comments:
Post a Comment