Thursday, March 20, 2025

 An earlier article1 described the creation and usage of a Knowledge Base for LLMs. One of the ideas emphasized behind is the end-to-end service expectations from the system and not just the provisioning of a vector database. In this regard, it is important to call out that semantic similarity and embeddings just does not cut it to capture the nuances of a query. In vector databases, each data point (document, image, or any object) is often stored along with metadata – structured information that provides additional context. For example, metadata could include attributes like timestamp, author, location, category, etc. During a vector search, filters can be applied on this metadata to narrow down the results, ensuring only relevant items are retrieved. This is particularly helpful when the dataset is large and diverse. This technique is sometimes referred to as “metadata filtering”

Some examples of where this makes a difference include:

1. Product recommendations: This case involves an e-commerce vector search where product embeddings are used to find similar items. If a customer searches for “lightweight hiking shoes,” the vector embeddings find semantically similar products. Adding a metadata filter like gender: female or brand: Columbia ensures the results align with specific requirements.

2. Content Moderation or compliance: Imagine a company using vector search to identify similar documents across various teams. By filtering metadata like department: legal or classification: confidential, only the relevant documents are retrieved. This prevents retrieving semantically similar but irrelevant documents from unrelated teams or departments.

3. Geospatial Search: A travel app uses vector embeddings to recommend destinations based on a user’s travel history and preferences. Using metadata filters for location: within 100 miles ensures the recommendations are regionally relevant.

4. Media Libraries: In a vector search for images, combining embeddings with metadata like resolution: >=1080p or author: John Doe helps surface high-quality or specific submissions.

And some examples where it doesn’t:

1. Homogeneous Datasets: If the dataset lacks meaningful metadata (e.g., all records have the same category or timestamp), filtering doesn’t add value because the metadata doesn’t differentiate between records.

2. Highly Unstructured Queries: For a generic query like “artificial intelligence” in a research database, metadata filtering might not help much if the user is looking for broad, cross-disciplinary results. Overly restrictive filters could exclude valuable documents.

3. When Metadata is Sparse or Inaccurate: If the metadata is inconsistently applied or missing in many records, relying on filters can lead to incomplete or skewed results.

Another technique that improves query responses is “contextual embeddings”.This improves retrieval accuracy, cutting failures with re-ranking. It involves both a well-known Retrieval Augmented Generation technique with semantic search using embeddings and lexical search using sparse retrievers like BM25. The entire knowledge base is split into chunks. Both the TF-IDF encodings as well as semantic embeddings are generated. Parallel searches using both lexical and semantic searches are run. The results are then combined and ranked. The most relevant chunks are located, and the response is generated with enhanced context. This enhancement over multimodal embeddings and GraphRAG2 is inspired by Anthropic and a Microsoft Community blog.

#Codingexercise

https://1drv.ms/w/c/d609fb70e39b65c8/EdJ3VDeiX2hGgAjzKHaFVoYBTCOvDz2W8EjTCUg08hyWkQ?e=BDjivM


No comments:

Post a Comment