Cluster computing

There is a growing need for dynamic, dependable, and repeatable infrastructure as the scope of deployment expands from small footprint to cloud scale. With emerging technologies like Generative AI, the best practices for cloud deployment have not matured enough to create playbooks. Generative Artificial Intelligence (AI) refers to a subset of AI algorithms and models that can generate new and original content, such as images, text, music, or even entire virtual worlds. Unlike other AI models that rely on pre-existing data to make predictions or classifications, generative AI models create new content based on patterns and information they have learned from training data. Many organizations continue to face challenges to deploy these applications at production quality. The AI output must be accurate, governed and safe.

Data infrastructure trends that have become popular in the wake of Generative AI include data lakehouses which brings out the best of data lakes and data warehouses allowing for both storage and processing, vector databases for both storing and querying vectors, and the ecosystem for ETL, data pipelines and connectors facilitating input and output of data at scale and even supporting real-time ingestion. In terms of infrastructure for data engineering projects, customers usually get started on a roadmap that progressively builds a more mature data function. One of the approaches for drawing this roadmap that experts observe as repeated across deployment stamps involves building a data stack in distinct stages with a stack for every phase on this journey. While needs, level of sophistication, maturity of solutions, and budget determines the shape these stacks take, the four phases are more or less distinct and repeated across these endeavors. They are starters, growth, machine-learning and real-time. Customers begin with a starters stack where the essential function is to collect the data and often involve implementing a drain. A unified data layer in this stage significantly reduces engineering bottlenecks. A second stage is the growth stack which solves the problem of proliferation of data destinations and independent silos by centralizing data into a warehouse which also becomes a single source of truth for analytics. When this matures, customers want to move beyond historical analytics and into predictive analytics. At this stage, a data lake and machine learning toolset come handy to leverage unstructured data and mitigate problems proactively. The next and final frontier to address is the one that overcomes a challenge in this current stack which is that it is impossible to deliver personalized experiences in real-time.

Even though it is a shifting landscape, the AI models are largely language models and some serve as the foundation for layers for increasingly complex techniques and purpose. Foundation models commonly refer to large language models that have been trained over extensive datasets to be generally good at some task(chat, instruction following, code generation, etc.) and they largely follow in two categories: proprietary (such as Phi, GPT-3.5 and Gemini) and open source (such as Llama2-70B and DBRX). DBRX for its popularity with Databricks platform that is ubiquitously found on different public clouds are transformer-based decoder large language models that are trained using next-token prediction. There are benchmarks available to evaluate foundational models.

Many end-to-end LLM training pipeline are becoming more compute-efficient. This efficiency is the result of a number of improvements including better architecture, network changes, better optimizations, better tokenization and last but not the least – better pre-training data which has a substantial impact on model quality.

Cluster computing

Tuesday, December 10, 2024

No comments:

Post a Comment