LLM Toolchain

Vector embeddings and similarity search
GPU acceleration
Unify GenAI and traditional ML
Large file support
The Chalk feature store handles large files, such as documents, images, or videos, that would be too large or cost-prohibitive for traditional databases. With Chalk, you can preprocess petabytes of data, chunk and embed documents, and store the results from diffusion models all within the Chalk feature store. We’ll automatically scale your jobs to run over multiple nodes, batch data together for efficient processing, and execute your Python logic efficiently on our Rust-powered runtime.
Embedding functions
Chalk integrates with third-party and open-source embedding models, so you can turn existing features into vector embeddings in just one line of code. Chalk automatically handles batching, caching, and retry logic, enabling you to run embedding functions in production and at scale. Combined with Chalk branches, you can experiment with new embedding models on all your historical data and compare the results to your existing feature pipeline. Deploy new embedding models with confidence.
Vector Search
Chalk stores and indexes vector embeddings, so you can run nearest neighbor similarity search within your feature pipeline. As vector search is built into the Chalk feature store, you can use your existing features as the vector search query, or combine other features with the vector search results to generate downstream features.
Prompt Engineering
When using generative AI, the quality of the prompt can dramatically affect the results. Chalk’s feature store remembers all historical data, so you can safely experiment with new prompts on your data via a branch. Chalk automatically handles querying the model and computing any downstream features (such as quality metrics), so you can easily try out new prompts and figure out what works best. When you’re ready to promote to production, all it takes is one command without having to rewrite any additional code.
Model Inference
With Chalk, you can build a model inference pipeline that seamlessly integrates with your existing features and data. If you have pre- or post-processing steps to sanitize query inputs and validate inference results, you can write these functions in pure Python, and Chalk will efficiently execute these steps via our Rust-powered runtime. We’ll automatically provision and autoscale resources (including GPU nodes), queue and batch inference requests, and compute and store all upstream and downstream features. As all data is stored in the Chalk offline store, it’s easy to compare how different models or prompts perform, annotate inference responses with quality metrics, and use your data to fine-tune and improve your model.