Let's talk about what modern data science actually looks like.
CUDA, Scikit, and PyTorch were once nice-to-haves but are now table-stakes. These data tools you've mastered are now just the entry fee. Modern ML systems need real-time data pipelines, distributed computing, monitoring, and a dozen other things that probably weren't why you got excited about data science in the first place.
Expecting data scientists to manage Kubernetes clusters and build observability systems is like asking a cardiologist to also perform heart surgeries – overlapping areas of expertise, but fundamentally different skill sets.
This creates the bottleneck every data scientist knows: promising models sitting in notebooks, waiting for engineering resources that never come, ultimately never making it to production. Welcome to the Jupyter graveyard.
But what if your notebook code was already production code?
With Chalk, you can test new features, run experiments, and ship models to production — all from your Jupyter notebook.
@features
class Review:
id: int
created_at: FeatureTime
review_body: str
rating: int
@features(tags=["team:DevX"])
class Product:
id: int
title: str
reviews: DataFrame[Review]
# fmt: off
average_rating: float | None = _.reviews[_.rating].mean()
most_recent_review_rating: int | None = F.max_by(
_.reviews[_.rating],
sort=_.created_at,
)
review_count_all: int = _.reviews.count()
review_count: Windowed[int] = windowed(
"30d",
"90",
expression=_.reviews[_.created_at > _.chalk_window].count(),
)
Your notebook IS production (in theory)
Forget the traditional workflow where notebook code gets rewritten into another language for production. With Chalk, the same Python you write for exploration runs in production with millisecond latency. Wondering how? We built a Symbolic Python Interpreter that converts ordinary Python into expressions that run natively!
Easily import your production features with a single line of code, add a new experimental feature, and simulate a prediction, recommendation, etc. all from a single notebook!
from chalk.client import ChalkClient
client = ChalkClient()
client.load_features()
Review.is_positive = _.rating >= 4
Product.positive_reviews_percentage = _.reviews[_.is_positive == True].count() / _.review_count
Experiment without breaking things
Remember those week-old CSV dumps sitting in your folder? We've all been there. You're trying to test a model that will run on live user data, but you're stuck with stale exports that don't capture what's actually happening in production.
With Chalk branches, you can spin up a copy of your production pipeline in seconds — like Git but for ML features. Test against millions of production records, catch edge cases you'd never find in samples, and merge when ready. Iterate at the speed of thought, not infrastructure.
chalk apply branch # deploys your model to a branch for testing
chalk diff # prints a diff of changed features
Once you're in a branch, changes hot-reload instantly — edit a feature definition, run a query, see results. Watch your new model predictions update in real-time as you tweak the logic.
On Chalk, your features ARE your code, version-controlled and shareable just like any codebase. When colleagues need to test your features, whether it's another data scientist reproducing your work or a data engineer ensuring pipeline compatibility, they simply check out your branch.
Production-ready training data
Here's how Chalk helps data scientists create accurate training datasets.
Offline Queries: Every feature computed during production gets stored. Need to debug why your model approved a suspicious transaction last week? Query the exact features inputted and outputted at every stage of your feature pipeline.
Temporal Consistency: Models often fail because they trained on future data that leaked into past features. Chalk handles point-in-time correctness automatically:
OfflineQuery(
name="Reviewed products from 2025 Q1",
input={Product.id = [range(10**5)]},
output=[
Product.average_rating,
Product.most_recent_review_rating,
# these are computed within the context of the time window Jan 2025 - March 2025
],
recompute_features=True,
lower_bound=datetime(2025, 1, 1),
upper_bound=datetime(2025, 3, 31),
)
Query features from March 15th, get exactly what existed on March 15th. No future transactions leaking into historical risk scores or manual timestamp juggling.
Backfills Made Simple: Need to add a new feature? Traditionally, that's a week coordinating with data engineering. With Chalk, get it done in minutes.
Change your feature definition, trigger a backfill. Chalk recomputes across your entire history while maintaining temporal consistency. Test "what if we had this feature 6 months ago?" before deploying.
Ship it
Once you've defined your features and resolvers, deployment is simple:
chalk apply
Your features now serve predictions at scale, and Python gets compiled to C++ for production performance.
Integration with your existing ML stack is seamless. Deploy ONNX models directly in Chalk, or plug into SageMaker and Vertex:
@features
class User:
id: str
encoded_sagemaker_data: bytes
should_offer_discount_percentage: float = F.sagemaker_predict(
_.encoded_sagemaker_data,
endpoint="discount_model-2025-01-15",
target_model="model_v2.tar.gz",
target_variant="blue"
)
Plus, Chalk's native Iceberg integration enables sharing datasets with other teams — democratizing data access across the entire organization. Analysts can explore them in BI tools, everyone works from the same source of truth.