Mission Lane demo: Scaling ML with thousands of features

Linda Zhou - Marketing Manager
by Linda Zhou
November 3, 2025

We just wrapped up a live demo with Mission Lane, where Mike Kuhlen, who leads data science and ML, walked us through how they manage over thousands of features in production for over two million customers. Mike showed us real code, query plans in the Chalk UI, and explained how they handle complex feature dependencies across training, live decisioning, and batch evaluation.

The setup

Mission Lane helps people build better financial futures by making credit more accessible and transparent. Their ML models power credit decisions and fraud detection, pulling features from credit bureau data, internal transaction history, behavioral signals, and fraud providers — all building on each other in complex DAGs.

This creates treacherous territory for feature updates. Before Chalk, shipping a new feature meant coordinating across data science, data engineering, and ML engineering teams. Even a minor change risked breaking downstream dependencies, creating weeks of delays and drift. (Sounds familiar? We've written about the common challenges in MLOps before).

Demo walkthrough

Define once, use everywhere

Mike started by showing us their GitHub repo structure. Mission Lane defines each feature once, then reuses it across across real-time apps, monthly batch jobs and model training.

He walked through an example with TransUnion credit bureau data. Even though they pull this data in different formats at different times, the feature definitions stay the same. Add a new bureau feature, and it automatically works everywhere without reimplementation.

Decoupling models from infrastructure

When a customer applies for a card, Mission Lane needs to score them in real-time. Instead of hardcoding all the features each model needs, they use named queries: request features by model name, and Chalk figures out which 300+ features to fetch.

Whenever a model needs different features, just update the Chalk config — no need to touch the decisioning system code!

Handling dependencies at scale

Mission Lane scores millions of customers for credit line increases. This means processing hundreds of aggregated features with complex dependencies — aggregations depend on raw data, velocity features depend on aggregations, and so on.

Mike showed us the query plan for one of their production models — a complex tree of hundreds of interdependent features that Chalk resolves together. Chalk's scheduled resolvers pull fresh data incrementally from Snowflake, tracking what's already been ingested and only fetching new records. By morning, the offline store is current and ready for batch scoring.

(Chalk can also handle orchestration natively — some teams use it as a replacement for tools like Airflow and Dagster.)

What stood out

The real win: eliminating coordination overhead. Before Chalk, updating features meant orchestrating across multiple teams and systems, with changes rippling through complex dependency chains. Now data scientists self-serve with Python code that works everywhere, from real-time decisioning to batch evaluation!

Watch the full demo to see the code, query plans, and orchestration: