The Realtime Feature Engine
Tired of Spark? So are we.
Just-in-time data + Hot-reload + Rust compute
Feature pipelines in idiomatic Python. Powered by Rust.
With Chalk, feature pipelines are simple Python functions. Declare data dependencies with Python type signatures, and Chalk will compose and execute your pipelines to compute features in real time.
Scheduling, streaming, caching – all built-in
Powerful data engineering workflows, without the infrastructure headaches.
Automatically composed & queried in real-time
Chalk handles the composition of your pipelines to compute the features your models need in real-time.
neobank
resolvers.py
100%
$
Feature pipelines in idiomatic Python. Powered by Rust.
With Chalk, feature pipelines are simple Python functions. Declare data dependencies with Python type signatures, and Chalk will compose and execute your pipelines to compute features in real time.
neobank
resolvers.py
100%
neobank
resolvers.py
100%
Scheduling, streaming, caching – all built-in
Powerful data engineering workflows, without the infrastructure headaches.
Automatically composed & queried in real-time
Chalk handles the composition of your pipelines to compute the features your models need in real-time.
neobank
resolvers.py
100%
$
Power real-time decisions with real-time data. Goodbye, ETL.
Make better predictions with fresher data. Don’t pay vendors to pre-fetch data you don’t use. Query data just-in-time for online predictions.
Total. . . . . . . . $695.00
Buy Now Pay Later
python
typescript
cli
from chalk.api import ChalkClient
client = ChalkClient()
client.query(
output=[User.income.last_60],
deployment="jessie-2",
input={
Transfer.user.id: "dkjio4n902",
Transfer.amount: 1200
}
)



Served Credit Score
max
812
avg
637
Detect, troubleshoot, and eliminate data issues faster
Monitor feature values, drift, missing data, and pipeline performance. Logs, metrics, and alerting – all built-in. Integrated with Slack, Pagerduty, and Datadog.
Unify training and serving. Iterate faster.
Experiment in Jupyter, then deploy to production. Prevent train-serve skew and speed up development.
neobank
resolvers.py
100%
$
Jupyter Notebook
In []:
df = client.offline_query(
input=labels[[User.uid]],
input_times=[datetime.now()] * len(labels),
output=[
User.name,
User.credit_report,
User.plaid_account.mean_balance,
]
)
Out[]:
# xgboost train / predict
xgb = XGBClassifier(
eval_metric="logloss",
use_label_encoder=False
)
Perfect Auditability
Unmatched data provenance. Know everything you computed and data replay anything.
is_income
python
income_total
python
income_over_estimate
python
income_over_estimate
Aug 20, 2022 07:13:27+4:00
2ms
Inputs | |
---|---|
User.self_reported_income | $24,000 |
User.plaid_transaction_income | $10,302 |
Outputs | |
User.plaid_transaction_income | $24,000 |
Definition |
get_user
v3
postgres://
get_plaid
v1
api.plaid.com (90ms)
Credit Application | |
---|---|
User ID | SPGRAY1980 |
Date | 09/01/22 |
Time | 09:15 |
Status | Declined |
Model | underwriting_model |
Integrations
Integrate with the tools you already use and deploy to your infrastructure
Withdrawal Model
Decide and enforce withdrawal limits with custom hold times.
Income
Compute income from Plaid transactions.
Cache Busting
Bypass the cache with a max-staleness of 0.
Device Data
Easily listen to streaming data and parse messages with custom logic.