Get Started with Code Examples

Unlock the power of real-time data pipelines

Fraud & Risk

Withdrawal Model

Decide and enforce withdrawal limits with custom hold times.
Credit

Income

Compute income from Plaid transactions.
Caching

Cache Busting

Bypass the cache with a max-staleness of 0.
Predictive Maintenance

Sensor Streams

Compute streaming window aggregate functions on sensor data.
GitHub Actions

Deploy with Chalk

Deploy to Chalk (either as a preview deployment or to production).
Features

Setup

To run a Chalk resolver in airflow, you'll need to add CHALK_CLIENT_ID and CHALK_CLIENT_SECRET andCHALK_ENVIRONMENT environment variables to airflow.
Marketplaces

User-Seller Affinity

Create Chalk features for Users and Sellers and evaluate whether a user and seller have matching categories.
Fraud & Risk

Changes in Behavior

Detect changes in user behavior over time.
Testing

Unit tests

Resolvers are just Python functions, so they are easy to unit test.
Resolvers

Sharing Resolvers

Resolvers are shared between all models.
Features

Custom Feature Types

Use pydantic, attrs, dataclasses.dataclass, or custom types as feature values.
Resolvers

Downstream DataFrames

Chain a DataFrame resolver with a scalar resolver.
Predictive Maintenance

Failing Sensors

Combine batch, caching, and DataFrames to create a powerful predictive maintenance pipeline.
DataFrame

Self Joins

Join a feature set back to itself.
Features

Mapping Stream

Create features directly from messages on a stream.
Caching

Pre-Fetching

Keep the cache warm by scheduling a resolver to run more frequently than the max-staleness.
Resolvers

Downstream Scalars

Resolvers chain together through their required dependencies and declared outputs.
Features

Feature Time

Access and override the time at which a feature should be recorded.
Credit

Credit Bureau API

Integrate data from credit bureaus like Transunion.
Features

Define features

First, we define features representing our users and their card transactions:
DataFrame

Filters

Filter the rows of a DataFrame by supplying conditions to the __getitem__() method.
Features

Isolated Python Environment

To isolate the chalkpy dependency from your python environment, you can use airflow's @task.virtualenv decorator. Note, this is slightly slower since a python virtual environment is created for the task, but it might be a useful approach if you want to avoid conflicts with other python dependencies.
Feature Discovery

Tags

Tag related features.
Marketplaces

Stream Interaction Data

Enrich User Interaction data with stream data.
Caching

Intermediate Feature Values

Cache intermediate feature values.
GitHub Actions

Preview deployments

Set up preview deployments for all PRs.
Fraud & Risk

Account Takeover

Aggregate failed logins over a Kafka stream.
Caching

Override Cache Values

Supply a feature value in the input to skip the cache and any resolver entirely.
DataFrame

Creating DataFrames

Describe features at a feature class or feature level.
Feature Discovery

Descriptions

Describe features at a feature class or feature level.
DataFrame

Aggregations

Compute aggregates over a DataFrame.
Features

Has Many

Define a has-many relationship between feature classes.
Features

Has One + Has Many

Define a has-many relationship between feature classes.
Features

Has One

Define a has-one relationship between feature classes.
GitHub Actions

Install Chalk CLI

Install the Chalk CLI in a GitHub Action.
Features

Query Scalars

Query scalars with SQL files or strings.
Caching

Latest Computed Value

Cache the last computed example of the feature.
Features

Models

The example code below shows how to integrate a predictive model into a resolver.
Features

Stream SQL Aggregation

Compute an aggregation on windows using DataFrames.
DataFrame

Projections

Scope down the set of rows available in a DataFrame.
Features

Feature Types

Create a namespaced set of features.
Feature Discovery

Owners

Assign owners to features for monitoring and alerting.
Features

OpenAI

Chalk also makes it easy to integrate LLMs like ChatGPT, into your resolvers. In the following example, we use Chat-GPT to answer questions about our Users.
Fraud & Risk

Identity Verification

Make use of vendor APIs to verify identities, control costs with Chalk's platform.
Features

Polling the Resolver Run

To wait for the resolver run to complete in airflow, you can use the get_run_status Chalk method to poll the status of the resolver run. One way to accomplish this is by using Airflow's Sensor framework.
Resolvers

Tagged Resolvers

Trigger special behavior with tags.
Testing

Integration tests

Test interactions between resolvers with preview deployments.
Resolvers

Scalar Resolvers

Create a resolver that returns a single feature.
Resolvers

Multi-Feature Resolvers

Create a resolver that returns many features.
Features

Primary Keys

Set the primary key for an entity.
DataFrame

Projections with Filters

Compose projections and filters to create a new DataFrame.
Features

Define LLM resolvers

In the rest of this readme, we will focus on the LLM-dependent features:completion, clean_memo, category, is_nsf, and is_ach. (You can check out the full code and our documentation to see how we resolve the rest of the features using SQL file resolvers and windowed aggregations.)
Predictive Maintenance

Device Data

Easily listen to streaming data and parse messages with custom logic.
Predictive Maintenance

Historical Data

Access historical sensor data as-of any time in the past.
Features

Stream DataFrame

Compute a streaming window aggregate function using DataFrames.
Features

Stream SQL

Compute a streaming window aggregate function using DataFrames.
Feature Discovery

Tags & Owners

Assigning tags & owners to features.
Caching

Override Max-Staleness

Set max-staleness per-request.
Marketplaces

Track Interactions

Identify the number of interactions that have occurred between users and sellers.
Scheduling

Sampling Cron

Pick exactly the examples that you’d like to run.
Fraud & Risk

Returns

Identify transactions returned for non-sufficient funds.
Features

Constructing Features

Create sets of features from your feature classes.
Caching

Basic Caching

Cache feature values rather than computing them realtime.
Credit

Aggregate Tradelines

Aggregate user statistics across tradelines.
Credit

Multiple Accounts

Identify users with multiple accounts.
Scheduling

Filtered Cron

Run resolvers on a schedule and filter down which examples to consider.
Scheduling

Cron

Run resolvers on a schedule with all possible arguments.
Resolvers

Multi-Tenancy

Serve many end-customers with differentiated behavior.
Features

Query DataFrames

Query many rows and take advantage of push down filters.