Chalk on the Move: April Events Recap

Linda Zhou

Marketing Manager

by Linda Zhou

May 1, 2025

It’s been a busy month! The Chalk crew has gone on a cross-country tour (IRL and URL) to share the exciting updates our team is shipping. We spoke, demoed, did a roundtable and learned a lot. In case you missed it, here’s what we got up to!

NexGen Banking Summit (NYC)

Our founders and GTM team dove into the fintech ecosystem at NexGen Banking Summit in NYC!

Elliot led a roundtable on one of fintech's biggest challenges: Most fraud models aren’t fast enough to prevent payment losses. He broke down how modern inference architectures are closing this gap, how we achieve sub-10ms decision-making, and when to build custom tooling versus buying off-the-shelf.

Marc spoke on his time as product lead at Google Wallet and founding Index (now Stripe Terminal). Drawing a through-line from the early mobile payments revolution to today’s latency requirements, he shared how infrastructure decisions are critical for creating lasting competitive advantages in the space.

Beyond fraud detection, an interesting shift we observed was how financial institutions are exploring ML to optimize customer experience and HR processing. Banks are now increasingly looking at their data infrastructure as a platform that can serve multiple use cases.

VeloxCon at Meta HQ (Menlo Park)

Nathan and Chase traveled to MPK to present an in-depth technical dive on our Symbolic Python Interpreter. Nathan pulled back the curtain on how we transpile Python into highly optimized Velox expressions at query-plan time, showcasing specific optimizations for reducing the performance bottlenecks during expression initialization.

In his talk, Nathan revealed what we discovered while profiling a customer query with 10,000+ expressions: a full 50% of CPU time was being spent just on expression initialization. By caching expression equality patterns and implementing rewrite optimizations, we reduced this overhead to 17%, and pooling ExprSets across queries with the same shape brought it down even further.

Beyond our own work, VeloxCon hummed with conversations on bridging traditional data engineering with modern ML demands. Many talks explored how to maximize performance across both CPUs and GPUs while maintaining developer accessibility in ecosystems like Spark and Presto. Developers we spoke to wrestled with the same core challenge: how do we make data infrastructure work better for ML without forcing developers to abandon the tools they already know and love?

If you’re curious, Nathan’s full talk is now available on Youtube!

Agents & GenAI Infrastructure & Tooling Summit (Virtual)

Our co-founder Andy presented at the virtual Agents & GenAI Summit, demonstrating how to blend structured data with LLM outputs for fraud detection. Using GitHub star manipulation as our case study (a CMU paper estimates there are 4.5 million fake stars!), he showed how traditional graph algorithms like CopyCatch can detect obvious patterns but fall short with sophisticated fraud. Coordinated networks distributing malware through phishing repositories require more nuanced approaches.

The demo illustrated how Chalk streamlines fraud detection by abstracting away complex orchestration between data sources. Teams can extract insights from wherever their data lives (Postgres, 3rd party APIs, vector stores) while we handle the parallelization, caching, and execution details. This frees engineers to focus on building and refining models rather than wrestling with infrastructure. The same pattern works equally well for financial fraud, content moderation, and anywhere else that needs both speed and precision.

OptimizedAI Conference (ATL) & Data Council (SF)

Elvis split his time between Atlanta and SF: first, he traveled to OptimizedAI to demo Chalk, share why we built our Symbolic Python Interpreter, and how we're thinking about Iceberg and the role it plays in separating storage from compute. Afterwards, he flew back to SF just in time for Data Council 2025!

Open standards dominated discussions at both events. There’s plenty of interest in how Apache Arrow and Iceberg are enabling seamless integration into existing data ecosystems, but this newfound interoperability introduces an interesting paradox: more flexibility often means more complexity as teams navigate an expanding universe of tools.

What connected these conversations was a growing emphasis on simplifying the modern data stack. We saw widespread validation for our Python-first philosophy that democratizes access by meeting practitioners where they are, using languages they already know. This reflects our core approach at Chalk – evolving familiar tools to meet modern needs rather than forcing teams to abandon what already works for them.

What's Next?

After a month of sharing our work at conferences across the country, it's clear that teams everywhere are wrestling with the same challenges we are: building AI-native ML systems that deliver both speed and simplicity.

If you're interested in ML infrastructure and facing similar challenges around latency, developer experience, or bridging structured and unstructured data, we'd love to connect!

Nathan in his element at VeloxCon, breaking down why real-time ML is a real pain