Integrate any data source. Always operate on fresh data
Only pay for data that you need
Orchestrate compute, caching, scheduling, and streaming
Execute Python on a Rust-based runtime for maximum performance
Integrate any API or Data Source
Instead of complex ETL jobs and streaming pipelines, make direct calls to your data sources. Chalk makes it fast and easy to call external APIs, query production databases and data warehouses, or fetch parquet files from S3.
Just-in-Time Data
Chalk’s just-in-time (JIT) feature pipelines fetch data on demand so you have fresh data for predictions and don’t waste money on data you don’t need. You get fine-grained control over freshness at a per-model level, so you can get the most accurate data possible for sensitive use-cases.
Idiomatic Python
Use the libraries and patterns you know and love — SQLAlchemy, DataFrames, Pydantic, Pendulum. Use requirements.txt, or a custom Dockerfile to add any dependencies you need. Chalk automatically distributes and parallelizes Python methods so that you can write business logic without worrying about infrastructure.
Declarative Pipelines
Don’t repeat yourself. Declaratively specify dependencies between pipeline stages and easily compose resolvers. Just add an argument to your resolver to add a new data dependency, and let Chalk automatically re-orchestrate pipelines to supply your function with the data it needs.
Preview Deploys
Chalk enables you to iterate on feature pipelines quickly with isolated preview deployments for pull requests and git branches. Run integration tests and manual QA against sandboxed deployments to verify that updates perform as expected, and share your in-progress work with other developers.
Rust Compute
Let Chalk do the hard work of making your code execute fast. Chalk executes simple Python in a Rust-based runtime that automatically parallelizes data fetches, multithreads dataframe operations, and pushes operations down to your underlying data sources. The result? Pure Python with native performance.
Metrics, Logs, and Traces
Make your infrastructure transparent – Chalk automatically instruments all Python functions with logging, metrics, and data capture so that you can easily debug production issues and understand how your code is behaving in real-time. You can even export logs and metrics from Chalk to tools like Datadog for a full view of how your data infrastructure integrates with the rest of your system.
Get Started with Code Examples
Unlock the power of real-time data pipelines.
Cache feature values rather than computing them realtime.
Run resolvers on a schedule with all possible arguments.
Bypass the cache with a max-staleness of 0.
Filtered Cron
Run resolvers on a schedule and filter down which examples to consider.
Explore All