Ibis 8.0 allows data teams to write code once for different engines


Advertisement: Click here to learn how to Generate Art From Text

Voltron Data announced the release Ibis 8.0 of its popular Python dataframes API. This update has been downloaded more than 10 million times. Ibis allows developers run code across multiple data platforms by selecting the most appropriate query engine. 

The latest version includes the first dedicated streaming engines for Apache Flink, RisingWave and its existing batch execution engines. This expansion allows a unified batch and streaming experience within a Python dataframe API. It enhances the flexibility and capability for data analytics tasks.

“Finally developers can write code once and use it across local, batch, CPU, GPU, and now real-time query engines. Ibis leads the charge in removing the barriers that separate batch and stream processing engines. This is a big step toward a modular and composable data ecosystem across all paradigms,” said Josh Patterson, co-founder and CEO of Voltron Data. 

Ibis is an open-source, independently governed project that enjoys support from Voltron Data. Contributions come from a wide range of entities, including Google and Starburst Data. 

Ibis version 8.0 now supports 20 different query engine types, allowing it to handle a wide range data processing requirements, from small queries with DuckDB, to large-scale, distributed preprocessing/ETL tasks with engines such as BigQuery, Spark and Theseus. Additionally, it integrates seamlessly into two streaming engines: Apache Flink, and RisingWave without the need for any code modifications by users.

The development of Ibis is particularly focused on improving user experience and functionality, as explained by Zhenzhong “Z” Xu, vice president of engineering at Voltron Data. The Ibis API improvements, including the new features of ML preprocessing and a dataframe API familiar to users, benefit all supported backends. 

This approach provides a more versatile data processing environment and encourages the community to contribute. It also broadens the scope of Python-based analytics across different data platforms.

“As the Ibis API improves and adds new functionality like ML preprocessing, every backend it supports improves with it. Users can learn one familiar dataframe API, without being locked into a backend. The open source community can add Ibis ecosystem integrations to make working with data in Python better on any data platform Ibis supports,” said Xu.


Leave a Reply

Your email address will not be published. Required fields are marked *