
Cloud data warehouse vendor Snowflake has announced full-fledged support for the Apache Iceberg open table format, in a move that reflects a larger trend toward open source, especially as it supports data and artificial intelligence (AI) applications.
Apache Iceberg, which began development in 2017, is well regarded as an open source format for large analytics tables and is commonly used by the biggest enterprise companies. Iceberg is a management layer that operates on top of data files on the cloud, so it offer metadata for large data repositories, which speeds up the data mining process to drive AI and other data-intensive software. Iceberg allows numerous data enginesโincluding Impala, Spark, Hiveโto all work on tables simultaneously, which promotes significant data interoperability, a key Snowflake advantage as a data warehouse provider.
Before the announcement, Snowflake already offered limited functionality with Iceberg. Customers could, for instance, use Snowflake security and governance tools with Iceberg, but not the data vendorโs full array of data tools. This meant some users needed to cobble together a solution, or accept a lower level of functionality. Now with Snowflakeโs full embrace of the open table format, Iceberg can be used in any configuration.
In a larger sense, Snowflakeโs support of Iceberg moves the company toward greater openness and interoperability. In the tech industryโs great debate between closed, proprietary modelsโstill a massive revenue source for many companiesโand open source formats, Snowflake is now fully positioning itself on the open side of the debate.
โThe future of data is open, but it also needs to be easy,โ said Christian Kleinerman, EVP of Product, Snowflake. โCustomers shouldnโt have to choose between open formats and best-in-class performance or business continuity.โ
Among the performance enhancements: Snowflake touts that it brings โseamless securityโ to Iceberg tables, arguably an important reassurance for some cautious enterprise users who might feel that open source is less secure than proprietary environments. The company notes that it is extending its data replication and syncing to Iceberg tables (so far still in preview) which it claims will allow customers faced with a cyberattack to rapidly restore their data without disruption.
Snowflakeโs embrace of Iceberg could also be a broadside against Databricks, its close rival in the data platform market. The competition between these two leaders is high stakes, given that the lucrative AI sector not only requires massive amounts of data, but that this data must be stored, prepped and cataloged to fully leverage itโwhich is exactly the sector that Snowflake and Databricks serve.
Both companies have included Iceberg as part of their platform, but neither has completely embraced it until Snowflakeโs recent announcement. Both had tools and features that offered Iceberg-like functionality, and so presumably hoped customers would adopt their own in-house tools instead of Iceberg. But Iceberg has become increasingly popular among large enterprise due to how it streamlines data prepping for vast data repositories stored in data lakes.
So Snowflakeโs news about Iceberg is a major offer for enterprise customers. โWith Snowflakeโs latest Iceberg tables innovations, customers can work with their open data exactly as they would with data stored in the Snowflake platform, all while removing complexity and preserving Snowflakeโs enterprise-grade performance and security,โ said Kleinerman.