Hudi Flink Upserts, Materialize 26.28, Airflow Go SDK
Hudi 1.2.0 brings stateless Flink upserts, Materialize v26.28.0 and Redpanda v26.1.10 ship, Airflow adds a Go Task SDK, ClickHouse opens Postgres ClickPipes
DataPrincipal Daily, June 13th, 2026
Streaming upserts and vector-native lakehouse storage anchored a week of incremental releases, while Airflow added Go task authoring and ClickHouse opened a managed Postgres migration path.
🐘 Apache Hudi 1.2.0 moves Flink global upserts off operator state and into the Record Level Index.
💎 Lance’s data-evolution write path adds columns to terabyte-scale multimodal tables by writing only new data.
🌀 Airflow ships a Go Task SDK beta as ClickHouse opens managed Postgres-to-Postgres migrations.
🔬 Deep Dives
Stateless global upserts land for Flink streaming in Apache Hudi 1.2.0 (8 minute read)
Danny Chan and Shuo Cheng describe, in a June 10 post, how Hudi 1.2.0 moves the global index off Flink operator state and into the Record Level Index in the Metadata Table, allowing partition-agnostic upserts when update events arrive without their original partition. Flink buffers records in mini-batches, checks a small local cache for in-flight updates, then queries RLI shards to route each record to the correct write target.
Scalable feature engineering on multimodal datasets with Lance (9 minute read)
In a June 8 post, Prashanth Rao explains how the Lance format’s data-evolution feature adds columns by writing only the new column data and leaving existing blobs, embeddings, and indexes in place. The post claims that adding a few-megabyte column writes only a few megabytes even when the joined table is already terabytes or petabytes in size, shown on the 1,162,252-row LAION-1M dataset with a 768-dimensional embedding column.
🚀 Launches & Tools
ClickHouse opens managed Postgres-to-Postgres migrations in ClickPipes (8 minute read)
ClickHouse engineer Amogh Bharadwaj announced a public-beta ClickPipes capability on June 11 that runs low-downtime Postgres-to-Postgres migrations inside ClickHouse Cloud as part of Postgres by ClickHouse. It uses PeerDB technology for parallel transfer and change data capture, automates schema dump and restore, and adds independent retry logic with email and UI notifications.
Materialize v26.28.0 cuts steady-state CPU under temporal filters
Materialize released v26.28.0 to self-managed users on June 12, reporting in internal tests that steady-state CPU while using temporal filters dropped from 75% to 4%. The release also lets a single DROP statement remove co-named dependents and adds nested JWT path support to its OIDC authentication.
Redpanda v26.1.10 adds OIDC proxy basic auth and Iceberg schema fixes
Redpanda v26.1.10 shipped June 12 with HTTP Basic auth for the OIDC forward proxy through new oidc_http_proxy_username and oidc_http_proxy_password cluster configs. It fixes schema evolution incorrectly rejecting new optional Iceberg columns that contain structurally required nested fields, patches krb5 CVE-2026-40355 and CVE-2026-40356, and adds an iceberg_default_schema_case_insensitive property.
Apache Airflow ships a Go Task SDK at v1.0.0-beta2
The Apache Airflow Go Task SDK, which lets developers author tasks in Go against the AIP-72 Task Execution Interface rather than Python, was tagged v1.0.0-beta2 on June 11 by contributor jason810496. The SDK runs tasks in the external, isolated execution runtime introduced in Airflow 3.
OpenMetadata 1.12.11 patches MCP, search, and a Netty CVE (4 minute read)
OpenMetadata released 1.12.11 on June 12, a maintenance update to the 1.12 line separate from the 1.13.0 feature release, with MCP fixes (compact entity responses, similarity scores in the search tool, an OAuth state double-encoding fix) and search fixes that restore aliases after reindex and make users searchable by email. It bumps react-router-dom to 6.30.4 and netty-bom to 4.1.135.Final to address CVE-2026-44249.
📈 Opinions & Advice
When to use Apache Kafka, and when not to (6 minute read)
Confluent associate solutions architect Mohtasham Sayeed Mohiuddin published a June 12 decision guide that frames Kafka as a distributed commit log rather than a message queue. It lays out five scenarios where Kafka fits (ordered durable streams, fan-out, replay, real-time pipelines, throughput above 10,000 events per second), three anti-patterns (task queues, small-scale workloads, synchronous request-reply), and a five-question test scored against RabbitMQ, AWS SQS and SNS, and PostgreSQL.
💎 Gems & Repos
apache/datafusion-comet Comet is an Apache Spark plugin that swaps Spark physical operators for an Apache DataFusion native execution engine, aiming to speed up existing Spark SQL jobs without query changes. The repository is actively developed, with the 0.16.0 line cited in recent benchmarks against vanilla Spark on Parquet workloads.
⚡ Quick Links
Apache Spark 4.0.3: a June 11 maintenance release on the branch-4.0 line with security and correctness fixes, recommended for all 4.0 users.
Feast adds offline-store RED metrics and SOX audit logging (6 minute read): a June 9 update adding three Prometheus RED metrics for the offline store and a structured JSON audit logger that records entity key names rather than values.


