Skip to content

Tracking issues of Data Lake with Iceberg Support #12272

Open
4 of 7 issues completed
Open
4 of 7 issues completed
@Xuanwo

Description

@Xuanwo

After the close of #11947, Databend has completed all preparation work required for implementing data lake support!

Databend now has multi-catalog support!

We can create a new catalog like:

CREATE CATALOG iceberg_ctl
TYPE=ICEBERG
CONNECTION=(
    URL='s3://testbucket/iceberg_ctl/'
    AWS_KEY_ID='minioadmin'
    AWS_SECRET_KEY='minioadmin'
    ENDPOINT_URL='${STORAGE_S3_ENDPOINT_URL}'
);

And we can show/drop them:

SHOW DATABASES IN iceberg_ctl;
SHOW TABLES IN iceberg_ctl.iceberg_db;
DROP CATALOG IF EXISTS iceberg_ctl

Databend now can read existing iceberg!

We can query data in an exisint iceberg table like the following:

SELECT count(*) FROM iceberg_ctl.iceberg_db.iceberg_tbl;

We have found a way to add data features in Databend. I have some ideas that we can start working on:

Tasks

Our current goal is to make reading from iceberg table fast and reliable.

  • Implement push_down for iceberg table
  • Implement iceberg rest catalog support
  • Work with iceberg community to build iceberg-rust

Future

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions