Skip to content

[FLINK-24379][Formats] Avro Glue Schema Registry table format#122

Open
nicusX wants to merge 7 commits intoapache:mainfrom
nicusX:gsr-table-format
Open

[FLINK-24379][Formats] Avro Glue Schema Registry table format#122
nicusX wants to merge 7 commits intoapache:mainfrom
nicusX:gsr-table-format

Conversation

@nicusX
Copy link
Copy Markdown

@nicusX nicusX commented Dec 29, 2023

Purpose of the change

Implement Table API support for Avro with AWS Glue Schema Registry

Verifying this change

  • Added unit tests
  • Manually verified by running a Flink application locally against a local Kafka cluster and using GSR.

Significant changes

  • Dependencies have been added or upgraded
  • Public API has been changed (Public API is any class annotated with @Public(Evolving))
  • Serializers have been changed
  • New feature has been introduced
    • If yes, how is this documented? docs

The implementation is mostly based on the original PR to the main Flink repo with minor changes. SQL uber-jar has been added and docs fixed and enriched.

@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented Dec 29, 2023

Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)

@nicusX
Copy link
Copy Markdown
Author

nicusX commented Jan 4, 2024

@dannycranmer please review this PR when you have the chance

Copy link
Copy Markdown
Contributor

@z3d1k z3d1k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @nicusX!

Please add integration tests to cover usage of the format using SQL.

Comment thread flink-formats-aws/flink-avro-glue-schema-registry/pom.xml Outdated
Comment thread flink-formats-aws/flink-sql-avro-glue-schema-registry/pom.xml Outdated
Fixed param description to match actual default
Fix typo in NOTICE
@nicusX
Copy link
Copy Markdown
Author

nicusX commented May 13, 2024

@z3d1k I addressed most of the comments. But I am not sure how to implement the IT, in particular for a format using an external registry.
Any reference or example I can follow?
Most of formats lack of IT. And this one requires an external system

The only similar format, flink-avro-confluent-registry, doesn't have any IT.

The GRS Serialization/DeserializationSchema IT relies on a AWS account that I assume is available during the CI/CD build. Any reference on how to leverage this?


/** Options for AWS Glue Schema Registry Avro format. */
@PublicEvolving
public class AvroGlueFormatOptions {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about aligning the naming with AWSSchemaRegistryConstants (for all config options below)?
Example:

    public static final ConfigOption<String> AWS_REGION =
            ConfigOptions.key(AWSSchemaRegistryConstants.AWS_REGION)
                    .stringType()
                    .noDefaultValue()
                    .withDescription("AWS region");

public static final ConfigOption<String> REGISTRY_NAME =
ConfigOptions.key("registry.name")
.stringType()
.noDefaultValue()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.withDescription(
"The Schema name under which to register the schema used by this format during serialization.");

public static final ConfigOption<AvroRecordType> SCHEMA_TYPE =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the serializer/deserializer shipped with this PR, you'll only ever operate with Generic Records. This option should not be part of the API.
Consequently, in the buildConfigMap of the factory, GENERIC_RECORD should be "forced" into the config map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants