Hello everyone,
I'm encountering a small issue that seems to be related to settings, and I would appreciate any guidance in identifying the problem. This pertains to my upcoming videos where I'm covering the Hudi Hive Sync tool in detail.
I've started the Spark Thrift Server using the following command:
spark-submit \
--master 'local[*]' \
--conf spark.executor.extraJavaOptions=-Duser.timezone=Etc/UTC \
--conf spark.eventLog.enabled=false \
--class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 \
--name "Thrift JDBC/ODBC Server" \
--executor-memory 512m \
--packages org.apache.spark:spark-hive_2.12:3.4.0
Additionally, I have Beeline installed and connected to the default database:
beeline -u jdbc:hive2://localhost:10000/default
While my delta stream works fine, it appears that I'm facing issues using it with the Hive MetaStore.
Here's my Spark submit command for the Hudi Delta Streamer:
spark-submit \
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--packages 'org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0,org.apache.hadoop:hadoop-aws:3.3.2' \
--repositories 'https://repo.maven.apache.org/maven2' \
--properties-file /Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E5/spark-config.properties \
--master 'local[*]' \
--executor-memory 1g \
/Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E5/jar/hudi-utilities-slim-bundle_2.12-0.14.0.jar \
--table-type COPY_ON_WRITE \
--op UPSERT \
--enable-hive-sync \
--source-ordering-field ts \
--source-class org.apache.hudi.utilities.sources.CsvDFSSource \
--target-base-path file:///Users/soumilshah/Downloads/hudidb/ \
--target-table orders \
--props hudi_tbl.props
Hudi CONF
hoodie.datasource.write.recordkey.field=order_id
hoodie.datasource.write.partitionpath.field=order_date
hoodie.streamer.source.dfs.root=file:////Users/soumilshah/IdeaProjects/SparkProject/apache-hudi-delta-streamer-labs/E5/sampledata/orders
hoodie.datasource.write.precombine.field=ts
hoodie.deltastreamer.csv.header=true
hoodie.deltastreamer.csv.sep=\t
hoodie.datasource.hive_sync.enable=true
hoodie.datasource.hive_sync.mode=jdbc
hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://localhost:10000
hoodie.datasource.hive_sync.database=default
hoodie.datasource.hive_sync.table=orders
hoodie.datasource.hive_sync.partition_fields=order_date
Spark Conf:
spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog spark.sql.hive.convertMetastoreParquet=false
The error I'm encountering is:
Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.schema.autoCreateTables"
org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.schema.autoCreateTables"
at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:606)
at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3385)
Any assistance in identifying what might be missing or misconfigured would be highly appreciated.
Thank you!
Hello everyone,
I'm encountering a small issue that seems to be related to settings, and I would appreciate any guidance in identifying the problem. This pertains to my upcoming videos where I'm covering the Hudi Hive Sync tool in detail.
I've started the Spark Thrift Server using the following command:
Additionally, I have Beeline installed and connected to the default database:
While my delta stream works fine, it appears that I'm facing issues using it with the Hive MetaStore.
Here's my Spark submit command for the Hudi Delta Streamer:
Hudi CONF
Spark Conf:
The error I'm encountering is:
Any assistance in identifying what might be missing or misconfigured would be highly appreciated.
Thank you!