Skip to content

Proper way to insert arrays #163

@blacelle

Description

@blacelle

I'm being provided a bunch of String[] and double[] and I'm looking for the proper way to insert them efficiently into a DuckDB table.

I succeed inserting (primitive) arrays only by generating equivalent concatenated String. Am I missing a way to do so?

Relates with aggregating arrays


Appender: #110 confirms this is not supported by the Appender yet. The only way seems to going through a concatenated String:

DuckDBConnection duckDbConnection = (DuckDBConnection) c;

Statement s = duckDbConnection.createStatement();

s.execute("CREATE TABLE someTableName (color VARCHAR, array_keys STRING[%s], array_values DOUBLE[%s]);"
		.formatted(arrayLength, arrayLength));

try (var appender = duckDbConnection.createAppender(DuckDBConnection.DEFAULT_SCHEMA, tableName)) {
	for (int rowIndex = 0; rowIndex < 16; rowIndex++) {

		double rowFactor = Math.sqrt(rowIndex);

		appender.beginRow();
		appender.append("red");
		appender.append(IntStream.range(0, arrayLength).mapToObj(i -> "d" + i).collect(Collectors.joining(", ", "[", "]")));
		appender.append(IntStream.range(0, arrayLength).mapToDouble(i -> rowFactor * Math.sqrt(i)).mapToObj(d -> Double.toString(d)).collect(Collectors.joining(", ", "[", "]")));
		appender.endRow();
	}
}

BatchWriter: https://duckdb.org/docs/stable/clients/java.html#batch-writer I can not get it working:

DuckDBConnection duckDbConnection = (DuckDBConnection) c;

Statement s = duckDbConnection.createStatement();

s.execute("CREATE TABLE someTableName (color VARCHAR, array_keys STRING[%s], array_values DOUBLE[%s]);"
		.formatted(arrayLength, arrayLength));

PreparedStatement stmt = duckDbConnection
		.prepareStatement("INSERT INTO someTableName (color, array_keys, array_values) VALUES (?, ?, ?);");

stmt.setObject(1, "red");
stmt.setObject(2, IntStream.range(0, 255).mapToObj(i -> "d" + i).toArray(String[]::new));
stmt.setObject(3, IntStream.range(0, 255).mapToDouble(i -> Math.sqrt(i)).toArray());
stmt.addBatch();

stmt.executeBatch();
stmt.close();

I get:

org.jooq.exception.DataAccessException: Error while running ConnectionCallable
	at org.jooq.impl.DefaultDSLContext.connectionResult(DefaultDSLContext.java:663)
	at org.jooq.impl.DefaultDSLContext.connection(DefaultDSLContext.java:695)
	at eu.solven.adhoc.table.duckdb.quantile.TestTableQuery_DuckDb_VaR.feedTable(TestTableQuery_DuckDb_VaR.java:84)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.sql.SQLException: Invalid Input Error: Unsupported parameter type
	at org.duckdb.DuckDBNative.duckdb_jdbc_execute(Native Method)
	at org.duckdb.DuckDBPreparedStatement.execute(DuckDBPreparedStatement.java:148)
	at org.duckdb.DuckDBPreparedStatement.executeBatchedPreparedStatement(DuckDBPreparedStatement.java:489)
	at org.duckdb.DuckDBPreparedStatement.executeBatch(DuckDBPreparedStatement.java:474)
	at eu.solven.adhoc.table.duckdb.quantile.TestTableQuery_DuckDb_VaR.lambda$1(TestTableQuery_DuckDb_VaR.java:141)
	at org.jooq.impl.DefaultDSLContext.lambda$connection$12(DefaultDSLContext.java:696)
	at org.jooq.impl.DefaultDSLContext.connectionResult(DefaultDSLContext.java:657)
	... 5 more

Arrow:

I tried getting through Arrow and ADBC, but I did not undertand how to insert data into DuckDB through any Arrow option.

https://duckdb.org/docs/stable/clients/adbc.html
https://duckdb.org/docs/stable/clients/java.html#arrow-import

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions