PostgreSQL type conversion should default to NATIONAL CHARACTER

PostgreSQL type conversion in `PostgreSQLJDBCDatatypeImporter` imports `varchar` as `CHARACTER VARYING` and `text` as `CHARACTER LARGE OBJECT`. However, because SQL:1999 distincts between `CHARACTER` and `NATIONAL CHARACTER`, and PostgreSQL does not, the conversion should default to `NATIONAL CHARACTER`.
I haven't tried, but I believe this could break migration of SIARD archives from PostgreSQL databases to databases that distinct between `VARCHAR` and `NVARCHAR` if cells contains non-ASCII characters..

Of course, `NATIONAL CHARACTER` is not always what you'll want, e.g. if the database encoding is not a Unicode type or the column is just an enum.
I suggest that the type conversion methods should take the schema, table, and column as arguments, and that the PostgreSQL importer should have an option to make the following query to determine if a text column holds national characters:
```sql
select exists(select from SCHEMA_NAME.TABLE_NAME where COLUMN_NAME::text ~ '[^\x01-\x7F]');
```
This will of course very much slow down the import and should probably be an opt in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PostgreSQL type conversion should default to NATIONAL CHARACTER #574

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PostgreSQL type conversion should default to NATIONAL CHARACTER #574

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions