When using the Pentaho PostgreSQL Bulk Loader step, you might come across following error message in the log:
INFO 26-08 13:04:07,005 - PostgreSQL Bulk Loader - ERROR {0} ERROR: invalid byte sequence for encoding "UTF8": 0xf6 0x73 0x63 0x68
INFO 26-08 13:04:07,005 - PostgreSQL Bulk Loader - ERROR {0} CONTEXT: COPY subscriber, line 2
Now this is not a problem with Pentaho Kettle, but quite likely with the default encoding used in your Unix/Linux environment. To check which encoding is currently the default one, execute the following:
$ echo $LANG
en_US
In this case, we can clearly see it is not an UTF-8 encoding, the one which the bulk loader relies on.
So to fix this, we just set the LANG variable in example to the following:
$ export LANG=en_US.UTF-8
Note: This will only be available for the current session. Add it to ~/.bashrc or similar to have it available on startup of any future shell session.
Run the transformation again and now you will see that the process just works flawlessly.
0 comments:
Post a Comment