“RDBMS code 6706: The string contains an untranslatable character.”
Did you get this type of error in DataStage while loading data into Teradata or any other database. If yes, following tutorial can help you to overcome the error you received.
This error comes when data contains any non printable characters. If you can see any Control-M (^M) characters in file, you can delete them directly. But what if we can’t see them?
Sometimes we can’t see non printable characters even through vi editor in unix.
I had same experience with one file and I explained it below with my example.
See the below file content, when I use “vi file1.txt” there is nothing in between LINES and GENERATION words except a ‘-’.
When I investigated this file for above error, I found that something wrong with this record only. I was able to load all the records into database except the above one record.
What I did here is, I used ‘cat’ command to check for any non printable characters in the file. Below command came in handy for me to check non printable characters.
This ‘-A’ is equivalent to -vET
Exact meaning of the cat options from documentation is.
-v, –show-nonprinting
use ^ and M- notation, except for LFD and TAB
-E, –show-ends
display $ at end of each line
-T, –show-tabs
display TAB characters as ^I
Now we found that there are some non printable characters in our file, our next step is to delete these from our file.
I used ‘tr’ command to delete these non printable characters.
This will delete (-d) anything not in the specified set (-c). The set contains 11, 12 and 15, which are tab, LF and CR, and then everything from 40 to 176, which are all ASCII printable characters.
When the -c and -d options of the tr command are used in combination like this, the only characters tr writes to the standard output stream are the characters we’ve specified on the command line.
Although it may not look very attractive, we’re using octal characters in our tr command to make our programming job easier and more efficient. Our command tells tr to retain only the octal characters 11, 12, and 40 through 176 when writing to standard output. Octal character 11 corresponds to the [TAB] character, and octal 12 corresponds to the [LINEFEED] character. The octal characters 40 through 176 correspond to the standard visible keyboard characters, beginning with the [Space] character (octal 40) through the ~ character (octal 176).
These are the only characters retained by tr and the rest are filtered out, leaving us with a clean ASCII file.
After applying the above ‘tr’ command ‘-’ is disappeared from the record, it means it’s not a real hyphen(-) and it was formed by some non printable characters.
Please share your experience through comments if you faced similar problem or if you solved your problem with the above solution.
We can do this using Client character set
Yes we can do. This solution is what we can do from DataStage/Unix end.
Thanks for you comment:)