Home Code Page rules Which code page is used for enc_base64 function

Which code page is used for enc_base64 function Featured

Written by 
Rate this item
(2 votes)

This article is written based on Power Center 9.0.0. But it seems to apply to other versions as well. Let's assume that one needs to write a BASE64 encoded string to a flat file.

Power Center provides expression transformation with which it is possible to encode a string using the Enc_base64 function. Everything seems to be easy until we try to decode the string, let’s say, using the ldapmodify command. The ldapmodify command allows one to modify LDAP attributes. It is able to decode a base64 string to its original code page. But after one loads new entries to LDAP, it appears that some characters are not presented correctly.

So, we start the investigation into why is this the case? LDAP requires data to be in the UTF-8 code page. Our Integration Service code page is UTF-8. Data movement is UNICODE. When we skip base64 encoding then target flat file is in UTF-8 code page.

We conduct another test and change Integration Service data movement to ASCII. This time the data loaded to LDAP is correct. Comparison of flat files generated by Integration Service in data movement Unicode and ASCII proves that Power Center generates different outputs for each data movement. Next, we decode both flat files, as generated by the data movement, from base64 to original code pages. The string encoded with data movement set to Unicode is in the Unicode code page, while the string encoded with data movement set to ASCII is in UTF-8.

This is evidence that:

· When Integration Service data movement is set to Unicode, even if Integration Service code page is set to UTF-8, function Enc_base64 uses string in Unicode codepage. For Windows it is UTF-16LE code page. This is code page in which strings are internally stored for data movement Unicode.

· When Integration Service data movement is set to ASCII, function Enc_base64 uses string in code page the same as input string, which in our case is UTF-8. 

As Integration Service with ASCII data movement does not do any data conversion, which in some cases is required, it seems to be useful to have a function like Enc_base64 but using code page given as a parameter.

A solution is available:Base64

 

Read 4345 times Last modified on Sunday, 06 November 2011 13:13
More in this category: Oracle article about AL32UTF »