UTL_ENCODE.UUENCODE Does Not Follow The Standard Uuencode Format
Last updated on OCTOBER 25, 2016
Applies to:PL/SQL - Version 10.2.0.3 and later
Information in this document applies to any platform.
Oracle's UTL_ENCODE.uuencode has the following deviations from standard uuencode:
1) Data lines in uuencoded data begin with lower case 'L' ('l'), which is not correct. The standard states that uuencoded data lines should begin with 'M' for 45 bytes of original data (pre-encoded data) per line. ASCII M = 77. 77-32=45 original data bytes (pre-encoding) per line in the standard. Oracle's 'l' leading character in encoded data lines refers to ASCII l = 108. 108-32=76 data bytes per line for Oracle. This is too long for the uuencode standard. If you uuencode binary data using Oracle, then other uuencode utilities that follow the standard will reject the encoded format based on an invalid leading character in data lines.
2) Oracle's uuencode leading character for each data line, including the last data line which will usually be shorter than prior data lines, refers to the length of the POST-ENCODED data in the data line. The uuencode standard states that the leading character in each data line refers to the PRE-ENCODED data represented by the encoded data line. In both cases, the leading character is the number of referenced bytes plus 32 in ASCII. So a leading 'M' in the standard refers to 77-32=45 PRE-ENCODED data bytes per line. This equates to 60 POST-ENCODED data bytes in the encoded data line (uuencoding uses 4 post-encoded bytes to represent 3 pre-encoded bytes, so a 3/4 ratio). Oracle's leading 'l' (lower case 'L') is not only too long, but it refers to the POST-ENCODED number of bytes vs. the standard PRE-ENCODED number of bytes. So an Oracle uuencoded data line will have 'l' leading character followed by 65 bytes of post-encoded data, which represents 57 pre-encoded bytes.
3) The uuencode standard states that the grave accent (`) is used in the penultimate line to show an empty line with zero characters. While the grave accent is ASCII 96, it is just after the end of the range of valid leading characters and is therefore used to identify an empty line with no data bytes. Oracle uuencode uses a space instead here. This may actually work with some uuencode utilities, since space ASCII is 32 and 32-32=0, representing an empty data line. But the standard states that the grave accent should be used here.
4) In comparing Oracle uuencoded samples vs. standard uuencoded samples of the same binary files, the data bytes from Oracle appear to use a space whereas standard uuencode uses grave accents in the same locations. This is the only difference in the actual data bytes that I could find (although I was not comprehensive in my research here). Again, this may work between uuencode implementations since space is ASCII 32 and 32-32=0, which the grave accent is also intended to represent in the standard. I was able to use the data interchangeably with the implementations I checked. But a more thorough study is required to determine whether Oracle uuencoded space data bytes should always be grave accent data bytes.
Note the difference between the Oracle uuencode and this standard uuencode website http://www.webutils.pl/index.php?idx=uu: Oracle's uuencode data lines begin with 'l' (lower case 'L'), which is using 76 byte original data lines (ASCII lower case L = 108, 108-32 = 76, see the uuencode standards). The uuencode standard is to have every data line except the last one if shorter, to begin with 'M', which is using 45 byte original lines (ASCII M = 77, 77-32 = 45). While a bit strange, this IS THE UUENCODE STANDARD, for all data lines to begin with M except for the last one if shorter. The encoded data itself appears to match between Oracle and the above website (see attached files), except for the leading M vs. l. The lowercase L at the front of each data line is 'correct' in that it correctly indicates the length of bytes translated per line in Oracle's uuencode. But the line length Oracle is using is NOT per the uuencode standard, which is always M, or 45 original bytes per encoded line. If you try to decode test_out.txt using 3rd party tools (e.g., the above website), they will return an error because it does not follow the standard.
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms