What this page covers

  • What "raw bytes" are and how they differ from text
  • How Oracle stores binary data (RAW, BLOB)
  • Examples of conversions: STRING ↔ RAW, HEX, Base64
  • Example of sending a BLOB to an HTTP API in chunks using UTL_HTTP
  • What Base64 means when used in APIs and how it differs from HEX

What are "raw bytes"

"Raw bytes" are a sequence of octets (8-bit values). They have no meaning until you explicitly specify how to interpret them (text encoding, image format, numeric layout, etc.).

Key points:

  • They are not text: text = bytes + encoding (UTF-8, CP1251, etc.).
  • Interpreting bytes as text without knowing the encoding yields unpredictable results.
  • In Oracle: RAW / RAW(n) — for small binary fragments, BLOB — for large streams.
  • For display or transfer of binary data people often use HEX or Base64.
  • Conversions are performed by explicit functions: HEX↔RAW, STRING↔RAW (with charset), Base64 encode/decode.

Examples: string → bytes → HEX / Base64

SQL/PL/SQL examples for Oracle.

1) Obtain RAW (UTF-8 bytes) and its HEX representation

SELECT UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8') FROM dual;
-- SQL clients typically show RAW as HEX.

SELECT RAWTOHEX(UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')) AS hex_bytes FROM dual;
-- -> 48656C6C6F2C20776F726C6421

This is the hex representation of the bytes that SQL Developer or sqlplus show for RAW values.

2) Decode HEX → string (UTF-8)

SELECT UTL_I18N.RAW_TO_CHAR(HEXTORAW('48656C6C6F2C20776F726C6421'), 'AL32UTF8') AS text
FROM dual;
-- -> Hello, world!

3) Base64 — a convenient format for embedding binary data into JSON/XML

-- Produce a Base64 text string from RAW
SELECT UTL_RAW.CAST_TO_VARCHAR2(
         UTL_ENCODE.BASE64_ENCODE(
           UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')
         )
       ) AS b64_text
FROM dual;
-- -> SGVsbG8sIHdvcmxkIQ==

-- Without CAST you will see RAW (shown as HEX) that contains the base64 bytes
SELECT RAWTOHEX(UTL_ENCODE.BASE64_ENCODE(UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')))
FROM dual;

Short size comparison (example):

SELECT 
  LENGTH(UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')) AS bytes_len,         -- byte length
  LENGTH(RAWTOHEX(UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8'))) AS hex_len,   -- HEX length
  LENGTH(UTL_RAW.CAST_TO_VARCHAR2(UTL_ENCODE.BASE64_ENCODE(UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')))) AS b64_len
FROM dual;
-- HEX = 2 * bytes, Base64 ≈ 4/3 * bytes

A note about character sets in Oracle

  • AL32UTF8 — Oracle name for UTF‑8.
  • Windows‑1252 is commonly referenced as WE8MSWIN1252 in Oracle.

Example of converting strings between character sets:

SELECT CONVERT('Hello', 'WE8MSWIN1252', 'AL32UTF8') FROM dual;

When converting between string and RAW use UTL_I18N.STRING_TO_RAW and UTL_I18N.RAW_TO_CHAR with an explicit charset.


Dissection of HEX string 48656C6C6F2C20776F726C6421

This is ASCII/UTF‑8 in hex. Bytes and their corresponding characters:

  • 48 → 'H'
  • 65 → 'e'
  • 6C → 'l'
  • 6C → 'l'
  • 6F → 'o'
  • 2C → ','
  • 20 → space
  • 77 → 'w'
  • 6F → 'o'
  • 72 → 'r'
  • 6C → 'l'
  • 64 → 'd'
  • 21 → '!'

Result: "Hello, world!"

Decode in Oracle:

SELECT UTL_I18N.RAW_TO_CHAR(HEXTORAW('48656C6C6F2C20776F726C6421'), 'AL32UTF8') FROM dual;

Example: sending a BLOB to an HTTP API in chunks — explanation

Below is a working PL/SQL example that builds a BLOB from a CLOB, sends it in chunks using UTL_HTTP.write_raw and reads the response as RAW chunks.

DECLARE
    p_source_data clob := 'some_source_data';
    l_auth_header varchar2(2000) := 'your diadoc Auth header data...';

    v_resp_clob  CLOB;
    l_api_base_url varchar2(2000) := 'http://api.diadoc.ru/v4/endpoint';
    l_blob         BLOB;
    -- LOB/HTTP variables
    l_req          UTL_HTTP.req;
    l_resp         UTL_HTTP.resp;
    l_buffer       RAW(32767);        -- buffer for reading BLOB
    l_amount       BINARY_INTEGER;
    l_offset       INTEGER := 1;
    l_buf_len      BINARY_INTEGER;
BEGIN
    -- Create temporary BLOB and fill it with data from p_source_data
    DBMS_LOB.createtemporary(l_blob, TRUE);
    DBMS_LOB.OPEN(l_blob, DBMS_LOB.LOB_READWRITE);
    DECLARE
        v_pos INTEGER := 1;
        v_chunk VARCHAR2(32767 CHAR);
        v_raw RAW(32767);
        v_chunk_len INTEGER := 32767;
    BEGIN
        LOOP
            v_chunk := DBMS_LOB.SUBSTR(p_source_data, v_chunk_len, v_pos);
            EXIT WHEN v_chunk IS NULL OR LENGTH(v_chunk) = 0;
            -- If a specific encoding is required for bytes, replace UTL_RAW.CAST_TO_RAW with UTL_I18N.STRING_TO_RAW(v_chunk, 'AL32UTF8'/'WE8MSWIN1252')
            v_raw := UTL_RAW.CAST_TO_RAW(v_chunk);
            DBMS_LOB.WRITEAPPEND(l_blob, UTL_RAW.LENGTH(v_raw), v_raw);
            v_pos := v_pos + v_chunk_len;
        END LOOP;
        DBMS_LOB.CLOSE(l_blob);
    END;

    -- Start HTTP request
    l_req := UTL_HTTP.begin_request(l_api_base_url, 'POST', 'HTTP/1.1');
    UTL_HTTP.set_header(l_req, 'Content-Type', 'application/octet-stream'); -- sending raw bytes
    UTL_HTTP.set_header(l_req, 'Accept', 'application/json');
    UTL_HTTP.set_header(l_req, 'Authorization', l_auth_header);
    UTL_HTTP.set_header(l_req, 'Transfer-Encoding', 'chunked'); -- ensure server supports this

    -- Write BLOB in chunks
    l_amount := DBMS_LOB.getlength(l_blob);
    l_offset := 1;
    l_buf_len := 2000; -- safe size for write_raw; can be increased to 32767 for testing
    WHILE l_offset <= l_amount LOOP
        IF l_offset + l_buf_len - 1 > l_amount THEN
            l_buf_len := l_amount - l_offset + 1;
        END IF;
        DBMS_LOB.READ(l_blob, l_buf_len, l_offset, l_buffer);
        UTL_HTTP.write_raw(l_req, l_buffer);
        l_offset := l_offset + l_buf_len;
    END LOOP;

    -- Get response
    l_resp := UTL_HTTP.get_response(l_req);

    -- Read response as RAW chunks and assemble into a CLOB with proper encoding
    DECLARE
        v_resp_chunk_raw RAW(32767);
        v_resp_chunk_clob CLOB;
        v_resp_chunk_len INTEGER;
        v_charset VARCHAR2(50) := 'WE8MSWIN1252'; -- set according to expected response charset
    BEGIN
        DBMS_LOB.createtemporary(v_resp_clob, TRUE);
        LOOP
            UTL_HTTP.read_raw(l_resp, v_resp_chunk_raw, 32767);
            v_resp_chunk_len := UTL_RAW.LENGTH(v_resp_chunk_raw);
            EXIT WHEN v_resp_chunk_len IS NULL OR v_resp_chunk_len = 0;
            -- Correct conversion: RAW -> VARCHAR2 (using DB charset) then convert to AL32UTF8
            v_resp_chunk_clob := CONVERT(UTL_RAW.CAST_TO_VARCHAR2(v_resp_chunk_raw), v_charset, 'AL32UTF8');
            DBMS_LOB.writeappend(v_resp_clob, LENGTH(v_resp_chunk_clob), v_resp_chunk_clob);
        END LOOP;
        -- here you can process v_resp_clob (parse JSON, etc.)
    EXCEPTION
        WHEN UTL_HTTP.end_of_body THEN
            UTL_HTTP.end_response(l_resp);
        WHEN OTHERS THEN
            dbms_output.put_line(sqlerrm);
            dbms_output.put_line(DBMS_UTILITY.FORMAT_ERROR_BACKTRACE());
    END;

EXCEPTION
    WHEN OTHERS THEN
        dbms_output.put_line(sqlerrm);
        dbms_output.put_line(DBMS_UTILITY.FORMAT_ERROR_BACKTRACE());
        BEGIN
            UTL_HTTP.end_response(l_resp);
        EXCEPTION WHEN OTHERS THEN NULL; END;
        -- free temporary LOBs
        BEGIN IF DBMS_LOB.ISTEMPORARY(l_blob) = 1 THEN DBMS_LOB.FREETEMPORARY(l_blob); END IF; END;
END;

What Base64 means in APIs and how it differs

Base64 is a textual representation of binary data, convenient to embed in JSON, XML or other text protocols. When you send a file, image or arbitrary BLOB via an API you usually need to convert bytes to a string because JSON and XML expect text, not raw bytes.

Also, raw bytes may contain values that are not valid characters in text formats (or that interfere with XML/JSON syntax). Base64 avoids encoding and interpretation problems.

Key differences and notes:

  • Size and format
  • HEX encodes every byte as two hexadecimal characters → length = 2 × bytes.
  • Base64 encodes each 3 bytes into 4 characters → length ≈ 4/3 × bytes (with optional padding '=').
  • Base64 is usually more space-efficient than HEX and is preferred for larger binary payloads.

  • Semantics and compatibility

  • Base64 is a de-facto standard for embedding binary data in JSON and XML (MIME, data URIs, many SDKs and services expect base64).
  • HEX is often used for representing hashes (sha1/md5) and debugging, but is less common for embedding files in JSON.

  • Security

  • Base64 is encoding, not encryption. Anyone who can decode the string can read the data. Use HTTPS and/or encrypt before encoding to protect content.

  • URL and URL-safe variants

  • Standard Base64 uses '+' and '/' and padding '='; for URLs use the URL-safe variant where '+' → '-' and '/' → '_'.

  • Practical streaming nuances

  • When chunking a BLOB and encoding to Base64, it is convenient to read chunks whose size is a multiple of 3 to avoid intermediate padding. UTL_ENCODE.BASE64_ENCODE works with RAW chunks and returns RAW.

Oracle example — produce Base64 string from text (UTF-8):

SELECT UTL_RAW.CAST_TO_VARCHAR2(
         UTL_ENCODE.BASE64_ENCODE(
           UTL_I18N.STRING_TO_RAW('Hello, world!', 'AL32UTF8')
         )
       ) AS b64_text
FROM dual;
-- -> SGVsbG8sIHdvcmxkIQ==

Practical sending via JSON: construct an object with a base64 field and send with Content-Type: application/json. Do not use the header Content-Transfer-Encoding: base64 for ordinary HTTP/REST — that header is historically used in MIME.

Example JSON payload:

{
  "fileName": "image.png",
  "content": "iVBORw0KGgoAAAANS..."
}