Character-Hybrid Encoding : Zoned
Decimal
Zoned (or "Zoned Decimal") is a common numeric encoding system used on IBM mainframes
based on the EBCDIC codes (and the old Hollerith punch card "sign-overpunch" concept). It is a
common COBOL encoding form used for values for which character input and output is a more
significant factor than calculations.
- Sign encoding
- Remember how the Hollerith codes for punch cards combined the sign punch in the
same column with the last digit of a number
- +365 would have been encoded on a punch card with a plus-punch and a 5-punch in
the same column
- -52 wpuld have been encoded on a punch card with a minus-punch and a 2-punch in
the same column
- Hollerith codes with a plus-punch became EBCDIC codes with a high order nybble =
C
- Hollerith codes with a minus-punch became EBCDIC codes with a high order nybble =
D
- that is, +365 from a Hollerith punch card would become encoded in EBCDIC as F3h
F6h C5h, and -52 from a Hollerith punch card would become encoded in EBCDIC as
F5h D2h
- General Format
A "Zoned Decimal" encoded value has the form
- for every decimal digit except the right-most, each digit is encoded as (hexadecimal)
Fxh with "x" replaced by the actual decimal digit value
- the right-most digit is encoded as (hexadecimal) sxh with "s" replace by C for positive
values or D for negative values (and with "x" replaced as above)
- (Note: the IBM mainframe will actually accept any of A, C, E, or F as valid
hexadecimal values for a positive sign, the "s" above, and either B or D as valid
hexadecimal values for a negative sign; however, C and D are the "standard" values
expected and generated by the computer.)
- Size limitations
- The IBM mainframe uses a 4-bit value (encoded to represent 1 to 16) to represent the
length when working with a "zoned decimal" encoded value
- Because of this, Zoned Decimal numbers (for example, as used in COBOL) can not
contain more than 16 digits
- Example
- Decode the following dump assuming it contains only zoned decimal values:
- F2 F5 F6 D7 C3 F9 C8 F1
- F4 D6 F3 F1 F7 C5 F0 C0
- Solution:
- -2567 +3 +98 -146 +3175 (+)0
Character-Hybrid Encoding : Packed
Decimal)
Zoned Decimal encoding is used as the basis for Packed Decimal encoding scheme.
The relative merits of Zoned Decimal, Packed Decimal, and 2's Complement encoding are
discussed relative to computations, data storage requirements, and character IO.
Zoned and Packed Decimal are basically restricted to IBM mainframe computers (and their
"clones"). Other systems often use a system called BCD encoding to get the same advantages as
the Packed Decimal system.
Packed Decimal Encoding
- Zoned Decimal Format (reviewed) - Zoned Decimal encoding is a character-based decimal
encoding scheme used on IBM mainframe computers. WIth the exception of the right-most /
least-significant digit, each digit is encoded using its EBCDIC character code i.e. Fx (hex)
where x is between 0 and 9 inclusive. The right-most / least-significant digit by replace the
hexadecimal F in the top 4 bits of the EBCDIC code for the character digit value with a code
indicating if the entire number is positive or negative. Any of the hexadecimal values A, C,
E, or F are considered valid codes for positive Zoned Decimal encoded values; however, C
(hex) is the only code that is actually generated by machine instructions. Similarly, either B
or D (hex) are considered valid codes for negative values, but machine instructions only
generate negative zoned decimal values using the code D (hex). A zoned decimal value is
considered invalid and will generate an error if the low-order 4 bits of any byte has a value
greater than 9, or if any high -order 4 bits has a value less than A(hex).
- Conversion from Zoned Decimal to Packed Decimal Zoned decimal value include a lot of
"wasted" space, namely for the high-order F(hex) values in every byte (except the last one).
Packed decimal avoids this waste by "dropping" all the F(hex) values and "packing" two
decimal values in each byte; the sign code and the decimal value in the least-significant byte
exchange places.
- Zoned Decimal format: Fd Fd .... Fd sd
- Packed Decimal format: dd ... dd ds
- ExamplesThe decimal value +37825:
- as a Zoned Decimal value: F3 F7 F8 F2 C5 (internal view; 5 bytes)
- as a Packed Decimal value: 37 82 5C (internal view; 3 bytes)
The decimal value -91:
- as a Zoned Decimal value: F9 D1 (internal view; 2 bytes)
- as a Packed Decimal value: 09 1D (internal view; 2 bytes)
Comparative Evaluation
...... note: the IBM mainframe does not support Unsigned Binary
- Arithmetic Operation Speed
- Binary (2's Complement) : very fast
- Packed Decimal: slow
- Zoned Decimal: arithmetic can not be performed until converted to Packed Decimal or
a Binary format
- Memory Requirements & Size Limitations
- Binary (2's Complement): "half-word"/2 bytes with range +/- 32K; "full word"/4 bytes
with a range +/- 2G
- Packed Decimal: 1 to 16 bytes ; sign plus up to 31 decimal digits
- Zoned Decimal: 1 to 16 bytes; sign plus up to 16 decimal digits
- Input / Output
- Binary (2's Complement): slow conversion between Binary and EBCDIC decimal
characters
- Packed Decimal: relatively fast conversion between decimal and EBCDIC decimal
characters
- Zoned Decimal: trivial or no conversion required
Conversion Process
for EBCDIC input, arithmetic processing, then EBCDIC output (actual time required to perform
each step will vary considerable between different IBM mainframe computer systems; "relative
timings are given to provide a "feel" for the advantage/disadvantage of conversion to/from
Binary):
- EBCDIC/Zoned to Packed Decimal - if an EBCDIC sign character is included, decoding of
the sign and adjustment of the last byte to proper Zoned Decimal sign is required (3 relative
timing units); plus a PACK instruction (average: 6 relative timing units)
- Packed Decimal Arithmetic / Conversion to Binary - Packed Decimal arithmetic (average:
10 relative timing units per operation); conversion to Binary (CVB instruction) (10 relatie
timing units)
- Binary Arithmetic - (if not direct Packed Decimal arithmatic) (1 relative timing unit per
arithmetic operation)
- Binary to Packed Decimal - conversion to Decimal (CVD instruction) (10 relative timing
units)
- Packed Decimal to Zoned - "unpack" value (UNPK instruction) (average 6 relative timing
units)
- Sign Conversion for EBCDIC - depending upon whether sign character is to be displayed or
not (1 to 5 relative timing units)
BCD Encoding
Many non-"IBM mainframe" computer system use a decimal encoding scheme similar to IBM's
packed decimal form. This is especially true of spreadsheet package software where continual
decimal character IO outweighs the actual computational activities. This system is in general
refered to as Binarary Coded Decimal (BCD). BCD suffers from a lack of standardization in the
encoding of signed values.
- General BCD Form - values are stored in decimal with 2 decimal digits encoded in each
byte (one digit in the top 4 bits; one in the bottom 4 bits)
- Non-Standardization Problems - some processors provide secondary support features to
help implement BCD (for example, the Intel 80x86 family provides an "Auxiluary Carry"
flag which is turned on if there is a carry out of the low order 4 bits of an 8-bit byte);
however, very few processors provide direct BCD processing instructions. As a result, BCD
has basically been implemented by software developers, with (for the most part) each
developer creating their own version especially with respect to how signs are encoded. Even
running on the same computer, two software packages may not be able to exchange data in
BCD encoded fields because neither recognizes the way the other has encoded sign values.
Note that "Packed Decimal" is a BCD encoding scheme which (unlike most BCD systems)
enjoys standardization across any IBM mainframe "platform"; however, even this
"standardization" is not supported by other hardware systems.
- Advantages / Disadvantages - BCD encoded values are a compromise; they are decimal
based which makes conversion to decimal characters for output relatively simple; they
"pack" twice as many digits into the same space compared to a true character encoding
scheme; and, with some minimal hardware support, arithmetic operations can be performed
without a huge overhead. On the other hand, arithmetic is definitely slower than for binary
encoded forms (typically by at least a factor of 10); it does take some time to convert to
character format every time IO is required; and, because of the non-standardization
mentioned above, sharing data between two programs almost always requires conversion to
and back from some other (shared) data type.
- BCD AdditionIn order to get a feel for the problems in handling BCD arithmetic
without machine level BCD arithmetic instructions, consider 3 cases where we might be
adding together 2 bytes each containing 2 digit BCD values:
32 (hex) ...representing 32(dec)
+ 16 (hex) ...representing 16(dec)
_________ performed with normal (binary) addition
48 (hex) ...no problem!
38 (hex) ...representing 38(dec)
+ 16 (hex) ...representing 16(dec)
_________ performed with normal (binary) addition
4E (hex) ...problem which is easily detected since the
(low-order) 4-bit pattern does not contain a decimal value (solution
is to subtract 10dec and add 10hex, or simply add 6) whenever the
low-order "nybble" contains a value greater than 9)
38 (hex) ...representing 38(dec)
+ 19 (hex) ...representing 19(dec)
_________ performed with normal (binary) addition
51 (hex) ...problem which is not easily detected since the
(low-order) 4-bit pattern appears to be valid; because of this
possibility, it may not be possible to perform addition of a pair of
BCD values without splitting the values into single decimal units
(unless, as with the Intel 80x86 processors, an additional "Auxiliary
Carry" flag is provided to indicate that a carry occurred from the
low-order decimal position into the high order decimal position