## Binary to decimal convert example

16 comments### Free stock trading algorithms

This note describes how record keys are encoded. The encoding is designed such that memcmp can be used to sort the keys into their proper order. A key consists of a table number followed by a list of one or more SQL values. Each SQL value in the list has one of the following types: NULL, numeric, text, or binary. Keys are compared value by value, from left to right, until a difference is found. The first difference determines the key order. The table number is a varint that identifies the table to which the key belongs.

To generate the encoding, the SQL values of the key are visited from left to right. Each SQL value generates one or more bytes that are appended to the encoding. The complete key encoding is the concatenation of the individual SQL value encodings, in the same order as the SQL values. Two key encodings are only compariable if they have the same number of SQL values and if corresponding SQL values have the same sort-order. The first byte of a key past the table number will be in the range of 0x This leaves large chunks of key space available for other uses.

For example, the three-byte key 0x00 0x00 0x01 stores the schema cookie for the database as a bit big-endian integer. There are zero or more intervening bytes that encode the text value. The intervening bytes are chosen so that the encoding will sort in the desired collating order.

The default sequence of bytes is simply UTF8. The intervening bytes may not contain a 0x00 character; the only 0x00 byte allowed in a text encoding is the final byte. Strings must be converted to UTF8 so that equivalent strings in different encodings compare the same and so that the strings contain no embedded 0x00 bytes. In other words, strcmp should be sufficient for comparing two text keys. The text encoding ends in 0x00 in order to ensure that when there are two strings where one is a prefix of the other that the shorter string will sort first.

Binary Encoding The encoding of binaries fields is different depending on whether or not the value to be encoded is the last value the right-most value in the key. There are zero or more intervening bytes that encode the binary value. None of the intervening bytes may be zero. Each of the intervening bytes contains 7 bits of blob content with a 1 in the high-order bit the 0x80 bit. The final byte before the 0x00 contains any left-over bits of the blob content. This alternative encoding is more efficient, but it only works if there are no subsequent values in the key, since there is no termination mark on the BLOB being encoded.

The initial byte of a binary value, 0x25 or 0x26, is larger than the initial byte of a text value, 0x24, ensuring that every binary value will sort after every text value. We assume that numeric SQL values can be both integer and floating point values. If the numeric value is a NaN, then the encoding is a single byte of 0x This causes NaN values to sort prior to every other numeric value.

If the numeric value is a negative infinity then the encoding is a single byte of 0x Since every other numeric value except NaN has a larger initial byte, this encoding ensures that negative infinity will sort prior to every other numeric value other than NaN.

If the numeric value is a positive infinity then the encoding is a single byte of 0x Every other numeric value encoding begins with a smaller byte, ensuring that positive infinity always sorts last among numeric values. If the numeric value is exactly zero then it is encoded as a single byte of 0x Finite negative values will have initial bytes of 0x08 through 0x14 and finite positive values will have initial bytes of 0x16 through 0x For all values, we compute a mantissa M and an exponent E.

The mantissa is a base representation of the value. The exponent E determines where to put the decimal point. Each centimal digit of the mantissa is stored in a byte.

This means that the mantissa will never contain a byte with the value 0x If we assume all digits of the mantissa occur to the right of the decimal point, then the exponent E is the power of one hundred by which one must multiply the mantissa to recover the original value. Value Exponent E Significand M in hex 1. If E is 11 or more, the value is large. For E between 0 and 10, the value is medium. For E less than zero, the value is small.

Large positive values are encoded as a single byte 0x22 followed by E as a varint and then M. Small positive values are encoded as a single byte 0x16 followed by the ones-complement of the varint for -E followed by M. Small negative values are encoded as a single byte 0x14 followed by -E as a varint and then the ones-complement of M. Medium negative values are encoded as a byte 0xE followed by the ones-complement of M.

Large negative values consist of the single byte 0x08 followed by the ones-complement of the varint encoding of E followed by the ones-complement of M. Summary Each SQL value is encoded as one or more bytes. The first byte of the encoding, its meaning, and a terse description of the bytes that follow is given by the following table: This page was generated in about 0.