D-String, an answer to P-String and C-String limitations

C-String’s are delimited by a NULL byte. P-Strings are preceded by a length identifier. Both have their downsides and I’ve developed the solution (it’s called the D-String; D for Data). The C-String’s downfall is that it cannot contain a NULL (else the interpreting language — C — will prematurely terminate the data). The P-String’s downfall is that it cannot represent more than 255 bytes (unless of course you use a wider length identifier in which case you’ve also increased the overhead). The D-String overcomes both of these limitations with minimal overhead. Let’s have a look at the specifics (free of charge).

NOTE: These are my own internal notes. They will be translated into a full technical explanation in another blog posting. However… the fact is that I’ve sat on this technology for 10 years and want to finally make it public. This is the first step in doing so. Last, this will serve as a backup should my iPhone crash (currently the only machine in the world with a documented example of the methodology). This is not meant to be digested by mere mortals (but if you can, all the more power to you — you’ll have a leg-up on the rest of those waiting on the technical discussion).

shxd.rfc04 — dStr data object

Len
0×00 => 0×00
0×01 => 0×01 0×00 0×00 DATA
0xFF => 0xFF 0×00 0×00 DATA
0×0100 => 0×0101 0×00 0×01 DATA
0×0101 => 0×0101 0×00 0×00 DATA
0xFFFF => 0xFFFF 0×00 0×00 DATA
0×010000 => 0×010101 0×00 0×03 DATA
0×010100 => 0×010101 0×00 0×02 DATA
0×010100 => 0×010101 0×00 0×01 DATA
0xFFFFFF => 0xFFFFFF 0×00 0×00 DATA
0×01000000 => 0×01010101 0×00 0×07 DATA
0×01000001 => 0×01010101 0×00 0×06 DATA
0×01000100 => 0×01010101 0×00 0×05 DATA
0×01000101 => 0×01010101 0×00 0×04 DATA
0×01010000 => 0×01010101 0×00 0×03 DATA
0×01010001 => 0×01010101 0×00 0×02 DATA
0×01010100 => 0×01010101 0×00 0×01 DATA
0xFFFFFFFF => 0xFFFFFFFF 0×00 0×00 DATA
0×0100000000 => 0×0101010101 0×00 0x0F DATA
0×0100000001 => 0×0101010101 0×00 0x0E DATA
0×0100000100 => 0×0101010101 0×00 0x0D DATA
0×0100000101 => 0×0101010101 0×00 0x0C DATA
0×0100010000 => 0×0101010101 0×00 0x0B DATA
0×0100010001 => 0×0101010101 0×00 0x0A DATA
0×0100010100 => 0×0101010101 0×00 0×09 DATA
0×0100010101 => 0×0101010101 0×00 0×08 DATA
0×0101000000 => 0×0101010101 0×00 0×07 DATA
0×0101000001 => 0×0101010101 0×00 0×06 DATA
0×0101000100 => 0×0101010101 0×00 0×05 DATA
0×0101000101 => 0×0101010101 0×00 0×04 DATA
0×0101010000 => 0×0101010101 0×00 0×03 DATA
0×0101010001 => 0×0101010101 0×00 0×02 DATA
0×0101010100 => 0×0101010101 0×00 0×01 DATA
0xFFFFFFFFFF => 0xFFFFFFFFFF 0×00 0×00 DATA
0×010000000000 => 0×010101010101 0×00 0x1F DATA
0×010000000001 => 0×010101010101 0×00 0x1E DATA
0×010000000100 => 0×010101010101 0×00 0x1D DATA
0×010000000101 => 0×010101010101 0×00 0x1C DATA
0×010000010000 => 0×010101010101 0×00 0x1B DATA
0×010000010001 => 0×010101010101 0×00 0x1A DATA
0×010000010100 => 0×010101010101 0×00 0×19 DATA
0×010000010101 => 0×010101010101 0×00 0×18 DATA
0×010001000000 => 0×010101010101 0×00 0×17 DATA
0×010001000001 => 0×010101010101 0×00 0×16 DATA
0×010001000100 => 0×010101010101 0×00 0×15 DATA
0×010001000101 => 0×010101010101 0×00 0×14 DATA
0×010001010000 => 0×010101010101 0×00 0×13 DATA
0×010001010001 => 0×010101010101 0×00 0×12 DATA
0×010001010100 => 0×010101010101 0×00 0×11 DATA
0×010001010101 => 0×010101010101 0×00 0×10 DATA
0×010100000000 => 0×010101010101 0×00 0x0F DATA
0×010100000001 => 0×010101010101 0×00 0x0E DATA
0×010100000100 => 0×010101010101 0×00 0x0D DATA
0×010100000101 => 0×010101010101 0×00 0x0C DATA
0×010100010000 => 0×010101010101 0×00 0x0B DATA
0×010100010001 => 0×010101010101 0×00 0x0A DATA
0×010100010100 => 0×010101010101 0×00 0×09 DATA
0×010100010101 => 0×010101010101 0×00 0×08 DATA
0×010101000000 => 0×010101010101 0×00 0×07 DATA
0×010101000001 => 0×010101010101 0×00 0×06 DATA
0×010101000100 => 0×010101010101 0×00 0×05 DATA
0×010101000101 => 0×010101010101 0×00 0×04 DATA
0×010101010000 => 0×010101010101 0×00 0×03 DATA
0×010101010001 => 0×010101010101 0×00 0×02 DATA
0×010101010100 => 0×010101010101 0×00 0×01 DATA
0xFFFFFFFFFFFF => 0xFFFFFFFFFFFF 0×00 0×00 DATA
.
.
.
0×0100000000000000 => 0×0101010101010101 0×00 0x7F DATA
0xFFFFFFFFFFFFFFFF => 0xFFFFFFFFFFFFFFFF 0×00 0×00 DATA
.
.
.
0×01000000000000000000000000000000 => 0×01010101010101010101010101010101 0×00 0x7FFFF DATA
0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF => 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF 0×00 0×0000 DATA

That’s a length identifier of 2^(8*16) or 2^128 or 3.4028236692094e+38 or 340,282 thousand Decillion bytes long. The length identifier is valid with only 3 bytes of overhead preceding the actual DATA (compared to 16 bytes for the length identifier).

Scaling this higher (to 512 bit integers — 64-bytes wide), the overhead would be 9 bytes.

The overhead is always the length of the length-identifier (in bytes) divided by 8 plus one (with the minimum overhead being two bytes at the low end).

If a dStr contains 0 bytes of data, the dStr will be 0×00.

If a dStr contains 1-15 bytes of data, the dStr will be 0xLL 0×00 0xNN DATA (header is 3 bytes). LL is the length of DATA. NN is the encode register.

If a dStr contains 65536-16777215 bytes of data, the dStr will be 0xLLLLLL 0×00 0xNN DATA (header is 5 bytes).

If a dStr contains 16777216-4294967295 bytes of data, the dStr will be 0xLLLLLLLL 0×00 0xNN DATA (header of 6 bytes).

If a dStr contains 4294967296-1099511627775 bytes of data, the dStr will be 0xLLLLLLLLLL 0×00 0xNN DATA (header of 7 bytes).

If a dStr contains 1099511627776-281474976710655 bytes of data, the dStr will be 0xLLLLLLLLLLLL 0×00 0xNN DATA (header of 8 bytes).

If a dStr contains 281474976710655-7.2057594037928e+16 bytes of data, the dStr will be 0xLLLLLLLLLLLLLL 0×00 0xNN DATA (header of 9 bytes).

If a dStr contains 7.2057594037928e+16-1.844674407371e+19 bytes of data, the dStr will be 0xLLLLLLLLLLLLLLLL 0×00 0xNN DATA (header of 10 bytes).

If a dStr contains 1.844674407371e+19-4.7223664828696e+21 bytes of data, the dStr will be 0xLLLLLLLLLLLLLLLLLL 0×00 0xNNNN DATA (header of 12 bytes).

If the dStr contains 4.7223664828696e+21-1.2089258196146e+24 bytes of data, the dStr will be 0xLLLLLLLLLLLLLLLLLLLL 0×00 0xNNNN DATA (header of 13 bytes).

Ad nausea to infinitum.

This entry was posted in News. Bookmark the permalink.

4 Responses to D-String, an answer to P-String and C-String limitations

  1. Saladphorq says:

    I think you put a 0 where a 1 should be. Just saying

  2. My husband and i have been really happy when Chris could round up his investigation through your ideas he got in your blog. It is now and again perplexing just to find yourself giving out secrets and techniques many others may have been making money from. And we also grasp we have the writer to be grateful to because of that. The most important illustrations you made, the straightforward site menu, the friendships you can give support to engender – it is mostly wonderful, and it’s really aiding our son and our family recognize that this situation is cool, which is certainly quite pressing. Thank you for everything!

    • devinteske says:

      Wasn’t sure if this was a link-back attempt. If it is, it’s the most clearly written one ever made! (and if not, hey, all the better!) I approve this message even if a link-back attempt because, hey, I advocate clearly written digestible commentary such as this.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>