<p>While validating <code>~E</code> isn't technically required, it acts as a useful assertion that the packet being
parsed is valid.</p>
<h2id="schema">The packet schema header</h2>
<p>Following the 4 byte header, is a 4 byte integer containing the length of the next part of the header (this is
technically made redundant as this structure is also terminated).</p>
<p>This part of the header defines the schema that the main payload uses.</p>
<p>A tag definition looks like:</p>
<table>
<thead>
<tr>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>11</td>
<td>12</td>
<td>13</td>
<td>14</td>
<td>15</td>
</tr>
</thead>
<tr>
<td>Type</td>
<td>nlen</td>
<tdcolspan="7">Tag name</td>
<tdstyle="border-bottom: none"colspan="8"></td>
</tr>
<tr>
<tdstyle="border-top: none;"colspan="15">Attributes and children</td>
<tdcolspan="1"><i>FE</i></td>
</tr>
</table>
<p>Structure names are encoded as densely packed 6 bit values, length prefixed (<code>nlen</code>). The acceptable
alphabet is <code>0123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz</code>, and the packed values
are indecies within this alphabet.</p>
<p>The children can be a combination of either attribute names, or child tags. Attribute names are represented by
the byte <code>0x2E</code> followed by a length prefixed name as defined above. Child tags follow the above
format. Type <code>0x2E</code> must therefore be considered reserved as a possible structure type.</p>
<p>Attributes (type <code>0x2E</code>) represent a string attribute. Any other attribute must be defined as a child
tag. Is it notable that 0 children is allowable, which is how the majority of values are encoded.</p>
<p>All valid IDs, and their respective type, are listed in the following table. The bucket column here will be
used later when unpacking the main data, so we need not worry about it for now, but be warned it exists and is
possibly the least fun part of this format.</p>
<table>
<thead>
<tr>
<td>ID</td>
<td>Bytes</td>
<td>C type</td>
<td>Bucket</td>
<tdcolspan="2">XML names</td>
<td></td>
<td>ID</td>
<td>Bytes</td>
<td>C type</td>
<td>Bucket</td>
<tdcolspan="2">XML names</td>
</tr>
</thead>
<tr>
<td>0x01</td>
<td>0</td>
<td>void</td>
<td>-</td>
<td>void</td>
<td></td>
<td></td>
<td>0x21</td>
<td>24</td>
<td>uint64[3]</td>
<td>int</td>
<td>3u64</td>
<td></td>
</tr>
<tr>
<td>0x02</td>
<td>1</td>
<td>int8</td>
<td>byte</td>
<td>s8</td>
<td></td>
<td></td>
<td>0x22</td>
<td>12</td>
<td>float[3]</td>
<td>int</td>
<td>3f</td>
<td></td>
</tr>
<tr>
<td>0x03</td>
<td>1</td>
<td>uint8</td>
<td>byte</td>
<td>u8</td>
<td></td>
<td></td>
<td>0x23</td>
<td>24</td>
<td>double[3]</td>
<td>int</td>
<td>3d</td>
<td></td>
</tr>
<tr>
<td>0x04</td>
<td>2</td>
<td>int16</td>
<td>short</td>
<td>s16</td>
<td></td>
<td></td>
<td>0x24</td>
<td>4</td>
<td>int8[4]</td>
<td>int</td>
<td>4s8</td>
<td></td>
</tr>
<tr>
<td>0x05</td>
<td>2</td>
<td>uint16</td>
<td>short</td>
<td>s16</td>
<td></td>
<td></td>
<td>0x25</td>
<td>4</td>
<td>uint8[4]</td>
<td>int</td>
<td>4u8</td>
<td></td>
</tr>
<tr>
<td>0x06</td>
<td>4</td>
<td>int32</td>
<td>int</td>
<td>s32</td>
<td></td>
<td></td>
<td>0x26</td>
<td>8</td>
<td>int16[4]</td>
<td>int</td>
<td>4s16</td>
<td></td>
</tr>
<tr>
<td>0x07</td>
<td>4</td>
<td>uint32</td>
<td>int</td>
<td>u32</td>
<td></td>
<td></td>
<td>0x27</td>
<td>8</td>
<td>uint8[4]</td>
<td>int</td>
<td>4s16</td>
<td></td>
</tr>
<tr>
<td>0x08</td>
<td>8</td>
<td>int64</td>
<td>int</td>
<td>s64</td>
<td></td>
<td></td>
<td>0x28</td>
<td>16</td>
<td>int32[4]</td>
<td>int</td>
<td>4s32</td>
<td>vs32</td>
</tr>
<tr>
<td>0x09</td>
<td>8</td>
<td>uint64</td>
<td>int</td>
<td>u64</td>
<td></td>
<td></td>
<td>0x29</td>
<td>16</td>
<td>uint32[4]</td>
<td>int</td>
<td>4u32</td>
<td>vs32</td>
</tr>
<tr>
<td>0x0a</td>
<td><i>prefix</i></td>
<td>char[]</td>
<td>int</td>
<td>bin</td>
<td>binary</td>
<td></td>
<td>0x2a</td>
<td>32</td>
<td>int64[4]</td>
<td>int</td>
<td>4s64</td>
<td></td>
</tr>
<tr>
<td>0x0b</td>
<td><i>prefix</i></td>
<td>char[]</td>
<td>int</td>
<td>str</td>
<td>string</td>
<td></td>
<td>0x2b</td>
<td>32</td>
<td>uint64[4]</td>
<td>int</td>
<td>4u64</td>
<td></td>
</tr>
<tr>
<td>0x0c</td>
<td>4</td>
<td>uint8[4]</td>
<td>int</td>
<td>ip4</td>
<td></td>
<td></td>
<td>0x2c</td>
<td>16</td>
<td>float[4]</td>
<td>int</td>
<td>4f</td>
<td>vf</td>
</tr>
<tr>
<td>0x0d</td>
<td>4</td>
<td>uint32</td>
<td>int</td>
<td>time</td>
<td></td>
<td></td>
<td>0x2d</td>
<td>32</td>
<td>double[4]</td>
<td>int</td>
<td>4d</td>
<td></td>
</tr>
<tr>
<td>0x0e</td>
<td>4</td>
<td>float</td>
<td>int</td>
<td>float</td>
<td>f</td>
<td></td>
<td>0x2e</td>
<td><i>prefix</i></td>
<td>char[]</td>
<td>int</td>
<td>attr</td>
<td></td>
</tr>
<tr>
<td>0x0f</td>
<td>8</td>
<td>double</td>
<td>int</td>
<td>double</td>
<td>d</td>
<td></td>
<td>0x2f</td>
<td>0</td>
<td></td>
<td>-</td>
<td>array</td>
<td></td>
</tr>
<tr>
<td>0x10</td>
<td>2</td>
<td>int8[2]</td>
<td>short</td>
<td>2s8</td>
<td></td>
<td></td>
<td>0x30</td>
<td>16</td>
<td>int8[16]</td>
<td>int</td>
<td>vs8</td>
<td></td>
</tr>
<tr>
<td>0x11</td>
<td>2</td>
<td>uint8[2]</td>
<td>short</td>
<td>2u8</td>
<td></td>
<td></td>
<td>0x31</td>
<td>16</td>
<td>uint8[16]</td>
<td>int</td>
<td>vu8</td>
<td></td>
</tr>
<tr>
<td>0x12</td>
<td>4</td>
<td>int16[2]</td>
<td>int</td>
<td>2s16</td>
<td></td>
<td></td>
<td>0x32</td>
<td>16</td>
<td>int8[8]</td>
<td>int</td>
<td>vs16</td>
<td></td>
</tr>
<tr>
<td>0x13</td>
<td>4</td>
<td>uint16[2]</td>
<td>int</td>
<td>2s16</td>
<td></td>
<td></td>
<td>0x33</td>
<td>16</td>
<td>uint8[8]</td>
<td>int</td>
<td>vu16</td>
<td></td>
</tr>
<tr>
<td>0x14</td>
<td>8</td>
<td>int32[2]</td>
<td>int</td>
<td>2s32</td>
<td></td>
<td></td>
<td>0x34</td>
<td>1</td>
<td>bool</td>
<td>byte</td>
<td>bool</td>
<td>b</td>
</tr>
<tr>
<td>0x15</td>
<td>8</td>
<td>uint32[2]</td>
<td>int</td>
<td>2u32</td>
<td></td>
<td></td>
<td>0x35</td>
<td>2</td>
<td>bool[2]</td>
<td>short</td>
<td>2b</td>
<td></td>
</tr>
<tr>
<td>0x16</td>
<td>16</td>
<td>int16[2]</td>
<td>int</td>
<td>2s64</td>
<td>vs64</td>
<td></td>
<td>0x36</td>
<td>3</td>
<td>bool[3]</td>
<td>int</td>
<td>3b</td>
<td></td>
</tr>
<tr>
<td>0x17</td>
<td>16</td>
<td>uint16[2]</td>
<td>int</td>
<td>2u64</td>
<td>vu64</td>
<td></td>
<td>0x37</td>
<td>4</td>
<td>bool[4]</td>
<td>int</td>
<td>4b</td>
<td></td>
</tr>
<tr>
<td>0x18</td>
<td>8</td>
<td>float[2]</td>
<td>int</td>
<td>2f</td>
<td></td>
<td></td>
<td>0x38</td>
<td>16</td>
<td>bool[16]</td>
<td>int</td>
<td>vb</td>
<td></td>
</tr>
<tr>
<td>0x19</td>
<td>16</td>
<td>double[2]</td>
<td>int</td>
<td>2d</td>
<td>vd</td>
<td></td>
<td>0x38</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1a</td>
<td>3</td>
<td>int8[3]</td>
<td>int</td>
<td>3s8</td>
<td></td>
<td></td>
<td>0x39</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1b</td>
<td>3</td>
<td>uint8[3]</td>
<td>int</td>
<td>3u8</td>
<td></td>
<td></td>
<td>0x3a</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1c</td>
<td>6</td>
<td>int16[3]</td>
<td>int</td>
<td>3s16</td>
<td></td>
<td></td>
<td>0x3b</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1d</td>
<td>6</td>
<td>uint16[3]</td>
<td>int</td>
<td>3s16</td>
<td></td>
<td></td>
<td>0x3c</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1e</td>
<td>12</td>
<td>int32[3]</td>
<td>int</td>
<td>3s32</td>
<td></td>
<td></td>
<td>0x3d</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x1f</td>
<td>12</td>
<td>uint32[3]</td>
<td>int</td>
<td>3u32</td>
<td></td>
<td></td>
<td>0x3e</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x20</td>
<td>24</td>
<td>int64[3]</td>
<td>int</td>
<td>3s64</td>
<td></td>
<td></td>
<td>0x3f</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</table>
<p>Strings should be encoded and decoded according to the encoding specified in the packet header. Null termination is optional, however should be stripped during decoding.</p>
<p>All of these IDs are <code>& 0x3F</code>. Any value can be turned into an array by setting the 7<sup>th</sup> bit
high (<code>| 0x40</code>). Arrays of this form, in the data section, will be an aligned <code>size: u32</code>
immediately followed by <code>size</code> bytes' worth of (unaligned!) values of the unmasked type.</p>
<details>
<summary>Source code details</summary>
<p>The full table for these values can be found in libavs. This table contains the names of every tag, along
with additional information such as how many bytes that data type requires, and which parsing function
<summary>Note about the <code>array</code> type:</summary>
<p>While I'm not totally sure, I have a suspicion this type is used internally as a pseudo-type. Trying to
identify its function as a parsable type has some obvious blockers:</p>
<p>All of the types have convenient <code>printf</code>-using helper functions that are used to emit them when
serializing XML. All except one.</p>
<imgsrc="./images/no_array.png">
<p>If we have a look inside the function that populates node sizes (<code>libavs-win32.dll:0x1000cf00</code>),
it has an explicit case, however is the same fallback as the default case.</p>
<imgsrc="./images/no_array_2.png">
<p>In the same function, however, we can find a second (technically first) check for the array type.</p>
<imgsrc="./images/yes_array.png">
<p>This seems to suggest that internally arrays are represented as a normal node, with the <code>array</code>
type, however when serializing it's converted into the array types we're used to (well, will be after the
next sections) by masking 0x40 onto the contained type.</p>
<p>Also of interest from this snippet is the fact that <code>void</code>, <code>bin</code>, <code>str</code>,
and <code>attr</code> cannot be arrays. <code>void</code> and <code>attr</code> make sense, however
<code>str</code> and <code>bin</code> are more interesting. I suspect this is because konami want to be able
to preallocate the memory, which wouldn't be possible with these variable length structures.
</p>
</details>
<h2id="data">The data section</h2>
<p>This is where all the actual packet data is. For the most part, parsing this is the easy part. We traverse our
schema, and read values out of the packet according to the value indicated in the schema. Unfortunately, konami
decided all data should be aligned very specifically, and that gaps left during alignment should be backfilled
later. This makes both reading and writing somewhat more complicated, however the system can be fairly easily
understood.</p>
<p>Firstly, we divide the payload up into 4 byte chunks. Each chunk can be allocated to either store individual
bytes, shorts, or ints (these are the buckets in the table above). When reading or writing a value, we first
check if a chunk allocated to the desired type's bucket is available and has free/as-yet-unread space within it.
If so, we will store/read our data to/from there. If there is no such chunk, we claim the next unclaimed chunk
for our bucket.</p>
<p>For example, imagine we write the sequence <code>byte, int, byte, short, byte, int, short</code>. The final output should look like:</p>
<table>
<thead>
<tr>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>10</td>
<td>11</td>
<td>12</td>
<td>13</td>
<td>14</td>
<td>15</td>
</tr>
</thead>
<tr>
<td>byte</td>
<td>byte</td>
<td>byte</td>
<td></td>
<tdcolspan="4">int</td>
<tdcolspan="2">short</td>
<tdcolspan="2">short</td>
<tdcolspan="4">int</td>
</tr>
</table>
<p>While this might seem a silly system compared to just not aligning values, it is at least possible to intuit that it helps reduce wasted space. It should be noted that any variable-length structure, such as a string or an array, claims all chunks it encroaches on for the <code>int</code> bucket, disallowing the storage of bytes or shorts within them.</p>
<details>
<summary>Implementing a packer</summary>
<p>While the intuitive way to understand the packing algorithm is via chunks and buckets, a far more efficient implementation can be made that uses three pointers. Rather than try to explain in words, hopefully this python implementation should suffice as explanation:<pre><code>class Packer: