PCP and Alpha

This commit is contained in:
Bottersnike 2023-02-17 09:22:27 +00:00
parent b1544d136c
commit c408eff8d0
3 changed files with 299 additions and 37 deletions

BIN
images/pcp-railroad.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.5 KiB

View File

@ -0,0 +1,232 @@
{% extends "sega.html" %} {% block title %}AlphADVD{% endblock %} {% block body %}
<h1>AlphaDVD (aka &alpha;DVD)</h1>
<p>
&alpha;DVD is the custom copy-protection SEGA employ for update DVDs. It is handled by
<code>mxAuthDisc.exe</code> on Ring systems. Is is present on DVR-* discs, typically the first in a multi-DVD
install process.
</p>
<p>
In order to understand &alpha;DVD, it's important to first have a basic understanding of how data is stored on a
DVD. Rather than like random access storage, where the data stream can be moddeled as a large addressable series of
bytes, DVDs are more akin to HDDs in their division into sectors. Unlike HDDs, however, there is no prescribed order
for the sectors! Each sector of data on disc is prefixed by a header identifying that sector, and notably including
its sector number. When a DVD reader is asked to read a specific sector, it spins the disc until it reads the
appropriate header, then returns the data following that header. There is importantly nothing here that would stop a
disc from containing multiple sectors with the same sector number in their prefix!
</p>
<p>
DVD readers will return the first sector that matches the requested sector number, so if we know where the different
duplicates are on disc we can seek to a known sector a short distance before the instance of the duplicate we wish
to aquire, then ask the DVD reader to read the duplicated sector. Depending on where we first seek, we will receive
different data back.
</p>
<p>
&alpha;DVD utilises this, with 6 duplicated sectors, each with three distinct copies. When authenticating a disc,
only one of these 6 duplicates will be checked, however which is checked is random, so in practice all 6 should be
present lest the disc sporadically fails the authentication. This is similar to a copy-protection scheme called
TAGES, however more advanced. All three instances <b>must</b> be present, so it is impossible to create a single
flat image that passes authentication!
</p>
<p>
As well as this more hardware based authentication, there is a level of encryption applied to the disc headers too.
This is however much easeier to work with. Each disc has a header in sector 16, sector 1, or sector 17 (checked in
that order). There is no indication which sector contains the header, so in turn each sector is read and decryption
is attempted. We can then validate the header magic number.
</p>
<h2>&alpha;DVD Encryption</h2>
<p>
&alpha;DVD encryption is a basic XOR cipher, where the text is XORed with a key, repeating the key as needed. The
key is always 32768 (8000<sub>h</sub>) bytes, and is unmodified during this process.
</p>
<p>
Keys are derrived based on a key expansion algorithm that takes as input an unsigned short (16 bit) seed. I'm not
totally sure what expansion algorithm this is, or if it's something totally custom, but for now here's a snippet of
python code that implements the expansion:
</p>
<pre>
{%highlight "python"%}
def amAuthDiskInitKey(seed):
key = bytearray(0x8000)
for i in range(0x8000):
uVar1 = (seed * 2 >> 4 ^ seed * 2) >> 10 & 2 | seed << 2
uVar2 = uVar1 * 2
uVar3 = ((seed << 2) >> 4 ^ uVar1) >> 10 & 2 | uVar2
uVar1 = uVar3 * 2
uVar3 = (uVar2 >> 4 ^ uVar3) >> 10 & 2 | uVar1
uVar2 = uVar3 * 2
uVar3 = (uVar1 >> 4 ^ uVar3) >> 10 & 2 | uVar2
uVar1 = uVar3 * 2
uVar3 = (uVar2 >> 4 ^ uVar3) >> 10 & 2 | uVar1
uVar2 = uVar3 * 2
uVar3 = (uVar1 >> 4 ^ uVar3) >> 10 & 2 | uVar2
uVar1 = uVar3 * 2
uVar2 = (uVar2 >> 4 ^ uVar3) >> 10 & 2 | uVar1
seed = uVar2 | (uVar1 >> 4 ^ uVar2) >> 11 & 1
key[i] = seed & 0xff
return key
{% endhighlight %}</pre
>
<h2>&alpha;DVD Headers</h2>
<p>
Now that we know how to decrypt data on &alpha;DVDs we can search for the header. The header will always be
encrypted with a fixed key with seed <code>5369</code>. The header is a sequence of 53 bytes, located at offset 318
if it is in sector 16, and offset 508 if it is in sector 1 or 17.
</p>
<table>
<thead>
<tr>
<td colspan="17">Header Offset</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>A</td>
<td>B</td>
<td>C</td>
<td>D</td>
<td>E</td>
<td>F</td>
</tr>
</thead>
<tbody>
<tr>
<td><b>0</b></td>
<td colspan="4">Magic = F1FFFF1F<sub>h</sub></td>
<td colspan="2"></td>
<td colspan="2">Auth sector 1</td>
<td colspan="2">Auth sector 2</td>
<td colspan="2">Auth sector 3</td>
<td colspan="6"></td>
</tr>
<tr>
<td><b>1</b></td>
<td colspan="2">Data Offset</td>
<td colspan="4"></td>
<td colspan="2">??</td>
<td colspan="20">DVD Name</td>
</tr>
<tr>
<td><b>2</b></td>
<td colspan="2">Key seed</td>
<td colspan="2">Dummy number</td>
<td colspan="3"></td>
</tr>
</tbody>
</table>
<p>
To validate the decryption of a header, both the magic number and the DVD name are checked. The DVD name must start
with <code>SEGA_DVD</code>.
</p>
<p>
The key seed present is the header is used to generate a new key that will be used to decrypt the authentication
sectors.
</p>
<h2>The Authentication Sectors</h2>
<p>
The three sector addresses in the header are now used to perform a series of seeks and reads. We seek the drive by
requesting a read of 16 sectors, but disregarding the returned data. The first step is to choose the authentication
sector we wish to read. The six duplicates are present using the following offsets:
</p>
<ul>
<li>140<sub>h</sub></li>
<li>150<sub>h</sub></li>
<li>160<sub>h</sub></li>
<li>170<sub>h</sub></li>
<li>180<sub>h</sub></li>
<li>190<sub>h</sub></li>
</ul>
<p>We will refer to our chosen offset, from this list, as <code>n</code>.</p>
<ul>
<li>Seek to <code>[Auth sector 1] - 16</code></li>
<li>Read(16) <code>([Auth sector 1] + (n - 1)) & 0xFFFFFFF0</code></li>
<li>Seek to <code>[Auth sector 1] - 16</code></li>
<li>Seek to <code>[Auth sector 1] + [Auth sector 2] - 8</code></li>
<li>Read(16) <code>([Auth sector 1] + (n - 1)) & 0xFFFFFFF0</code></li>
<li>Seek to <code>[Auth sector 1] + [Auth sector 3] + 8</code></li>
<li>Read(16) <code>([Auth sector 1] + (n - 1)) & 0xFFFFFFF0</code></li>
</ul>
<p>
Each of the three reads are decrypted using the key we generated earlier, and are authentication block 1, 2, and 3
respectively. The actual data is at offset 31228 in these 16-sector blocks, and follows the following structure:
</p>
<table>
<thead>
<tr>
<td colspan="17">Header Offset</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
<td>5</td>
<td>6</td>
<td>7</td>
<td>8</td>
<td>9</td>
<td>A</td>
<td>B</td>
<td>C</td>
<td>D</td>
<td>E</td>
<td>F</td>
</tr>
</thead>
<tbody>
<tr>
<td><b>0</b></td>
<td colspan="4">Magic = F1FFFF1F<sub>h</sub></td>
<td colspan="4">Num magic</td>
<td colspan="4"><code>n</code></td>
<td colspan="4"></td>
</tr>
</tbody>
</table>
<p>
Num magic will be F1FFFF1F<sub>h</sub> in the first sector, F2FFFF2F<sub>h</sub> in the second, and F3FFFF3F<sub
>h</sub
>
in the third.
</p>
<p>
There is, however, one extra curveball. One of these three sectors is a dummy sector that contains nonsensiacal data
(in practice this appears to just be nulls). This is the sector indicated by the lower byte of the
<code>Dummy number</code> field in the alpha header. It is essential that this header is <b>not</b> valid.
</p>
<p>
Assuming we pass these checks, &alpha;DVD authentication succeeded. The disc will now be be read as usual, applying
the data offset from the alpha header before any operations. Coindidentally if an ISO image has been made of an
alphaDVD (which will be unable to pass authentication anyway), all sectors preceeding this offset can be stripped,
and the ISO now matches that of a non-alpha disc.
</p>
{% endblock %}

View File

@ -1,48 +1,71 @@
{% extends "sega.html" %}
{% block title %}PCP{% endblock %}
{% block body %}
{% extends "sega.html" %} {% block title %}PCP{% endblock %} {% block body %}
<h1>PCP</h1>
<h2>What is PCP?</h2>
<p>PCP is the protocol used for inter-process communication between services running on Ring* systems. I have no idea
what it stands for; head-canon it as Process Command Protocol or whatever you want really. The official
implementation is <code>libpcp</code>, which is statically linked in to binaries that make use of the protocll (and
is itself dependent on <code>amLib</code>).</p>
<p>On paper, there are many things the format would at first appear to
support, but is unable to due to the reference implementation in <code>libpcp</code>. Specification of this nature
will be <span class="mark">marked</span>. Custom implementations should be liberal in what they receive, and
conservative in what they transmit; marked specification are key areas of focus for this.</p>
<p>
PCP is the protocol used for inter-process communication between services running on Ring* systems. I have no idea
what it stands for; head-canon it as Process Communication Protocol or whatever you want really. The official
implementation is <code>libpcp</code>, which is statically linked in to binaries that make use of the protocol (and
is itself dependent on <code>amLib</code>).
</p>
<p>
On paper, there are many things the format would at first appear to support, but is unable to due to the reference
implementation in <code>libpcp</code>. Specification of this nature will be <span class="mark">marked</span>. Custom
implementations should be liberal in what they receive, and conservative in what they transmit; marked specification
are key areas of focus for this.
</p>
<h2>Protocol Specification</h2>
<p>We consider two processes communicating: a server, and a consumer. A server need only be capable of processing at
least one consumer concurrently, though implementations may desire the ability to do so.</p>
<p>When a server is ready to process a command, it transmits a single <code>&gt;</code> byte to its connected consumer.
<p>
We consider two processes communicating: a server, and a consumer. A server need only be capable of processing at
least one consumer concurrently, though implementations may desire the ability to process multiple consumers.
</p>
<p>
When a server is ready to process a command, it transmits a single <code>&gt;</code> byte to its connected consumer.
</p>
<p>The consumer responds with a CRLF-terminated payload packet, containing the command.</p>
<p>The server then responds syncronously with a CRLF-terminated payload packet. If a data transfer is being a performed,
this packet will contain <code>port</code> and <code>size</code> as paramaters.</p>
<p>In a server->consumer data transfer operation, the consumer connects to the provided port, and expects to receive
<p>
The server then responds syncronously with a CRLF-terminated payload packet. If a data transfer is being a
performed, this packet will contain <code>port</code> and <code>size</code> as paramaters.
</p>
<p>
In a server->consumer data transfer operation, the consumer connects to the provided port, and expects to receive
<code>size</code> bytes of data. It then closes the connection to the data port and transmits a <code>$</code> byte
to the server to ackgnowledge receipt. The server will only process this ackgnowledgement after it has succesfully
transmitted its data to a consumer.
</p>
<p>The characteristics of a consumer->server transfer as as yet undocumented. I'll get round to it!</p>
<p>If the server is unable to process a request for any reason, it may respond with a single <code>?</code>. This may be
due to a non existant command being requested, or it may be due to an invalid packet.</p>
<p>
If the server is unable to process a request for any reason, it may respond with a single <code>?</code>. This may
be due to a non existant command being requested, or it may be due to an invalid packet.
</p>
<h3>Payload format</h3>
<p>Payloads are a non-zero number of <code>&</code> delimited <code>=</code> seperated key-value pairs. i.e.
<code>key1=value2&key2=value2</code>. Both the key and the value can contain any alphanumeric character, and any of
the symbols <code>._-:@%/\{}</code>. The value may alternatively be a single <code>?</code>. Leading and trailing
ampersands are illegal, as is more than one ampersand consecuitevely. Empty strings for either the key or value is
likewise illegal.
<p>
Outside of special packets such as <code>&gt;</code> and <code>$</code>, all communication in both directions
strictly follows the following structure:
</p>
<p>Spaces (ascii 20<sub>h</sub>) and tabs (ascii 09<sub>h</sub>) are allowable whitespace. They may be present anywhere
<img src="{{ROOT}}/images/pcp-railroad.png" class="graphic" />
<p>
<i><code>Text</code></i> is defined as a series of one or more bytes matching <code>[a-zA-Z0-9._:@%/\{}-]</code>.
</p>
<p>
Spaces (ascii 20<sub>h</sub>) and tabs (ascii 09<sub>h</sub>) are allowable whitespace. They may be present anywhere
in the packet surrounding keys, values, any delimiter, or seperator, and will be ignored. They are not valid within
a key or a value.</p>
<p>Comments begin and end with the <code>#</code> symbol. They may appear at any point in a packet, and the packet
should be processed as if the comment is not there. <b>The content of comments observe the same restrictions as keys
and values. This notably includes no whitespace.</b></p>
a key or a value.
</p>
<p>
Comments begin and end with the <code>#</code> symbol. They may appear at any point in a packet, and the packet
should be processed as if the comment is not there.
<b
>The content of comments must match <i><code>Text</code></i
>. This notably includes no whitespace.</b
>
</p>
<h2>Format restrictions</h2>
<ul>
@ -65,24 +88,31 @@
<details>
<summary>libpcp bugs</summary>
<p>When parsing requests, libpcp null-terminates <code>?</code> values with an off-by-one error. This means if you
<p>
When parsing requests, libpcp null-terminates <code>?</code> values with an off-by-one error. This means if you
transmit <code>test=12345</code> followed by <code>test=?</code>, it will be parsed as if you had transmitted
<code>test=?2</code>. This could actually be an issue with how I am inspecting the internal state; I will
update/remove this spoilier once I've had a chance to dig deeper.
</p>
<p>libpcp does not enforce ampersand placement, causing strange memory artifacts. Consumers <b>MUST</b> conform to
the ampersand specification.</p>
<p>libpcp allows empty keys and values. The case of an empty key causes the pair to be keyed with an empty string,
<p>
libpcp does not enforce ampersand placement, causing strange memory artifacts. Consumers <b>MUST</b> conform to
the ampersand specification.
</p>
<p>
libpcp allows empty keys and values. The case of an empty key causes the pair to be keyed with an empty string,
however an empty value causes it to contain a random value, reading from memory where the previous packet was
decoded. In fact, the presence of a <code>=</code> is not validated either, likewise causing it to read
undefined regions of memory. Consumers <b>MUST</b> always provide both the key and the value.</p>
undefined regions of memory. Consumers <b>MUST</b> always provide both the key and the value.
</p>
</details>
<details>
<summary>Using libpcp</summary>
<p>I have reproduced a locally functioning standalone distribution of libpcp, warts and all. Eventually I will
<p>
I have reproduced a locally functioning standalone distribution of libpcp, warts and all. Eventually I will
produce some basic docs for making use of the exported functions, and will hopefully be able to provide a
download for a precompiled library. I'm still unsure if the source code will ever be made available however
because it's a <em>very</em> true to the original reproduction, potentially problematically so!</p>
because it's a <em>very</em> true to the original reproduction, potentially problematically so!
</p>
<p>If rather than implementing your own pcp you wish to use <em>the</em> libpcp, watch this space.</p>
</details>