$Id: ed2k_crc,v 1.1.1.1 2003/05/17 06:11:53 ericprev Exp $

e-donkey 2000 CRC support
-------------------------

since DCTC 0.85.0, global e-donkey 2000 (ed2k) CRC are supported. The global CRC
allows detection of corrupted file.

I. How it works
---------------

The CRC is computed on the whole file. The CRC looks like a set of 16 bytes
(currently displayed as 32 hexdigits). To see an example of CRC, try
"md5sum /bin/login" in a shell. On my computer, I obtain this:

cd2824ef4794520bcd1f422e4b9a8d17  /bin/login
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ yes I know, it is not a CRC here, it is a hash
                                 also known as a digest.

Modifying the file, even 1 bit, adding bytes (even null byte) or removing bytes
will produce a different value. Technically, the value is not uniq but the
probability to have 2 files sharing the same CRC is nearly 0. The CRC acts a bit
like a fingerprint (e-donkey 2000 works using them instead of filename).

  I.1 extended CRC
  ----------------

md5sum uses a 1 level CRC i.e. the CRC is computed on the file itself. ed2k uses
a 2 levels CRC. The file is splitted in slice of a given size (approximately
9.25MB), a CRC is computed on each slice and then a CRC is computed or all the
slice.


                             global CRC        (<= this value is the ed2k CRC)
                            /          \
                           /            \
                          /              \
                    partial CRC1     partial CRC 2
                         |                |
                         |                |
                         |                |
                     part 1 of        part 2 of
                     the file         the file

With a such computation, using a valid global CRC, you are sure to have a valid
file and with an invalid one, using the partial CRC, you can know which part of
the file is corrupted. 

In the following description, the partial CRC will be named L0 CRC and the 
global CRC will be named CRC :)


II. detecting and correcting 
----------------------------

The ed2k CRC allows detection of corrupted files but it is not enough. Detecting
corrupted file is a good idea but correcting buggy files is a better one.
Unfortunatelly, without anything else, there is no way to correct a file because
only the CRC (global) is available, not the L0 CRCs. 

Thanks to a protocol extension (see below), DCTC and dchub can exchange L0 CRCs
of files.

Once the L0 CRCs were received, the client can compute its L0 CRCs for each part
of the file it wants to check and compares it with the obtained L0 CRCs.
Different L0 CRCs will indicate invalid part (=part to redownload)

Note: because the L0 CRCs are computed on part of file having a defined size
      (~9.25MB), a file may have only 1 partial CRC. In such condition, finding
      the erroneous part of file can be performed even if the L0 CRC is not
      available because an incorrect L0 CRC will generate an incorrect global
      CRC.

III. protocol extension
-----------------------

DCTC and dchub provides 2 commands named "$MD4Get0" and "$MD4Set" to provide
access to the L0 CRCs. (see Documentation/protocol_extension file in dchub for
a description). Using them, a client can obtain the L0 CRCs of a file from 
another client having the file (the querying client must check the validity of
the obtained L0 CRC by computing a CRC on them, it should obtain the (global)
CRC). dchub (v0.4.0 and above) also acts as a L0 CRC repository thus even if no
client has the L0 CRCs of a file, the hub may have it if one day, someone
connected on it asked for the L0 CRC and someone has replied.

This 2 commands are available if the hub/client has a capability named "MD4x".
This capability is set on dchub v0.4.0 and above and on DCTC v0.85.0 and above.


