This change concerns only Chibi. The portable implementation is
still kept around because it is... well... portable and can be
used by other Scheme implementations.
We can't use 'integer->hex-string' alone to print out SHA-224/256
digest because it rightly converts #x00001234 into "1234", while
we need to keep the padding zero nibbles and get "00001234".
'hex' got renamed into 'hex32' because SHA-512 will need some
different 'hex64' which returns 16-character-long strings.
* Boundary cases
Both SHA-224 and SHA-256 use 512-bit data chunks and have a special
behavior when chunk size is near the 448-bit boundary.
* Source type support
Basic smoke tests for accepting bytevectors and binary input ports
as valid arguments.
The original Scheme implementation is astonishingly slow. Rewriting
SHA-2 in C yields around x10000 speed boost for premade strings and
bytevectors. For input ports this is alleviated to x100 boost.
The implementation is divided into two parts: native computational
backend and thin Scheme interface. It is tedious to properly do IO
from C, so the Scheme code handles reading data from an input port
and the C code performs actual computations on byte buffers (which
is also used to handle strings and bytevectors directly).
Scheme wrapper reads data in chunked manner with 'read-bytevector'.
Currently, the chunk size has insignificant impact on performance
as soon as it is bigger than 64. Also, using simply read-bytevector
turned out to be 33% faster than preallocating a buffer and filling
it with read-bytevector!
One tricky part is how to get exact 32-bit integers in C89. We have
no <inttypes.h> there, so instead we use <limits.h> to see whether
we have a standard type with suitable boundaries.
The other one is how to return a properly tagged sha_context from C.
Chibi FFI currently cannot handle the case when a procedure returns
either a C pointer (which needs to be boxed) or an exception (which
should be left as is). To workaround this sexp_start_sha() receives
a dummy argument of type sha_context; this makes Chibi FFI to put a
proper type tag into 'self', which is then extracted in the C code.
This commits adds a new shared library 'crypto$(SO)' with intent to
keep there all native code of (chibi crypto) libraries. This allows
to simply put any future native implementation of SHA-512 or MD5 in
some md5.c and just include those files into crypto.stub.
By convention, a library meant for testing exports "run-tests".
Also by convention, assume the test for (foo bar) is (foo bar-test),
keeping the test in the same directory and avoiding confusion since
(chibi test) is not a test for (chibi).
- Avoids the hack of "load"ing test, with resulting namespace complications.
- Allows keeping tests together with the libraries.
- Allows setting up test hooks before running.
- Allows implicit inference of test locations when using above conventions.
Previous code was losing the carry bit in 'all ones' case, when adata[i]
= bdata[i] = SEXP_UINT_T_MAX, and carry = 1 too. In this case expression
(SEXP_UINT_T_MAX - bdata[i] - carry) overflows and yields an incorrect
value SEXP_UINT_T_MAX which results into carry being incorrectly set to
0 after addition.
We need to avoid the second overflow when calculating the new value of
the carry bit. One way to do this is at first check for the overflow in
(adata[i] + bdata[i]), and then throw in the (previous) carry bit.
I have also given "n" more expressive name and added a comment about
the reason why we need that temporary variable.
Naturally, fixed-width integer arithmetics can overflow. Chibi handles
it pretty well in general, but one case was missing: it is negation of
the minimal negative number that can be represented as a fixnum. That is,
sexp_fx_neg() must not be applied to sexp_make_fixnum(SEXP_MIN_FIXNUM)
because it overflows and returns an identical fixnum back.
sexp_fx_neg() itself seems to be used right in the current code, but
sexp_fx_abs()--which is defined in terms of sexp_fx_neg()--could be
applied to the forbidden number when used to retrieve an unboxed value
via the sexp_unbox_fixnum(sexp_fx_abs(x)) pattern. So I have added a
separate macro that safely calculates unboxed absolute value of a fixnum,
and replaced sexp_unbox_fixnum(sexp_fx_abs(x)) usages with it.
Current implementation uses two-bit tag for fixnums, plus we need one
bit for the sign, so fixnums have (machine word - 3) significant bits.
Regression tests cover word sizes of 16, 32, 64, and 128 bits (for the
sake of past- and future-proofness).
sexp_bignum_expt() does not have a regression test because we need to
check it with negative exponents like -2^29, so the base must be over
at least 2^(2^29) for the differences to be visible. Fun fact: bignum
representation of such number takes around 1/32 of the available user-
space memory, which makes testing on anything except 32-bit systems
unreasonable (4 TB of RAM anyone?)