This change provides an optimized hand-written strlen function for
SuperH targets. The original plan was to declare the C-based naive
version weak and just let the linker figure out the proper one to use,
but unfortunately static libraries don't work like that; ld
intentionally stops at the first version even if it's weak. Instead,
some #ifdef's are used in the C-based strlen to not compile it when
unneeded.
The optimized strlen uses 4-byte accesses and cmp/str.
On sh-generic targets, the headers <bits/cpucap.h> (in C) and
<bits/asm/cpucap.h> (in assembler) provide definitions to acces the
__cpucap symbol which provides information on the CPU.
Currently, a single capability __CPUCAP_SH4ALDSP is defined; it
represents the SH4 extended instructions together with the integrated
DSP instructions. The main uses of this capability are [movua.l]
(unaligned reads) and [ldrc] (built-in tight loops).
Capabilities are initialized to 0 (their safest default) and the runtime
can enable them based on what hardware is running.
The method is rather naive - digits read as an integer, then multipled
by a power of 10 or 2. This does not always give exact results, but it's
close enough for now. A stub support for long double larger than 64 bits
is provided.
The presumed bug where the value computed without the sign overflows
even though the negative result can be represented is not actually a
problem, because this only happens with signed results and the temporary
value is computed as unsigned (thus with extra range).
This is enough to support the standard and likely the C++ library and
external programs to port, but also the most we can do without a proper
locale data storage and more target-specific developments that aren't a
priority right now.