(*) O_APPEND is *not* functional yet, this is just a hack for write-only
streams. Currently O_APPEND does not reposition the cursor at the end of
the file before every write.
There is no evidence that BFile_Create() keeps the address and writes
to it later on, but I'd rather be overly careful than have to debug a
stack corruption problem in half a year :)
I'm pretty sure it makes no difference because the OS does not rely on
interrupts for most (if not all) of its DMA operations, but it's better
to keep it clean anyway.
This helped locate some bugs:
* read() could read past EOF due to BFile_Read() allowing you to read up
until the end of the last sector, beyond the file size
* pread() did not restore the file offset because the negative seek at
the end is not relative (that was the CASIOWIN fs API), so pread()
could not actually be written without knowing the current position
* lseek() would clamp you to EOF but still return its out-of-bounds
arguments, as a direct result of BFile_Seek() doing that
Benefits:
* Made pread() a generic function
I saw a crash with the 12 kB stack. Added an error message to diagnose
further similar issues, and bumbed the stack to 14 kB. That's a lot of
space just for BFile but stability is queen... :x
* Mark SPU memory as sleep-blocking.
* Perform 4-byte accesses only in dma_memset() and dma_memcpy() (32-byte
accesses freeze as one would expect).
This change does *NOT* implement support for SPU's integrated DMAC.
The checks for VRAM access account for image columns intersecting the
longword before the start of a VRAM line, but not the longword after the
start of a VRAM line. This is now fixed.
Nothing particular to change, simply make sure that the DMA channels
have higher priority than the USB module, otherwise the BEMP interrupt
might be executed before the DMA frees the channel, resulting in the
transfer failing because the channel is still busy.
Also reduce BUSWAIT since it works even on high overclock levels, and
keeping it high won't help increase performance.
This changes fixes the way gint uses the FIFO controllers D0F and D1F
to access the FIFO. It previously used D0F in the main thread and D1F
during interrupt handling, but this is incorrect for several reasons,
mainly the possible change of controllers between a write and a commit,
and numerous instances of two FIFOs managing the same pipe caused by
the constant switching.
gint now treats FIFO controllers as resources allocated to pipes for
the duration of a commit-terminated sequence of writes. The same
controller is used for a single pipe in both normal and interrupt
modes, and released when the pipe is committed. If no controller is
available, asynchronous writes fail and synchronous ones wait.
The fxlink API is also added with a small amount of functions, namely
to transfer screenshots and raw text. Currently these are synchronous
and do not use the DMA, this will be improved later.
Finally:
* Removed pipe logic from src/usb/setup.c, instead letting pipes.c
handle the special case of the DCP (which might be regularized later)
* Removed the usb_pipe_mode_{read,write} functions as they're actually
about FIFo controllers and it's not clear yet how a pipe with both
read and write should be handled. This is left for the future.
* Clarified end-of-sequence semantics after a successful commit.
The function was designed with multi-threaded concurrency in mind,
where threads can take over while the lock is held and simply block
trying to acquire it, which allows the lock holder to proceed.
However interrupt handlers are different; they have priority, so once
they start they must complete immediately. The cannot afford to block
on the lock as the program would simply freeze. In exchange, they clean
up before they leave, so there are some guarantees on the execution
state even when interrupted.
The correct protection is therefore not a lock but a temporary block on
interrupts. There is no data race on the value of the saved IMASK
because it is preserved during interrupt handling.
This change introduces new sleep_block() and sleep_unblock() functions
that control whether the sleep() function actually sleeps. This type of
behavior was already implemented in the DMA driver, since DMA access to
on-chip memory is paused when sleeping (on-chip memory being paused
itself), which would make waiting for a DMA transfer a freeze.
Because DMA transfers are now asynchronous, and USB transfers that may
involve on-chip memory are coming, this API change allows the DMA and
USB drivers to block the sleep() function so that user code can sleep()
for interrupts without having to worry about asynchronous tasks
requiring on-chip memory to complete.
This change introduces the global "feature function" that can be
enabled in getkey() to receive events, and use them for
application-wide features. This would be useful, for instance, to
toggle screen backlight with a different key combination that the
default, to capture screenshots, or to implement a catalog.
When enabled, the feature function is present with all new events and
can perform actions, then decide whether or not to return them from
getkey().
Bounds would be moved before drawing the border, therefore displacing
the border. Since drect() already performs all the necessary checks,
this change doesn't try to save a couple of function calls and drops the
redundant checks.
* Properly define the callback time of a write/commit as the time when
the pipe is available again for further writing.
* Refuse commits when writes are pending; instead, enforce a strict
order of finishing writes before committing, which makes sense since
consecutive writes are ordered this way already.
* Properly support callbacks for writes and for commits.
* Define the synchronous APIs in terms of waiting until the callbacks
for equivalent asynchronous functions are invoked (plus initial
waiting for pipes to be ready).
This change adds asynchronous capabilities to the DMA API. Previously,
transfers would start asynchronously but could only be completed by a
call to dma_transfer_wait(). The API now supports a callback, as well
as the dma_transfer_sync() variant, to be consistent with the upcoming
USB API that has both _sync and _async versions of functions.
The interrupt handler of the DMA was changed to include a return to
userland, which is required to perform the callback.
* dma_transfer() is now an obsolete synonym for dma_transfer_async()
with no callback.
* dma_transfer_noint() is now a synonym for dma_transfer_atomic(), for
consistency with the upcoming USB API.
* Change gint_inth_callback()
* Add intc_handler_function() to use C functions as handlers instead of
writing assembler, and use it in the RTC and USB
* Revisit the TMU handlers, which after moving out the callbacks, now
fit into 3 gates (great!), and adapt the ETMU handler
* Improve the timer driver (less code = better code, removed magic
constants assuming the VBR layout on SH3/SH4, etc.)
* Remove 2 gates and a gap from the compact scheme on SH3
* Define timer_configure() to replace timer_setup(), which could not be
cleanly updated to support GINT_CALL()
* Replace rtc_start/stop_timer with rtc_periodic_enable/disable, which
is less confusing because of ETMU being "RTC timers"
Changes in the driver and world system:
* Rewrite driver logic to include more advanced concepts. The notion of
binding a driver to a device is introduced to formalize wait(); power
management is now built-in instead of being handled by the drivers
(for instance DMA). The new driver model is described in great detail
in <gint/drivers.h>
* Formalized the concept of "world switch" where the hardware state is
saved and later restored. As a tool, the world switch turns out to be
very stable, and allows a lot of hardware manipulation that would be
edgy at best when running in the OS world.
* Added a GINT_DRV_SHARED flag for drivers to specify that their state
is shared between worlds and not saved/restored. This has a couple of
uses.
* Exposed a lot more of the internal driver/world system as their is no
particular downside to it. This includes stuff in <gint/drivers.h>
and the driver's state structures in <gint/drivers/states.h>. This is
useful for debugging and for cracked concepts, but there is no
API stability guarantee.
* Added a more flexible driver level system that allows any 2-digit
level to be used.
Feature changes:
* Added a CPU driver that provides the VBR change as its state save.
Because the whole context switch relied on interrupts being disabled
anyway, there is no longer an inversion of control when setting the
VBR; this is just part of the CPU driver's configuration. The CPU
driver may also support other features such as XYRAM block transfer
in the future.
* Moved gint_inthandler() to the INTC driver under the name
intc_handler(), pairing up again with intc_priority().
* Added a reentrant atomic lock based on the test-and-set primitive.
Interrupts are disabled with IMASK=15 for the duration of atomic
operations.
* Enabled the DMA driver on SH7305-based fx-9860G. The DMA provides
little benefit on this platform because the RAM is generally faster
and buffers are ultimately small. The DMA is still not available on
SH3-based fx-9860G models.
* Solved an extremely obnoxious bug in timer_spin_wait() where the
timer is not freed, causing the callback to be called when interrupts
are re-enabled. This increments a random value on the stack. As a
consequence of the change, removed the long delays in the USB driver
since they are not actually needed.
Minor changes:
* Deprecated some of the elements in <gint/hardware.h>. There really is
no good way to "enumerate" devices yet.
* Deprecated gint_switch() in favor of a new function
gint_world_switch() which uses the GINT_CALL abstraction.
* Made the fx-9860G VRAM 32-aligned so that it can be used for tests
with the DMA.
Some features of the driver and world systems have not been implemented
yet, but may be in the future:
* Some driver flags should be per-world in order to create multiple
gint worlds. This would be useful in Yatis' hypervisor.
* A GINT_DRV_LAZY flag would be useful for drivers that don't want to
be started up automatically during a world switch. This is relevant
for drivers that have a slow start/stop sequence. However, this is
tricky to do correctly as it requires dynamic start/stop and also
tracking which world the current hardware state belongs to.
* Add the power management functions (mostly stable even under
overclock; requires some testing, but no known issue)
* Add a dynamic configuration system where interfaces can declare
descriptors with arbitrary endpoint numbers and additional
parameters, and the driver allocates USB resources (endpoints, pipes
and FIFO memory) between interfaces at startup. This allows
implementations of different classes to be independent from each
other.
* Add responses to common SETUP requests.
* Add pipe logic that allows programs to write data synchronously or
asynchronously to pipes, in a single or several fragments, regardless
of the buffer size (still WIP with a few details to polish and the
API is not public yet).
* Add a WIP bulk IN interface that allows sending data to the host.
This will eventually support the fxlink protocol.
The question of how to handle a partially-restored world state begs for
an elegant symmetrical answer, but that doesn't work unless both kernels
do the save/restore for themselves. So far, things have worked out
because any order works since interrupts are disabled therefore
partially-restored drivers are inactive.
However the USB module requires waits that are best performed with
timers, so the order cannot be chosen arbitrarily. This commit enforces
a gint-centric order where code from a gint driver is only run when all
lower-level drivers are active. This solves some pretty bad freezes with
the USB module.
The new allocator uses a segregated best-fit algorithm with exact-size
lists for all sizes between 8 bytes (the minimum) and 60 bytes, one list
for blocks of size 64-252 and one for larger blocks.
Arenas managed by this allocator have built-in statistics that track
used and free memory (accounting for block headers), peak memory, and
various allocation results.
In addition, the allocator has self-checks in the form of integrity
verifications, that can be enabled with -DGINT_KMALLOC_DEBUG=1 at
configuration time or with the :dev configuration for GiteaPC. This is
used by gintctl.
The kmalloc interface is extended with a new arena covering all unused
memory in user RAM, managed by gint's allocator. It spans about 4 kB on
SH3 fx-9860G, 16 kB on SH4 fx-9860G, and 500 kB on fx-CG 50, in addition
to the OS heap. This new arena is now the default arena for malloc(),
except on SH3 where some heap problems are currently known.
This change introduces a centralized memory allocator in the kernel.
This interface can call into multiple arenas, including the default OS
heap and planned arenas managed by a gint algorithm.
The main advantage of this method is that it allows the heap to be
extended over previously-unused areas of RAM such as the end of the
static RAM region (apart from where the stack resides). Not using the OS
heap is also sometimes a matter of correctness since on some OS versions
the heap is known to fragment badly and degrade over time.
I hope the deep control this interfaces gives over meomry allocation
will allow very particular applications like object-specific allocators
in fragmented SPU memory.
This change does not introduce any new algorithm or arena so programs
should behave exactly as before.
The new keyboard device (keydev) interface implements the kernel's view
of a keyboard providing input events. Its main role is to abstract all
the globals of the KEYSC driver and getkey functions into a separate
object: the "keyboard device".
The device implements event transformations such as modifiers and
repeats, instead of leaving them to getkey. While this can seem
surprising at first, a real keyboard controller is responsible for
repeats and modifier actions depend on the state of the keyboard which
is only tracked in real-time.
In this commit, getkey() has not changed yet apart from indirectly using
the keydev interface with pollevent(). It will be changed soon to use
event transforms in keydev_read(), and will be left in charge of
providing repeat profiles, handling return-to-menu, backlight changes
and timeouts, all of which are user convenience features.
* dnsize() works like dsize() but a limit on the number of bytes is
specified. This is useful to obtain the length of a substring.
* drsize() has a reverse limit; the input specifies a number of pixels
and the function determines how much of the input fits. This is useful
for word wrapping algorithms.
This parameter controls the maximum number of glyphs to print.
For backwards compatibility, it is automatically inserted by a macro in
older calls with only 7 parameters.
This function performs a more rigorous analysis of the mapped region by
checking continuity. So far all pages mapped in userpsace have been
contiguous, so the results are identical to gint[HWURAM].
Page size is now optionnaly provided in mmu_translate() and its
subfunctions; programs that use this function need to add a second NULL
parameter.
* Create an `src/3rdparty` folder for third-party code (to add the
Grisu2B alfogithm soon).
* Split the formatted printer into gint's kprint (src/kprint), its
extension and interface (include/gint/kprint.h), and its use in the
standard stdio functions (src/std/print.c).
* Slightly improve the interface of kformat_geometry() to avoid relying
on knowing format specifiers.
* Add a function to register more formatters, to allow floating-point
formatters without requiring them.
The repeat delays of getkey() are adjusted automatically, however a
repeat that is currently going on might be affected.
Also, repeat delays are always approximated as a whole number of
keyboard scans so an increase in scan frequency can impact the speed at
which repeats are emitted.
When switching to dynamic TLB the counting of mapped memory was no
longer required at boot time. This was restored weirdly for fx-CG 50 and
not at all for fx-9860G; this is now fixed.
Some very trivial applications might not require its symbols explicitly,
thus the need to force a dependency (otherwise OS interrupts such as the
KEYSC are not disabled and crash the handler very quickly).
This change adds a new TMU function timer_spinwait() which waits for a
timer to raise its UNF flag. This makes it possible to wait even when
interrupts are disabled.
This is used by the new CPG function sleep_us_spin() which waits for a
given delay without using interrupts. This is currently used in SPU
initialization.
* Specify a line height for the default fx-CG 50 font so that the height
returned by dsize() is correctly 9, not 11.
* Adjust vertical and horizontal alignment in dtext_opt() and
dprint_opt() by a full pixel (DTEXT_BOTTOM, DTEXT_RIGHT) and half a
pixel (DTEXT_MIDDLE, DTEXT_CENTER) to make sure that the specified
position is within rendered text (as in DTEXT_LEFT and TEXT_TOP) and
to improve centering of strings with odd width or odd height, for
which there is only one valid position.
This commit introduces custom character spacing with a new fxconv
parameter "char-spacing". Word spacing is also tied to the width of the
space character (0x20). This removes the need for special semantics on
the space character, but requires that its size be specified with gray
pixels for proportional fonts.
This also fixes problems with the size of spaces in dsize() not being
correlated with their size during rendering, since on fx-9860G topti
already used the glyph's with as word spacing.
Since fxconv changes but gint's Makefile does not track updates to
external tools, a full rebuild of gint is required past this commit.
This commit introduces a large architectural change. Unlike previous
models of the fx-9860G series, the G-III models have a new user RAM
address different from 8801c000. The purpose of this change is to
dynamically load GMAPPED functions to this address by querying the TLB,
and call them through a function pointer whose address is determined
when loading.
Because of the overhead of using a function pointer in both assembly and
C code, changes have been made to avoid GMAPPED functions altogether.
Current, only cpu_setVBR() and gint_inth_callback() are left, the second
being used specifically to enable TLB misses when needed.
* Add a .gint.mappedrel section for the function pointers holding
addresses to GMAPPED functions; add function pointers for
cpu_setVBR() and gint_inth_callback()
* Move rram to address 0 instead of the hardcoded 0x8801c000
* Load GMAPPED functions at their linked address + the physical address
user RAM is mapped, to and compute their function pointers
* Remove the GMAPPED macro since no user function needs it anymore
* Add section flags "ax" (code) or "aw" (data) to every custom .section
in assembler code, as they default to unpredictable values that can
cause the section to be marked NOLOAD by the linker
* Update the main kernel, TMU, ETMU and RTC interrupt handlers to use
the new indirect calling method
This is made possible by new MMU functions giving direct access to the
physical area behind any virtualized page.
* Add an mmu_translate() function to query the TLB
* Add an mmu_uram() function to access user RAM from P1
The exception catching mechanism has been modified to avoid the use of
GMAPPED functions altogether.
* Set SR.BL=0 and SR.IMASK=15 before calling exception catchers
* Move gint_exc_skip() to normal text ROM
* Also fix registers not being popped off the stack before a panic
The timer drivers have also been modified to avoid GMAPPED functions.
* Invoke timer_stop() through gint_inth_callback() and move it to ROM
* Move and expand the ETMU driver to span 3 blocks at 0xd00 (ETMU4)
* Remove the timer_clear() function by inlining it into the ETMU handler
(TCR is provided within the storage block of each timer)
* Also split src/timer/inth.s into src/timer/inth-{tmu,etmu}.s
Additionally, VBR addresses are now determined at runtime to further
reduce hardcoded memory layout addresses in the linker script.
* Determine fx-9860G VBR addresses dynamically from mmu_uram()
* Determine fx-CG 50 VBR addresses dynamically from mmu_uram()
* Remove linker symbols for VBR addresses
Comments and documentation have been updated throughout the code to
reflect the changes.
This change introduces a new getkey_repeat_filter() function that can be
used to individually accept, deny or delay repeat events for specific
keys and timings.
* Turn on GCC's -O3 for bopti files
* Remove the bopti_render_noclip() step
* Use rbox as early as possible to avoid moving memory around
* A lot of local grinding
* Defined the single-column single-position (SCSP) situation where a
single column of the input is blit on a single position of the VRAM.
Provided optimized assembly and a specialized bopti_render_scsp()
function.
* Improved the rendered by reducing the amount of computation and
clarifying the semantics of the rbox.
* Separated rbox setup from clipping by making bopti_render_clip() a
purely abstract superset of bopti_render_noclip().
This new mechanism allows an add-in to be restarted after exiting by
just never exiting in the first place, calling gint_osmenu() instead.
This makes sure that we can relaunch the add-in immediately, which is
normally possible through an option in the OS though no OS-independent
method of setting it is currently known.
Because this is gint_osmenu(), known pitfalls apply. On all platforms,
it is necessary to prepare the first frame before leaving. On fx-CG 50,
the inevitable display border is also there.
Disables spread spectrum by default so that the frequency estimations of
the CPG driver (notably used by the timer driver and libprof) are more
accurate.
This commit changes the interrupt handler arrangement to support the PRI
interrupt on SH3 (a gap is needed between 0xaa0 and its helper).
It also introduces the use of the _gint_inth_callback function for the
callback, which provides dynamic TLB during the interrupt, and revealed
a bug about IMASK not being set automatically on SH3.
Finally, it sets the interrupt settings of the RTC more conservatively,
by wiping RCR1 and the carry, alarm and periodic interrupt flags during
initialization and context restoration.
This change fixes very weird bugs first observed with the RTC, related
to IMASK not being updated when an interrupt occurs to avoid
re-interruption.
On SH4 there is a CPUOPM setting that automatically sets IMASK to the
level of the accepted interrupt, which is so exactly what every kernel
needs that I can't figure out why this isn't the only behavior.
Turns out on SH3 it's not even an option. This commit sets IMASK to 15
when accepting a callback on SH3. This most notably prevents the gray
engine from updating the screen so callbacks need to be made very short.
This change modifies the font_t type to replace the concept of charset
with a more generic list of non-overlapping Unicode blocks defined by a
starting code point and a size.
It also takes advantage of the assembly feature of fxconv, introduced
for libimg long after the first version of topti, to support pointers in
the converted structure rather than having to tediously compute offsets
within a variable-size structure.
DGRAY_PUSH_ON/OFF will push the current gray engine state to a stack
before transitioning to on/off mode. DGRAY_POP will later recover the
saved state and transition back to it.
* Define dgray() to replace gray_start() and gray_stop()
* Introduce a mechanism to override the d*() functions rather than using
another set of functions, namely g*(). Gray rendering should now be
done with d*() (a compatibility macro for g*() is available until v2.1).
* Gray engine now reserves TMU0 at the start of the add-in to prevent
surprises if timers are exhausted, so it nevers fails to start
* Replace other gray engine functions with dgray_*()
* More general rendering functions (in render/) to lessen the burden of
porting them to the gray engine. As a consequence, dtext_opt(),
dprint_opt() and drect_border() are now available in the gray engine,
which was an omission from 230b796.
* Allow C_NONE in more functions, mainly on fx-CG 50
* Remove the now-unused dupdate_noint()
Since both platforms now have their VBR and gint-specific data loaded
along the add-in's data, the .gint.data section is entirely unused.
The .gint.bss section is still used for uninitialized objects (it has
different semantics than .bss which is initially cleared) and the
.gint.data.sh3 and .gint.bss.sh3 sections that are dropped on the
SH4-only fx-CG 50 are also still used.
This change moves interrupt handler from VBR + 0x640 to VBR + 0x200, in
the gap between the exception and TLB miss handlers.
This new scheme is not limited to VBR+0x200 .. VBR+0x400 as new large
block numbers can be used to jump over the TLB miss handler and the
interrupt handler entry points.
I have recenty discovered that the so-called "rram" section used by gint
to store its VBR space and a couple memory structures gets overwritten
when returning to the main menu. It is thus necessary to get rid of it
and store that data somewhere else.
My current lead is to have it at the start of the static RAM by querying
its address in the TLB. However, the static RAM is very small on SH3
(8k) so the VBR must be made more compact.
This change elaborates the event code translation scheme used on SH3 to
emulate SH4 event codes. It is now used to translate the event codes to
a gint-specific VBR layout that leaves no gaps and thus reduces the size
of the VBR space. The gint_inthandler() method has to be modified for
every new SH3 interrupt to maintain this scheme.
* Reduce the keyboard queue size from 64 to 32, which is more than
enough even for real-time games with multiple key presses.
* Pack the driver_event_t structure of the keyboard driver to make it 4
bytes rather than 6 bytes. Combined with the previous item, this saves
256 bytes off the BSS section (which is 3% of the SH3's static RAM).
* As part of a debugging attempt, updated the watchdog delay code in
iokbd_delay() to make it usable in the current version of gint.
* Restored port registers more aggressively in iokbd_row().
This change adds optimized versions of the core memory functions,
relying on 4-alignment, 2-alignment, and the SH4's unaligned move
instruction to (hopefully) attain good performance in all situations.
This change adds a new HWCALC model, HWCALC_FXCG_MANAGER, which
identifies Casio's official fx-CG Manager software. Both the Prizm and,
to my surprise, the fx-CG Manager use the old RAM address of 88000000
(P1) and a8000000 (P2) instead of the new fx-CG 50 address of 8c000000
(P1) and ac000000 (P2).
The VRAM is hence adjusted at startup to move hardcoded pointers into
the proper space. Added to the kernel moving the VBR space dynamically
on the Prizm, this allows gint to be fully compatible with these
platforms.
The fx-CG Manager is detected by its product ID made of 0xff.
Also adds a proper interface to the R61524 driver, even though it's not
any more complete than previously, and fixes an oversight where the
HWURAM entry of the kernel data array was no longer computed since the
TLB management change.
As of now, the fx-CG Manager still has a bug regarding return-to-menu
since returning from the main menu doesn't work very well and often
loops. This has been seen occasionally on some Graph 90+E so it's
unlikely to be a platform-specific problem.
This commit minimally changes the signature of timer_setup() to greatly
simplify timer management, allowing to user to let the library choose
available timers dynamically depending on the settings.
Gray quality is better on the Graph 35+E II, it still flickers a lot on
other models (as I remembered). There might be better settings out there
but I'm not sure we can reach the quality of the current Graph 35+E II
defaults. The Graph 75+E with which I tested might also be different
from other T6K11 such as the smaller Graph 35+E.
* Removed .pretext sections since the TLB is now entirely dynamic; left
only .text.entry for the start symbol.
* Reworked the main files of src/core to move the INTC to its own driver
and let the kernel handle only VBR space and CPU (now: VBR & CPUOPM).
* Moved src/core/gint.c to src/core/kernel.c and centralized all driver
loops that save or restore context for more robustness. This leaves
the callbacks of cpu_setVBR() (formerly gint_setvbr()) pretty short.
* Coalesced gint_switch_out() and gint_switch_in() into a single
function callback for cpu_setVBR().
* Added an abstraction of interrupt signals as an enumerated value so
that drivers no longer hardcode the IPR and IMR numbers and bits,
sometimes even with isSH3() due to differences in the SH7705.
* Changed the interrupt blocking method in cpu_setVBR() from SR.BL=1 to
SR.IMASK=15 so that TLB misses are still handled. This removes the
need for callback functions to be GMAPPED.
* Moved gint_osmenu() and its utilities to a new file src/core/osmenu.c.
This change includes three reliability improvements in handlers:
1. TMU handlers now actively check for the UNF flag to go low rather
than expecting it to do so right away.
2. CPUOPM.INTMU is now set so that IMASK it updated at every interrupt
(which is absolutely required for nested interrupts!).
3. gint_inth_callback() no longer performs transfers between user bank
and kernel bank while in user bank, because this is when interrupts
are enabled and thus likely to corrupt the kernel bank; rather, it
now does it while in kernel bank with interrupts disabled.
This change fixes a never-should-have-worked problem where the ETMU
interrupt handler loses track of the timer ID before attempting to call
timer_stop(), resulting in complete nonsense.
And also a similar problem in timer_wait().
This change introduces two new functions dtext_opt() and dprint_opt()
that have both color and alignment options. The regular dtext() and
dprint() have been changed to always used bg=C_NONE which is what most
calls want.
This change removes the RRAM region which was inherited from the fx9860g
memory layout but no longer relevant on fxcg50. This removed one
occurrence of a hardcoded user stack address in the linker script, the
other being the VBR address. But since the VBR only contains
position-independent code that is manually "relocated" at startup, the
linker script needs not actually use its value, so this is not a true
dependency.
gint should now more or less be able to boot up on an fxcg20, except for
the hardcoded VRAM addresses which need to be moved to the fxcg20 system
stack.
This change enables interrupts within timer callbacks, making it
possible to load pages to MMU while handling a timer underflow. The call
to TLB_LoadPTEH() has been moved directly into the VBR handler to avoid
jumping to ILRAM for a short call on SH4.
The TMU and ETMU handlers have been changed to callback through a new
function gint_inth_callback() that saves the user bank and a few
registers, then invokes the callback with interrupts enabled and in user
bank; until now, callbacks were invoked with interrupts disabled and in
kernel bank. Note that IMASK is still set so a callback can only be
interrupted by a high-priority interrupt.
A timer_wait() function has also been added to simplify tests that
involve timers. Finally, the priority level of the TMU0 underflow
interrupt has been set to 13 (as per the comments) instead of 7.
This version is the first stable version that handles TLB misses
transparently for large add-ins. It is suitable for every gint
application.
This change ports the TLB management system to fx9860g through %003.
This raises the size limit for add-ins to about 500k.
Because SH3 fx9860g does not have ILRAM, the GMAPPED attribute has been
made to generate content to a .gint.mapped section which is sent to the
P1 RAM section historically dubbed "real ram" in which gint's data and
VBR are installed. (Now that I think about it, gint's data should try to
go to normal RAM instead to reduce pressure on this invasion.)
Return-to-menu was also fixed on both platforms by narrowing down the
need for code to remain mapped to the chance of running it with
interrupts disabled. The natural distribution of GMAPPED under this
criterion showed that _gint_setvbr had been left under TLB control;
moving it to the proper RAM area fixed gint switches.
Finally, an omission in the bound checks for mappable TEA addresses (TEA
>= 0x00300000) prevented the appearance of a non-interactible System
ERROR popup when some unmapped addresses are accessed.
This version still does not enable interrupts in timer callbacks,
exposing any application to a crash if a timer underflows while its
callback is not mapped. It is not suitable for any stable application!
This change adds a TLB miss handler that calls __TLB_LoadPTEH() and
removes the startu mapping of add-in pages in the explore() routine of
src/core/start.c.
Note that calling __TLB_LoadPTEH() manually might unexpectedly lead to a
TLB multihit problem as the requested page might be accidentally loaded
by a TLB miss in the code that loads it. A TLB multihit is a platform
reset, so this function should always be considered unsafe to call
(unless the calling code is in a combination of P1 space and ILRAM).
This change also moves a lot of functions out of the .pretext section,
notably topti, as this was designed to allow panic messages when the
add-in couldn't be mapped entirely. By contrast, a GMAPPED macro has
been defined to mark crucial kernel code and data that must remain
mapped at all times. This currently puts the data in ILRAM because
static RAM is not executable. An alternative will have to be found for
SH3-based fx9860g machines.
This version still does not allow TLB misses in timer callbacks and
breaks return-to-menu in a severe way! It is not suitable for any
stable application!
This change modifies the fx-CG 50 linker script to allow add-ins up to
2M and no longer complains about add-ins that don't fit in the TLB.
It also exposes the __TLB_LoadPTEH() syscall (%003 on fx9860g, %00c on
fxcg50) that answers TLB misses. This syscall can be called manually
from an add-in to load some pages and seems to work without problem.
However, this version does not provide any automatic TLB management,
some key areas of the kernel are still under TLB and some user code
(such as timer callbacks) is not! This version is suitable only for
add-ins smaller than 220k!
When parsing a %% format, the second % character was mistakenly not
skipped over after emitting a '%' output; this resulted in it being
treated as a format specifier. For instance,
printf("%%d", 12);
would print "%12".
A missing coordinate check in gint_dhline() would allow lines entirely
out of bounds of the screen to write pixels outside of their expected
range, often wrapping up to the next line, but possibly overflowing from
VRAM.