diff --git a/README.md b/README.md
new file mode 100644
index 00000000..7ed9686d
--- /dev/null
+++ b/README.md
@@ -0,0 +1,76 @@
+---
+layout: post
+title: Cyclone Scheme
+---
+
+Cyclone is an experimental Scheme-to-C compiler that uses a variant of the [Cheney on the MTA](http://www.pipeline.com/~hbaker1/CheneyMTA.html) technique to implement full tail recursion, continuations, and generational garbage collection. Unlike previous Cheney on the MTA compilers, Cyclone also allows execution of multiple native threads. An on-the-fly garbage collector is used to manage the second-generation heap and perform major collections without "stopping the world".
+
+Getting Started
+---------------
+
+1. To install Cyclone on your machine for the first time use [**cyclone-bootstrap**](https://github.com/justinethier/cyclone-bootstrap) to build a set of binaries.
+
+2. After installing you can run the `cyclone` command to compile a single Scheme file:
+
+ $ cyclone examples/fac.scm
+ $ examples/fac
+ 3628800
+
+ And the `icyc` command to start an interactive interpreter:
+
+ $ icyc
+
+ :@
+ @@@
+ @@@@:
+ `@@@@@+
+ .@@@+@@@ Cyclone
+ @@ @@ An experimental Scheme compiler
+ ,@ https://github.com/justinethier/cyclone
+ '@
+ .@
+ @@ #@ (c) 2014 Justin Ethier
+ `@@@#@@@. Version 0.0.1 (Pre-release)
+ #@@@@@
+ +@@@+
+ @@#
+ `@.
+
+ cyclone> (write 'hello-world)
+ hello-world
+
+ You can use [`rlwrap`](http://linux.die.net/man/1/rlwrap) to make the interpreter more friendly, EG: `rlwrap icyc`.
+
+3. Read the documentation below for more information on how to use Cyclone.
+
+Documentation
+-------------
+
+- The [User Manual](docs/User-Manual.md) covers in detail how to use Cyclone, and provides information and API documentation on the Scheme language features implemented by Cyclone.
+
+- Cyclone's [Garbage Collector](docs/Garbage-Collector.md) is documented at a high-level. This document includes details on extending Cheney on the MTA to support multiple stacks and fusing that approach with a tri-color marking collector.
+
+- The [Benchmarks](docs/Benchmarks.md) page compares the performance of Cyclone with other R7RS Schemes using a common set of benchmarks.
+
+- [Writing the Cyclone Scheme Compiler](docs/Writing-the-Cyclone-Scheme-Compiler.md) provides high-level details on how the compiler was written and how it works.
+
+- Finally, if you need another resource to start learning the Scheme language you may want to try a classic textbook such as [Structure and Interpretation of Computer Programs](https://mitpress.mit.edu/sicp/full-text/book/book.html).
+
+Example Programs
+----------------
+
+Cyclone provides several example programs, including:
+
+- [Game of Life](examples/game-of-life) - The game of life example program and libraries from R7RS.
+
+- [Threading](examples/threading) - Various examples of multi-threaded programs.
+
+- [Tail Call Optimization](examples/tail-call-optimization.scm) - A simple example of Scheme tail call optimization; this program runs forever, calling into two mutually recursive functions.
+
+- Finally, the largest program is the compiler itself. Most of the code is contained in a series of libraries which are used by [`cyclone.scm`](cyclone.scm) and [`icyc.scm`](icyc.scm) to create executables for Cyclone's compiler and interpreter.
+
+License
+-------
+Copyright (C) 2014 [Justin Ethier](http://github.com/justinethier).
+
+Cyclone is available under the [MIT license](http://www.opensource.org/licenses/mit-license.php).
diff --git a/docs/API.md b/docs/API.md
new file mode 100644
index 00000000..bce46692
--- /dev/null
+++ b/docs/API.md
@@ -0,0 +1,35 @@
+
+# R7RS Libraries
+
+- Cyclone runtime
+- [`scheme base`](api/scheme/base.md)
+- [`scheme case-lambda`](api/scheme/case-lambda.md)
+- [`scheme char`](api/scheme/char.md)
+- [`scheme cxr`](api/scheme/cxr.md)
+- [`scheme eval`](api/scheme/eval.md)
+- [`scheme file`](api/scheme/file.md)
+- [`scheme inexact`](api/scheme/inexact.md)
+- [`scheme lazy`](api/scheme/lazy.md)
+- [`scheme load`](api/scheme/load.md)
+- [`scheme process-context`](api/scheme/process-context.md)
+- [`scheme read`](api/scheme/read.md)
+- [`scheme time`](api/scheme/time.md)
+- [`scheme write`](api/scheme/write.md)
+
+# SRFI Support
+
+Cyclone supports the following [Scheme Requests for Implementation (SRFI)](http://srfi.schemers.org/) libraries:
+
+- [`receive`: Binding to multiple values](http://srfi.schemers.org/srfi-8/srfi-8.html) - Included as part of `scheme base`.
+- [`srfi 18`](api/srfi/18.md) - [Multithreading support](http://srfi.schemers.org/srfi-18/srfi-18.html)
+
+# Cyclone-specific
+
+These libraries are used by the Cyclone compiler itself, and are subject to change:
+
+- `scheme cyclone cgen`
+- `scheme cyclone common`
+- `scheme cyclone libraries`
+- `scheme cyclone macros`
+- `scheme cyclone transforms`
+- `scheme cyclone util`
diff --git a/docs/Benchmarks.md b/docs/Benchmarks.md
new file mode 100644
index 00000000..a8e70ddd
--- /dev/null
+++ b/docs/Benchmarks.md
@@ -0,0 +1,49 @@
+[
](http://github.com/justinethier/cyclone)
+
+# Benchmarks
+
+The following [benchmarks from Larceny](http://www.larcenists.org/benchmarksGenuineR7Linux.html) give an indication of how well Cyclone performs compared with other R7RS Schemes. These benchmarks were recorded on a system with an Intel Core i5 CPU @ 2.20 GHz and indicate elapsed time in seconds. Longer bars indicate worse performance, although a bar is not displayed if the benchmark could not be completed in a reasonable amount of time.
+
+## Gabriel Benchmarks
+
+
+
+Benchmark | Cyclone | Chibi | Chicken
+--------- | ------- | ----- | -------
+browse | 77 | 439 | 30
+deriv | 39 | 212 | 13
+destruc | 136 | 197 | 20
+diviter | 51 | 122.9 | 8
+divrec | 70 | 108 | 29
+puzzle | 184 | Timeout | 32
+triangl | 95 | 201 | 26.6
+tak | 70 | 105 | 28.9
+takl | 132 | Timeout | 78.7
+ntakl | 152 | 193 | 77.9
+cpstak | 92 | Timeout | 35
+ctak | 7.9 | Timeout | 8.6
+
+## Kernighan and Van Wyk Benchmarks
+
+
+
+Benchmark | Cyclone | Chibi | Chicken
+--------- | ------- | ----- | -------
+ack | 288 | 161 | 116
+array1 | 167 | 130 | 29.4
+string | 1 | 8.478 | 1.584
+sum1 | 27 | 74 | 7.737
+cat | 43.669 | 132 | 55
+tail | 367 | 674 | -
+wc | 202 | 1072 | 36.4
+
+## Garbage Collection Benchmarks
+
+
+
+Benchmark | Cyclone | Chibi | Chicken
+--------- | ------- | ----- | -------
+nboyer | 67.783 | 73.516 | 39.377
+sboyer | 48.044 | 69.243 | 23.628
+gcbench | 143.478 | Timeout | 16.75
+mperm | 328.741 | 260.358 | 57.5
diff --git a/docs/Developer-How-To.md b/docs/Developer-How-To.md
new file mode 100644
index 00000000..9a7514e9
--- /dev/null
+++ b/docs/Developer-How-To.md
@@ -0,0 +1,16 @@
+## Add a primitive
+
+- Add function/definitions to `runtime.h` and `runtime.c`
+- Rebuild and install runtime library.
+- Add to `prim?` section in `transforms.sld`. Some functions may need to added to the next section in the file, so they are not constant-folded (IE, evaluated at compile time).
+- Add above the `c-compile-primitive` section in `cgen.sld`. Some functions may need to be added in multiple places to indicate they take additional arguments, call their continuation, etc.
+
+TODO: need to develop this section better to come up with a workable/optimal approach to building things:
+
+- Compile:
+ cyclone scheme/cyclone/cgen.sld
+ cyclone scheme/cyclone/transforms.sld
+- Copy modified files to cyclone-bootstrap, including runtime, `.sld`, and compiled `.c` files.
+- Run `make clean ; ./install` from bootstrap repo
+
+- Add primitives to the list in eval.sld. Rebuild one more time.
diff --git a/docs/Development.md b/docs/Development.md
new file mode 100644
index 00000000..c28b8c93
--- /dev/null
+++ b/docs/Development.md
@@ -0,0 +1,22 @@
+# Development Guide
+
+## Building
+
+Please use cyclone-bootstrap if you are installing Cyclone on a machine for the first time. Otherwise, if you already have a copy of Cyclone installed you can build from Scheme source.
+
+The following prerequisites are required:
+
+- make
+- gcc
+
+From the source directory, use the following commands to build and install:
+
+ $ make
+ $ make test
+ $ sudo make install
+ $ cyclone
+
+By default everything is installed under `/usr/local`. This may be changed by passing a different `PREFIX`. For example:
+
+ make PREFIX=/home/me install
+
diff --git a/docs/Garbage-Collector.md b/docs/Garbage-Collector.md
new file mode 100644
index 00000000..7210d855
--- /dev/null
+++ b/docs/Garbage-Collector.md
@@ -0,0 +1,322 @@
+[
](http://github.com/justinethier/cyclone)
+
+# Garbage Collector
+
+- [Introduction](#introduction)
+- [Terms](#terms)
+- [Code](#code)
+- [Data Structures](#data-structures)
+ - [Heap](#heap)
+ - [Thread Data](#thread-data)
+ - [Object Header](#object-header)
+ - [Mark Buffers](#mark-buffers)
+- [Minor Collection](#minor-collection)
+- [Major Collection](#major-collection)
+ - [Tri-color Marking](#tri-color-marking)
+ - [Handshakes](#handshakes)
+ - [Collection Cycle](#collection-cycle)
+ - [Mutator Functions](#mutator-functions)
+ - [Collector Functions](#collector-functions)
+ - [Cooperation by the Collector](#cooperation-by-the-collector)
+ - [Running the Collector](#running-the-collector)
+- [Looking Ahead](#looking-ahead)
+- [Further Reading](#further-reading)
+
+# Introduction
+
+The goal of this paper is to provide a high-level overview of Cyclone's garbage collector. The explanation is fairly technical; there are some introductory articles on garbage collection in the [further reading section](#further-reading) that may provide more familiarity with the concepts and that are worthwhile to read in their own right.
+
+Anyway, with that out of the way, the collector has the following requirements:
+
+- Efficiently free allocated memory.
+- Allow the language implementation to support tail calls and continuations.
+- Allow the language to support native multithreading.
+
+Cyclone uses generational garbage collection (GC) to automatically free allocated memory using two types of collection. In practice, most allocations consist of short-lived objects such as temporary variables. Minor GC is done frequently to clean up most of these short-lived objects. Some objects will survive this collection because they are still referenced in memory. A major collection runs less often to free longer-lived objects that are no longer being used by the application.
+
+Cheney on the MTA, a technique introduced by Henry Baker, is used to implement the first generation of our garbage collector. Objects are allocated directly on the stack using `alloca` so allocations are very fast, do not cause fragmentation, and do not require a special pass to free unused objects. Baker's technique uses a copying collector for both the minor and major generations of collection. One of the drawbacks of using a copying collector for major GC is that it relocates all the live objects during collection. This is problematic for supporting native threads because an object can be relocated at any time, invalidating any references to the object. To prevent this either all threads must be stopped while major GC is running or a read barrier must be used each time an object is accessed. Both options add a potentially significant overhead so instead another type of collector is used for the second generation.
+
+Cyclone supports native threads by using a tracing collector based on the Doligez-Leroy-Gonthier (DLG) algorithm for major collections. An advantage of this approach is that objects are not relocated once they are placed on the heap. In addition, major GC executes asynchronously so threads can continue to run concurrently even during collections.
+
+# Terms
+- Collector - A thread running the garbage collection code. The collector is responsible for coordinating and performing most of the work for major garbage collections.
+- Continuation - With respect to the collectors, this is a function that is called to resume execution of application code. For more information see [this article on continuation passing style](https://en.wikipedia.org/wiki/Continuation-passing_style).
+- Forwarding Pointer - When a copying collector relocates an object it leaves one of these pointers behind with the object's new address.
+- Mutation - A modification to an object. For example, changing a vector (array) entry.
+- Mutator - A thread running user (or "application") code; there may be more than one mutator running concurrently.
+- Read Barrier - Code that is executed before reading an object. Read barriers have a larger overhead than write barriers because object reads are much more common.
+- Root - During tracing the collector uses these objects as the starting point to find all reachable data.
+- Write Barrier - Code that is executed before writing to an object.
+
+# Code
+
+The implementation code is available here:
+
+- [`runtime.c`](../runtime.c) contains most of the runtime system, including code to perform minor GC. A good place to start would be the `GC` and `gc_minor` functions.
+- [`gc.c`](../gc.c) contains the major GC code.
+
+# Data Structures
+
+## Heap
+
+The heap is used to store all objects that survive minor GC, and consists of a linked list of pages. Each page contains a contiguous block of memory and a linked list of free chunks. When a new chunk is requested the first free chunk large enough to meet the request is found and either returned directly or carved up into a smaller chunk to return to the caller.
+
+Memory is always allocated in multiples of 32 bytes. On the one hand this helps prevent external fragmentation by allocating many objects of the same size. But on the other it incurs internal fragmentation because an object will not always fill all of its allocated memory.
+
+The heap is locked during allocation and sweep operations to protect against concurrent access.
+
+If there is not enough free memory to fulfill a request a new page is allocated and added to the heap. This is the only choice, unfortunately. The collection process is asynchronous so memory cannot be freed immediately to make room.
+
+## Thread Data
+
+At runtime Cyclone passes the current continuation, number of arguments, and a thread data parameter to each compiled C function. The continuation and arguments are used by the application code to call into its next function with a result. Thread data is a structure that contains all of the necessary information to perform collections, including:
+
+- Thread state
+- Stack boundaries
+- Jump buffer
+- List of mutated objects detected by the minor GC write barrier
+- Major GC parameters - mark buffer, last read/write, etc (see next sections)
+- Call history buffer
+- Exception handler stack
+
+Each thread has its own instance of the thread data structure and its own stack (assigned by the C runtime/compiler).
+
+## Object Header
+
+Each object contains a header with the following information:
+
+- Tag - A number indicating the object type: cons, vector, string, etc.
+- Mark - The status of the object's memory.
+- Grayed - A field indicating the object has been grayed but has not been added to a mark buffer yet (see major GC sections below). This is only applicable for objects on the stack.
+
+## Mark Buffers
+
+Mark buffers are used to hold gray objects instead of explicitly marking objects gray. These mark buffers consist of fixed-size pointer arrays that are increased in size as necessary using `realloc`. Each mutator has a reference to a mark buffer holding their gray objects. A last write variable is used to keep track of the buffer size.
+
+The collector updates the mutator's last read variable each time it marks an object from the mark buffer. Marking is finished when last read and last write are equal. The collector also maintains a single mark stack of objects that the collector has marked gray.
+
+An object on the stack cannot be added to a mark buffer because the reference may become invalid before it can be processed by the collector.
+
+# Minor Collection
+
+Cyclone converts the original program to continuation passing style (CPS) and compiles it as a series of C functions that never return. At runtime each mutator periodically checks to see if its stack has exceeded a certain size. When this happens a minor GC is started and all live stack objects are copied to the heap.
+
+Root objects are live objects the collector uses to begin the tracing process. Cyclone's minor collector treats the following as roots:
+
+- The current continuation
+- Arguments to the current continuation
+- Mutations contained in the write barrier
+- Closures from the exception stack
+- Global variables
+
+A minor collection is always performed for a single mutator thread, usually by the thread itself. The algorithm is based on Cheney on the MTA:
+
+- Move any root objects on the stack to the heap. For each object moved:
+ - Replace the stack object with a forwarding pointer. The forwarding pointer ensures all references to a stack object refer to the same heap object, and allows minor GC to handle cycles.
+ - Record each moved object in a buffer to serve as the Cheney to-space.
+- Loop over the to-space buffer and check each object moved to the heap. Move any child objects that are still on the stack. This loop continues until all live objects are moved.
+- Cooperate with the collection thread (see next section).
+- Perform a `longjmp` to reset the stack and call into the current continuation.
+
+Any objects left on the stack after `longjmp` are considered garbage. There is no need to clean them up because the stack will just re-use the memory as it grows.
+
+Finally, although not mentioned in Baker's paper, a heap object can be modified to contain a reference to a stack object. For example, by using a `set-car!` to change the head of a list. This is problematic since stack references are no longer valid after a minor GC, and the GC does not check heap objects. We account for these mutations by using a write barrier to maintain a list of each modified object. During GC, these modified objects are treated as roots to avoid dangling references.
+
+# Major Collection
+
+A single heap is used to store objects relocated from the various thread stacks. Eventually the heap will run too low on space and a collection is required to reclaim unused memory. The collector thread is used to perform a major GC with cooperation from the mutator threads.
+
+## Tri-color Marking
+
+An object can be marked using any of the following colors to indicate the status of its memory:
+
+ - Blue - Unallocated memory.
+ - Red - An object on the stack.
+ - White - Heap memory that has not been scanned by the collector.
+ - Gray - Objects marked by the collector that may still have child objects that must be marked.
+ - Black - Objects marked by the collector whose immediate child objects have also been marked.
+
+Only objects marked as white, gray, or black participate in major collections:
+
+- White objects are freed during the sweep state. White is sometimes also referred to as the clear color.
+- Gray is never explicitly assigned to an object. Instead, objects are grayed by being added to lists of gray objects awaiting marking. This improves performance by avoiding repeated passes over the heap to search for gray objects.
+- Black objects survive the collection cycle. Black is sometimes referred to as the mark color as live objects are ultimately marked black.
+
+## Handshakes
+
+Instead of stopping the world and pausing all threads, when the collector needs to coordinate with the mutators it performs a handshake.
+
+Each of the mutator threads, and the collector itself, has a status variable:
+
+ typedef enum { STATUS_ASYNC
+ , STATUS_SYNC1
+ , STATUS_SYNC2
+ } gc_status_type;
+
+The collector will update its status variable and then wait for all of the collectors to change their status before continuing. The mutators periodically call a cooperate function to check in and update their status to match the collectors. A handshake is complete once all mutators have updated their status.
+
+## Collection Cycle
+
+During a GC cycle the collector thread transitions through the following states.
+
+### Clear
+The collector swaps the values of the clear color (white) and the mark color (black). This is more efficient than modifying the color on each object in the heap. The collector then transitions to sync 1. At this point no heap objects are marked, as demonstrated below:
+
+
+
+### Mark
+The collector transitions to sync 2 and then async. At this point it marks the global variables and waits for the mutators to also transition to async:
+
+
+
+### Trace
+The collector finds all live objects using a breadth-first search and marks them black:
+
+
+
+### Sweep
+The collector scans the heap and frees memory used by all white objects:
+
+
+
+If the heap is still low on memory at this point the heap will be increased in size. Also, to ensure a complete collection data for any terminated threads is not freed until now.
+
+### Resting
+The collector cycle is complete and it rests until it is triggered again.
+
+## Mutator Functions
+
+Each mutator calls the following functions to coordinate with the collector.
+
+### Create
+
+This function is called by a mutator to allocate memory on the heap for an object. This is generally only done during a minor GC when each object is relocated to the heap.
+
+### Update
+
+A write barrier is used to ensure any modified objects are properly marked for the current collection cycle. There are two cases:
+
+- Gray the object's new and old values if the mutator is in a synchronous status.
+- Gray the object's old value if the collector is in the tracing stage.
+
+Because updates can occur at any time a modified object may still live on the stack. In this case the object is tagged to be grayed when it is relocated to the heap.
+
+### Cooperate
+
+Each mutator is required to periodically call this function to cooperate with the collector. During cooperation a mutator will update its status to match the collector's status, to handshake with the collector.
+
+In addition when a mutator transitions to async it will:
+
+- Mark all of its roots gray
+- Use black as the allocation color for any new objects to prevent them from being collected during this cycle.
+
+Cyclone's mutators cooperate after each minor GC, for two reasons. Minor GC's are frequent and immediately afterwards all of the mutator's live objects can be marked because they are on the heap.
+
+### Mark Gray
+
+Mutators call this function to add an object to their mark buffer.
+
+ mark_gray(m, obj):
+ if obj != clear_color:
+ m->mark_buffer[m->last_write] = obj
+ m->last_write++
+
+## Collector Functions
+
+### Collector Mark Gray
+
+The collector calls this function to add an object to the mark stack.
+
+ collector_mark_gray(obj):
+ if obj != clear_color:
+ mark_stack->push(obj)
+
+### Mark Black
+
+The collector calls this function to mark an object black and mark all of the object's children gray using Collector Mark Gray.
+
+ mark_black(obj):
+ if mark(obj) != mark_color:
+ for each child(c):
+ collector_mark_gray(c)
+ mark(obj) = mark_color
+
+
+### Empty Collector Mark Stack
+
+This function removes and marks each object on the collector's mark stack.
+
+ empty_collector_mark_stack():
+ while not mark_stack->empty():
+ mark_black(mark_stack->pop())
+
+### Collector Trace
+
+This function performs tracing for the collector by looping over all of the mutator mark buffers. All of the remaining objects in each buffer are marked black, as well as all the remaining objects on the collector's mark stack. This function continues looping until there are no more objects to mark:
+
+ collector_trace():
+ clean = 0
+ while not clean:
+ clean = 1
+ for each mutator(m):
+ while m->last_read < m->last_write:
+ clean = 0
+ mark_black(m->mark_buffer[m->last_read])
+ empty_collector_mark_stack()
+ m->last_read++
+
+## Cooperation by the Collector
+
+In practice a mutator will not always be able to cooperate in a timely manner. For example, a thread can block indefinitely waiting for user input or reading from a network port. In the meantime the collector will never be able to complete a handshake with this mutator and major GC will never be performed.
+
+Cyclone solves this problem by requiring that a mutator keep track of its thread state. With this information the collector can cooperate on behalf of a blocked mutator and do the work itself instead of waiting for the mutator.
+
+The possible thread states are:
+
+- `CYC_THREAD_STATE_NEW` - A new thread not yet running.
+- `CYC_THREAD_STATE_RUNNABLE` - A thread that can be scheduled to run by the OS.
+- `CYC_THREAD_STATE_BLOCKED` - A thread that could be blocked.
+- `CYC_THREAD_STATE_BLOCKED_COOPERATING` - A blocked thread that the collector is cooperating with on behalf of the mutator.
+- `CYC_THREAD_STATE_TERMINATED` - A thread that has been terminated by the application but its resources have not been freed up yet.
+
+Before entering a C function that could block the mutator must call a function to update its thread state to `CYC_THREAD_STATE_BLOCKED`. This indicates to the collector that the thread may be blocked.
+
+When the collector handshakes it will check each mutator to see if it is blocked. Normally in this case the collector can just update the blocked mutator's status and move on to the next one. But if the mutator is transitioning to async all of its objects need to be relocated from the stack so they can be marked. In this case the collector changes the thread's state to `CYC_THREAD_STATE_BLOCKED_COOPERATING`, locks the mutator's mutex, and performs a minor collection for the thread. The mutator's objects can then be marked gray and its allocation color can be flipped. When it is finished cooperating for the mutator the collector releases its mutex.
+
+When a mutator exits a (potentially) blocking section of code, it must call another function to update its thread state to `CYC_THREAD_STATE_RUNNABLE`. In addition, the function will detect if the collector cooperated for this mutator by checking if its status is `CYC_THREAD_STATE_BLOCKED_COOPERATING`. If so, the mutator waits for its mutex to be released to ensure the collector has finished cooperating. The mutator then performs a minor GC again to ensure any additional objects - such as results from the blocking code - are moved to the heap before calling `longjmp` to jump back to the beginning of its stack. Either way, the mutator now calls into its continuation and resumes normal operations.
+
+## Running the Collector
+
+Cyclone checks the amount of free memory as part of its cooperation code. A major GC cycle is started if the amount of free memory dips below a threshold. The goal is to run major collections infrequently, but at the same time we want to prevent unnecessary allocations.
+
+# Looking Ahead
+
+The garbage collector is by far the most complex component of Cyclone. The primary motivations in developing it were to:
+
+- Extend baker's approach to support multiple mutators
+- Position to potentially support state of the art GC's built on top of DLG (Stopless, Chicken, Clover)
+
+There are a few limitations or potential issues with the current implementation:
+
+- Heap memory fragmentation has not been addressed and could be an issue for long-running programs. Traditionally a compaction process is used to defragment a heap. An alternative strategy has also been suggested by Pizlo:
+
+ > instead of copying objects to evacuate fragmented regions of the heap, fragmentation is instead embraced. A fragmented heap is allowed to stay fragmented, but the collector ensures that it can still satisfy allocation requests even if no large enough contiguous free region of space exists.
+
+- Accordingly, the runtime needs to be able to handle large objects that could potentially span one or more pages.
+- There is probably too much heap locking going on, and this could be an issue for a large heap and/or a large number of mutators. Improvements can likely be made in this area.
+
+Cyclone needs to be tested with large heap and large allocations. I believe it should work well for large heaps that do not allocate too many objects of irregular size. However, a program regularly allocating large strings or vectors could cause significant heap fragmentation over time.
+
+Ultimately, a garbage collector is tricky to implement and the focus must primarily be on correctness first, with an eye towards performance.
+
+# Further Reading
+
+- [Baby's First Garbage Collector](http://journal.stuffwithstuff.com/2013/12/08/babys-first-garbage-collector/), by Bob Nystrom
+- [Chibi-Scheme](https://github.com/ashinn/chibi-scheme)
+- [CHICKEN internals: the garbage collector](http://www.more-magic.net/posts/internals-gc.html), by Peter Bex
+- [CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A.](http://www.pipeline.com/~hbaker1/CheneyMTA.html), by Henry Baker
+- Fragmentation Tolerant Real Time Garbage Collection (PhD Dissertation), by Filip Pizlo
+- [The Garbage Collection Handbook: The Art of Automatic Memory Management](http://gchandbook.org/), by Antony Hosking, Eliot Moss, and Richard Jones
+- Implementing an on-the-fly garbage collector for Java, by Domani et al
+- Incremental Parallel Garbage Collection, by Paul Thomas
+- Portable, Unobtrusive Garbage Collection for Multiprocessor Systems, by Damien Doligez and Georges Gonthier
diff --git a/docs/Scheme-Language-Compliance.md b/docs/Scheme-Language-Compliance.md
new file mode 100644
index 00000000..d8ce6f9c
--- /dev/null
+++ b/docs/Scheme-Language-Compliance.md
@@ -0,0 +1,48 @@
+# R7RS Compliance
+
+This is the status of Scheme programming language features implemented from the [R7RS Scheme Specification](r7rs.pdf):
+
+Section | Status | Comments
+------- | ------ | ---------
+2.2 Whitespace and comments | Yes |
+2.3 Other notations | Yes |
+2.4 Datum labels | No |
+3.1 Variables, syntactic keywords, and regions | Yes |
+3.2 Disjointness of types | Yes |
+3.3 External representations | Yes |
+3.4 Storage model | Yes | No immutable types at this time.
+3.5 Proper tail recursion | Yes |
+4.1 Primitive expression types | Partial | `include` and `include-ci` are not implemented, although `include` may be specified as part of a library definition.
+4.2 Derived expression types | Partial |
+4.2.1 Conditionals | Yes |
+4.2.2 Binding constructs | Partial | Missing `let-values` and `let*-values`
+4.2.3 Sequencing | Yes | Begin may not sequence properly as part of a library definition
+4.2.4 Iteration | Yes |
+4.2.5 Delayed evaluation | Yes |
+4.2.6 Dynamic bindings | | Not supported yet.
+4.2.7 Exception handling | Yes |
+4.2.8 Quasiquotation | Yes |
+4.2.9 Case-lambda | Yes |
+4.3 Macros | Yes | Support for `syntax-rules` and a lower-level explicit renaming macro system.
+5.1 Programs | Yes |
+5.2 Import declarations | Partial |
+5.3 Variable definitions | Partial | `define-values` is not implemented yet.
+5.4 Syntax definitions | Yes |
+5.5 Record-type definitions | Yes | Located in the `(srfi 9)` library.
+5.6 Libraries | Partial | Support is "good enough" but need to make it more robust
+5.7 The REPL | Yes |
+6.1 Equivalence predicates | Yes | `eqv?` is not implemented, it is just an alias to `eq?`
+6.2 Numbers | Partial | Only integers and reals are supported at this time.
+6.3 Booleans | Yes | `#true` and `#false` are not recognized by parser.
+6.4 Pairs and lists | Yes | `member` functions are predicates, `member` and `assoc` do not accept `compare` argument.
+6.5 Symbols | Yes |
+6.6 Characters | Partial | No unicode support.
+6.7 Strings | Partial | No unicode support.
+6.8 Vectors | Yes |
+6.9 Bytevectors | Yes |
+6.10 Control features | Yes | `dynamic-wind` is limited, and does not work across calls to continuations.
+6.11 Exceptions | Partial | Exceptions are implemented but error objects (and associated functions `error-object`, etc) are not at this time.
+6.12 Environments and evaluation | Partial | Only `eval` is implemented at this time.
+6.13 Input and output | Partial | Functions do not differentiate between binary and textual ports. Do not have support for input/output strings or bytevectors.
+6.14 System interface | Yes |
+
diff --git a/docs/Writing-the-Cyclone-Scheme-Compiler.md b/docs/Writing-the-Cyclone-Scheme-Compiler.md
new file mode 100644
index 00000000..60458de8
--- /dev/null
+++ b/docs/Writing-the-Cyclone-Scheme-Compiler.md
@@ -0,0 +1,174 @@
+# Writing the Cyclone Scheme Compiler
+
+###### by [Justin Ethier](https://github.com/justinethier)
+
+This document covers some of the background on how Cyclone was written, including aspects of the compiler and runtime system.
+
+Before we get started, I want to say **Thank You** to everyone that has contributed to the Scheme community. At the end of this document is a list of online resources that were the most helpful and influential. Without quality Scheme resources like these it would not have been possible to write Cyclone.
+
+In addition, developing [Husk Scheme](http://justinethier.github.io/husk-scheme) helped me gather much of the knowledge that would later be used to build Cyclone. In fact, the primary motivation in building Cyclone was to go a step further and understand not only how to write a Scheme compiler but also how to build a runtime system. Over time some of the features and understanding gained in Cyclone may be folded back into Husk.
+
+## Table of Contents
+
+- [Overview](#overview)
+- [Source-to-Source Transformations](#source-to-source-transformations)
+- [C Code Generation](#c-code-generation)
+- [C Runtime](#c-runtime)
+- [Data Types](#data-types)
+- [Interpreter](#interpreter)
+- [Macros](#macros)
+- [Scheme Standards](#scheme-standards)
+- [Future](#future)
+- [Conclusion](#conclusion)
+- [References](#references)
+
+## Overview
+
+Cyclone has a similar architecture to other modern compilers:
+
+
+
+First, an input file containing Scheme code is received on the command line and loaded into an abstract syntax tree (AST) by Cyclone's parser. From there a series of source-to-source transformations are performed on the AST to expand macros, perform optimizations, and make the code easier to compile to C. These intermediate representations (IR) can be printed out in a readable format to aid debugging. The final AST is then output as a `.c` file and the C compiler is invoked to create the final executable or object file.
+
+The code is represented internally as an AST of regular Scheme objects. Since Scheme represents both code and data using [S-expressions](https://en.wikipedia.org/wiki/S-expression), our compiler does not have to use custom abstract data types to store the code as would be the case with many other languages.
+
+## Source-to-Source Transformations
+My primary inspiration for Cyclone was Marc Feeley's [The 90 minute Scheme to C compiler](http://churchturing.org/y/90-min-scc.pdf) (also [video](https://www.youtube.com/watch?v=TxOM9Y5YrCs) and [code](https://github.com/justinethier/nugget/tree/master/90-min-scc)). Over the course of 90 minutes, Feeley demonstrates how to compile Scheme to C code using source-to-source transformations, including closure and continuation-passing-style (CPS) conversions.
+
+As outlined in the presentation, some of the difficulties in compiling to C are:
+
+> Scheme has, and C does not have
+> - tail-calls a.k.a. tail-recursion optimization
+> - first-class continuations
+> - closures of indefinite extent
+> - automatic memory management i.e. garbage collection (GC)
+>
+> Implications
+> - cannot translate (all) Scheme calls into C calls
+> - have to implement continuations
+> - have to implement closures
+> - have to organize things to allow GC
+>
+> The rest is easy!
+
+To overcome these difficulties a series of source-to-source transformations are used to remove powerful features not provided by C, add constructs required by the C code, and restructure/relabel the code in preparation for generating C. The final code may be compiled direcly to C. Cyclone also includes many other intermediate transformations, including:
+
+- Macro expansion
+- Processing of globals
+- [Alpha conversion](https://wiki.haskell.org/Alpha_conversion)
+- [CPS conversion](https://en.wikipedia.org/wiki/Continuation-passing_style)
+- [Closure conversion](http://matt.might.net/articles/closure-conversion/)
+
+The 90-minute scc ultimately compiles the code down to a single function and uses jumps to support continuations. This is a bit too limiting for a production compiler, so that part was not used.
+
+## C Code Generation
+
+The compiler's code generation phase takes a single pass over the transformed Scheme code and outputs C code to the current output port (usually a `.c` file).
+
+During this phase C code is sometimes returned for later use instead of being output directly. For example, when compiling a vector literal or a series of function arguments. In this case, the code is returned as a list of strings that separates variable declarations from C code in the "body" of the generated function.
+
+The C code is carefully generated so that a Scheme library (`.sld` file) is compiled into a C module. Functions and variables exported from the library become C globals in the generated code.
+
+## C Runtime
+A runtime based on Henry Baker's paper [CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A.](http://www.pipeline.com/~hbaker1/CheneyMTA.html) was used as it allows for fast code that meets all of the fundamental requirements for a Scheme runtime: tail calls, garbage collection, and continuations.
+
+Baker explains how it works:
+
+> We propose to compile Scheme by converting it into continuation-passing style (CPS), and then compile the resulting lambda expressions into individual C functions. Arguments are passed as normal C arguments, and function calls are normal C calls. Continuation closures and closure environments are passed as extra C arguments. Such a Scheme never executes a C return, so the stack will grow and grow ... eventually, the C "stack" will overflow the space assigned to it, and we must perform garbage collection.
+
+Cheney on the M.T.A. uses a copying garbage collector. By using static roots and the current continuation closure, the GC is able to copy objects from the stack to a pre-allocated heap without having to know the format of C stack frames. To quote Baker:
+
+> the entire C "stack" is effectively the youngest generation in a generational garbage collector!
+
+After GC is finished, the C stack pointer is reset using [`longjmp`](http://man7.org/linux/man-pages/man3/longjmp.3.html) and the GC calls its continuation.
+
+Here is a snippet demonstrating how C functions may be written using Baker's approach:
+
+ object Cyc_make_vector(object cont, object len, object fill) {
+ object v = nil;
+ int i;
+ Cyc_check_int(len);
+
+ // Memory for vector can be allocated directly on the stack
+ v = alloca(sizeof(vector_type));
+
+ // Populate vector object
+ ((vector)v)->tag = vector_tag;
+ ...
+
+ // Check if GC is needed, then call into continuation with the new vector
+ return_closcall1(cont, v);
+ }
+
+[CHICKEN](http://www.call-cc.org/) was the first Scheme compiler to use Baker's approach.
+
+## Data Types
+
+### Objects
+
+Most Scheme data types are represented as allocated "objects" that contain a tag to identify the object type. For example:
+
+ typedef struct {tag_type tag; double value;} double_type;
+
+### Value Types
+
+On the other hand, some data types can be represented using 30 bits or less and can be stored as value types using a technique from Lisp in Small Pieces. On many machines, addresses are multiples of four, leaving the two least significant bits free. [A brief explanation](http://stackoverflow.com/q/9272526/101258):
+
+> The reason why most pointers are aligned to at least 4 bytes is that most pointers are pointers to objects or basic types that themselves are aligned to at least 4 bytes. Things that have 4 byte alignment include (for most systems): int, float, bool (yes, really), any pointer type, and any basic type their size or larger.
+
+Due to the tag field, all Cyclone objects will have (at least) 4-byte alignment.
+
+Cyclone uses this technique to store characters. The nice thing about value types is they do not have to be garbage collected because no extra data is allocated for them.
+
+## Interpreter
+
+The [Metacircular Evaluator](https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-26.html#%_sec_4.1) from [SICP](https://mitpress.mit.edu/sicp/full-text/book/book.html) was used as a starting point for `eval`.
+
+## Macros
+
+[Explicit renaming](http://wiki.call-cc.org/explicit-renaming-macros) (ER) macros provide a simple, low-level macro system without requiring much more than `eval`. Many ER macros from [Chibi Scheme](https://github.com/ashinn/chibi-scheme) are used to implement the built-in macros in Cyclone.
+
+## Scheme Standards
+
+Cyclone targets the [R7RS-small specification](https://github.com/justinethier/cyclone/raw/master/docs/r7rs.pdf). This spec is relatively new and provides incremental improvements from the popular [R5RS spec](http://www.schemers.org/Documents/Standards/R5RS/HTML/). Library (C module) support is the most important but there are also exceptions, system interfaces, and a more consistent API.
+
+## Future
+
+Andrew Appel used a similar runtime for [Standard ML of New Jersey](http://www.smlnj.org/) which is referenced by Baker's paper. Appel's book [Compiling with Continuations](http://www.amazon.com/Compiling-Continuations-Andrew-W-Appel/dp/052103311X) includes a section on how to implement compiler optimizations - many of which could be applied to Cyclone.
+
+## Conclusion
+
+From Feeley's presentation:
+
+> Performance is not so bad with NO optimizations (about 6 times slower than Gambit-C with full optimization)
+
+Compared to a similar compiler (CHICKEN), Cyclone's performance is worse but also "not so bad":
+
+ $ time cyclone -d transforms.sld
+
+ real 0m6.802s
+ user 0m4.444s
+ sys 0m1.512s
+
+ $ time csc -t transforms.scm
+
+ real 0m1.084s
+ user 0m0.512s
+ sys 0m0.380s
+
+Thanks for reading!
+
+Want to give Cyclone a try? Install a copy using [cyclone-bootstrap](https://github.com/justinethier/cyclone-bootstrap).
+
+
+## References
+
+- [CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A.](http://www.pipeline.com/~hbaker1/CheneyMTA.html), by Henry Baker
+- [CHICKEN Scheme](http://www.call-cc.org/)
+- [Chibi Scheme](https://github.com/ashinn/chibi-scheme)
+- [Compiling Scheme to C with closure conversion](http://matt.might.net/articles/compiling-scheme-to-c/), by Matt Might
+- [Lisp in Small Pieces](http://pagesperso-systeme.lip6.fr/Christian.Queinnec/WWW/LiSP.html), by Christian Queinnec
+- [R5RS Scheme Specification](http://www.schemers.org/Documents/Standards/R5RS/HTML/)
+- [R7RS Scheme Specification](http://trac.sacrideo.us/wg/wiki)
+- [Structure and Interpretation of Computer Programs](https://mitpress.mit.edu/sicp/full-text/book/book.html), by Harold Abelson and Gerald Jay Sussman
+- [The 90 minute Scheme to C compiler](http://churchturing.org/y/90-min-scc.pdf), by Marc Feeley