random-state.net

Nikodemus Siivola

<< next | top | previous >>

Streams in SBCL, part I #
hacking, May 8th 2006

Presently, SBCL streams look something like this — Gray and Simple Stream omitted for clarity.

One thing that is pretty clear from the image, is the central role sb-kernel:ansi-stream plays in the implementation: almost all dispatching works through it. What is less clear is that in reality sb-sys:fd-stream is at least as important: almost all interesting non-standard streams encapsulate an fd-stream and talk to the operating system through it.

I find both problematic. Firstly, ansi-stream because even though it nicely centralizes the dispatching needs of various streams, it is too low-level and includes buffering. Secondly, fd-stream because (1) it is a subtype of file-stream, (2) it is fails to properly differentiate different kinds of file descriptors (or handles in the Windows parlance), (3) it isn't a particularly good abstract low-level IO interface, which is what it gets used for.

Stepping back from the implementation details, I belive there are four fundamentally different issues here, that while intimately interleaved should still not be conflated:

1. ANSI Common Lisp stream interface. 3. Stream implementation strategy: built-in streams, and dispatching from 1 to both 2 and built-in streams.
2. User extensible stream interfaces (Gray and Simple Streams, and future innovations). 4. Low-level IO interface to the operating system.

Try not to read too much meaning into the layout there, as it is fairly arbitraty — I just wanted to avoid building them into a neat little stack of layers when I don't think that is the most fruitful way to think about this. Numbering purely for ease of reference.

Number 1 is a constrait we must satisfy, but can't do much about it. The noteworthy aspects here are the (lack of) provisions for non-blocking IO, various bits and pieces of newline magic (not just the CR/LF things, but stuff like fresh-line too), and external-formats in their encoding-aspect (the non-newline things).

Number 2 is "just" more constraints to satisfy, but while thinking about all this we should keep in mind the things that could be done better here. I firmly belive Gray and Simple Streams are not the end of the story.

Number 3 is the place we get to satisfy all these constraints in. More about it later.

Number 4 is something else. In order to talk to the operating system we could just hang on to an fd and call appropriate foreign functions on it when necessary. However, this being Lisp and SBCL we (or at least I) would like to provide an interface programmers needing low-level access could use, instead of having to choose between fd-stream and direct foreign calls. I'm not sure about all the details here, but here's a tentative list of requirements for this beast:

  • Both binary and character IO.
  • Needs to understand character encodings, but not newline policies.
  • Needs to understand the fundamental differences between different file descriptors (block devices, sockets, etc). There seem to be a five-or-so different kinds of IO-devices around these days, so supporting all of them sensibly doesn't appear too overwhelming.
  • Multiplexed IO.
  • Both blocking and non-blocking IO.
  • Buffering.

Things like byte-order control and IO of arrays of IEEE floats would be nice too, but less crucial in my view.

Sounds like a tall order? Good news is that most of the design can