Array Improvements #
hacking, May 18th 2009
I've just about completed a slew of array related optimizations in SBCL. Since the details are not all that interesting, I'll focus on couple of my favorite changes:
MAKE-ARRAY already had some compiler transforms that were able to take care of some aspects of it, but there were some IMO perfectly idiomatic cases that were dog slow:
;; the compiler didn't realize it could elide the call to LIST and use ;; it's arguments directly to initialize the array — ditto for VECTOR ;; and backquoted forms in place of LIST. (defun bar (x y z) (make-array 3 :initial-contents (list x y z)))
Previously this resulted in consing up the list at runtime, and then making a full call to MAKE-ARRAY, including keyword argument parsing and everything. I'll spare you the disassembly — pick up an pre 1.0.28.51 SBCL and see for yourself if you must.
Now, we are instead able to transform the above into something like this:
(lambda (x y z) (initialize-vector (allocate-vector ...) x y z))
..which in turn is open coded, resulting in an almost 90% speedup. If you were aware of this deficiency, and used explicit (SETF AREF) to initialize the array, you were manually doing what the compiler now does automatically — and will not see any performance improvements.
What's more, now we are also able to stack allocate arrays in the presence of :INITIAL-CONTENTS or :INITIAL-ELEMENT, including those allocated with a call to VECTOR.
FILL also now benefits from a better transform. If the array element type is known (and fills a bunch of criteria), we are able to fill much more efficiently by using "bit bashing". The implementation functions for this have existed in SBCL for quite a while now, thanks to Nathan Froyd, but were not used until now.
Consider eg. an array with element type (UNSIGNED-BYTE 8). By first filling a word with 4 (8 on 64 bit platforms) copies of the initial element, we can fill the array a word at a time — poor man's SIMD, if you will. This strategy was already used for filling bit vectors, but now it applies to all types where it is safe.
Upshot: upto 90% faster FILL.