random-state.net

Nikodemus Siivola

<< next | top | previous >>

MAKE-INSTANCE optimizations #
hacking, June 29th 2009

SBCL has always had a fairly efficient MAKE-INSTANCE — as long as the class argument is a quoted symbol and all the keywords in initargs are constant as well.

This is due to what are called ctors. Very briefly, a call such as

(make-instance 'point :x x :y y)

is transformed into something along the lines of

(let ((#:ctor (load-time-value (ensure-ctor 'foo :x :y))))
  (funcall #:ctor x y))

where CTOR is a funcallable instance. The first time it is called, an optimized constructor is generated, saved as the funcallable instance function, and finally called with the provided arguments. Later calls get the optimized constructor ready made.

There are multiple efficiency gains here: initargs can be checked for validity at compile time, and (assuming certain things hold true about the metaclass of the class and its methods) essentially the whole MAKE-INSTANCE -> ALLOCATE-INSTANCE | INITIALIZE-INSTANCE -> SHARED-INITIALIZE call tree can be open coded, and slot accesses can be performed using STANDARD-INSTANCE-ACCESS with constant slot locations. This all works with class redefinitions as ctors are updated automatically.

Unfortunately, this made the performance of non-constant class arguments to MAKE-INSTANCE extremely unattractive in comparison: 10-100 times slower is fairly typical. Ouch!

At ELS 2009 Didier Verna asked if we could do something similar for the non-constant case as well. Turns out we can:

(make-instance class :x x :y y)

can become something like

(let* ((#:cache (load-time-value (cons 'ctor-cache nil)))
       (#:store (cdr #:cache)))
  (multiple-value-bind (#:ctor #:new-store)
      (ensure-cached-ctor class :x :y #:store)
    (setf (cdr #:cache) #:new-store)
    (funcall #:ctor x y)))

where the ctor corresponding to the class argument is first looked for in the cache: if it is not found a new one is generated and cached.

This was implemented in SBCL 1.0.29.41; now the non-constant class arguments performs much better, being roughly only 1.15-1.25 times slower than the constant ones. Hopefully this will make a real difference in some applications out there!

A bit later another ctor optimization improvement was implemented, improving cases where the full open coding could not be done due to defined methods, but where we are still able to check initargs at compile time and replace the MAKE-INSTANCE generic function call with a slightly faster normal function call to a simplified constructor function: this makes such calls 4-10 times slower than the fast path, but still considerably faster than the worst case scenario.

The message to take home is actually not about SBCL optimizations — as much as I like to talk about them: I think ctors are an excellent example of compiler macros doing inline caching and specialization. A nice technique that could probably be used more often than it is.

Note: SBCL inherited its ctor optimizations from CMUCL: they are the work of the estimable Gerd Moellmann. Things described above do not replace his work, just extend it around the edges. Those interested in the gory details can look at src/pcl/ctor.lisp in the SBCL source tree.