random-state.net / Fastish CLOS slot typechecking (September 11th 2007)

Fastish CLOS slot typechecking #
hacking, September 11th 2007

SBCL has typechecked CLOS slot writes in safe code since 1.0, but it has been fairly slow, as the typechecking requirement used to disable pretty much all optimizations of note that SBCL was otherwise able to use:

(declaim (optimize (safety 3)))

(defclass foo () 
  ((n :initform 1 :type fixnum :accessor n-of)))

(defmethod frob1 ((foo foo))
  (dotimes (j 1000)
    (setf (n-of foo) 0)
    (dotimes (i 30000)
      (setf (n-of foo) 
            (+ (the fixnum (n-of foo)) i)))))

(defmethod frob2 ((foo foo))
  (dotimes (j 1000)
    (setf (slot-value foo 'n) 0)
    (dotimes (i 30000)
      (setf (slot-value foo 'n) 
            (+ (the fixnum (slot-value foo 'n)) i)))))

(defvar *foo* (make-instance 'foo))

;; First calls pay for dispatch computation, 
;; which we are not interested in.
(frob1 *foo*)
(frob1 *foo*)
(frob2 *foo*)
(frob2 *foo*)

(time (frob1 *foo*))
(time (frob2 *foo*))

For 1.0.9 this gives:

Evaluation took:
  4.828 seconds of real time
  4.816301 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
Evaluation took:
  6.289 seconds of real time
  6.288393 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.

Yikes! That's slow!. Using SAFETY 1 the numbers are:

Evaluation took:
  0.855 seconds of real time
  0.856053 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
Evaluation took:
  0.251 seconds of real time
  0.252016 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.

Quite a difference! Since 1.0.9.57 most important SLOT-VALUE optimization can now be applied with typechecking as well, giving SAFETY 3 numbers like this:

Evaluation took:
  4.807 seconds of real time
  4.7963 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
Evaluation took:
  0.431 seconds of real time
  0.420026 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.

...which is pretty reasonable for the SLOT-VALUE, but still not so great for the accessor. For completeness sake, SAFETY 1 looks like this:

Evaluation took:
  0.747 seconds of real time
  0.744046 seconds of user run time
  0.004 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
Evaluation took:
  0.194 seconds of real time
  0.196013 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.

So, while slot accesses have become faster across the board, the typechecking accessors are still lagging far behind. Be adviced.

Another thing I wanted to discuss was the general effciency considerations regarding CLOS slot accesses:

SLOT-VALUE inside method bodies using a constant slot name when the first argument is a specializer of the method are quite fast: basically two memory indirects, plus a possible typecheck-function call.
SLOT-VALUE using a variable slot name — in or out of a method body — is slow. You definitely want to avoid them in speed critical code.
SLOT-VALUE using a constant slot name in all other cases is essentially as fast or slow as an accessor call: this applies to using it inside a normal function body, or inside a method body when the first argument is not a specializer of the method. Not horribly slow, but not spectacularly fast either.

This is not a guarantee, of course, and while we're trying to make things as fast as possible, it remains likely that for a while at least the above three distinctions will remain. (There is definite optimization potential for accessor calls — and method calls in general — inside method bodies when the arguments are method specializers, but there is no schedule for it at the moment.)

Finally, I'd like to note the bit of CLOS spec that is the death-stroke to easy & fast typechecking slot writes: subclasses can have stricter types for their slots then superclasses. This bit not only breaks is-a relationships, as subclasses cannot then be used interchangably with their superclasses, but also makes optimizations a lot harder — and in some cases fairly intrackable.

To future lisp specification writers: please don't allow subclasses to restrict their slot types — it has no real payoff, and it makes many things a lot harder then they need to be.