random-state.net

Nikodemus Siivola

<< next | top | previous >>

What Does Extensible CAS Mean? #
hacking, November 13th 2011

Yesterday I said SBCL now had extensible CAS. Was sind paranormale Tonbandstimmen, what is CAS, and why should you care if it's extensible? Turn off the music and I'll tell you.

CAS is short for compare-and-swap. Compare-and-swap is a fairly common atomic operation. It compares the current value of a memory location with another value, and if they are the same it replaces the value of that memory location with a specified new value.

Depending on the language and the exact design of the interface, it might just return true for success, or it might return the old value. In SBCL it does the latter, which is sometimes very convenient, but also means you need to compare the return value to the expected-old-value you specified to know if CAS succeeded.

Because it is atomic, if you have two threads doing CAS on the same memory location in parallel, only one can succeed:

(let* ((x (list nil))
       (a (join-thread (make-thread (lambda () (cas (car x) nil :a)))))
       (b (join-thread (make-thread (lambda () (cas (car x) nil :b))))))
  ;; Because CAS is atomic, we know that exactly one of the threads
  ;; will succeed — but we can't know which beforehand.
  (cond ((not a)
         ;; A returned NIL, therefore it replaced the CAR with :A
	 ;; and therefore B must return :A and do nothing.
         (assert (and (eq :a (car x)) (eq :a b))))
        ((not b)
         ;; ...and vice versa.
         (assert (and (eq :b (car x)) (eq :b a))))
        (t
         (error "WTF? Broken CAS?"))))

If you have the least bit of threads on your mind, you can imagine how this can be quite useful.

Out of the box current bleeding edge SBCL supports CAS on a number places: car, cdr, first, rest, svref, slot-value, standard-instance-access, funcallable-standard-instance-access, symbol-value, symbol-plist, and defstruct-defined slot accessors with slot types fixnum and t. (Note: slot-value is not currently supported by CAS if slot-value-using-class or friends are involved — that's still in the works.)

With the exception of slot-value all of those pretty much come down to a single LOCK:CMPXCGH instruction on Intel architectures.

...but what it you have a data structure — say a queue of some sort — and want to implement cas-queue-head which does CAS on the first element of the queue. Fine. You can do that without any CAS support from the implementation by using eg. a lock.

...but what if you want to write a macro that operates on a CAS-able place?

(defmacro my-atomic-incf (place &optional (delta 1))
  "Spins till it succeeds in atomically incrementing PLACE by
DELTA. PLACE must be CASable."
  ;; OOPS! We're multiply-evaluating PLACE.
  (with-gensyms (old new n-delta))
    `(let* ((,old ,place)
            (,n-delta ,delta)
            (,new (+ ,old ,n-delta)))
      (loop until (eq ,old
                      (setf ,old (cas ,place ,old ,new)))
            do (setf ,new (+ ,old ,n-delta)))
      ,new))

Now imagine some hapless user doing:

(loop with i = 0
      while (< i (length vec))
      do (my-atomic-incf (svref vec (1- (incf i)))))

Where instead of I increasing by 1 each time through the loop and iterating across the whole vector, it could increase I by who-knows-how-many on a single attempt skipping entries and even running out of bounds. Ouch.

Turns out that to write a macro that operates on a CASable place you need something analogous to get-setf-expansion, except for CAS instead of SETF. As of yesterday, SBCL has sb-ext:get-cas-expansion that you can use to write a macro like my-atomic-incf correctly and safely.

(defmacro my-atomic-incf (place &optional (delta 1) &environment env)
  (multiple-value-bind (vars vals old new cas-form read-form)
      (sb-ext:get-cas-expansion place env)
    (with-gensyms (n-delta)
      `(let* (,@(mapcar 'list vars vals)
               (,old ,read-form)
               (,n-delta ,delta)
               (,new (+ ,old ,n-delta)))
          (loop until (eq ,old (setf ,old ,cas-form))
                do (setf ,new (+ ,old ,n-delta)))
          ,new))))

What's more, we've now have the notion of a generalized CASable place, just like Common Lisp has the notion of a generalized SETFable place.

This means that the person who has written cas-queue-head can use defcas, define-cas-expander, or even just:

(defun (sb-ext:cas queue-head) (old new queue)
  (cas-queue-head queue old new))

to make queue-head a CASable place on equal footing with the baked-in ones — so that

(my-atomic-incf (queue-head queue))

will Just Work. (Assuming your cas-queue-head works and is atomic, of course. The system will not magically make it atomic — meaning you will either have to use CAS internally, or lock, or... well, it depends on what you're doing.)

I think that's pretty nifty. I'm still looking at adding support for (cas slot-value-using-class), which will be even niftier. Who says there's no innovation in open source? (Maybe I'm feeling a bit hubristic right now. I'll come down soon enough when the first bug-reports hit the fan.)

Feel free to turn Laurie Anderson back on now.