lispcommon-lispclisp

In Common Lisp, how would you extend the existing comparator operations to work for new types?


I'm attempting to implement a lisp version of a domain specific language that uses the binary operators and comparators on discrete probability distributions. Is there a safe(ish) and concise way to extend the functionality of these atoms < <= = >= > + - / * to work with new types, without completely bricking the language?

I realize just making new operator names would be the easiest solution, but that answer is not compelling nor fun, and certainly doesn't live up to lisp's reputation as the programmable programming language.

For example, say we have the hash-table:

((4 . 1/4) (3 . 1/4) (2 . 1/4) (1 . 1/4))

which we create with the macro (d 4), and we want to be able to compare it to a number like so

(< (d 4) 3)

which we would want to return a hash-table representing a 50% chance it's true and 50% chance it's false, say in this form:

((0 . 1/2) (1 . 1/2))

but currently it produces the error: *** - <: #S(HASH-TABLE :TEST FASTHASH-EQL (4 . 1/4) (3 . 1/4) (2 . 1/4) (1 . 1/4)) is not a real number


Solution

  • There are three answers to this:

    1. no, you can't do it because + &c are not generic functions in CL and you may not redefine symbols exported from the CL package ...
    2. ... but yes, of course you can do it if you're willing to accept a small amount of inconvenience, because Lisp is a programmable programming language ...
    3. ... but no, in fact you can't completely do it if you want to preserve the semantics of your extended language in fact, and there isn't anything you can do about that.

    So let's look at these in order.

    (1): The arithmetic operators are not generic functions in CL ...

    This means you can't extend them: they work on numbers. And you can't redefine them because the language forbids that.

    (2): ... but of course you can get around that ...

    So, the first trick is we want to define a package in which the arithmetic operators are not the CL versions of them. We can do this manually, but we'll use Tim Bradshaw's conduits system to make life easier:

    (in-package :org.tfeb.clc-user)
    
    (defpackage :cl/generic-arith
      ;; The package people will use
      (:use)
      (:extends/excluding :cl
       #:< #:<= #:= #:= #:> #:+ #:- #:/ #:*)
      (:export
       #:< #:<= #:= #:= #:> #:+ #:- #:/ #:*))
    
    (defpackage :cl/generic-arith/user
      ;; User package
      (:use :cl/generic-arith))
    
    (defpackage :cl/generic-arith/impl
      ;; Implementation package
      (:use :cl/generic-arith))
    

    So now:

    > (in-package :cl/generic-arith/user)
    #<The CL/GENERIC-ARITH/USER package, 0/16 internal, 0/16 external>
    
    > (describe '*)
    
    * is a symbol
    name          "*"
    value         #<unbound value>
    function      #<unbound function>
    plist         nil
    package       #<The CL/GENERIC-ARITH package, 0/1024 internal, 978/1024 external>
    

    So now we can get on with defining these things. There are some problems: because what we've done is create an environment where, for instance * is not cl:*, then anything which uses * as a symbol will not work. So for instance cl:* is bound the the value of the last form evaluated in the repl, and typing * isn't going to get that symbol: you'll need to type cl:*.

    Similarly the + method combination will not work: you'd have to explicitly say

    (defgeneric foo (...)
      ...
      (:method-combination cl:+)
      ...)
    

    So there is some pain. But not, probably, that much pain.

    And now we can start implementing things. The obvious approach is to define generic functions which are 2-arg versions of the functions we need, and then define the functions on top of them. That means we can use things like compiler macros for the actual operators without fear of breaking the underlying generic functions.

    So start by

    (in-package :cl/generic-arith/impl)
    
    (defgeneric +/2 (a b)
      (:method ((a number) (b number))
       (cl:+ a b)))
    
    ...
    

    And now you think for a bit and realise that ...

    (3) ... but there are problems you can't get around, in any language

    The problem is at least this: + is one of the two operators on a field, and the way it is defined in CL in terms of some two-argument function like +/2 above is something like this (this may be wrong implementationally but it's how the maths would go):

    And that looks fine but we've missed a case:

    Oh, wait: what field? For cl:+ this is fine: it's the identities of the field of numbers. But now? Who knows? And there's nothing that can be done about that: if you want + to work the way it does in CL, except on other fields, you can't have that.

    In general if + and * are to represent the two operations of a field, then if zero-argument versions of those operators are allowed those should return the two identities of the field (which they do in CL). But that means that all fields they are defined for must share those two identities (which, again, they do in CL: the fields of rationals & complex numbers share identities). This means in turn that (+ x 3) (from (+ x (*) (*) (*))) must be allowed (and so on). Or you either disallow the zero-argument case, or that they represent the field operations, or both.