Skip to content

Extending deftype and company for CLR

dmiller edited this page Sep 13, 2010 · 12 revisions

A work in progress

A contemplation

Clojure has been adding a variety of new methods for defining types and interfaces, either directly or indirectly. Here’s a list:

  • proxy
  • gen-class
  • gen-delegate (CLR only) (rename proposed below)
  • gen-interface
  • definterface
  • reify
  • deftype
  • defrecord
  • protocol

Although ClojureCLR has implementations of the all of these, the implementations are at present inadequate to handle the full complexity of method signatures that the CLR presents.

Expressing just a bit of envy: the biggest complexity that the JVM seems to present here are covariant return types. Some of the functions above are forced to define bridge methods to handle this. ClojureJVM developers should appreciate this more. :)

Here’s what needs to be handled on the CLR side:

  • properties
  • by-ref parameters (ref and out parameters in C#)
  • explicit interface implementation
  • true generics
  • indexers

I have handled properties and by-ref parameters properly in host expressions. I have handled none of these in the macros above.

I have some proposals and some questions. I list them here, explain them below.

  • Replace the use of refparam and outparam with by-ref in host expression syntax.
  • Allow the use of by-ref in method signatures in all of these macros.
  • Allow property definition where we now allow method definition (gen-interface and company).
  • Allow property implementation where we now allow method definition (gen-class, deftype and company)
  • Use |-escaping for symbol names to allow reference to generic types and other problematic type names.
  • Do not allow (for now) the definition of new interfaces and classes that have type parameters (true generic types).
  • Ignore indexers (for now).
  • Rename gen-delegate to something like delegate-proxy to reflect that the parallel is to proxy, not gen-class.
  • Introduce defdelegate to define new delegate types. (Alternate name: gen-delegate.)

There are some alternates proposed for some of the syntactic forms. See below.

Host expressions set the stage

We have handled by-ref parameters in host expressions by means of a special syntactic form:

(. c (m1 x (refparam y)))

This will match a method on c named m1 taking two arguments, the second of which must be by-ref. The y must be a locally bound variable. Type hints on x or y will be used to disambiguate method calls, potentially avoiding reflection at runtime.

Properties are conflated with zero-arity methods. Disambiguation will look for a field, then a property, then a zero-arity method.

See CLR Interop for more details.

Currently, refparam and outparam are both used. This parallels C#. Internally, CLR only has ByRef parameters. The C# distinction between ref and out params is irrelevant in Clojure. I propose dropping refparam and outparam in favor of by-ref. (Can’t use ref, unfortunately.)

Defining new methods

We can classify the macros listed above by whether they define new methods or only refer to methods defined elsewhere, in an underlying interface or base class.

The forms gen-interface, gen-class, definterface and defprotocol are the only ‘definers’. At present, definterface and defprotocol are defined in terms of gen-interface, so a compatible solutions to the following questions should be pursued. gen-class is such a mish-mash, I plan to ignore it for now.

When defining a new interface via gen-interface and company:

  1. Should we be able to define properties?
  2. Should we be able to define methods with by-ref parameters?

Defining properties would not be difficult to implement. We need only have a syntactic indication of this. This will take different forms for each of the three. See below for a suggestion.

Defining methods with by-ref parameters is also not difficult to implement.

Here is how properties and by-ref could be handled in each of the three method-defining forms. Note that this solution assumes a getter and a setter are assigned for each property.

 (gen-interface :name I1
    :extends [I2 I3]
    :methods [ 
      [m1 [Object Int32] String]               ; normal
      [m2 [Object (by-ref Int32)] String]  ; taking a by-ref parameter
      [m3 [] String]                                  ; zero-arity method
      [m4 String]                                     ; property  
    ])

(definterface I1 
  (#^String m1 [x #^int y] )                ; normal
  (#^String m2 [x (by-ref #^int y)] )   ; taking a by-ref parameter
  (#^String m3 [] )                            ; zero-arity method
  (#^String m4 )                               ; property)
  )

Protocols are a bit trickier for properties, syntactically. The best solution depends on the answer to the following question. Can there be a method and a property with the same name?

C# does not allow a class or interface to have a method with the same name as a property. CLR/MSIL does.

If the answer is no, then the solution is easy:

(defprotocol [a b] 
  (#^String m1 
     [x #^int y]               ; normal
     [x (by-ref #^int y)]    ; taking a by-ref parameter
     [] )                      ; zero-arity method
                               ; can't overload a property on m1
  (#^String m4 )               ; property)
)

If the answer is yes, then we can overload a property and a zero-arity method. We cannot use [] as a marker for properties. We need some other indicator. We could do any of a number of things. Here’s one possibility:

(defprotocol [a b] 
  (#^String m1 
     [x #^int y]                ; normal
     [x (by-ref #^int y)]    ; taking a by-ref parameter
     []                             ; zero-arity method
     :property )               ; property)
)

Implementing methods

The remaining macros implement methods. They need to indicate the signature of the method being implemented.

By-ref parameters

Signatures of by-ref methods are handled as in definterface:

(reify  P1 
      (m1 [x #^int y]  ...)             ; normal
      (m2 [x (by-ref #^int y)] ...)   ; taking a by-ref parameter
      (m3 [] ...)                       ; zero-arity method or property  
)

Properties

For properties, we would have to distinguish defining getters and setters. We could do this with something along these lines:

(reify  P1 
  ...
  (m4 :get ... )
  (m4 :set [value] ...)
)

or

(reify  P1 
  ...
  (:get m4 ... )
  (:set m4  [value] ...)
)

It would be simpler to define the getters and setters directly:

(reify P1
  ...
  (get_m4 [] ... )
  (set_m4 [value] ...)
)

However, unless we explicitly define the property m4, the resulting class will not have an m4 property and reflection won’t work properly.

Explicit implementation

Explicit implementation is required when you want to give different implementations to methods from different interfaces that have the same signature.

(definterface I1 
   (#^String m1 [#^int x] ))

(definterface I2
   (#^int m1 [#^String x]))

Then the following is not going to work:
(reify 
  I1 
  (m1 [x] ...)
  I2
  (m1 [x] ...))

One of the two needs to be an explicit implementation:

(reify
  I1 
  (m1 [x] ...)
  I2
  (I2.m1 [x] ...))

Note that the name will have to be fully-qualified.

Sounds simple enough.

Generics

Definitions of generic types and interfaces

Should definterface, genclass, and protocol allow the introduction of type parameters so that generic types can be defined?

For now, I say no. I’m willing to revisit this, but I haven’t had time to think through all the complications this is likely to present.

References to generic types

References to generic interfaces are required in reify, deftype, defrecord and proxy. Because type names, especially instantiated generic types, can contain some bad characters for symbols and symbols are required in resolve, we need a way to specify syntactically symbols with bad names.

ClojureCLR already has |-escaping for symbols. (Though it is not well-documented.)

|anything except a vertical bar...<>@#$@#$#$|

To include a vertical bar in a symbol name that is |-escaped, use a doubled vertical bar.

|This has a vertical bar in the name ...  || ...<>@#$@#$#$|

Note |-escaping defined in ClojureCLR differs from the similar mechanism in CommonLisp in several ways:

  • We only support the use of | to indicate escaping at the beginning of a symbol name. Thus |123abc| has name “123abc”, but abc|1|23 has name “abc|1|23”. (In CommonLisp, the name would be “abc123”.)
  • CommonLisp allows a literal vertical bar in a symbol name with backslash-escaping: abc\|123 has name “abc|123”. ClojureCLR uses a doubled vertical bar, and only within an escaped symbol name. We would write |abc||123| for the symbol with name “abc|123”.
  • CommonLisp allows | to turn on/off escaping withing a name. Thus, in CommonLisp you could do: abc|<>|123|<>|pqr for the symbol with name “abc<>123<>pqr”. In ClojureCLR you would write |abc<>123<>pqr|.

I have not yet implemented print-readably for symbols with bad characters. (Nor has ClojureJVM.) This is straightforward.

Indexers

Indexers are going to create one more level of syntactic confusion. For now, just use the getters and setters directly.

gen-delegate naming problem

gen-delegate is named parallel to gen-interface and gen-class. This makes an incorrect parallel. The latter two define new interfaces and classes, respectively. gen-delegate creates an instance of a delegate. This function is more similar to reify and proxy.

gen-delegate should be renamed to something like reify-delegate or proxy-delegate, or

However, there is reason to allow a parallel gen-interface and gen-class or deftype and definterface. We may need to create new delegate types on the fly. Either a gen-delegate or a defdelegate should be introduced. (I vote for the latter name.)

Clone this wiki locally