-
Notifications
You must be signed in to change notification settings - Fork 14
Extending deftype and company for CLR
Clojure has been adding a variety of new methods for defining types and interfaces, either directly or indirectly. Here’s a list:
proxy
gen-class
-
gen-delegate
(CLR only) (rename proposed below) gen-interface
definterface
reify
deftype
defrecord
defprotocol
Although ClojureCLR has implementations of the all of these, the implementations are at present inadequate to handle the full complexity of method signatures in CLR.
Here’s a partial list of what needs to be handled on the CLR side:
- properties
- by-ref parameters (
ref
andout
parameters in C#) - explicit interface implementation
- true generics
- indexers
I have handled properties and by-ref parameters in host expressions. I have handled none of these in the macros above.
I think the following need to be done:
- Get rid of
outparam
. - Allow the use of
refparam
(or its replacement) in method signatures in all of these macros for implementing methods. - Use |-escaping for symbol names to allow reference to generic types and other problematic type names.
- Rename
gen-delegate
to something likedelegate-proxy
(name TBD) to reflect that the parallel is toproxy
, notgen-class
. - Introduce
defdelegate
to define new delegate types. (Alternate name:gen-delegate
.)
I think the following should be done, but realize these are more debatable and seek input:
- Rename
refparam
to beby-ref
. - Allow the use of
refparam
(or its replacement) in method signatures in all of these macros for defining methods. - Allow property definition where we now allow method definition (
gen-interface
and company). - Allow property implementation where we now allow method definition (
deftype
and company,gen-class
to be ignored for now).
I think there are some things that we should avoid for now:
- Do not allow (for now) the definition of new interfaces and classes that have type parameters (true generic types).
- Ignore indexers (for now).
The syntax for by-ref parameters should follow that for host expressions. The syntax for representing properties is not obvious for some of the macros. Below, I propose alternatives in some cases, and make a recommendation for one alternative in each case.
We have handled by-ref parameters in host expressions by means of a special syntactic form:
(. c (m1 x (refparam y)))
This will match a method on c
named m1
taking two arguments, the second of which must be by-ref. The y
must be a locally bound variable. Type hints on x
or y
will be used to disambiguate method calls, potentially avoiding reflection at runtime.
Properties are conflated with fields and zero-arity methods. The lookup method (at compile or runtime) will look for a field, then a property, then a zero-arity method.
See CLR Interop for more details.
Proposal: remove outparam.
Currently, refparam
and outparam
are both used. This parallels C#. Internally, CLR only has ByRef
parameters. The C# distinction between ref
and out
params is irrelevant in Clojure.
Proposal: rename refparam.
I propose renaming refparam
to by-ref
. I’m a little hesitant on this on, because of the existence of Refs. If one distinguishes terminologically between parameters and arguments, then refparam
is not quite accurate. The CLR designation for this is ByRef. Echoing that makes sense.
Certain of the macros above — proxy
, reify
, deftype
and defrecord
— implement methods (as opposed to something like definterface
that defines methods for others to implement). They need to indicate the signature of the method being implemented.
The signatures that can be indicated need to be extended.
Proposal: Signatures of by-ref methods are handled as in host expressions.
For example,
(reify P1
(m1 [x #^int y] ...) ; normal
(m2 [x (by-ref #^int y)] ...) ; taking a by-ref parameter
)
Each of these macros must be able to implement arbitrary interfaces. This means any CLR interface. The ability to indicate ByRef parameter positions is necessary.
Extending the syntax to allow properties is optional. The question is whether we allow a natural mode of expression for CLR developers.
If we allow property definitions in definterface
and company, certainly we should allow it here. If definterface
and company are not extended to properties, it still might be useful here for consistency with externally-defined CLR interfaces.
For properties, we would have to distinguish defining getters and setters. We could do this with something along these lines:
(reify P1
...
(m4 :get ... )
(m4 :set [value] ...)
)
or
(reify P1
...
(:get m4 ... )
(:set m4 [value] ...)
)
It would be simpler to define the getters and setters directly:
(reify P1
...
(get_m4 [] ... )
(set_m4 [value] ...)
)
However, unless we explicitly define the property m4
, the resulting class will not have an m4
property and reflection won’t work properly. I prefer the first syntax because it is explicit.
Proposal: Use the first syntax above to define property getters and setters in signatures when implementing methods.
Explicit implementation is required when you want to give different implementations to methods from different interfaces that have the same signature.
(definterface I1 (#^String m1 [#^int x] ))
(definterface I2 (#^int m1 [#^int x]))
Then the following is not going to work:
(reify
I1
(m1 [x] ...)
I2
(m1 [x] ...))
One of the two needs to be an explicit implementation:
(reify
I1
(m1 [x] ...)
I2
(I2.m1 [x] ...))
Note that the name will have to be fully-qualified.
Proposal: Implement explicit interface method implementation.
Sounds simple enough.
A note for the docs (when I get around to writing them):
When using
reify
, etc., on the JVM, it is not necessary to implement all the methods in implemented interfaces. The ClojureJVM code does not insert them, and the JVM does not care. Only if you try to invoke an unimplemented method will you discover this, via an exception being thrown.The CLR cares. If you define a class and do not implement all the methods, either the class must be marked as abstract or you will get an invalid type exception thrown. Abstract classes are useless here, so ClojureCLR must provide dummy implementations of all the methods in all the interfaces being implemented. These dummies need only throw a NotImplementedException.
When generating these dummy methods, we can just go through the list of all interface methods, subtract the ones the user provided implementations for, and dummy up the rest. Where two interfaces define identical methods (name + arg types + return types), only a single implementation will be provided.
This works except for the case where two interfaces provide methods with the same name and argument types but different return types. One can be implemented directly. One must be implemented via explicit implementation. The problem: which one? We have no way to know, and which we choose determines which interface we must cast to to pick up the desired implementation. Therefore, we will have to make is an error if the user does not provide implementations for at least one of the methods in this case.
Certain of the macros above — gen-interface
, gen-class
, definterface
and defprotocol
— define new methods (as opposed to identifying methods in existing interfaces or classes). At present, definterface
and defprotocol
are defined in terms of gen-interface
, so a compatible solutions should be pursued.
The issue is whether we should allow interfaces and protocols to define by-ref parameters and properties.
There is no compelling reason to allow either of these extensions. (As opposed to method implementation, where we must be able to indicate by-ref parameters.)
Argument in favor: We should supply the full range of expression that other CLR languages have. Also, it makes testing the by-ref extensions to reify
, etc., much easier.
Argument against: The JVM doesn’t have them. These expressions will be non-portable. We can live without them, so why clutter the implementation with them?
I’ll make the proposals below. Others (Rich) will have to decide.
Proposal: Allow by-ref parameters in
gen-interface
,definterface
anddefprotocol
.
This should be handled by the same by-ref
syntactic form used in host expressions. For example,
(gen-interface :name I1 :extends [I2 I3] :methods [ [m1 [Object Int32] String] ; normal [m2 [Object (by-ref Int32)] String] ; taking a by-ref parameter ])
(definterface I1 (#^String m1 [x #^int y] ) ; normal (#^String m2 [x (by-ref #^int y)] ) ; taking a by-ref parameter )
Proposal: Extend the syntax for
gen-interface
,definterface
anddefprotocol
to allow properties.
We need a syntax distinguishing properties from zero-arity methods. This will take different forms for each of the three.
Here is how properties could be handled in each of the three method-defining forms. Note that this solution assumes a getter and a setter are to be defined for each property.
(gen-interface :name I1 :extends [I2 I3] :methods [ [m1 [Object Int32] String] ; normal [m2 [Object (by-ref Int32)] String] ; taking a by-ref parameter [m3 [] String] ; zero-arity method [m4 String] ; property ])
(definterface I1 (#^String m1 [x #^int y] ) ; normal (#^String m2 [x (by-ref #^int y)] ) ; taking a by-ref parameter (#^String m3 [] ) ; zero-arity method (#^String m4 ) ; property) )
There is actually a second way to do by-ref for @definterface:
(definterface I1
(#^String m1 [x #^int y] ) ; normal
(#^String m2 [x #^int (by-ref y)] ) ; taking a by-ref parameter
)
For now, I’m sticking with the first version.
Protocols are a bit trickier. Unlike above, we cannot use [] to indicate a property because of confusion with zero-arity methods. Absence of an argument vector is also difficult given defprotocol
’s multiple-arities-per-method-name syntax. We could insist that properties appear separately:
(defprotocol [a b]
( m1
[x y] ; normal
[x (by-ref y)] ; taking a by-ref parameter
[] ) ; zero-arity method
; can't overload a property on m1
( m1 ) ; property indicated by lack of arg list
)
Or we could use another indicator in place of the argument vector. Here’s one possibility:
(defprotocol [a b]
(m1
[x y] ; normal
[x (by-ref y)] ; taking a by-ref parameter
[] ; zero-arity method
:property ) ; property)
)
If this :property
solution is used, it might be better to use a similar notation for definterface
and gen-interface
:
(gen-interface :name I1 :extends [I2 I3] :methods [ [m1 [Object Int32] String] ; normal [m2 [Object (by-ref Int32)] String] ; taking a by-ref parameter [m3 [] String] ; zero-arity method [m4 :property String] ; property ])
(definterface I1 (#^String m1 [x #^int y] ) ; normal (#^String m2 [x (by-ref #^int y)] ) ; taking a by-ref parameter (#^String m3 [] ) ; zero-arity method (#^String m4 :property) ; property) )
Proposal: Use the second version for the
defprotocol
syntax.
Proposal: Do not extend
gen-class
to support property definitions.
Given its primary use, there does not appear to be much to gain from allowing property definitions in gen-class
. Plus, the syntax would be a pain to modify. IMO, it is not worth the effort.
user=> (definterface I1 ( #^Int32 m [ (by-ref #^Int32 x) ])) user.I1
user=> (def r (reify I1 (#^Int32 m [ _ (by-ref #^Int32 x) ] (set! x (int (inc x))) (+ x 12)))) #'user/r
user=> (let [y (int 30)] [ (.m r (by-ref y)) y]) [43 31]
Proposal: Do not allow the definition of generic interfaces and classes.
Should definterface
, genclass
, and protocol
allow the introduction of type parameters so that generic types can be defined?
For now, I say no. I’m willing to revisit this, but I haven’t had time to think through all the complications this is likely to present. There will be syntactic complications for sure, but also issues regarding the instantiating the generic types. Given the use of these mechanisms at the moment, defining non-generic classes will probably suffice.
References to generic interfaces are required in reify
, deftype
, defrecord
and proxy
. Because type names, especially instantiated generic types, can contain some bad characters for symbols and symbols are required in resolve
, we need a way to specify syntactically symbols with bad names.
ClojureCLR already has |-escaping for symbols. (Though it is not well-documented.)
|anything except a vertical bar...<>@#$@#$#$|
To include a vertical bar in a symbol name that is |-escaped, use a doubled vertical bar.
|This has a vertical bar in the name ... || ...<>@#$@#$#$|
Note |-escaping defined in ClojureCLR differs from the similar mechanism in CommonLisp in one significant way:
- CommonLisp allows a literal vertical bar in a symbol name with backslash-escaping:
abc\|123
has name “abc|123”. ClojureCLR uses a doubled vertical bar, and only within an escaped symbol name. We would write|abc||123|
for the symbol with name “abc|123”.
There is a special interaction of |-escaping with / used to separate namespace from name. Any / appearing |-escaped does not count as a namespace/name separator. Thus,
(namespace 'ab|cd/ef|gh) => nil (name 'ab|cd/ef|gh) => "abcd/efgh"
(namespace 'ab/cd|ef/gh|ij) => "ab" (name 'ab/cd|ef/gh|ij) => "cdef/ghij"
With this mechanism we make a symbol referring to a type such as:
|System.Collections.Generic.IList`1[System.Int32]|
Note that this is the official CLR way of referring to the type that would be referred to (with an import) IList. I do not propose trying to implement C# or Visual Basic type naming conventions. The existence of characters such as the backquote and square brackets force some type of escaping mechanism.
I have not yet implemented print-readably for symbols with bad characters. (Nor has ClojureJVM.) This is straightforward and should be done.
Proposal: use |-escaping for symbols in order to allow CLR types to be represented.
This allows us to write:
(reify
|AnInterface<Int32,String>|
(m1 [x] ...)
I2
(m2 [x] ...))
Proposal: Do not implement indexers directly at this time.
Indexers are going to create one more level of syntactic confusion. For now, just use the getters and setters directly.
gen-delegate
is named parallel to gen-interface
and gen-class
. This is an incorrect parallel. The latter two define new interfaces and classes, respectively. gen-delegate
creates an instance of a delegate. This function is more similar to reify
and proxy
.
gen-delegate
should be renamed to something like reify-delegate
or proxy-delegate
or delegate-instance
or something else.
Proposal: Rename @gen-delegate (new name TBD)
However, there is reason to allow a parallel gen-interface
and gen-class
or deftype
and definterface
. We may need to create new delegate types on the fly. Either a gen-delegate
or a defdelegate
should be introduced. (I vote for the latter name.)
Proposal: Introduce a new macro
defdelegate
to define new delegate types.