Conversation
internal/enum/enum.go.tmpl
Outdated
| "github.com/bufbuild/protocompile/internal/iter" | ||
| ) | ||
|
|
||
| var _ iter.Seq[int] // Mark iter as used. |
There was a problem hiding this comment.
I know this is the simplest way to do this. But it would be nice to just emit only the imports that are actually needed. You could emit:
import (
"fmt"And then loop through methods and if/when you see "all" emit
"github.com/bufbuild/protocompile/internal/iter"
And emit the closing ) after that.
Another option might be not emit this line but still unconditionally emit the imports, and then always run make lintfix after go generate in the Makefile, which should fix up formatting, including organizing imports.
But I think just emitting the correct imports is fine for now.
Another option is, if the All method isn't used in very many places, just make them hand-written instead of generated.
There was a problem hiding this comment.
I don't think it's really worth the complexity to get the imports right. No one is going to read this file (or should, at least).
experimental/ir/presence/kind.go
Outdated
|
|
||
| // The field is singular and presence is equivalent to the field having its | ||
| // zero value. This corresponds to a non-message field not marked "optional" | ||
| // in proto3. |
There was a problem hiding this comment.
Yes, but this comment is mostly meant to be giving an example, not be exhaustive.
| // Type is the element type of this value. For arrays, this is the type of the | ||
| // element; Protobuf does not have a concept of an "array type". |
There was a problem hiding this comment.
It may behoove us to provide an "array" type, even though Protobuf descriptors do not. That could enable simpler code that inspects and works with types, to treat them more uniformly. Though I think doing so requires adding another byte to Type, to put an isArray flag there 🤷 (which may not be desirable since, for alignment, it means likely adding 4 or 8 bytes to the Type struct in practice...).
There was a problem hiding this comment.
I am slightly skeptical of that. My gut instinct would be to not add that for now, and see how painful it makes things.
experimental/ir/option.go
Outdated
| // Unknown means this is a message type. | ||
| // Enum types are predeclared.Int32. | ||
| ty predeclared.Name |
There was a problem hiding this comment.
These seems unfortunate, that you can't inspect the enum type of an enum value. Maybe instead encode enums/messages/groups as negative values here and map to a separate enum for NamedType?
There was a problem hiding this comment.
Well, one thing that you could do is say that predeclared.Unknown represents "is a user type", and determining whether it's a message or enum requires looking at the type.
experimental/ir/option.go
Outdated
|
|
||
| type rawMessageValueEntry struct { | ||
| key intern.ID | ||
| field int32 // Index of the field within rawMessageValue.ty; -1 if unknown. |
There was a problem hiding this comment.
We need more here to handle extensions. I think you could instead include a ref[rawField] here, instead of this field, and then interpret its ref.file field as an index into fields of rawMessageValue.typ if it is negative but not -1. (And in such cases, ref.ptr is ignored.)
There was a problem hiding this comment.
I think the representation for an expanded Any will be kind of funky, but there's really no avoiding that, and I think this can handle that correctly. We just have to resolve a type URL "field name" to a ref[rawType] and associate that with the field's value.
There was a problem hiding this comment.
Yeah, I think that for any we need to have both a ref[rawField] and ref[rawType]. Which is unfortunate, since most messages will only contain an int32.
Perhaps we'll wind up having different kinds of entries... we'll see. I'm kind of trying to punt on extensions in message literals for now, since it's a niche feature and I want to prioritize getting things working first.
There was a problem hiding this comment.
The TODO only mentions Any and not extensions. My original comment was about extensions -- you need more than just the tag number for those since you'll also need to know the file from which it came.
Also, FWIW, you shouldn't need both a ref[rawField] and ref[rawType] because you never need both at the same type. A field in a message literal is exactly one of:
- A normal field
- An extension field
- A reference to a type URL for expanded Any syntax.
So you could overload a single 64-bit value (such as a ref[rawField]). Extension numbers are only 31 bits (usually 29, but 31 for the case of message set wire format). Normal field numbers are only 29 bits. So the scheme I mentioned above would work, and you could even interpret the reference as a type instead of field when the top two highest bits are set. (That allows over a billion types per file, which seems like it should be plenty...).
There was a problem hiding this comment.
Turns out I was being dumb. All message literals know their type, so if you write { [google.com/foo.Bar]: { ... } } that will be typed correctly: the value's field's type will be Any, but the value as a message will be foo.Bar.
Changing to a ref[rawField] makes everything work out.
experimental/ir/option.go
Outdated
|
|
||
| isArray bool // If set, value is an arena.Pointer[[]rawValueBits]. | ||
| isUninterpretedPath bool // If set, value is an intern.ID. | ||
| isUninterpretedAggregate bool // If set, value is an intern.ID. |
There was a problem hiding this comment.
Perhaps this should be an arena.Pointer[string]?
Agreed that it's almost certainly not worth interning the aggregate value. I'm not even sure we need to compute it/store it all. I thin it would be better to leave value unset/nothing and rely solely on the AST expression node to interpret. If we ever do need the aggregate string (which would only be if we were to populate a DescriptorProto before/without interpreting options), it could be computed on-demand from the AST node.
| // We can't just embed With into File, because then it would be an exported | ||
| // field. |
There was a problem hiding this comment.
My issue with this naming convention is that it's unclear what bug number is really referring to. I usually see this in test code to indicate an issue in the owning repo (like a protocompile bug number). I think it makes the code more readable to name it semantically and then put a full link to the bug at its definition.
experimental/ir/option.go
Outdated
|
|
||
| type rawMessageValueEntry struct { | ||
| key intern.ID | ||
| field int32 // Index of the field within rawMessageValue.ty; -1 if unknown. |
There was a problem hiding this comment.
The TODO only mentions Any and not extensions. My original comment was about extensions -- you need more than just the tag number for those since you'll also need to know the file from which it came.
Also, FWIW, you shouldn't need both a ref[rawField] and ref[rawType] because you never need both at the same type. A field in a message literal is exactly one of:
- A normal field
- An extension field
- A reference to a type URL for expanded Any syntax.
So you could overload a single 64-bit value (such as a ref[rawField]). Extension numbers are only 31 bits (usually 29, but 31 for the case of message set wire format). Normal field numbers are only 29 bits. So the scheme I mentioned above would work, and you could even interpret the reference as a type instead of field when the top two highest bits are set. (That allows over a billion types per file, which seems like it should be plenty...).
There was a problem hiding this comment.
I'm a little confused by this commit (2553cb4). Why do the files need an ir_ prefix to group together? What other prefixes do you expect to see in this package in the future?
There was a problem hiding this comment.
I expect there to be a lower_ prefix for the AST->IR lowering code.
This PR is the first draft of the intermediate representation for symbol resolution, type inference (of option types) and eventually, generation of
FileDescriptorProtos.This PR is essentially just data structure boilerplate: it does not contain everything (for example, there is no support for services yet) nor does it include code for constructing from an AST or an imported FDP. The purpose of this PR is to get the first draft checked in so that followup PRs can iterate on the API as symbol resolution, type checking, and option resolution are implemented.