GH-48490: [GLib][Ruby] Add GArrowMakeStructOptions#48491
Conversation
|
|
|
The class is missing the |
|
Could you push the code that raises |
85e630c to
730c87a
Compare
|
I pushed some code. I couldn't figure out what types the properties should have, so I added getters and setters instead, but it doesn't work because of the limitations in gobject-introspection. I'm sure there are more problems with this code, since I can't test it. Perhaps there is a better way to do it? |
|
Thanks. I understand this situation and I also consider API. How about the following API? GArrowMakeStructOptions *
garrow_make_struct_options_new(void);
void
garrow_make_struct_options_add_field(GArrowMakeStructOptions *options, const char *name, gboolean nullability, GHashTable *metadata);
// no getter or
const char *
garrow_make_struct_options_get_field_name(GArrowMakeStructOptions *options, gsize i);
gboolean
garrow_make_struct_options_get_field_nullability(GArrowMakeStructOptions *options, gsize i);
GHashTable *
garrow_make_struct_options_get_field_metadta(GArrowMakeStructOptions *options, gsize i);GArrowMakeStructOptions *options = garrow_make_struct_options_new();
garrow_make_struct_options_add_field(options, name1, nullability1, metadata1);
garrow_make_struct_options_add_field(options, name2, nullability2, metadata2);
... |
|
I implemented your suggested API. I have some code in my own project that automatically generates Ruby code from the function doc, and it wouldn't work with this API. Still, I guess it is the best we can do. |
kou
left a comment
There was a problem hiding this comment.
I have some code in my own project that automatically generates Ruby code from the function doc
Wow! How do you implement it?
BTW, we can provide convenient converter in Ruby like https://github.com/apache/arrow/blob/main/ruby/red-arrow/lib/arrow/equal-options.rb does.
c_glib/arrow-glib/compute.cpp
Outdated
| } | ||
|
|
||
| enum { | ||
| PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1, |
There was a problem hiding this comment.
Should we keep this? I think that we should not provided this because mixing field_names and add_field() isn't expected. For example, add_field(); field_names = [...] removes information provided by the first add_field().
There was a problem hiding this comment.
I agree it can be confusing, but I think this is more convenient if you don't care about nullability or metadata.
There was a problem hiding this comment.
OK. How about providing convenient APIs like this in Ruby something like the following?
module Arrow
class MakeStructOptions
def field_names=(names)
raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
names.each do |name|
add_field(name, true, nil)
end
end
end
endThere was a problem hiding this comment.
What if you call #field_names= twice?
The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code. However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient. |
d665a82 to
8e58c8b
Compare
kou
left a comment
There was a problem hiding this comment.
The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb
Thanks. It's very helpful!
It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code.
Good point. I want to add similar features to Red Arrow but I want to use dynamic method definitions instead of code generation because Acero functions can be added dynamically.
Does the dynamic method definition work with IDE integration?
However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient.
We can add convenient API in Ruby.
For example, Arrow::Function#resolve_options provides a try_convert hook:
arrow/ruby/red-arrow/lib/arrow/function.rb
Lines 27 to 49 in bd8ab75
Arrow::MakeStructOptions.try_convert can accept {field_names:, field_nullability:, field_metadata:}:
module Arrow
class MakeStructOptions
class << self
def try_convert(value)
case value
when Hash
options = new
field_names = value[:field_names]
field_nullability = value[:field_nullability] || []
field_metadta = value[:field_metadata] || []
field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
options.add_field(name, nullability, metadata)
end
options
else
nil
end
end
end
end
end[field, ...] can also be acceptable:
module Arrow
class MakeStructOptions
class << self
def try_convert(value)
case value
when ::Array
options = new
value.each do |name, nullability, metadata|
options.add_field(name, nullability, metadata)
end
options
when Hash
options = new
field_names = value[:field_names]
field_nullability = value[:field_nullability] || []
field_metadta = value[:field_metadata] || []
field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
options.add_field(name, nullability, metadata)
end
options
else
nil
end
end
end
end
end
c_glib/arrow-glib/compute.cpp
Outdated
| } | ||
|
|
||
| enum { | ||
| PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1, |
There was a problem hiding this comment.
OK. How about providing convenient APIs like this in Ruby something like the following?
module Arrow
class MakeStructOptions
def field_names=(names)
raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
names.each do |name|
add_field(name, true, nil)
end
end
end
end8e58c8b to
83e908a
Compare
83e908a to
323eef7
Compare
Yes, that makes sense for something included with Arrow. In my case it doesn't matter since I can generate the code afterwards.
No, unfortunately IDE integration for Ruby is pretty bad.
This could work, but I implemented another solution. This adds field-nullability and field-metadata as boxed properties of type. These aren't automatically converted by gobject-introspection, so to work around this, I implemented the getter and setter methods to the Ruby extension. If support is added to gobject-introspection in the future, we can just remove the code from the extension without changing the API. I'm not sure what the type annotations for such properties are supposed to look like however. |
kou
left a comment
There was a problem hiding this comment.
I can understand your motivation (providing the same API as doc for easy to code generation) but non GObject Introspection friendly API isn't expected.
In general, using GArray/GPtrArray in API is one of "Things to avoid": https://gi.readthedocs.io/en/latest/writingbindableapis.html#arrays
In a general-purpose library, it’s best not to expose GLib array and hash types such as GArray, GPtrArray, GByteArray, GList, GSList, GQueue, and GHashTable in the public API. They are fine for internal libraries, but difficult in general for consumers of introspected libraries to deal with.
GLib's main target is Ruby but I want to keep GLib API usable for other languages such as Vala and Lua.
Can we avoid using this approach (using GArray and GPtrArray properties)?
My suggestion:
- Provide
garrow_make_struct_options_add_field()in GLib - Provide
Arrow::FunctionOptions#keywordsor something for code generation in Ruby - Provide
Arrow::MakeStructOptions#fields=in Ruby - Provide
Arrow::MakeStructOptions#keywords(or something) that includesfieldsin Ruby
c_glib/arrow-glib/compute.cpp
Outdated
| } | ||
|
|
||
| enum { | ||
| PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1, |
I was hoping that GObject Introspection could support this in the future, but I can also see that it will be very difficult to express the types of the key and values inside a hash table inside an array.
That makes sense, but we are already using GHashTable for the field metadata, so I thought that was fine: arrow/c_glib/arrow-glib/field.cpp Lines 328 to 329 in 43273fd
I agree, and again I was hoping this API could be supported by GObject Introspection in the future, but I see now that it won't.
I feels like the keywords will be very specific to my use case, and not very useful in general? If you were to dynamically generate code you wouldn't need to know exactly what the options are. You could just accept I think I need to give up on the idea that I can generate code for all functions. How about this:
|
Ah, I understand. https://gi.readthedocs.io/en/latest/annotations/giannotations.html#type-signature But
Thanks for the suggestion. Let's use the approach! |
2527980 to
796a599
Compare
…Ruby extension" This reverts commit 40e16be0591d592cb705df6cbad4e92b4a7bbbd8.
It causes inconsistencies with field-nullability and field-metadata if add_field is called.
796a599 to
cfc4090
Compare
|
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 5bda712. There weren't enough matching historic benchmark results to make a call on whether there were regressions. The full Conbench report has more details. |
Rationale for this change
The
MakeStructOptionsclass is not available in GLib/Ruby, and it is used together with themake_structcompute function.What changes are included in this PR?
This adds the
MakeStructOptionsclass to GLib.Are these changes tested?
Yes, with Ruby unit tests.
Are there any user-facing changes?
Yes, a new class.