Skip to content
This repository was archived by the owner on Feb 6, 2025. It is now read-only.

Commit eec75c0

Browse files
obi1kenobiLWprogramming
authored andcommitted
Add docstring for project_neighbors(). (#955)
* Add docstring for project_neighbors(). * Replace "Common bug..." section Co-authored-by: Branen Salmon <[email protected]> * Apply edits from code review. Co-authored-by: Branen Salmon <[email protected]> Co-authored-by: Selene Chew <[email protected]> Co-authored-by: Leon Wu <[email protected]>
1 parent b1455fa commit eec75c0

File tree

1 file changed

+181
-6
lines changed

1 file changed

+181
-6
lines changed

graphql_compiler/interpreter/typedefs.py

Lines changed: 181 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,8 @@ def project_property(
296296
property data needs to be loaded
297297
current_type_name: name of the vertex type whose property needs to be loaded. Guaranteed
298298
to be the name of a type defined in the schema being queried.
299+
(current_type_name names the concrete type of DataToken contained in
300+
the DataContext containers yielded by the data_context iterator.)
299301
field_name: name of the property whose data needs to be loaded. Guaranteed to refer
300302
either to a property that is defined in the supplied current_type_name
301303
in the schema, or to the "__typename" meta field that is valid for all
@@ -336,12 +338,185 @@ def project_neighbors(
336338
neighbor_hints: Optional[Collection[Tuple[EdgeInfo, NeighborHint]]] = None,
337339
**hints: Any,
338340
) -> Iterable[Tuple[DataContext[DataToken], Iterable[DataToken]]]:
339-
"""Produce the neighbors along a given edge for each of an iterable of input DataTokens."""
340-
# TODO(predrag): Add more docs in an upcoming PR.
341-
#
342-
# If using a generator or a mutable data type for the Iterable[DataToken] part,
343-
# be careful! Make sure any state it depends upon
344-
# does not change, or that bug will be hard to find.
341+
"""Produce the neighbors along a given edge for each of an iterable of input DataTokens.
342+
343+
To support traversing edges, as well as directives such as @optional and @recurse,
344+
the interpreter needs to get the neighboring vertices along a particular edge for
345+
a series of DataTokens.
346+
347+
For example, consider the following query:
348+
{
349+
Foo {
350+
out_Foo_Bar {
351+
< ... some fields here ... >
352+
}
353+
}
354+
}
355+
356+
Once the interpreter has used the get_tokens_of_type() function to obtain
357+
an iterable of DataTokens for the Foo type, it will automatically wrap each of them in
358+
a "bookkeeping" object called DataContext. These DataContext objects allow
359+
the interpreter to keep track of "which data came from where"; only the DataToken value
360+
bound to each current_token attribute is relevant to the InterpreterAdapter API.
361+
362+
Having obtained an iterable of DataTokens and converted it to an iterable of DataContexts,
363+
the interpreter needs to find the neighbors along the outbound Foo_Bar edge for each of
364+
those DataTokens. To do so, the interpreter calls project_neighbors() with the iterable of
365+
DataContexts, setting current_type_name = "Foo" and edge_info = ("out", "Foo_Bar"). This
366+
function call requests an iterable of DataTokens representing the neighboring vertices for
367+
each current_token contained in a DataContext. If the DataContext's current_token
368+
attribute is set to None (which may happen when @optional edges are used), an empty
369+
iterable of neighboring DataTokens should be returned.
370+
371+
A simple example implementation is as follows:
372+
def project_neighbors(
373+
self,
374+
data_contexts: Iterable[DataContext[DataToken]],
375+
current_type_name: str,
376+
edge_info: EdgeInfo,
377+
*,
378+
runtime_arg_hints: Optional[Mapping[str, Any]] = None,
379+
used_property_hints: Optional[AbstractSet[str]] = None,
380+
filter_hints: Optional[Collection[FilterInfo]] = None,
381+
neighbor_hints: Optional[Collection[Tuple[EdgeInfo, NeighborHint]]] = None,
382+
**hints: Any,
383+
) -> Iterable[Tuple[DataContext[DataToken], Iterable[DataToken]]]:
384+
for data_context in data_contexts:
385+
current_token = data_context.current_token
386+
neighbors: Iterable[DataToken]
387+
if current_token is None:
388+
# Evaluating an @optional scope where the optional edge didn't exist.
389+
# There are no neighbors here.
390+
neighbors = []
391+
else:
392+
neighbors = _your_function_that_gets_neighbors_for_a_given_token(
393+
current_type_name, edge_info, current_token
394+
)
395+
396+
# Remember to always yield the DataContext alongside the produced value
397+
yield data_context, neighbors
398+
399+
## Common bug to avoid in your implementation
400+
401+
In the previous code example, note that we called a module-scoped function,
402+
`_your_function_that_gets_neighbors_for_a_given_token`, instead of one defined in the scope
403+
of `project_neighbors`. Because Python evaluates references to variables in outer scopes
404+
at the time a function or generator is invoked--not at the time it's defined--, it's very
405+
easy to introduce subtle race conditions when defining generator factories in a nested
406+
scope.
407+
408+
Because generators may be evaluated in arbitrary order, these bugs can appear only
409+
intermittently and can be very difficult to troubleshoot. Always defining generator
410+
factories in the module scope is one reliable way to avoid this problem.
411+
412+
In this example code, we use a for-loop to yield several generators from a generator
413+
factory. Notice that we don't pass any arguments to the generator factory--the value its
414+
generators yield come from its enclosing scope.
415+
416+
>>> def yield_generators():
417+
... for target in range(1, 4):
418+
... def _generator_factory():
419+
... while True:
420+
... # refers to `target` in the enclosing scope
421+
... yield target
422+
... yield _generator_factory()
423+
...
424+
>>> gens = yield_generators()
425+
>>> one = next(gens)
426+
>>> next(one)
427+
1
428+
>>> two = next(gens)
429+
>>> next(two)
430+
2
431+
>>> next(one) # We expect 1, but get 2
432+
2
433+
>>> three = next(gens)
434+
>>> next(three)
435+
3
436+
>>> next(two) # We expect 2, but get 3
437+
3
438+
>>> next(one) # We expect 1, got 2, and now get 3
439+
3
440+
441+
Although we have three distinct generators, they're all yielding the same `target` from
442+
`yield_generator`'s scope, which is also the same `target` that the for-loop advances with
443+
each iteration.
444+
445+
If we define `_generator_factory` in the scope of the module, then we can't refer
446+
inadvertently to shared state in an enclosing scope, which saves us from this bug.
447+
448+
>>> def _generator_factory(target):
449+
... while True:
450+
... # refers to the argument `target`, which exists
451+
... # only in the local scope
452+
... yield target
453+
...
454+
>>> def yield_generators():
455+
... for target in range(1, 4):
456+
... yield _generator_factory(target)
457+
...
458+
>>> gens = yield_generators()
459+
>>> one = next(gens)
460+
>>> next(one)
461+
1
462+
>>> two = next(gens)
463+
>>> next(two)
464+
2
465+
>>> next(one)
466+
1
467+
>>> three = next(gens)
468+
>>> next(three)
469+
3
470+
>>> next(two)
471+
2
472+
>>> next(one)
473+
1
474+
475+
## Hints supplied to this function refer to neighboring vertices
476+
477+
Hint kwargs in this function, such as used_property_hints, filter_hints, and
478+
neighbor_hints, describe the desired structure of the *neighboring* vertices that this
479+
function produces (as opposed to the vertices supplied via the data_contexts argument).
480+
For example, consider the following query:
481+
{
482+
Foo {
483+
out_Foo_Bar {
484+
name @output(out_name: "name")
485+
}
486+
}
487+
}
488+
To traverse the out_Foo_Bar edge, project_neighbors() is called with
489+
used_property_hints=frozenset({"name"}) and data_contexts=<Iterable of DataContexts
490+
pointing to Foo vertices>. This is because used_property_hints correspond to
491+
neighboring vertices, and the neighboring Bar vertices (along the outbound Foo_Bar edge)
492+
are being queried for their "name" property.
493+
494+
Args:
495+
data_contexts: iterable of DataContext objects which specify the DataTokens whose
496+
neighboring DataTokens need to be loaded.
497+
current_type_name: name of the vertex type whose neighbors need to be loaded. Guaranteed
498+
to be the name of a type defined in the schema being queried.
499+
edge_info: direction and name of the edge along which neighboring vertices need to be
500+
loaded. For example, in the query example above, this argument would be set
501+
to ("out", "Foo_Bar").
502+
runtime_arg_hints: names and values of any runtime arguments provided to the query
503+
for use in filtering operations (e.g. "$arg_name").
504+
used_property_hints: property names of the neighboring vertices being loaded that
505+
are going to be used in a subsequent filtering or output step.
506+
filter_hints: information about any filters applied to the neighboring vertices being
507+
loaded, such as "which filtering operations are being performed?"
508+
and "with which arguments?"
509+
neighbor_hints: information about the edges of the neighboring vertices being loaded
510+
that the query will eventually need to expand.
511+
**hints: catch-all kwarg field making the function's signature forward-compatible with
512+
future revisions of this library that add more hints.
513+
514+
Yields:
515+
tuples (data_context, iterable_of_neighbor_tokens), providing the tokens of
516+
the neighboring vertices together with the DataContext corresponding to those neighbors.
517+
The yielded DataContext values must be yielded in the same order as they were received
518+
via the function's data_contexts argument.
519+
"""
345520

346521
@abstractmethod
347522
def can_coerce_to_type(

0 commit comments

Comments
 (0)