Propose validators to detect start loops#43
Conversation
| def _message_has_greeting(text: str) -> bool: | ||
| return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE)) |
| def is_greeting_repeated(dialogues: list[Dialogue]) -> bool: | ||
| """ | ||
| Checks whether greeting is repeated within dialogues. | ||
| Returns True if greeting has been repeated, False otherwise. | ||
| """ | ||
| for dialogue in dialogues: | ||
| for i, message in enumerate(dialogue.messages): | ||
| if i != 0 and message.participant == "assistant" and _message_has_greeting(message.text): | ||
| return True | ||
| return False |
There was a problem hiding this comment.
Отдельно добавить использование LLM
| def has_loop_to_start(G: BaseGraph) -> bool: | ||
| """ | ||
| Checks whether graph has node returning to the start node. | ||
| Returns True if there is a loop to start, False otherwise | ||
| """ | ||
| for edge in G.graph.edges: | ||
| if edge[1] == 1: | ||
| return True | ||
| return False |
| } | ||
|
|
||
|
|
||
| def _message_has_greeting(text: str) -> bool: |
There was a problem hiding this comment.
Добавить в классы семплинга и генерации
There was a problem hiding this comment.
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
dialogue2graph/metrics/no_llm_metrics/metrics.py:614
- Using re.match here only checks the beginning of the string; if the intent is to detect greetings appearing in other positions, consider using re.search or adjusting the regex to cover the entire string.
return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE))
dialogue2graph/datasets/complex_dialogues/generation.py:233
- The error message for repeated greetings in the generated dialogues currently states 'Opening is repeated' which might be misleading; consider updating it to clarify that a repeated greeting was detected.
if is_greeting_repeated(sampled_dialogues):
There was a problem hiding this comment.
Split validators into llm_validators and no_llm_validators like in metrics
There was a problem hiding this comment.
skip tests with embedders using pytest.mark.skipif
https://docs.pytest.org/en/6.2.x/skipping.html
There was a problem hiding this comment.
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
dialogue2graph/datasets/complex_dialogues/generation.py:150
- Using the literal '1' to identify the start node can be unclear; consider introducing a named constant (e.g., START_NODE = 1) to improve readability and maintainability.
no_start_cycle_requirement = not any([1 in c for c in cycles])
| bool: True if greeting has been repeated, False otherwise. | ||
| """ | ||
| if not regex: | ||
| regex = r"^hello|^hi|^greetings" |
There was a problem hiding this comment.
The regex pattern may inadvertently match 'greetings' anywhere in the text since only the first alternatives are anchored; consider grouping the alternatives as "^(hello|hi|greetings)" to ensure all patterns are anchored at the start.
| regex = r"^hello|^hi|^greetings" | |
| regex = r"^(hello|hi|greetings)" |
is_greeting_repeatedfunction to detect greeting appearing in the middle of a dialogue.has_loop_to_startfunction to detect loop to start in the graph (= edge has target node 1)