Skip to content

Propose validators to detect start loops#43

Merged
NotBioWaste905 merged 15 commits intodevfrom
feat/validate_cycles
Apr 7, 2025
Merged

Propose validators to detect start loops#43
NotBioWaste905 merged 15 commits intodevfrom
feat/validate_cycles

Conversation

@anna-a-m
Copy link
Contributor

  1. Added is_greeting_repeated function to detect greeting appearing in the middle of a dialogue.
  2. Added has_loop_to_start function to detect loop to start in the graph (= edge has target node 1)

Comment on lines 613 to 614
def _message_has_greeting(text: str) -> bool:
return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Добить прощание

Comment on lines 617 to 626
def is_greeting_repeated(dialogues: list[Dialogue]) -> bool:
"""
Checks whether greeting is repeated within dialogues.
Returns True if greeting has been repeated, False otherwise.
"""
for dialogue in dialogues:
for i, message in enumerate(dialogue.messages):
if i != 0 and message.participant == "assistant" and _message_has_greeting(message.text):
return True
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Отдельно добавить использование LLM

Comment on lines 629 to 637
def has_loop_to_start(G: BaseGraph) -> bool:
"""
Checks whether graph has node returning to the start node.
Returns True if there is a loop to start, False otherwise
"""
for edge in G.graph.edges:
if edge[1] == 1:
return True
return False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Проверить на валидность

}


def _message_has_greeting(text: str) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Добавить в классы семплинга и генерации

@kudep kudep changed the title Propose metrics to detect start loops Propose validators to detect start loops Apr 2, 2025
@kudep kudep requested a review from Copilot April 3, 2025 22:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

dialogue2graph/metrics/no_llm_metrics/metrics.py:614

  • Using re.match here only checks the beginning of the string; if the intent is to detect greetings appearing in other positions, consider using re.search or adjusting the regex to cover the entire string.
    return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE))

dialogue2graph/datasets/complex_dialogues/generation.py:233

  • The error message for repeated greetings in the generated dialogues currently states 'Opening is repeated' which might be misleading; consider updating it to clarify that a repeated greeting was detected.
            if is_greeting_repeated(sampled_dialogues):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split validators into llm_validators and no_llm_validators like in metrics

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip tests with embedders using pytest.mark.skipif
https://docs.pytest.org/en/6.2.x/skipping.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add README

@anna-a-m anna-a-m requested a review from Copilot April 7, 2025 15:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

dialogue2graph/datasets/complex_dialogues/generation.py:150

  • Using the literal '1' to identify the start node can be unclear; consider introducing a named constant (e.g., START_NODE = 1) to improve readability and maintainability.
no_start_cycle_requirement = not any([1 in c for c in cycles])

bool: True if greeting has been repeated, False otherwise.
"""
if not regex:
regex = r"^hello|^hi|^greetings"
Copy link

Copilot AI Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern may inadvertently match 'greetings' anywhere in the text since only the first alternatives are anchored; consider grouping the alternatives as "^(hello|hi|greetings)" to ensure all patterns are anchored at the start.

Suggested change
regex = r"^hello|^hi|^greetings"
regex = r"^(hello|hi|greetings)"

Copilot uses AI. Check for mistakes.
@NotBioWaste905 NotBioWaste905 merged commit 111c7e6 into dev Apr 7, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants