Propose validators to detect start loops by anna-a-m · Pull Request #43 · deeppavlov/dialog2graph

anna-a-m · 2025-03-31T10:04:50Z

Added is_greeting_repeated function to detect greeting appearing in the middle of a dialogue.
Added has_loop_to_start function to detect loop to start in the graph (= edge has target node 1)

kudep · 2025-03-31T12:36:35Z

dialogue2graph/metrics/no_llm_metrics/metrics.py

+def _message_has_greeting(text: str) -> bool:
+    return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE))


Добить прощание

kudep · 2025-03-31T12:36:52Z

dialogue2graph/metrics/no_llm_metrics/metrics.py

+def is_greeting_repeated(dialogues: list[Dialogue]) -> bool:
+    """
+    Checks whether greeting is repeated within dialogues.
+    Returns True if greeting has been repeated, False otherwise.
+    """
+    for dialogue in dialogues:
+        for i, message in enumerate(dialogue.messages):
+            if i != 0 and message.participant == "assistant" and _message_has_greeting(message.text):
+                return True
+    return False


Отдельно добавить использование LLM

kudep · 2025-03-31T12:37:12Z

dialogue2graph/metrics/no_llm_metrics/metrics.py

+def has_loop_to_start(G: BaseGraph) -> bool:
+    """
+    Checks whether graph has node returning to the start node.
+    Returns True if there is a loop to start, False otherwise
+    """
+    for edge in G.graph.edges:
+        if edge[1] == 1:
+            return True
+    return False


Проверить на валидность

kudep · 2025-03-31T12:37:33Z

dialogue2graph/metrics/no_llm_metrics/metrics.py

    }
+
+
+def _message_has_greeting(text: str) -> bool:


Добавить в классы семплинга и генерации

Copilot

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

dialogue2graph/metrics/no_llm_metrics/metrics.py:614

Using re.match here only checks the beginning of the string; if the intent is to detect greetings appearing in other positions, consider using re.search or adjusting the regex to cover the entire string.

    return bool(re.match(r"^hello|^hi|^greetings", text, flags=re.IGNORECASE))

dialogue2graph/datasets/complex_dialogues/generation.py:233

The error message for repeated greetings in the generated dialogues currently states 'Opening is repeated' which might be misleading; consider updating it to clarify that a repeated greeting was detected.

            if is_greeting_repeated(sampled_dialogues):

NotBioWaste905 · 2025-04-07T12:29:48Z

tests/test_validators.py

Split validators into llm_validators and no_llm_validators like in metrics

NotBioWaste905 · 2025-04-07T12:30:15Z

tests/test_validators.py

skip tests with embedders using pytest.mark.skipif
https://docs.pytest.org/en/6.2.x/skipping.html

NotBioWaste905 · 2025-04-07T12:30:26Z

experiments/exp2025_04_03_selecting_emb_params/README.md

…readme

Copilot

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

dialogue2graph/datasets/complex_dialogues/generation.py:150

Using the literal '1' to identify the start node can be unclear; consider introducing a named constant (e.g., START_NODE = 1) to improve readability and maintainability.

no_start_cycle_requirement = not any([1 in c for c in cycles])

Copilot · 2025-04-07T15:02:42Z

dialogue2graph/metrics/no_llm_validators/validators.py

+        bool: True if greeting has been repeated, False otherwise.
+    """
+    if not regex:
+        regex = r"^hello|^hi|^greetings"


The regex pattern may inadvertently match 'greetings' anywhere in the text since only the first alternatives are anchored; consider grouping the alternatives as "^(hello|hi|greetings)" to ensure all patterns are anchored at the start.

Suggested change

regex = r"^hello|^hi|^greetings"

regex = r"^(hello|hi|greetings)"

anna-a-m added 2 commits March 31, 2025 12:56

Add no llm metrics to check start loops

7f4a1de

Fix lint

9f65ff5

kudep requested changes Mar 31, 2025

View reviewed changes

anna-a-m added 2 commits April 2, 2025 14:20

Added closing in the middle check

9b43684

Added checks in the graph generation process

8875195

kudep changed the title ~~Propose metrics to detect start loops~~ Propose validators to detect start loops Apr 2, 2025

Merge remote-tracking branch 'origin/dev' into feat/validate_cycles

205df40

kudep requested a review from Copilot April 3, 2025 22:04

Copilot AI reviewed Apr 3, 2025

View reviewed changes

anna-a-m added 7 commits April 4, 2025 14:38

Deleted validators from metrics and added validators

e1d08f3

Fix lint and validators import

bb884da

Removed artefacts and changed models call

18ea27c

Added tests

34c5742

Added test skip

34547c1

Changed error messages during generation

060d5b5

Fixed lint and added experiment

9fe45ff

NotBioWaste905 requested changes Apr 7, 2025

View reviewed changes

anna-a-m added 3 commits April 7, 2025 17:28

Split validators, fix imports, add another test skip, add experiment …

04cbde5

…readme

Merge remote-tracking branch 'origin/dev' into feat/validate_cycles

230731a

Fix lint

f9d6508

anna-a-m requested a review from Copilot April 7, 2025 15:02

Copilot AI reviewed Apr 7, 2025

View reviewed changes

NotBioWaste905 merged commit 111c7e6 into dev Apr 7, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propose validators to detect start loops#43

Propose validators to detect start loops#43
NotBioWaste905 merged 15 commits intodevfrom
feat/validate_cycles

anna-a-m commented Mar 31, 2025

Uh oh!

kudep Mar 31, 2025

Uh oh!

kudep Mar 31, 2025

Uh oh!

kudep Mar 31, 2025

Uh oh!

kudep Mar 31, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

NotBioWaste905 Apr 7, 2025

Uh oh!

NotBioWaste905 Apr 7, 2025

Uh oh!

NotBioWaste905 Apr 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		def _message_has_greeting(text: str) -> bool:
		return bool(re.match(r"^hello\|^hi\|^greetings", text, flags=re.IGNORECASE))

	regex = r"^hello\|^hi\|^greetings"
	regex = r"^(hello\|hi\|greetings)"

Conversation

anna-a-m commented Mar 31, 2025

Uh oh!

kudep Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

kudep Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

kudep Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

kudep Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

NotBioWaste905 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

NotBioWaste905 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

NotBioWaste905 Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants