Skip to content

Commit b689f2f

Browse files
authored
Resolved #8 - Improving the reporting (#9)
* Renewed reporting * Fixed visualization in plotting of two graphics in 2d * Fixed visualization in plotting of two graphics in 2d and added test case filters * Renamed report.py * added reporting by selecting test cases * added folder for reports * added documentation page that describes how to use it. * deleted unused import * added 3D visualization * some small changes in texts * some small changes in texts * added embedding's filter for requirements * added embeddings to test cases and filtering by them in report * added new and missed tests for embeddings * fixed small mistakes * merge with master * remove unused imports * fixed formatting * improved documentation * updated documentation and README.md * added with operator for DBClient * fixed documentation formatting * added functions for counting entries * updated pyproject.toml * fix some small mistakes * fixed a text wrapping in radio boxes * fixed pyproject.toml * fixed formatting * fixed round distance and removed extra columns * fixed formatting * fixed controls page * removed tc id, added req summary to selectbox * added labels to plots * fixed formatting * removed all sql text to services/db/client.py * removed all sql text to services/db/client.py from tests * fixed formatting * fixed new methods in client.py * fixed extra fetches * fixed formatting * fixed extra fetches * fixed extra import * fixed annotation's labels * fixed annotation's labels
1 parent 3211114 commit b689f2f

22 files changed

+1320
-478
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,12 @@ To install the dependencies, run the following command:
2020
uv sync
2121
```
2222

23+
To bring a code to a single format:
24+
25+
```bash
26+
uvx ruff format
27+
```
28+
2329
### PyTorch version
2430

2531
PyTorch is default set to CPU distributive:

convert_trace_annos.py

Lines changed: 44 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -13,50 +13,52 @@ def is_empty(value):
1313
return True if value == EMPTY else False
1414

1515

16-
def trace_test_cases_to_annos(db_path: Path, trace_file_path: Path):
17-
db = get_db_client()
18-
19-
insertions = list()
20-
logger.info("Reading trace file and inserting annotations into table...")
21-
with open(trace_file_path, mode="r", newline="", encoding="utf-8") as trace_file:
22-
reader = csv.reader(trace_file)
23-
current_tc = EMPTY
24-
concat_summary = EMPTY
25-
test_script = EMPTY
26-
global_columns = next(reader)
27-
for row in reader:
28-
if row[0] == "TestCaseStart":
29-
current_tc = row[1]
30-
test_script = EMPTY
31-
concat_summary = EMPTY
32-
next(reader)
33-
elif row[0] == "Summary":
34-
continue
35-
elif row[0] == "TestCaseEnd":
36-
if not is_empty(current_tc) and not is_empty(concat_summary):
37-
case_id = db.test_cases.get_or_insert(
38-
test_script=test_script, test_case=current_tc
39-
)
40-
annotation_id = db.annotations.get_or_insert(summary=concat_summary)
41-
insertions.append(
42-
db.cases_to_annos.insert(
43-
case_id=case_id, annotation_id=annotation_id
16+
def trace_test_cases_to_annos(trace_file_path: Path):
17+
with get_db_client() as db:
18+
insertions = list()
19+
logger.info("Reading trace file and inserting annotations into table...")
20+
with open(
21+
trace_file_path, mode="r", newline="", encoding="utf-8"
22+
) as trace_file:
23+
reader = csv.reader(trace_file)
24+
current_tc = EMPTY
25+
concat_summary = EMPTY
26+
test_script = EMPTY
27+
global_columns = next(reader)
28+
for row in reader:
29+
if row[0] == "TestCaseStart":
30+
current_tc = row[1]
31+
test_script = EMPTY
32+
concat_summary = EMPTY
33+
next(reader)
34+
elif row[0] == "Summary":
35+
continue
36+
elif row[0] == "TestCaseEnd":
37+
if not is_empty(current_tc) and not is_empty(concat_summary):
38+
case_id = db.test_cases.get_or_insert(
39+
test_script=test_script, test_case=current_tc
40+
)
41+
annotation_id = db.annotations.get_or_insert(
42+
summary=concat_summary
43+
)
44+
insertions.append(
45+
db.cases_to_annos.insert(
46+
case_id=case_id, annotation_id=annotation_id
47+
)
4448
)
45-
)
46-
else:
47-
if not is_empty(row[global_columns.index("TestCase")]):
48-
if current_tc != row[global_columns.index("TestCase")]:
49-
current_tc = row[global_columns.index("TestCase")]
50-
if is_empty(test_script) and not is_empty(
51-
row[global_columns.index("TestScript")]
52-
):
53-
test_script = row[global_columns.index("TestScript")]
54-
concat_summary += row[0]
49+
else:
50+
if not is_empty(row[global_columns.index("TestCase")]):
51+
if current_tc != row[global_columns.index("TestCase")]:
52+
current_tc = row[global_columns.index("TestCase")]
53+
if is_empty(test_script) and not is_empty(
54+
row[global_columns.index("TestScript")]
55+
):
56+
test_script = row[global_columns.index("TestScript")]
57+
concat_summary += row[0]
5558

56-
db.conn.commit()
57-
logger.info(
58-
f"Inserted {len(insertions)} testcase-annotations pairs to database. Successful: {sum(insertions)}"
59-
)
59+
logger.info(
60+
f"Inserted {len(insertions)} testcase-annotations pairs to database. Successful: {sum(insertions)}"
61+
)
6062

6163

6264
if __name__ == "__main__":

main.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
import streamlit as st
22

3+
from test2text.pages.documentation import show_documentation
34
from test2text.pages.upload.annotations import show_annotations
45
from test2text.pages.upload.requirements import show_requirements
5-
from test2text.pages.controls.controls_page import controls_page
6-
from test2text.pages.report import make_a_report
6+
from test2text.pages.reports.report_by_req import make_a_report
7+
from test2text.pages.reports.report_by_tc import make_a_tc_report
78
from test2text.services.visualisation.visualize_vectors import visualize_vectors
9+
from test2text.pages.controls.controls_page import controls_page
810

911

1012
def add_logo():
@@ -37,21 +39,31 @@ def add_logo():
3739
)
3840
add_logo()
3941

42+
about = st.Page(
43+
show_documentation, title="About application", icon=":material/info:"
44+
)
45+
4046
annotations = st.Page(
4147
show_annotations, title="Annotations", icon=":material/database_upload:"
4248
)
4349
requirements = st.Page(
4450
show_requirements, title="Requirements", icon=":material/database_upload:"
4551
)
4652
cache_distances = st.Page(controls_page, title="Controls", icon=":material/cached:")
47-
report = st.Page(make_a_report, title="Report", icon=":material/publish:")
53+
report_by_req = st.Page(
54+
make_a_report, title="Requirement's Report", icon=":material/publish:"
55+
)
56+
report_by_tc = st.Page(
57+
make_a_tc_report, title="Test cases Report", icon=":material/publish:"
58+
)
4859
visualization = st.Page(
4960
visualize_vectors, title="Visualize Vectors", icon=":material/dataset:"
5061
)
5162
pages = {
63+
"Home": [about],
5264
"Upload": [annotations, requirements],
5365
"Update": [cache_distances],
54-
"Inspect": [report, visualization],
66+
"Inspect": [report_by_req, report_by_tc, visualization],
5567
}
5668
pg = st.navigation(pages)
5769

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@ name = "test2text"
33
version = "0.1.0"
44
description = ""
55
authors = [
6-
{name = "Nikolai Dorofeev - d0rich",email = "[email protected]"}
6+
{name = "Nikolai Dorofeev - d0rich", email = "[email protected]"},
7+
{name = "Anna Yamkovaya - anngoroshi", email = "[email protected]"}
78
]
89
readme = "README.md"
910
requires-python = ">=3.9"

test2text/pages/controls/controls_page.py

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,23 @@
1+
from test2text.services.db import get_db_client
2+
3+
14
def controls_page():
25
import streamlit as st
36
import plotly.express as px
47

5-
from test2text.services.embeddings.annotation_embeddings_controls import (
6-
count_all_annotations,
7-
count_embedded_annotations,
8-
)
9-
108
st.header("Controls page")
119
embedding_col, distances_col = st.columns(2)
1210
with embedding_col:
1311
st.subheader("Embedding")
1412

1513
def refresh_counts():
16-
st.session_state["all_annotations_count"] = count_all_annotations()
17-
st.session_state["embedded_annotations_count"] = (
18-
count_embedded_annotations()
19-
)
14+
with get_db_client() as db:
15+
st.session_state["all_annotations_count"] = db.count_all_entries(
16+
"Annotations"
17+
)
18+
st.session_state["embedded_annotations_count"] = (
19+
db.count_notnull_entries("embedding", from_table="Annotations")
20+
)
2021

2122
refresh_counts()
2223

test2text/pages/documentation.py

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
import streamlit as st
2+
3+
from test2text.services.db import get_db_client
4+
5+
6+
def show_documentation():
7+
st.markdown("""
8+
# Test2Text Application Documentation
9+
10+
## About the Application
11+
12+
**Test2Text** is a tool for computing requirement's coverage by tests and generating relevant reports.
13+
The application provides a convenient interface for analysis the relationships between test cases and requirements.
14+
15+
""")
16+
st.divider()
17+
st.markdown("""
18+
## HOW TO USE
19+
20+
### Upload data
21+
Click :gray-badge[:material/database_upload: Annotations] or :gray-badge[:material/database_upload: Requirements] to upload annotations and requirements from CSV files to the app's database.
22+
Then Annotations and Requirements are loaded and Test cases are linked to Annotations go to the next chapter.
23+
24+
### Renew data
25+
Click :gray-badge[:material/cached: Controls] to transform missed and new texts into numeral vectors (embeddings).
26+
Update distances by embeddings for intelligent matching of Requirements and Annotations.
27+
After distances are refreshed (all Annotations linked with Requirement by distances) go to the next chapter.
28+
29+
### Generate reports
30+
Click :gray-badge[:material/publish: Requirement's Report] or :gray-badge[:material/publish: Test cases Report] to make a report.
31+
Use filters and Smart search based on embeddings to select desired information.
32+
Analyze selected requirements or test cases by plotted distances.
33+
List of all requirements/test cases and their annotations are shown here.
34+
35+
### Visualize saved data
36+
Click :gray-badge[:material/dataset: Visualize vectors] to plot distances between vector representations of all requirements and annotations in multidimensional spaces.
37+
38+
""")
39+
st.divider()
40+
with get_db_client() as db:
41+
st.markdown("""## Database overview""")
42+
table, row_count = st.columns(2)
43+
with table:
44+
st.write("Table name")
45+
with row_count:
46+
st.write("Number of entries")
47+
for table_name, count in db.get_db_full_info.items():
48+
with table:
49+
st.write(table_name)
50+
with row_count:
51+
st.write(count)
52+
st.divider()
53+
st.markdown("""
54+
### Methodology
55+
The application use a pre-trained transformer model from the [sentence-transformers library](https://huggingface.co/sentence-transformers), specifically [nomic-ai/nomic-embed-text-v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1), a model trained to produce high-quality vector embeddings for text.
56+
The model returns, for each input text, a high-dimensional NumPy array (vector) of floating point numbers (the embedding).
57+
This arrays give a possibility to calculate Euclidian distances between test cases annotations and requirements to show how similar or dissimilar the two texts.
58+
""")
59+
60+
st.markdown("""
61+
#### Euclidean (L2) Distance Formula
62+
The Euclidean (L2) distance is a measure of the straight-line distance between two points (or vectors) in a multidimensional space.
63+
It is widely used to compute the similarity or dissimilarity between two vector representations, such as text embeddings.
64+
""")
65+
st.markdown("""
66+
Suppose we have two vectors:
67+
""")
68+
st.latex(r"""
69+
\mathbf{a} = [a_1, a_2, ..., a_n] ,
70+
""")
71+
st.latex(r"""
72+
\mathbf{b} = [b_1, b_2, ..., b_n]
73+
""")
74+
75+
st.markdown("""
76+
The L2 distance between **a** and **b** is calculated as:
77+
""")
78+
79+
st.latex(r"""
80+
L_2(\mathbf{a}, \mathbf{b}) = \sqrt{(a_1 - b_1)^2 + (a_2 - b_2)^2 + \cdots + (a_n - b_n)^2}
81+
""")
82+
83+
st.markdown("""
84+
Or, more compactly:
85+
""")
86+
87+
st.latex(r"""
88+
L_2(\mathbf{a}, \mathbf{b}) = \sqrt{\sum_{i=1}^n (a_i - b_i)^2}
89+
""")
90+
91+
st.markdown("""
92+
- A **smaller L2 distance** means the vectors are more similar.
93+
- A **larger L2 distance** indicates greater dissimilarity.
94+
""")
95+
96+
st.markdown("""
97+
This formula is commonly used for comparing the semantic similarity of embeddings generated from text using models like Sentence Transformers.
98+
""")

0 commit comments

Comments
 (0)