D.I.A.G.R.A.M.: Development of Image Analysis for Graph Recognition And Modeling

This project proposes the development of a system for analyzing different types of handwritten diagrams and converting them into well-rendered images through a textual syntax. The goal is to create a tool capable of analyzing scanned or photographed sketches of diagrams and automatically generating code that can be rendered into the same diagrams digitally, to eventually integrate or modify them.

Get Started

Install

Install D2 CLI from official repository: https://github.com/terrastruct/d2
- Linux: https://d2lang.com/tour/install/
- All releases

Try d2 installation.

d2 --version

Install dependencies using Python 3.12

conda create --name diagram python=3.12

conda activate diagram

pip3 install torch opencv-python matplotlib requests pillow pandas torchvision numpy shapely transformers sentencepiece protobuf torchmetrics scikit-learn

Check DIAGRAM installation

python src/main.py -h

Example

CLI options can be visualized thanks to flag: -h

CLI is in src/main.py

python src/main.py -h

CLI parameters are:

--input path/to/image1.png path/to/image2.png ... provides input images (even more than one)
--classifier path/to/classifier_weights.pth weights of classifier network
--bbox-detector path/to/object_detector_weights.pth weights of object detection network
--outputs-dir-path path/to/output_dir directory in which outputs will be dumped
--then-compile flag to compile markup language file into images
--markups d2-lang mermaid to specify markup languages

We advise you to tune thresholds based on your drawing style. We have tuned thresholds for common diagrams.

For example:

--element_arrow_distance_threshold 260

Demo

Graph

Easy Graph

python src/main.py --input demo/easy_graph.png --classifier demo/classifier_weights.pth --bbox-detector demo/object_detector_weights.pth --outputs-dir-path demo/outcome --then-compile --element_arrow_distance_threshold 150

Hard Graph

python src/main.py --input demo/hard_graph.png --classifier demo/classifier_weights.pth --bbox-detector demo/object_detector_weights.pth --outputs-dir-path demo/outcome --then-compile --element_arrow_distance_threshold 250

Warning

An extra node will be found, because a self arrow is recognized as a node. Anyway, you can remove it from .d2 markup file.

Non-Maximum Suppression is useless because "Node" label has a greater score respects to "Arrow" label.

Flowchart

Easy Flowchart

python src/main.py --input demo/easy_flowchart.png --classifier demo/classifier_weights.pth --bbox-detector demo/object_detector_weights.pth --outputs-dir-path demo/outcome --then-compile --element_arrow_distance_threshold 350

Hard Flowchart

python src/main.py --input demo/hard_flowchart.png --classifier demo/classifier_weights.pth --bbox-detector demo/object_detector_weights.pth --outputs-dir-path demo/outcome --then-compile --element_arrow_distance_threshold 350

Project Overview

System's components:

Preprocessor: pre-elaborates images, e.g. straighten images
Classifier: classifies input images (e.g. graph-diagram, flowchart)
Extractor: extracts and builds agnostic representation of input diagram
Transducer: converts agnostic representation of a diagram into a specific markup language (e.g. Mermaid)
Compiler: produces an input from a markup language file
Orchestrator: manages other components

The classifier network is used to determine which extraction module to use.

Each extractor is specialized for a single type of diagram.

For example, given an input image of a graph:

The classifier outputs graph-diagram, so the orchestrator forwards the input image to the graph diagram extractor.

The graph diagram extractor produces the graph's adjacency matrix, where the nodes and external nodes (to handle arrows starting from nowhere) are represented in the rows and columns. The value is a non-negative integer indicating the number of connections (the matrix position indicates the source and destination). Additionally, it generates lookup data structures for arrow annotations and node text.

The orchestrator then sends this to the corresponding transducers (based on user input or all available ones) for that type of diagram, which will generate the translations into the target lookup language. For example, in Mermaid, the output might be something like:

flowchart TD
	0(Y = X
	X =T)
	1{X > Y}
	2(T = Y)
	3(( ))
	4(( ))

	1-->|Else|4
	3-->1
	2-->0
	0-->4
	1-->|Then|2

Then, markup language is compiled using associated compiler.

Name		Name	Last commit message	Last commit date
Latest commit History 519 Commits
arrow_appendix		arrow_appendix
assets/images		assets/images
core		core
dataset		dataset
demo		demo
doc		doc
src		src
test_resources		test_resources
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
TODO.md		TODO.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D.I.A.G.R.A.M.: Development of Image Analysis for Graph Recognition And Modeling

Get Started

Install

Example

Demo

Graph

Easy Graph

Hard Graph

Flowchart

Easy Flowchart

Hard Flowchart

Project Overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

D.I.A.G.R.A.M.: Development of Image Analysis for Graph Recognition And Modeling

Get Started

Install

Example

Demo

Graph

Easy Graph

Hard Graph

Flowchart

Easy Flowchart

Hard Flowchart

Project Overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages