Update README.md

angomezu · web-flow · commit bf22529810f1 · 2025-12-22T09:49:25.000-05:00
diff --git a/README.md b/README.md
@@ -34,25 +34,17 @@ _Last updated: December 18, 2025_
 ## Table of Contents
 
 - [Abstract](#abstract)
-- [Contributions](#contributions)
 - [Repository Purpose](#repository-purpose)
-- [Tech Stack](#tech-stack)
+- [Contributions](#contributions)
+- [Getting Started](#getting-started)
+- [Code Structure](#code-structure)
 - [Method Overview](#method-overview)
-  - [Problem Setting](#problem-setting)
-  - [Manual Annotation Protocol](#manual-annotation-protocol)
-  - [Geometric Feature Engineering](#geometric-feature-engineering)
-  - [Learning Architecture](#learning-architecture)
-  - [Evaluation Protocol](#evaluation-protocol)
+- [Problem Setting](#problem-setting)
+- [Manual Annotation Protocol](#manual-annotation-protocol)
+- [Geometric Feature Engineering](#geometric-feature-engineering)
+- [Learning Architecture](#learning-architecture)
+- [Evaluation Protocol](#evaluation-protocol)
 - [Qualitative Findings](#qualitative-findings)
-- [Code Structure](#code-structure)
-- [Reproducibility & Installation](#reproducibility--installation)
-  - [Data Availability](#data-availability)
-  - [Installation](#installation)
-    - [Step 1: Clone the Repository and Create Environment](#step-1-clone-the-repository-and-create-environment)
-    - [Step 2: Install PyTorch](#step-2-install-pytorch)
-    - [Step 3: Install PyTorch Geometric](#step-3-install-pytorch-geometric)
-    - [Step 4: Install 3D Processing and ML Dependencies](#step-4-install-3d-processing-and-ml-dependencies)
-- [Notes on Usage](#notes-on-usage)
 - [Future Research Directions](#future-research-directions)
 - [Citation](#citation)
 - [External Research Collaborators](#external-research-collaborators) 
@@ -64,32 +56,36 @@ _Last updated: December 18, 2025_
 
 ## Abstract
 
-Semantic segmentation of unstructured 3D point clouds remains a challenging problem, particularly in domains where appearance cues are unavailable. In plant phenotyping, LiDAR-based point clouds provide rich geometric information but suffer from class imbalance, occlusion, and geometric ambiguity between biological and abiotic structures.
-
-This work investigates the use of **geometry-aware deep learning** for organ-level plant segmentation using **Dynamic Edge Convolutional Neural Networks (DECNNs)**. We propose a structured annotation protocol, geometric feature augmentation, and a loss formulation tailored to highly imbalanced plant data. The released code focuses on methodology and reproducibility and accompanies an ongoing research manuscript.
-
-Our goal with this applied research project is to demonstrate how computer vision tools have a direct impact on industry problems such as 3D perception, robotics, autonomous systems, medical imaging, and industrial and agricultural inspection.
+<div style="
+  max-width: 1100px;
+  margin: auto;
+  text-align: justify;
+  text-justify: inter-word;
+  line-height: 1.6;
+">
+Semantic segmentation of unstructured 3D point clouds remains a challenging problem, particularly in domains where appearance cues are unavailable. In plant phenotyping, LiDAR-based point clouds provide rich geometric information but suffer from class imbalance, occlusion, and geometric ambiguity between biological and abiotic structures. This work investigates the use of geometry-aware deep learning for organ-level plant segmentation using Dynamic Edge Convolutional Neural Networks (DECNNs). We propose a structured annotation protocol, geometric feature augmentation, and a loss formulation tailored to highly imbalanced plant data. The released code focuses on methodology and reproducibility and accompanies an ongoing research manuscript. Our goal with this applied research project is to demonstrate how computer vision tools have a direct impact on industry problems such as 3D perception, robotics, autonomous systems, medical imaging, and industrial and agricultural inspection.
+</div>
   
 ---
 
-## Contributions
-
-- Geometry-based semantic segmentation of plant organs from 3D LiDAR point clouds  
-- A structured manual annotation protocol tailored to complex biological structures  
-- Integration of local geometric descriptors within dynamic graph convolutional networks  
-- Robust learning under severe class imbalance in organ-level segmentation tasks  
-- A research-oriented, modular codebase designed to support reproducible experimentation
-
----
-
 ## Repository Purpose
 
 This repository releases **research code** developed in collaboration with **Oak Ridge National Laboratory** for studying semantic segmentation of 3D plant point clouds.
 
 **Important:**  
-- No raw or processed data is included  
-- No trained model checkpoints are provided  
-- The repository focuses on **methodology, architecture, and evaluation**
+- No raw or processed data is included.  
+- No trained model checkpoints are provided.  
+- The repository focuses on **methodology, architecture, and evaluation**.
+
+---
+
+## Contributions
+
+- Geometry-based semantic segmentation of plant organs from 3D LiDAR point clouds.  
+- A structured manual annotation protocol tailored to complex biological structures.  
+- Integration of local geometric descriptors within dynamic graph convolutional networks.  
+- Robust learning under severe class imbalance in organ-level segmentation tasks.  
+- A research-oriented, modular codebase designed to support reproducible experimentation.
 
 ---
 
@@ -102,6 +98,36 @@ train with `python train.py`, then evaluate with `python evaluation.py` and visu
 
 ---
 
+## Code Structure
+
+The repository is organized as a modular research pipeline:
+
+```text
+├── data/                # Directory structure only (no data included)
+│   ├── train/
+│   ├── val/
+│   └── test/
+│
+├── src/
+│   ├── dataset.py       # ETL and geometric feature computation
+│   ├── model.py         # Dynamic Edge CNN architecture
+│   └── inference.py     # Inference and post-processing
+│
+├── validations/         # Data integrity and sanity checks
+│   ├── check_data.py
+│   ├── check_labels.py
+│   └── count_nans.py
+│
+├── train.py             # Training loop
+├── evaluation.py        # Metric computation and bootstrapping
+├── visualization.py    # 3D visualization utilities
+└── README.md
+```
+
+All directories related to raw data, predictions, and model checkpoints are **intentionally excluded** from this repository to comply with data confidentiality and intellectual property constraints associated with Oak Ridge National Laboratory (ORNL).
+
+---
+
 ## Method Overview
 
 ### Problem Setting
@@ -114,10 +140,10 @@ Given a 3D LiDAR point cloud acquired in a controlled phenotyping environment, t
 - **Background**
 
 The task is challenging due to:
-- Severe class imbalance
-- Structural similarity between stems and stakes
-- Occlusion and sparse sampling
-- Absence of RGB or spectral information
+- Severe class imbalance.
+- Structural similarity between stems and stakes.
+- Occlusion and sparse sampling.
+- Absence of RGB or spectral information.
 
 ---
 
@@ -128,16 +154,14 @@ High-quality ground truth is critical for supervised semantic segmentation of 3D
 The protocol defines a rule-based workflow for point-wise segmentation and labeling of LiDAR point clouds into four semantic classes: **stem**, **leaf**, **stake**, and **background**. It enforces strict completeness, naming conventions, boundary rules, and class assignment guidelines, ensuring that every point in the original scan is assigned a biologically meaningful label.
 
 This annotation strategy was essential for:
-- Producing reliable supervision signals for deep learning
-- Reducing label noise in geometrically ambiguous regions
-- Enabling consistent evaluation across samples
-- Supporting reproducibility and future dataset extensions
+- Producing reliable supervision signals for deep learning.
+- Reducing label noise in geometrically ambiguous regions.
+- Enabling consistent evaluation across samples.
+- Supporting reproducibility and future dataset extensions.
 
 A total of 30 fully annotated 3D LiDAR point clouds were generated and used for supervised training and evaluation.
 
-The full annotation procedure, including setup instructions, segmentation steps, labeling rules, and export formats, is documented in detail here:
-
-[`docs/3D_Plant_Segmentation_Protocol.pdf`](docs/3D_Plant_Segmentation_Protocol.pdf)
+The full annotation procedure, including setup instructions, segmentation steps, labeling rules, and export formats, is documented in detail here: [Manual Annotation Protocol](assets/docs/3D_Plant_Segmentation_Protocol.pdf)
 
 ---
 
@@ -158,9 +182,9 @@ These features encode local shape properties critical for organ discrimination.
 
 The core model is a **Dynamic Edge Convolutional Neural Network (DECNN)** that:
 
-- Dynamically constructs neighborhood graphs per layer
-- Learns edge features capturing local geometry
-- Operates directly on unstructured point clouds
+- Dynamically constructs neighborhood graphs per layer.
+- Learns edge features capturing local geometry.
+- Operates directly on unstructured point clouds.
 
 To address extreme class imbalance, training uses a composite loss combining:
 
@@ -173,10 +197,10 @@ To address extreme class imbalance, training uses a composite loss combining:
 
 Model performance is evaluated using:
 
-- Intersection over Union (IoU)
-- Precision and Recall (per class)
-- Sample-averaged metrics
-- Bootstrapped confidence intervals
+- Intersection over Union (IoU).
+- Precision and Recall (per class).
+- Sample-averaged metrics.
+- Bootstrapped confidence intervals.
 
 Qualitative evaluation is performed via 3D visualization of predicted segmentations.
 
@@ -186,51 +210,18 @@ Qualitative evaluation is performed via 3D visualization of predicted segmentati
 
 Under the described experimental setup:
 
-- Strong performance is observed on the dominant **Leaf** class
-- High recall is achieved for the biologically critical **Stem** class
-- Qualitative results show coherent reconstruction of plant structure
+- Strong performance is observed on the dominant **Leaf** class.
+- High recall is achieved for the biologically critical **Stem** class.
+- Qualitative results show coherent reconstruction of plant structure.
 
 Limitations include stem–stake ambiguity and boundary artifacts due to resolution constraints.
 
-For a comprehensive discussion of experimental results, quantitative metrics, and additional analyses, please refer to the full exit report available here:  
-
-[`docs/ORNL_DECNN_Exit_Report.pdf`](docs/ORNL_DECNN_Exit_Report.pdf)
-
----
-
-## Code Structure
-
-The repository is organized as a modular research pipeline:
-
-```text
-├── data/                # Directory structure only (no data included)
-│   ├── train/
-│   ├── val/
-│   └── test/
-│
-├── src/
-│   ├── dataset.py       # ETL and geometric feature computation
-│   ├── model.py         # Dynamic Edge CNN architecture
-│   └── inference.py     # Inference and post-processing
-│
-├── validations/         # Data integrity and sanity checks
-│   ├── check_data.py
-│   ├── check_labels.py
-│   └── count_nans.py
-│
-├── train.py             # Training loop
-├── evaluation.py        # Metric computation and bootstrapping
-├── visualization.py    # 3D visualization utilities
-└── README.md
-```
-
-All directories related to raw data, predictions, and model checkpoints are **intentionally excluded** from this repository to comply with data confidentiality and intellectual property constraints associated with Oak Ridge National Laboratory (ORNL).
+For a comprehensive discussion of experimental results, quantitative metrics, and additional analyses, please refer to the full exit report available here: [Exit Report](assests/docs/ORNL_DECNN_Exit_Report.pdf)
 
 ---
 
 
-
-### Future Research Directions
+## Future Research Directions
 
 Potential extensions of this work include:
 
@@ -241,7 +232,7 @@ Potential extensions of this work include:
 
 ---
 
-### Citation
+## Citation
 
 If you find this work useful in your research, please consider citing:
 
@@ -253,7 +244,7 @@ If you find this work useful in your research, please consider citing:
 }
 ```
 ---
-### External Research Collaborators 
+## External Research Collaborators 
 **Oak Ridge National Laboratory (ORNL) | Biosciences Division**
 
 **Dr. John Lagergren**  
@@ -287,13 +278,13 @@ joynerm@etsu.edu
 
 ---
 
-### Acknowledgments
+## Acknowledgments
 
 This research used resources of the Advanced Plant Phenotyping Laboratory and the Center for Bioenergy Innovation (CBI), which is a U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. Oak Ridge National Laboratory is managed by UT-Battelle, LLC for the U.S. Department of Energy under Contract Number DE-AC05-00OR22725.
 
 We sincerely thank **Dr. John Lagergren**, **Dr. Larry M. York**, and **Anand Seethepalli** (Oak Ridge National Laboratory, Biosciences Division) for providing access to experimental data, domain expertise, and valuable feedback throughout the project. We also thank **Dr. Jeff R. Knisley**, **Dr. Robert M. Price**, and **Dr. Michele Joyner** (Department of Mathematics & Statistics, East Tennessee State University) for their academic guidance and mentorship, and for making this collaboration possible by enabling meaningful real-world research and development experience in data science.
 
 ---
-### Disclaimer
+## Disclaimer
 
 The views and conclusions expressed in this repository are those of the authors and do not necessarily represent the views of Oak Ridge National Laboratory or the U.S. Department of Energy. The code is provided for **academic and research purposes only**.