sciknoworg
diff --git a/‎docs/source/learners/llms4ol.rst‎
Lines changed: 39 additions & 14 deletions b/‎docs/source/learners/llms4ol.rst‎
Lines changed: 39 additions & 14 deletions
diff --git a/‎docs/source/learners/alexbek_learner.rst‎ ‎…rs/llms4ol_challenge/alexbek_learner.rst‎docs/source/learners/alexbek_learner.rst renamed to docs/source/learners/llms4ol_challenge/alexbek_learner.rst
Lines changed: 29 additions & 172 deletions b/‎docs/source/learners/alexbek_learner.rst‎ ‎…rs/llms4ol_challenge/alexbek_learner.rst‎docs/source/learners/alexbek_learner.rst renamed to docs/source/learners/llms4ol_challenge/alexbek_learner.rst
Lines changed: 29 additions & 172 deletions
@@ -1,40 +1,61 @@
-LLMs4OL Challenge Learners
-==========================
 
-LLMs4OL is a community development initiative collocated with the 23rd International Semantic Web Conference (ISWC) to explore the potential of Large Language Models (LLMs) in Ontology Learning (OL), a vital process for enhancing the web with structured knowledge to improve interoperability. By leveraging LLMs, the challenge aims to advance understanding and innovation in OL, aligning with the goals of the Semantic Web to create a more intelligent and user-friendly web.
+.. sidebar:: Challenge Series Websites
+
+	* `1st LLMs4OL @ ISWC 2024 <https://sites.google.com/view/llms4ol>`_
+	* `2nd LLMs4OL @ ISWC 2025 <https://sites.google.com/view/llms4ol2025>`_
+
+
+.. raw:: html
+
+   <div align="center">
+     <img src="https://raw.githubusercontent.com/sciknoworg/OntoLearner/refs/heads/dev/docs/source/learners/images/challenge-logo.png" alt="challenge-logo" width="10%"/>
+   </div>
+
+LLMs4OL Challenge
+==================================================================================================================
+
+
+
+
+LLMs4OL is a community development initiative collocated with the International Semantic Web Conference (ISWC) to explore the potential of Large Language Models (LLMs) in Ontology Learning (OL), a vital process for enhancing the web with structured knowledge to improve interoperability. By leveraging LLMs, the challenge aims to advance understanding and innovation in OL, aligning with the goals of the Semantic Web to create a more intelligent and user-friendly web.
 
 
 .. list-table::
-   :widths: 20 80
+   :widths: 20 20 60
    :header-rows: 1
 
-   * - **Task**
+   * - **Edition**
+     - **Task**
      - **Description**
-   * - **Text2Onto**
+   * - ``LLMs4OL'25``
+     - **Text2Onto**
      - Extract ontological terms and types from unstructured text.
 
        **ID**: ``text-to-onto``
 
        **Info**: This task focuses on extracting foundational elements (Terms and Types) from unstructured text documents to build the initial structure of an ontology. It involves recognizing domain-relevant vocabulary (Term Extraction, SubTask 1) and categorizing it appropriately (Type Extraction, SubTask 2). It bridges the gap between natural language and structured knowledge representation.
 
        **Example**: **COVID-19** is a term of the type **Disease**.
-   * - **Term Typing**
+   * - ``LLMs4OL'24``, ``LLMs4OL'25``
+     - **Term Typing**
      - Discover the generalized type for a lexical term.
 
        **ID**: ``term-typing``
 
        **Info**: The process of assigning a generalized type to each lexical term involves mapping lexical items to their most appropriate semantic categories or ontological classes. For example, in the biomedical domain, the term ``aspirin`` should be classified under ``Pharmaceutical Drug``. This task is crucial for organizing extracted terms into structured ontologies and improving knowledge reuse.
 
        **Example**: Assign the type ``"disease"`` to the term ``"myocardial infarction"``.
-   * - **Taxonomy Discovery**
+   * - ``LLMs4OL'24``, ``LLMs4OL'25``
+     - **Taxonomy Discovery**
      - Discover the taxonomic hierarchy between type pairs.
 
        **ID**: ``taxonomy-discovery``
 
        **Info**: Taxonomy discovery focuses on identifying hierarchical relationships between types, enabling the construction of taxonomic structures (i.e., ``is-a`` relationships). Given a pair of terms or types, the task determines whether one is a subclass of the other. For example, discovering that ``Sedan is a subclass of Car`` contributes to structuring domain knowledge in a way that supports reasoning and inferencing in ontology-driven applications.
 
        **Example**: Recognize that ``"lung cancer"`` is a subclass of ``"cancer"``, which is a subclass of ``"disease"``.
-   * - **Non-Taxonomic Relation Extraction**
+   * - ``LLMs4OL'24``, ``LLMs4OL'25``
+     - **Non-Taxonomic Relation Extraction**
      - Identify non-taxonomic, semantic relations between types.
 
        **ID**: ``non-taxonomic-re``
@@ -44,13 +65,17 @@ LLMs4OL is a community development initiative collocated with the 23rd Internati
        **Example**: Identify that *"virus"* ``causes`` *"infection"* or *"aspirin"* ``treats`` *"headache"*.
 
 
+.. note::
+
+	* Proceedings of 1st LLMs4OL Challenge @ ISWC 2024 available at `https://www.tib-op.org/ojs/index.php/ocp/issue/view/169 <https://www.tib-op.org/ojs/index.php/ocp/issue/view/169>`_
+	* Proceedings of 2nd LLMs4OL Challenge @ ISWC 2025 available at `https://www.tib-op.org/ojs/index.php/ocp/issue/view/185 <https://www.tib-op.org/ojs/index.php/ocp/issue/view/185>`_
 
 .. toctree::
    :maxdepth: 1
-   :caption: LLMs4OL Learners
+   :caption: LLMs4OL Challenge Series Participants Learners
    :titlesonly:
 
-   rwthdbis_learner
-   skhnlp_learner
-   alexbek_learner
-   sbunlp_learner
+   llms4ol_challenge/rwthdbis_learner
+   llms4ol_challenge/skhnlp_learner
+   llms4ol_challenge/alexbek_learner
+   llms4ol_challenge/sbunlp_learner
@@ -51,7 +51,7 @@ For term typing, we use GeoNames as an example ontology. Labeled term–type pai
 
 .. code-block:: python
 
-   from ontolearner import GeoNames, train_test_split, evaluation_report
+   from ontolearner import GeoNames, train_test_split
 
    # Load the GeoNames ontology and extract labeled term-typing data
    ontology = GeoNames()
@@ -74,15 +74,15 @@ The task IDs are: ``term-typing``, ``taxonomy-discovery``, ``non-taxonomic-re``.
 
 .. code-block:: python
 
-   from ontolearner.learner.term_typing import AlexbekRFLearner
-
    task = "term-typing"
 
 We first configure the Alexbek random-forest learner.
 This learner builds features from text embeddings (and optionally graph structure) and trains a random-forest classifier for term typing.
 
 .. code-block:: python
 
+   from ontolearner.learner.term_typing import AlexbekRFLearner
+
    rf_learner = AlexbekRFLearner(
        device="cpu",           # switch to "cuda" if available
        batch_size=16,
@@ -91,6 +91,12 @@ This learner builds features from text embeddings (and optionally graph structur
        use_graph_features=True # set False for pure text-based features
    )
 
+Learn and Predict
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: python
+
+   from ontolearner import evaluation_report
    # Fit the RF-based learner on the training split
    rf_learner.fit(train_data, task=task)
 
@@ -102,62 +108,6 @@ This learner builds features from text embeddings (and optionally graph structur
    metrics = evaluation_report(y_true=truth, y_pred=predicts, task=task)
    print(metrics)
 
-Pipeline Usage
-~~~~~~~~~~~~~~
-
-The :class:`LearnerPipeline` class integrates the random-forest term-typing learner with a retriever, runs training, and evaluates performance on the test split.
-
-.. code-block:: python
-
-   # Import core modules from the OntoLearner library
-   from ontolearner import GeoNames, train_test_split, LearnerPipeline
-   from ontolearner.learner.term_typing import AlexbekRFLearner  # RF learner over text+graph features
-
-   # Load the GeoNames ontology and extract labeled term-typing data
-   ontology = GeoNames()
-   ontology.load()
-   data = ontology.extract()
-
-   # Split the labeled term-typing data into train and test sets
-   train_data, test_data = train_test_split(
-       data,
-       test_size=0.2,
-       random_state=42,
-   )
-
-   # Configure the RF-based learner (embeddings + optional graph features)
-   rf_learner = AlexbekRFLearner(
-       device="cpu",            # switch to "cuda" if you have a GPU
-       batch_size=16,
-       max_length=512,          # max tokenizer length for embedding model inputs
-       threshold=0.30,          # probability cutoff for assigning each type
-       use_graph_features=True, # set False for pure RF on text embeddings only
-   )
-
-   # Build the pipeline and pass raw structured objects end-to-end.
-   pipe = LearnerPipeline(
-       retriever=rf_learner,
-       retriever_id="intfloat/e5-base-v2",  # or "Qwen/Qwen3-Embedding-4B" if you have sufficient GPU memory
-       ontologizer_data=True,               # True if data is already {"term": ..., "types": [...], ...}
-       device="cpu",
-       batch_size=16,
-   )
-
-   # Run the full learning pipeline on the term-typing task
-   outputs = pipe(
-       train_data=train_data,
-       test_data=test_data,
-       task="term-typing",
-       evaluate=True,
-       ontologizer_data=True,
-   )
-
-   # Display evaluation summary and runtime
-   print("Metrics:", outputs.get("metrics"))
-   print("Elapsed time:", outputs["elapsed_time"])
-   print(outputs)
-
-
 Term Typing (RAG-based)
 -----------------------
 
@@ -168,7 +118,7 @@ The RAG-based term-typing setup also uses GeoNames. We again load the ontology a
 
 .. code-block:: python
 
-   from ontolearner import GeoNames, train_test_split, evaluation_report
+   from ontolearner import GeoNames, train_test_split
 
    ontology = GeoNames()
    ontology.load()
@@ -190,15 +140,15 @@ The task IDs are: ``term-typing``, ``taxonomy-discovery``, ``non-taxonomic-re``.
 
 .. code-block:: python
 
-   from ontolearner.learner.term_typing import AlexbekRAGLearner
-
    task = "term-typing"
 
 Next, we configure a Retrieval-Augmented Generation (RAG) term-typing classifier.
 An encoder retrieves top-k similar training examples, and a generative LLM predicts types conditioned on the query term plus retrieved examples.
 
 .. code-block:: python
 
+   from ontolearner.learner.term_typing import AlexbekRAGLearner
+
    rag_learner = AlexbekRAGLearner(
        llm_model_id="Qwen/Qwen2.5-0.5B-Instruct",
        retriever_model_id="sentence-transformers/all-MiniLM-L6-v2",
@@ -211,6 +161,13 @@ An encoder retrieves top-k similar training examples, and a generative LLM predi
    # Load the underlying LLM and retriever for RAG-based term typing
    rag_learner.load(llm_id=rag_learner.llm_model_id)
 
+Learn and Predict
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: python
+
+   from ontolearner import evaluation_report
+
    # Index the training data for retrieval and prepare prompts
    rag_learner.fit(train_data, task=task)
 
@@ -222,59 +179,6 @@ An encoder retrieves top-k similar training examples, and a generative LLM predi
    metrics = evaluation_report(y_true=truth, y_pred=predicts, task=task)
    print(metrics)
 
-Pipeline Usage
-~~~~~~~~~~~~~~
-
-We place the RAG learner in the ``llm`` slot of :class:`LearnerPipeline`.
-The pipeline handles retrieval, LLM calls, and evaluation end-to-end.
-
-.. code-block:: python
-
-    # Import core modules from the OntoLearner library
-    from ontolearner import GeoNames, train_test_split, LearnerPipeline
-    from ontolearner.learner.term_typing import AlexbekRAGLearner
-
-    # Load the GeoNames ontology.
-    ontology = GeoNames()
-    ontology.load()
-
-    # Extract labeled items and split into train/test sets for evaluation
-    train_data, test_data = train_test_split(
-        ontology.extract(),
-        test_size=0.2,
-        random_state=42,
-    )
-
-    # Configure a Retrieval-Augmented Generation (RAG) term-typing classifier.
-    rag_learner = AlexbekRAGLearner(
-        llm_model_id="Qwen/Qwen2.5-0.5B-Instruct",
-        retriever_model_id="sentence-transformers/all-MiniLM-L6-v2",
-        device="cuda",
-        top_k=3,
-        max_new_tokens=256,
-        output_dir="./results/",
-    )
-
-    # Build the pipeline and pass raw structured objects end-to-end.
-    pipe = LearnerPipeline(
-        llm=rag_learner,
-        llm_id="Qwen/Qwen2.5-0.5B-Instruct",
-        ontologizer_data=True,
-    )
-
-    # Run the full learning pipeline on the term-typing task
-    outputs = pipe(
-        train_data=train_data,
-        test_data=test_data,
-        task="term-typing",
-        evaluate=True,
-        ontologizer_data=True,
-    )
-
-    # Display the evaluation results and runtime
-    print("Metrics:", outputs.get("metrics"))  # e.g., {'precision': ..., 'recall': ..., 'f1_micro': ..., ...}
-    print("Elapsed time (s):", outputs.get("elapsed_time"))
-
 
 Taxonomy Discovery
 ------------------
@@ -286,7 +190,7 @@ For taxonomy discovery, we again use the GeoNames ontology. It exposes parent–
 
 .. code-block:: python
 
-   from ontolearner import GeoNames, train_test_split, evaluation_report
+   from ontolearner import GeoNames, train_test_split
 
    ontology = GeoNames()
    ontology.load()
@@ -307,15 +211,15 @@ The task IDs are: ``term-typing``, ``taxonomy-discovery``, ``non-taxonomic-re``.
 
 .. code-block:: python
 
-   from ontolearner import AlexbekCrossAttnLearner
-
    task = "taxonomy-discovery"
 
 Next, we configure the Alexbek cross-attention learner.
 It uses embeddings of type labels and a lightweight cross-attention layer to predict *is-a* relations.
 
 .. code-block:: python
 
+   from ontolearner import AlexbekCrossAttnLearner
+
    cross_learner = AlexbekCrossAttnLearner(
        embedding_model="sentence-transformers/all-MiniLM-L6-v2",
        device="cpu",
@@ -329,6 +233,13 @@ It uses embeddings of type labels and a lightweight cross-attention layer to pre
        seed=42,
    )
 
+Learn and Predict
+~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: python
+
+   from ontolearner import evaluation_report
+
    # Train the cross-attention model on taxonomic edges
    cross_learner.fit(train_data, task=task)
 
@@ -339,57 +250,3 @@ It uses embeddings of type labels and a lightweight cross-attention layer to pre
    truth = cross_learner.tasks_ground_truth_former(data=test_data, task=task)
    metrics = evaluation_report(y_true=truth, y_pred=predicts, task=task)
    print(metrics)
-
-Pipeline Usage
-~~~~~~~~~~~~~~
-
-Here, :class:`LearnerPipeline` trains the cross-attention model on train edges, predicts taxonomic relations on the test set, and reports evaluation metrics.
-
-.. code-block:: python
-
-   from ontolearner import GeoNames, train_test_split, LearnerPipeline
-   from ontolearner.learner.taxonomy_discovery import AlexbekCrossAttnLearner
-
-   # Load & split
-   ontology = GeoNames()
-   ontology.load()
-   data = ontology.extract()
-   train_data, test_data = train_test_split(
-       data,
-       test_size=0.2,
-       random_state=42,
-   )
-
-   # Configure the cross-attention learner
-   cross_learner = AlexbekCrossAttnLearner(
-       embedding_model="sentence-transformers/all-MiniLM-L6-v2",
-       device="cpu",
-       num_heads=8,
-       lr=5e-5,
-       weight_decay=0.01,
-       num_epochs=1,
-       batch_size=256,
-       neg_ratio=1.0,
-       output_dir="./results/crossattn/",
-       seed=42,
-   )
-
-   # Build pipeline
-   pipeline = LearnerPipeline(
-       llm=cross_learner,    # cross-attention learner
-       llm_id="cross-attn",  # label for bookkeeping
-       ontologizer_data=False,
-   )
-
-   # Train + predict + evaluate
-   outputs = pipeline(
-       train_data=train_data,
-       test_data=test_data,
-       task="taxonomy-discovery",
-       evaluate=True,
-       ontologizer_data=False,
-   )
-
-   print("Metrics:", outputs.get("metrics"))
-   print("Elapsed time:", outputs["elapsed_time"])
-   print(outputs)