@@ -50,10 +50,9 @@ Comprehensive documentation for OntoAligner, including detailed guides and examp
5050
5151## 🚀 Quick Tour
5252
53- Below is an example of using Retrieval-Augmented Generation (RAG) for ontology matching:
53+ Below is an example of using Retrieval-Augmented Generation (RAG) step-by-step approach for ontology matching:
5454
5555``` python
56- import json
5756from ontoaligner.ontology import MaterialInformationMatOntoOMDataset
5857from ontoaligner.utils import metrics, xmlify
5958from ontoaligner.ontology_matchers import MistralLLMBERTRetrieverRAG
@@ -80,26 +79,54 @@ retriever_config = {"device": 'cuda', "top_k": 5,}
8079llm_config = {" device" : " cuda" , " max_length" : 300 , " max_new_tokens" : 10 , " batch_size" : 15 }
8180
8281# Step 5: Initialize Generate predictions using RAG-based ontology matcher
83- model = MistralLLMBERTRetrieverRAG(retriever_config = retriever_config,
84- llm_config = llm_config)
82+ model = MistralLLMBERTRetrieverRAG(retriever_config = retriever_config, llm_config = llm_config)
8583predicts = model.generate(input_data = encoded_ontology)
8684
8785# Step 6: Apply hybrid postprocessing
8886hybrid_matchings, hybrid_configs = rag_hybrid_postprocessor(predicts = predicts,
8987 ir_score_threshold = 0.1 ,
9088 llm_confidence_th = 0.8 )
9189
92- evaluation = metrics.evaluation_report(predicts = hybrid_matchings,
93- references = dataset[' reference' ])
90+ evaluation = metrics.evaluation_report(predicts = hybrid_matchings, references = dataset[' reference' ])
9491print (" Hybrid Matching Evaluation Report:" , evaluation)
9592
9693# Step 7: Convert matchings to XML format and save the XML representation
9794xml_str = xmlify.xml_alignment_generator(matchings = hybrid_matchings)
98- with open (" matchings.xml" , " w" , encoding = " utf-8" ) as xml_file:
99- xml_file.write(xml_str)
95+ open (" matchings.xml" , " w" , encoding = " utf-8" ).write(xml_str)
10096```
10197
98+ Ontology alignment pipeline using RAG method:
10299
100+ ``` python
101+ import ontoaligner
102+
103+ pipeline = ontoaligner.OntoAlignerPipeline(
104+ task_class = ontoaligner.ontology.MouseHumanOMDataset,
105+ source_ontology_path = " assets/MI-MatOnto/mi_ontology.xml" ,
106+ target_ontology_path = " assets/MI-MatOnto/matonto_ontology.xml" ,
107+ reference_matching_path = " assets/MI-MatOnto/matchings.xml" ,
108+ )
109+
110+ matchings, evaluation = pipeline(
111+ method = " rag" ,
112+ encoder_model = ontoaligner.ConceptRAGEncoder(),
113+ model_class = ontoaligner.ontology_matchers.MistralLLMBERTRetrieverRAG,
114+ postprocessor = ontoaligner.postprocess.rag_hybrid_postprocessor,
115+ llm_path = ' mistralai/Mistral-7B-v0.3' ,
116+ retriever_path = ' all-MiniLM-L6-v2' ,
117+ llm_threshold = 0.5 ,
118+ ir_threshold = 0.7 ,
119+ top_k = 5 ,
120+ max_length = 512 ,
121+ max_new_tokens = 10 ,
122+ device = ' cuda' ,
123+ batch_size = 32 ,
124+ return_matching = True ,
125+ evaluate = True
126+ )
127+
128+ print (" Matching Evaluation Report:" , evaluation)
129+ ```
103130## ⭐ Contribution
104131
105132We welcome contributions to enhance OntoAligner and make it even better! Please review our contribution guidelines in [ CONTRIBUTING.md] ( CONTRIBUTING.md ) before getting started. Your support is greatly appreciated.
@@ -119,7 +146,7 @@ If you use OntoAligner in your work or research, please cite the following:
119146 title = {OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment},
120147 version = {1.0.0},
121148 year = {2024},
122- url = {https://github.com/HamedBabaei /OntoAligner},
149+ url = {https://github.com/sciknoworg /OntoAligner},
123150}
124151```
125152
0 commit comments