Skip to content

anonimoustt/anonymous1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 

Repository files navigation

Introduction

In this research, we present a groundbreaking short text classification method for digital forensic analysis that effectively computes probability scores for target topics within a corpus of conversational texts. Unlike traditional state-of-the-art text classification methods, which depend on a trained model, extensive training data, human input, and a large corpus for efficient inference, our innovative approach operates independently of these constraints. We leverage the Sentence Transformer to generate high-quality embeddings and rigorously compare our model's performance with other embedding techniques, such as Word2Vec and Fast Text. Moreover, we evaluate our method against zero-shot and few-shot models. Our experiments involve two authoritative benchmarks: Daily Dialog {Lhoest_Datasets_A_Community_2021} and Dialog Sum {chen-etal-2021-dialogsum} data. The empirical results unequivocally demonstrate that our model outperforms traditional text classification techniques, confirming its effectiveness in this domain.

Data

The dialog sum {chen-etal-2021-dialogsum} data are available at:

https://drive.google.com/drive/folders/1VnW2__6D2RtI0TMP7Ggsyp20TKeevDvq?usp=sharing

The daily dialog {Lhoest_Datasets_A_Community_2021} data available at:

https://huggingface.co/datasets/peandrew/dialy_dialogue_with_recoginized_concept_raw

Code

  1. The daily dialog {Lhoest_Datasets_A_Community_2021} application available at the following google colab:

https://colab.research.google.com/drive/1QJY60RVnX5etwU0wLImPXU6ra5NpEDVk?usp=sharing

  1. The dialog sum {chen-etal-2021-dialogsum} application available at the following google colab:

https://colab.research.google.com/drive/15D03KLZTzLk0M5vSlEeUWiNhYRJvU33f?usp=sharing

  1. Comparison with the current state-of-the-arts embedding methods

https://colab.research.google.com/drive/1D6TUZLvSrbIJXcb3Z7l4KDYbWWZPrFJR?usp=sharing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors