Introduction

In this research, we present a groundbreaking short text classification method for digital forensic analysis that effectively computes probability scores for target topics within a corpus of conversational texts. Unlike traditional state-of-the-art text classification methods, which depend on a trained model, extensive training data, human input, and a large corpus for efficient inference, our innovative approach operates independently of these constraints. We leverage the Sentence Transformer to generate high-quality embeddings and rigorously compare our model's performance with other embedding techniques, such as Word2Vec and Fast Text. Moreover, we evaluate our method against zero-shot and few-shot models. Our experiments involve two authoritative benchmarks: Daily Dialog {Lhoest_Datasets_A_Community_2021} and Dialog Sum {chen-etal-2021-dialogsum} data. The empirical results unequivocally demonstrate that our model outperforms traditional text classification techniques, confirming its effectiveness in this domain.

Data

The dialog sum {chen-etal-2021-dialogsum} data are available at:

https://drive.google.com/drive/folders/1VnW2__6D2RtI0TMP7Ggsyp20TKeevDvq?usp=sharing

The daily dialog {Lhoest_Datasets_A_Community_2021} data available at:

https://huggingface.co/datasets/peandrew/dialy_dialogue_with_recoginized_concept_raw

Code

The daily dialog {Lhoest_Datasets_A_Community_2021} application available at the following google colab:

https://colab.research.google.com/drive/1QJY60RVnX5etwU0wLImPXU6ra5NpEDVk?usp=sharing

The dialog sum {chen-etal-2021-dialogsum} application available at the following google colab:

https://colab.research.google.com/drive/15D03KLZTzLk0M5vSlEeUWiNhYRJvU33f?usp=sharing

Comparison with the current state-of-the-arts embedding methods

https://colab.research.google.com/drive/1D6TUZLvSrbIJXcb3Z7l4KDYbWWZPrFJR?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Data

Code

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Introduction

Data

Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages