Skip to content

sgr1118/Bert_beer_sentiment_anlysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

22 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Project : Beer Sentiment Classifier and Keyphrase Extraction

์˜์ƒ ์ œ๋ชฉ

ํ”„๋กœ์ ํŠธ ์†Œ๊ฐœ

  • ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ ๊ฐ์ •๋ถ„์„๊ณผ ํ•ต์‹ฌ ๋ฌธ๊ตฌ๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘, ๋ผ๋ฒจ๋ง, ๋ชจ๋ธ๋ง, ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•, ๋ฌธ๊ตฌ ์ถ”์ถœ๋ฐฉ๋ฒ•์„ ์ „๋ถ€ ๋‹ค๋ฃน๋‹ˆ๋‹ค.

ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ

  • ๋ชจ๋ธ์˜ ๊ฐ์ • ๋ถ„์„ ์„ฑ๋Šฅ์„ Precision ๊ธฐ์ค€ 0.9 ์ด์ƒ ๋‹ฌ์„ฑ
  • ์‚ฌ์šฉ์ž๊ฐ€ ์„ ํƒํ•œ ๋งฅ์ฃผ์— ๋Œ€ํ•œ ํ•ต์‹ฌ ๋ฌธ๊ตฌ์„ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ• ์ œ๊ณต

๊ฐ์ • ๋ถ„๋ฅ˜ ๋ฐ ํ•ต์‹ฌ ๋ฌธ๊ตฌ ์ถ”์ถœ Demo (์˜์–ด๋งŒ ์ง€์›)

image

์‚ฌ์ „ ์ค€๋น„

!git clone https://github.com/sgr1118/Bert_beer_sentiment_anlysis.git
pip install -r requirements.txt

์‚ฌ์šฉ๋ฒ• ์˜ˆ์‹œ

Use to Pre_Trained_Model [colab]

import torch
from transformers import BertTokenizerFast, BertForSequenceClassification
from torch.nn.functional import softmax
import matplotlib.pyplot as plt

# ๋ชจ๋ธ ๋กœ๋“œ
model = BertForSequenceClassification.from_pretrained('GiRak/beer-sentiment-bert') # HuggingFace ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ ์—…๋กœ๋“œ
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = model.to(device)

# ํ† ํฌ๋‚˜์ด์ € ์ดˆ๊ธฐํ™”
tokenizer = BertTokenizerFast.from_pretrained('GiRak/beer-sentiment-bert') # HuggingFace ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ ์—…๋กœ๋“œ

def analyze_sentiment(sentence):
    # ๋ฌธ์žฅ์„ ํ† ํฌ๋‚˜์ด์ง•ํ•˜๊ณ  ๋ชจ๋ธ ์ž…๋ ฅ์œผ๋กœ ๋ณ€ํ™˜
    inputs = tokenizer(sentence, return_tensors='pt')
    inputs = inputs.to(device)

    # ๋ชจ๋ธ์„ ํ†ตํ•ด ๊ฐ์ • ๋ถ„๋ฅ˜ ์ˆ˜ํ–‰
    outputs = model(**inputs)
    logits = outputs.logits
    probabilities = softmax(logits, dim=1)

    # ๊ฐ์ • ๋ถ„๋ฅ˜ ํ™•๋ฅ  ์ถ”์ถœ
    sentiment_labels = ['Negative', 'Positive']
    sentiment_probabilities = {label: probability.item() for label, probability in zip(sentiment_labels, probabilities[0])}

    return sentiment_probabilities

sentences = ['I took a sip and immediately discarded it. How could a beer have such a strong cinnamon flavor?']

# Lists to store probabilities
positive_probs = []
negative_probs = []

for sentence in sentences:
    sentiment_probabilities = analyze_sentiment(sentence)
    positive_prob = sentiment_probabilities['Positive'] * 100
    negative_prob = sentiment_probabilities['Negative'] * 100

    positive_probs.append(positive_prob)
    negative_probs.append(negative_prob)

    print("Sentence:", sentence)
    print("Positive Probability:", int(positive_prob), "%")
    print("Negative Probability:", int(negative_prob), "%")

# Plotting
x = ['Positive', 'Negative']

plt.bar(x, [positive_probs[0], negative_probs[0]], color=['green', 'red'])
plt.xlabel('Sentiment')
plt.ylabel('Probability (%)')
plt.title('Sentiment Analysis Result')
plt.tight_layout()
plt.show()

ํ‚ค์›Œ๋“œ ๋ฐ ํ•ต์‹ฌ ๋ฌธ๊ตฌ ์ถ”์ถœ

# Load the model
kw_model = KeyBERT('all-mpnet-base-v2')

# Use KeyphraseCountVectorize
from tqdm import tqdm

def apply_keybert(sentence):
    keywords = kw_model.extract_keywords(sentence, vectorizer=KeyphraseCountVectorizer(), stop_words='english', top_n=3)
    return ', '.join([keyword for keyword, score in keywords]) # ํ‚ค์›Œ๋“œ๋Š” ์ค‘์š”๋„ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ์ตœ๋Œ€ 3๊ฐœ๊นŒ์ง€ ์ €์žฅ๋œ๋‹ค.

# Creating Columns and Storing Keywords
df['keywords'] = df['Review'].apply(apply_keybert)

# Counting Keywords and Keyphrase

from collections import Counter

# ๋ชจ๋“  ์ธ๋ฑ์Šค์˜ 'keywords' ์ปฌ๋Ÿผ์„ ํ•ฉ์ณ์„œ ๋‹จ์–ด๋“ค์„ ์นด์šดํŠธํ•˜๋Š” ํ•จ์ˆ˜
def count_all_keywords(dataframe):
    all_keywords = dataframe['keywords'].str.split(', ').sum()
    keyword_counts = Counter(all_keywords)
    sorted_keyword_counts = keyword_counts.most_common() # ํ‚ค์›Œ๋“œ ๋นˆ๋„๊ฐ€ ๋งŽ์€ ์ˆœ์œผ๋กœ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌํ•œ๋‹ค.
    return sorted_keyword_counts

# ๋ชจ๋“  ์ธ๋ฑ์Šค์˜ 'keywords' ์ปฌ๋Ÿผ์— ์žˆ๋Š” ๋‹จ์–ด๋“ค์„ ์นด์šดํŠธ
sorted_all_keyword_counts = count_all_keywords(beer_Wired_iStout_pre)

Reference

ํ”„๋กœ์ ํŠธ ๊ฒฐ๊ณผ๋ฌผ ๋ชจ์Œ

No ๋‚ด์šฉ ๊นƒํ—ˆ๋ธŒ
1 ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๐Ÿ“‚
2 ๋ฐ์ดํ„ฐ ๋ผ๋ฒจ๋ง ๐Ÿ“‚
3 ํŒŒ์ธ ํŠœ๋‹ ๐Ÿ“‚

ํ”„๋กœ์ ํŠธ ๊ฐœ์„  ์š”๊ตฌ ์‚ฌํ•ญ

1. ์ค‘๋ฆฝ ๋ผ๋ฒจ๋ง ์ถ”๊ฐ€

  • ์ข€๋” ์„ธ๋ถ„ํ™”๋œ ๊ฐ์ • ๋ถ„๋ฅ˜๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ธฐ์œ„ํ•ด ์ค‘๋ฆฝ ๋ผ๋ฒจ๋ง ๊ธฐ์ค€์„ ํ™•๋ฆฝํ•˜๊ณ  ์ ์šฉํ•  ์˜ˆ์ •

2. ํ•ต์‹ฌ ๋ฌธ๊ตฌ ์ถ”์ถœ ์†๋„ ์ฆ๊ฐ€

  • KeyBERT๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ต์‹ฌ ๋ฌธ๊ตฌ ์ถ”์ถœ ์‹œ๊ฐ„์ด ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์•„์งˆ์ˆ˜๋ก ๊ธธ์–ด์ง„๋‹ค. ์‹ค์‹œ๊ฐ„ ์‘๋‹ต์œผ๋กœ ํ‚ค์›Œ๋“œ ์ถ”์ถœ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ ํž˜๋“ค๋‹ค๋Š” ๋‹จ์ ์ด์žˆ๋‹ค.

ํ”„๋กœ์ ํŠธ ํ›„์› : (์ฃผ)๋ชจ๋‘์˜์—ฐ๊ตฌ์†Œ, K-๋””์ง€ํ„ธ ํ”Œ๋žซํผ

๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ๋ชจ๋‘์˜์—ฐ๊ตฌ์†Œ์™€ K-๋””์ง€ํ„ธ ํ”Œ๋žซํผ์œผ๋กœ๋ถ€ํ„ฐ ์ง€์›๋ฐ›์•˜์Šต๋‹ˆ๋‹ค.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors