Skip to content

Commit bc3334e

Browse files
authored
Added Research Profile Summarizer
1 parent 797b396 commit bc3334e

File tree

3 files changed

+206
-0
lines changed

3 files changed

+206
-0
lines changed
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# **Research Profile Summarizer**
2+
3+
### 🎯 **Goal**
4+
5+
The primary goal of **Research Profile Summarizer** is to provide a comprehensive tool for researchers to gather, summarize, and analyze academic profiles. The app retrieves key information about authors, their research interests, citations, and top publications, enhancing accessibility and streamlining the research process.
6+
7+
### 🧵 **Dataset**
8+
9+
**Research Profile Summarizer** does not rely on a pre-existing dataset. Instead, it utilizes live data retrieved from academic databases through the **scholarly** library, allowing users to access real-time information on various authors and their works.
10+
11+
### 🧾 **Description**
12+
13+
**Research Profile Summarizer** enables users to input author names, retrieve their academic profiles, and generate summaries using automated text processing and generative AI. The application is designed for seamless interaction, providing users with concise and informative outputs, making academic research more efficient and accessible.
14+
15+
### 🧮 **What I had done!**
16+
17+
- Integrated a **data retrieval system** using the **scholarly** library to gather information on authors.
18+
- Utilized **Google Generative AI** for generating concise summaries of author profiles.
19+
- Deployed the application using **Streamlit** to create a user-friendly web interface for interaction.
20+
21+
### 🚀 **Models Implemented**
22+
23+
- **Google Generative AI**: Chosen for its advanced natural language understanding and high accuracy in generating meaningful summaries based on the retrieved data.
24+
25+
### 📚 **Libraries Needed**
26+
27+
- `streamlit`
28+
- `pandas`
29+
- `scholarly`
30+
- `google.generativeai`
31+
- `dotenv`
32+
33+
### 📊 **Exploratory Data Analysis Results**
34+
35+
This project does not involve traditional exploratory data analysis, as it focuses on real-time data retrieval and summarization. However, if relevant visualizations or processing statistics are generated (e.g., citation counts, summary lengths), they can be displayed here.
36+
37+
### 📈 **Performance of the Models based on the Accuracy Scores**
38+
39+
The performance of the system can be evaluated based on:
40+
- **Response accuracy**: How well the system retrieves and summarizes relevant information from author profiles.
41+
- **Summary quality**: The clarity and conciseness of the generated summaries.
42+
43+
### 💻 How to run
44+
45+
To get started with **Research Profile Summarizer**, follow these steps:
46+
47+
1. Navigate to the project directory:
48+
49+
```bash
50+
cd Research-Profile-Summarizer
51+
```
52+
53+
2. (Optional) Activate a virtual environment:
54+
55+
```bash
56+
conda create -n venv python=3.10+
57+
conda activate venv
58+
```
59+
60+
3. Install dependencies:
61+
62+
```bash
63+
pip install -r requirements.txt
64+
```
65+
66+
4. Configure environment variables:
67+
68+
```
69+
Rename `.env-sample` to `.env` file.
70+
Replace with your Google API Key.
71+
```
72+
73+
Kindly refer to this site for getting [your own key](https://ai.google.dev/tutorials/setup).
74+
<br/>
75+
76+
5. Run the application:
77+
78+
```bash
79+
streamlit run app.py
80+
```
81+
82+
PS: Explore other functionalities within the app as well.
83+
84+
### 📢 **Conclusion**
85+
86+
**Research Profile Summarizer** successfully integrates data retrieval and AI-powered summarization to assist researchers in navigating academic profiles. It ensures high interaction accuracy by leveraging state-of-the-art models like Google Generative AI, providing a reliable and accessible research tool for its users.
87+
88+
### ✒️ **Signature**
89+
90+
**[J B Mugundh]**
91+
GitHub: [Github](https://github.com/J-B-Mugundh)
92+
LinkedIn: [Linkedin](https://www.linkedin.com/in/mugundhjb/)
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
import streamlit as st
2+
from scholarly import scholarly
3+
import pandas as pd
4+
import google.generativeai as genai
5+
import os
6+
7+
# Configure Google Generative AI
8+
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
9+
10+
# Streamlit App
11+
st.title("🤖 Research Profile Summarizer")
12+
13+
# Input for author name
14+
author_name = st.text_input("Enter the author's name:", "Steven A Cholewiak")
15+
16+
if st.button("Generate Summary"):
17+
# Retrieve the author's data
18+
search_query = scholarly.search_author(author_name)
19+
first_author_result = next(search_query)
20+
author = scholarly.fill(first_author_result)
21+
22+
# Initialize a string to store all textual data
23+
summary_text = ""
24+
25+
# Display author's name and affiliation
26+
author_info = [
27+
f"**Name:** {author['name']}",
28+
f"**Affiliation:** {author.get('affiliation', 'N/A')}"
29+
]
30+
31+
st.subheader("Author Information")
32+
for info in author_info:
33+
st.write(info) # Display each piece of information as a separate line
34+
summary_text += info + "\n"
35+
36+
37+
# Display research interests as a list
38+
st.subheader("Research Interests")
39+
interests = author.get('interests', [])
40+
if interests:
41+
interests_list = "- " + "\n- ".join(interests) # Display interests as a bullet list
42+
st.write(interests_list)
43+
summary_text += f"**Research Interests:**\n{interests_list}\n"
44+
else:
45+
st.write('N/A')
46+
summary_text += "**Research Interests:** N/A\n"
47+
48+
# Citations overview
49+
st.subheader("Citations Overview")
50+
citations = {
51+
"Total Citations": author.get('citedby', 'N/A'),
52+
"Citations (Last 5 Years)": author.get('citedby5y', 'N/A')
53+
}
54+
for citation_name, citation_value in citations.items():
55+
st.write(f"**{citation_name}:** {citation_value}")
56+
summary_text += f"**{citation_name}:** {citation_value}\n"
57+
58+
# Citations per year
59+
citations_per_year = author.get('cites_per_year', {})
60+
if citations_per_year:
61+
citations_df = pd.DataFrame(list(citations_per_year.items()), columns=['Year', 'Citations'])
62+
st.subheader("Citations Per Year")
63+
st.line_chart(citations_df.set_index('Year'))
64+
summary_text += "Citations data is available.\n"
65+
else:
66+
st.write("No citation data available for the past years.")
67+
summary_text += "No citation data available for the past years.\n"
68+
69+
# Indexes
70+
st.subheader("Indexes")
71+
indexes = {
72+
"H-Index": author.get('hindex', 'N/A'),
73+
"H-Index (Last 5 Years)": author.get('hindex5y', 'N/A'),
74+
"i10-Index": author.get('i10index', 'N/A'),
75+
"i10-Index (Last 5 Years)": author.get('i10index5y', 'N/A')
76+
}
77+
78+
# Displaying indexes in a more structured format
79+
for index_name, index_value in indexes.items():
80+
st.write(f"**{index_name}:** {index_value}")
81+
summary_text += f"**{index_name}:** {index_value}\n"
82+
83+
# Display top publications
84+
st.subheader("Top Publications")
85+
top_publications = sorted(author['publications'], key=lambda x: x.get('num_citations', 0), reverse=True)[:5]
86+
top_publications_text = ""
87+
for pub in top_publications:
88+
pub_filled = scholarly.fill(pub)
89+
publication_info = f"- **{pub_filled['bib']['title']}** (Citations: {pub_filled.get('num_citations', 0)})"
90+
st.write(publication_info)
91+
top_publications_text += publication_info + "\n"
92+
93+
summary_text += f"**Top Publications:**\n{top_publications_text}\n"
94+
95+
# Generate summary using Google Generative AI
96+
model = genai.GenerativeModel("gemini-pro")
97+
chat = model.start_chat(history=[])
98+
99+
# Function to generate summary using Gemini Pro model
100+
def generate_summary(data):
101+
summary_prompt = f"Write a concise 200-word summary based on the following information:\n{data}\nInclude key details like research interests, citations, H-index, co-authors, and notable publications."
102+
response = chat.send_message(summary_prompt)
103+
summary = "".join([chunk.text for chunk in response])
104+
return summary
105+
106+
# Generate and display the summary
107+
generated_summary = generate_summary(summary_text)
108+
st.subheader("Profile Summary")
109+
st.write(generated_summary)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
streamlit
2+
pandas
3+
scholarly
4+
google-generativeai
5+
python-dotenv

0 commit comments

Comments
 (0)