Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# **Research Profile Summarizer**

### 🎯 **Goal**

The primary goal of **Research Profile Summarizer** is to provide a comprehensive tool for researchers to gather, summarize, and analyze academic profiles. The app retrieves key information about authors, their research interests, citations, and top publications, enhancing accessibility and streamlining the research process.

### 🧵 **Dataset**

**Research Profile Summarizer** does not rely on a pre-existing dataset. Instead, it utilizes live data retrieved from academic databases through the **scholarly** library, allowing users to access real-time information on various authors and their works.

### 🧾 **Description**

**Research Profile Summarizer** enables users to input author names, retrieve their academic profiles, and generate summaries using automated text processing and generative AI. The application is designed for seamless interaction, providing users with concise and informative outputs, making academic research more efficient and accessible.

### 🧮 **What I had done!**

- Integrated a **data retrieval system** using the **scholarly** library to gather information on authors.
- Utilized **Google Generative AI** for generating concise summaries of author profiles.
- Deployed the application using **Streamlit** to create a user-friendly web interface for interaction.

### 🚀 **Models Implemented**

- **Google Generative AI**: Chosen for its advanced natural language understanding and high accuracy in generating meaningful summaries based on the retrieved data.

### 📚 **Libraries Needed**

- `streamlit`
- `pandas`
- `scholarly`
- `google.generativeai`
- `dotenv`

### 📊 **Exploratory Data Analysis Results**

This project does not involve traditional exploratory data analysis, as it focuses on real-time data retrieval and summarization. However, if relevant visualizations or processing statistics are generated (e.g., citation counts, summary lengths), they can be displayed here.

### 📈 **Performance of the Models based on the Accuracy Scores**

The performance of the system can be evaluated based on:
- **Response accuracy**: How well the system retrieves and summarizes relevant information from author profiles.
- **Summary quality**: The clarity and conciseness of the generated summaries.

### 💻 How to run

To get started with **Research Profile Summarizer**, follow these steps:

1. Navigate to the project directory:

```bash
cd Research-Profile-Summarizer
```

2. (Optional) Activate a virtual environment:

```bash
conda create -n venv python=3.10+
conda activate venv
```

3. Install dependencies:

```bash
pip install -r requirements.txt
```

4. Configure environment variables:

```
Rename `.env-sample` to `.env` file.
Replace with your Google API Key.
```

Kindly refer to this site for getting [your own key](https://ai.google.dev/tutorials/setup).
<br/>

5. Run the application:

```bash
streamlit run app.py
```

PS: Explore other functionalities within the app as well.

### 📢 **Conclusion**

**Research Profile Summarizer** successfully integrates data retrieval and AI-powered summarization to assist researchers in navigating academic profiles. It ensures high interaction accuracy by leveraging state-of-the-art models like Google Generative AI, providing a reliable and accessible research tool for its users.

### ✒️ **Signature**

**[J B Mugundh]**
GitHub: [Github](https://github.com/J-B-Mugundh)
LinkedIn: [Linkedin](https://www.linkedin.com/in/mugundhjb/)
109 changes: 109 additions & 0 deletions Algorithms and Deep Learning Models/Research-Profile-Summarizer/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
import streamlit as st
from scholarly import scholarly
import pandas as pd
import google.generativeai as genai
import os

# Configure Google Generative AI
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

# Streamlit App
st.title("🤖 Research Profile Summarizer")

# Input for author name
author_name = st.text_input("Enter the author's name:", "Steven A Cholewiak")

if st.button("Generate Summary"):
# Retrieve the author's data
search_query = scholarly.search_author(author_name)
first_author_result = next(search_query)
author = scholarly.fill(first_author_result)

# Initialize a string to store all textual data
summary_text = ""

# Display author's name and affiliation
author_info = [
f"**Name:** {author['name']}",
f"**Affiliation:** {author.get('affiliation', 'N/A')}"
]

st.subheader("Author Information")
for info in author_info:
st.write(info) # Display each piece of information as a separate line
summary_text += info + "\n"


# Display research interests as a list
st.subheader("Research Interests")
interests = author.get('interests', [])
if interests:
interests_list = "- " + "\n- ".join(interests) # Display interests as a bullet list
st.write(interests_list)
summary_text += f"**Research Interests:**\n{interests_list}\n"
else:
st.write('N/A')
summary_text += "**Research Interests:** N/A\n"

# Citations overview
st.subheader("Citations Overview")
citations = {
"Total Citations": author.get('citedby', 'N/A'),
"Citations (Last 5 Years)": author.get('citedby5y', 'N/A')
}
for citation_name, citation_value in citations.items():
st.write(f"**{citation_name}:** {citation_value}")
summary_text += f"**{citation_name}:** {citation_value}\n"

# Citations per year
citations_per_year = author.get('cites_per_year', {})
if citations_per_year:
citations_df = pd.DataFrame(list(citations_per_year.items()), columns=['Year', 'Citations'])
st.subheader("Citations Per Year")
st.line_chart(citations_df.set_index('Year'))
summary_text += "Citations data is available.\n"
else:
st.write("No citation data available for the past years.")
summary_text += "No citation data available for the past years.\n"

# Indexes
st.subheader("Indexes")
indexes = {
"H-Index": author.get('hindex', 'N/A'),
"H-Index (Last 5 Years)": author.get('hindex5y', 'N/A'),
"i10-Index": author.get('i10index', 'N/A'),
"i10-Index (Last 5 Years)": author.get('i10index5y', 'N/A')
}

# Displaying indexes in a more structured format
for index_name, index_value in indexes.items():
st.write(f"**{index_name}:** {index_value}")
summary_text += f"**{index_name}:** {index_value}\n"

# Display top publications
st.subheader("Top Publications")
top_publications = sorted(author['publications'], key=lambda x: x.get('num_citations', 0), reverse=True)[:5]
top_publications_text = ""
for pub in top_publications:
pub_filled = scholarly.fill(pub)
publication_info = f"- **{pub_filled['bib']['title']}** (Citations: {pub_filled.get('num_citations', 0)})"
st.write(publication_info)
top_publications_text += publication_info + "\n"

summary_text += f"**Top Publications:**\n{top_publications_text}\n"

# Generate summary using Google Generative AI
model = genai.GenerativeModel("gemini-pro")
chat = model.start_chat(history=[])

# Function to generate summary using Gemini Pro model
def generate_summary(data):
summary_prompt = f"Write a concise 200-word summary based on the following information:\n{data}\nInclude key details like research interests, citations, H-index, co-authors, and notable publications."
response = chat.send_message(summary_prompt)
summary = "".join([chunk.text for chunk in response])
return summary

# Generate and display the summary
generated_summary = generate_summary(summary_text)
st.subheader("Profile Summary")
st.write(generated_summary)
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
streamlit
pandas
scholarly
google-generativeai
python-dotenv
Loading