Skip to content

Commit b3d99cd

Browse files
committed
feat: add more translation languages
1 parent a34f40b commit b3d99cd

File tree

4 files changed

+109
-20
lines changed

4 files changed

+109
-20
lines changed

README.md

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Easy Webpage Summarizer
22

3-
A Python script designed to summarize webpages from specified URLs using the LangChain framework and the ChatOllama model. It leverages advanced language models to generate detailed summaries, making it an invaluable tool for quickly understanding the content of web-based documents.
3+
A Python script designed to summarize webpages from specified URLs using the LangChain framework and the ChatOllama model. It leverages advanced language models to generate detailed summaries and translate them to multiple languages, making it an invaluable tool for quickly understanding the content of web-based documents.
44

55
## Requirements
66

@@ -17,30 +17,59 @@ pip install -r requirements.txt
1717
## Features
1818

1919
- Summarization of webpages and youtube videos directly from URLs.
20-
- Translates to Turkish language (other languages will be added soon!)
21-
- Integration with LangChain and ChatOllama for state-of-the-art summarization.
20+
- **Translation to multiple languages** with language selection
21+
- Integration with LangChain and ChatOllama for state-of-the-art summarization and translation.
2222
- Command-line interface for easy use and integration into workflows.
23+
- Web interface with language selection dropdown.
2324

2425
## Usage
2526

27+
### Command Line Interface
28+
2629
To use the webpage summarizer, run the script from the command line, providing the URL of the document you wish to summarize:
2730

2831
```bash
29-
python summarizer.py -u "http://example.com/document"
32+
# Basic summarization only
33+
python app/summarizer.py -u "http://example.com/document"
34+
35+
# Summarize and translate to Spanish (default)
36+
python app/summarizer.py -u "http://example.com/document" -t "Spanish"
37+
38+
# Summarize and translate to French
39+
python app/summarizer.py -u "http://example.com/document" -t "French"
40+
41+
# Summarize and translate to German
42+
python app/summarizer.py -u "http://example.com/document" -t "German"
3043
```
3144

3245
Replace `http://example.com/document` with the actual URL of the document you want to summarize.
3346

47+
#### Available Languages
48+
49+
The following languages are supported for translation:
50+
- Spanish (default)
51+
- French
52+
- German
53+
- Italian
54+
- Portuguese
55+
- Turkish
56+
- English
57+
3458
### Web UI
3559

36-
To use the webpage summarizer in you web browser, you can also try gradio app.
60+
To use the webpage summarizer in your web browser, you can also try the gradio app:
3761

3862
```bash
3963
python app/webui.py
4064
```
4165

4266
![gradio](assets/gradio.png)
4367

68+
The web interface includes:
69+
- URL input for summarization
70+
- Language selection dropdown (appears after generating summary)
71+
- Translate button to convert summary to selected language
72+
4473
## Docker
4574

4675
```bash
@@ -51,14 +80,14 @@ docker run -p 7860:7860 web_summarizer
5180
docker run -d --network='host' -p 7860:7860 web_summarizer
5281
```
5382

54-
5583
## Development
5684

5785
To contribute to the development of this script, clone the repository, make your changes, and submit a pull request. We welcome contributions that improve the script's functionality or extend its capabilities.
5886

5987
- [x] Summarize youtube videos
6088
- [x] Dockerize project
61-
- [ ] Translate to different languages
89+
- [x] Translate to different languages
90+
- [x] Language selection for translations
6291
- [ ] Streaming text output on gradio
6392
- [ ] Serve on web
6493

app/summarizer.py

Lines changed: 48 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,15 @@
99
def setup_argparse():
1010
"""Setup argparse to parse command line arguments."""
1111
parser = argparse.ArgumentParser(
12-
description="Summarize a document from a given URL."
12+
description="Summarize a document from a given URL and optionally translate it."
1313
)
1414
parser.add_argument(
1515
"-u", "--url", required=True, help="URL of the document to summarize"
1616
)
17+
parser.add_argument(
18+
"-t", "--translate",
19+
help="Target language for translation (if specified, translation is enabled)"
20+
)
1721
return parser.parse_args()
1822

1923

@@ -50,12 +54,54 @@ def setup_summarization_chain():
5054
return llm_chain
5155

5256

57+
def setup_translation_chain(target_language="Spanish"):
58+
"""Setup the translation chain with a prompt template and ChatOllama."""
59+
prompt_template = PromptTemplate(
60+
template="""Translate the following text into {target_language}. Provide only the translation without any quotes, headers, or additional text. The output should be clean and direct:
61+
62+
{text}
63+
64+
TRANSLATION:""",
65+
input_variables=["text", "target_language"],
66+
)
67+
68+
llm = ChatOllama(model="llama3:instruct", base_url="http://127.0.0.1:11434")
69+
llm_chain = LLMChain(llm=llm, prompt=prompt_template)
70+
return llm_chain
71+
72+
73+
def translate_text(text, target_language="Spanish"):
74+
"""Translate text to the specified target language."""
75+
llm_chain = setup_translation_chain(target_language)
76+
result = llm_chain.run({"text": text, "target_language": target_language})
77+
return result
78+
79+
5380
def main():
5481
args = setup_argparse()
5582
docs = load_document(args.url)
5683

84+
# Generate summary
5785
llm_chain = setup_summarization_chain()
58-
result = llm_chain.run(docs)
86+
summary = llm_chain.run(docs)
87+
88+
print("=" * 60)
89+
print("SUMMARY")
90+
print("=" * 60)
91+
print(summary)
92+
print()
93+
94+
# Translate if language is specified
95+
if args.translate:
96+
print("=" * 60)
97+
print(f"TRANSLATION TO {args.translate.upper()}")
98+
print("=" * 60)
99+
try:
100+
translation = translate_text(summary, args.translate)
101+
print(translation)
102+
except Exception as e:
103+
print(f"Translation failed: {e}")
104+
print("Make sure Ollama is running with the llama3:instruct model.")
59105

60106

61107
if __name__ == "__main__":

app/translator.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,24 @@
33
from langchain_community.chat_models import ChatOllama
44

55

6-
def setup_translator_chain():
6+
def setup_translator_chain(target_language="Turkish"):
77
"""Setup the translation chain with a prompt template and ChatOllama."""
88
prompt_template = PromptTemplate(
9-
template="""As a professional translator, provide a detailed and comprehensive translation of the provided text into turkish, ensuring that the translation is accurate, coherent, and faithful to the original text.
9+
template="""As a professional translator, provide a detailed and comprehensive translation of the provided text into {target_language}, ensuring that the translation is accurate, coherent, and faithful to the original text.
1010
1111
"{text}"
1212
1313
DETAILED TRANSLATION:""",
14-
input_variables=["text"],
14+
input_variables=["text", "target_language"],
1515
)
1616

1717
llm = ChatOllama(model="llama3:instruct", base_url="http://127.0.0.1:11434")
1818
llm_chain = LLMChain(llm=llm, prompt=prompt_template)
1919
return llm_chain
20+
21+
22+
def translate_text(text, target_language="Turkish"):
23+
"""Translate text to the specified target language."""
24+
llm_chain = setup_translator_chain(target_language)
25+
result = llm_chain.run({"text": text, "target_language": target_language})
26+
return result

app/webui.py

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import gradio as gr
22

33
from summarizer import load_document, setup_summarization_chain
4-
from translator import setup_translator_chain
4+
from translator import translate_text
55
from yt_summarizer import check_link, summarize_video
66

77

@@ -13,12 +13,11 @@ def summarize(url):
1313
llm_chain = setup_summarization_chain()
1414
result = llm_chain.run(docs)
1515

16-
return [result, gr.Button("🇹🇷 Translate ", visible=True)]
16+
return [result, gr.Button("Translate ", visible=True), gr.Dropdown(visible=True)]
1717

1818

19-
def translate(text):
20-
llm_chain = setup_translator_chain()
21-
result = llm_chain.run(text)
19+
def translate(text, target_language):
20+
result = translate_text(text, target_language)
2221
return result
2322

2423

@@ -35,7 +34,15 @@ def translate(text):
3534
btn_generate = gr.Button("Generate")
3635

3736
summary = gr.Markdown(label="Summary")
38-
btn_translate = gr.Button(visible=False)
37+
38+
with gr.Row():
39+
btn_translate = gr.Button(visible=False)
40+
language_dropdown = gr.Dropdown(
41+
choices=["Spanish", "French", "German", "Italian", "Portuguese", "Turkish", "English"],
42+
value="Turkish",
43+
label="Target Language",
44+
visible=False
45+
)
3946

4047
gr.Examples(
4148
[
@@ -54,7 +61,7 @@ def translate(text):
5461
Repo: github.com/mertcobanov/easy-web-summarizer
5562
```"""
5663
)
57-
btn_generate.click(summarize, inputs=[url], outputs=[summary, btn_translate])
58-
btn_translate.click(translate, inputs=[summary], outputs=[summary])
64+
btn_generate.click(summarize, inputs=[url], outputs=[summary, btn_translate, language_dropdown])
65+
btn_translate.click(translate, inputs=[summary, language_dropdown], outputs=[summary])
5966

6067
demo.launch(server_name="0.0.0.0")

0 commit comments

Comments
 (0)