Skip to content

Commit d7bc07d

Browse files
committed
proxy added and enhance gnewsdecoder functionality
1 parent 836bac1 commit d7bc07d

File tree

11 files changed

+383
-95
lines changed

11 files changed

+383
-95
lines changed
Lines changed: 28 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,41 @@
1-
# This workflow will upload a Python Package using Twine when a release is created
2-
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
3-
4-
# This workflow uses actions that are not certified by GitHub.
5-
# They are provided by a third-party and are governed by
6-
# separate terms of service, privacy policy, and support
7-
# documentation.
8-
91
name: Upload Python Package
102

113
on:
4+
push:
5+
tags:
6+
- "[0-9]+.[0-9]+.[0-9]+"
7+
- "[0-9]+.[0-9]+.[0-9]+a[0-9]+"
8+
- "[0-9]+.[0-9]+.[0-9]+b[0-9]+"
9+
- "[0-9]+.[0-9]+.[0-9]+rc[0-9]+"
10+
branches:
11+
- main
1212
release:
1313
types: [published]
14-
1514
permissions:
1615
contents: read
1716

1817
jobs:
1918
deploy:
20-
2119
runs-on: ubuntu-latest
2220

2321
steps:
24-
- uses: actions/checkout@v4
25-
- name: Set up Python
26-
uses: actions/setup-python@v3
27-
with:
28-
python-version: '3.x'
29-
- name: Install dependencies
30-
run: |
31-
python -m pip install --upgrade pip
32-
pip install build
33-
- name: Build package
34-
run: python -m build
35-
- name: Publish package
36-
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
37-
with:
38-
user: __token__
39-
password: ${{ secrets.PYPI_API_TOKEN }}
22+
- uses: actions/checkout@v4
23+
24+
- name: Set up Python
25+
uses: actions/setup-python@v3
26+
with:
27+
python-version: "3.12"
28+
29+
- name: Install dependencies
30+
run: |
31+
python -m pip install --upgrade pip
32+
pip install build
33+
34+
- name: Build package
35+
run: python -m build
36+
37+
- name: Publish package
38+
uses: pypa/gh-action-pypi-publish@release/v1
39+
with:
40+
user: __token__
41+
password: ${{ secrets.PYPI_API_TOKEN }}

.gitignore

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,16 @@
1-
# Ignore Python bytecode
21
__pycache__/
32
*.pyc
43
*.pyo
54

6-
# Ignore macOS system files
75
.DS_Store
86
.venv
97
.ignore
108
.backup_readme.md
119
.backup_readme2.md
1210

13-
# Ignore build artifacts
1411
dist/
1512
build/
1613

17-
# Ignore egg info
1814
*.egg-info/
15+
16+
test.py

README.md

Lines changed: 78 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,29 @@
11
[![PyPI version](https://badge.fury.io/py/googlenewsdecoder.svg)](https://badge.fury.io/py/googlenewsdecoder)
2-
[![Python Versions](https://img.shields.io/badge/python-3.9-blue)](https://pypi.org/project/facebook-pages-scraper/)
2+
[![Python Versions](https://img.shields.io/badge/python-3.9%20|%203.10%20|%203.11%20|%203.12%20|%203.13-blue)](https://pypi.org/project/googlenewsdecoder/)
33
[![Downloads](https://static.pepy.tech/badge/googlenewsdecoder)](https://pepy.tech/project/googlenewsdecoder)
4-
[![Downloads](https://static.pepy.tech/badge/googlenewsdecoder/month)](https://pepy.tech/project/googlenewsdecoder)
54
[![Downloads](https://static.pepy.tech/badge/googlenewsdecoder/week)](https://pepy.tech/project/googlenewsdecoder)
65

76
# Google News Decoder
87

98
Google News Decoder is a Python package that can decode Google News links or Google News URLs to their original URLs. It is a simple tool that saves you time and effort. If you find it useful, please support the package by hitting the star on GitHub. Your support helps keep the project going!
109

11-
[Pypi Package](https://pypi.org/project/googlenewsdecoder/)
12-
1310
## Update
1411

15-
- Version 0.1.6:
16-
17-
- Improved: Enhanced error handling with a fallback mechanism for decoding parameters.
18-
- Refined: Optimized get_decoding_params to try decoding via https://news.google.com/articles first, falling back to https://news.google.com/rss/articles if needed
19-
- Updated: Reduced occurrences of HTTP 429 (Too Many Requests).
20-
- Removed: Logging functionality for a cleaner codebase.
21-
- Fixed: Resolved time delay issue between requests.
12+
- **Version 0.1.7**:
13+
- **New Feature**: Added **proxy support** to handle rate limiting and bypass restrictions.
14+
- **Improved**: Enhanced error handling with a fallback mechanism for decoding parameters.
15+
- **Refined**: Optimized `get_decoding_params` to try decoding via `https://news.google.com/articles` first, falling back to `https://news.google.com/rss/articles` if needed.
16+
- **Updated**: Reduced occurrences of HTTP 429 (Too Many Requests).
17+
- **Removed**: Logging functionality for a cleaner codebase.
18+
- **Fixed**: Resolved time delay issue between requests.
2219

2320
## Demo
2421

2522
![Google News Decoder](https://github.com/user-attachments/assets/3a3c3279-1c54-4e19-96cb-6f22f889aa2a)
2623

2724
## Installation
2825

29-
- You can install this package using pip:
26+
You can install this package using pip:
3027

3128
```sh
3229
pip install googlenewsdecoder
@@ -38,31 +35,70 @@ pip install googlenewsdecoder
3835
pip install googlenewsdecoder --upgrade
3936
```
4037

38+
## Supported Proxy Formats
39+
40+
- **HTTP/HTTPS Proxy**:
41+
42+
- **With authentication**: `http://user:pass@host:port` or `https://user:pass@host:port`
43+
- **Without authentication**: `http://host:port` or `https://host:port`
44+
45+
- **SOCKS5 Proxy**:
46+
47+
- **With authentication**: `socks5://user:pass@host:port`
48+
- **Without authentication**: `socks5://host:port`
49+
50+
- **IP and Port Only**:
51+
- **HTTP**: `http://127.0.0.1:8080`
52+
- **SOCKS5**: `socks5://127.0.0.1:1080`
53+
4154
## Usage
4255

4356
Here is an example of how to use this package with different decoders:
4457

45-
### Using new_decoderv1
58+
### Using gnewsdecoder
4659

4760
```python
48-
from googlenewsdecoder import new_decoderv1
61+
from googlenewsdecoder import gnewsdecoder
4962

5063
def main():
51-
52-
interval_time = 5 # default interval is 1 sec, if not specified
64+
interval_time = 1 # interval is optional, default is None
5365

5466
source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
5567

5668
try:
57-
decoded_url = new_decoderv1(source_url, interval=interval_time)
69+
decoded_url = gnewsdecoder(source_url, interval=interval_time)
70+
5871
if decoded_url.get("status"):
5972
print("Decoded URL:", decoded_url["decoded_url"])
6073
else:
6174
print("Error:", decoded_url["message"])
6275
except Exception as e:
6376
print(f"Error occurred: {e}")
6477

65-
# Output: decoded_urls - [{'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}]
78+
if __name__ == "__main__":
79+
main()
80+
```
81+
82+
### Using gnewsdecoder with proxy
83+
84+
```python
85+
from googlenewsdecoder import gnewsdecoder
86+
87+
def main():
88+
interval_time = 1 # interval is optional, default is None
89+
proxy = "http://user:pass@localhost:8080" # proxy is optional, default is None
90+
91+
source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
92+
93+
try:
94+
decoded_url = gnewsdecoder(source_url, interval=interval_time, proxy=proxy)
95+
96+
if decoded_url.get("status"):
97+
print("Decoded URL:", decoded_url["decoded_url"])
98+
else:
99+
print("Error:", decoded_url["message"])
100+
except Exception as e:
101+
print(f"Error occurred: {e}")
66102

67103
if __name__ == "__main__":
68104
main()
@@ -71,54 +107,53 @@ if __name__ == "__main__":
71107
### Using a for loop to decode multiple URLs
72108

73109
```python
74-
from googlenewsdecoder import new_decoderv1
110+
from googlenewsdecoder import gnewsdecoder
75111

76112
def main():
113+
interval_time = 1 # interval is optional, default is None
77114

78-
interval_time = 5 # default interval is None, if not specified
79-
80-
source_urls = ["https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen","https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"]
115+
source_urls = [
116+
"https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen",
117+
"https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"
118+
]
81119

82120
for url in source_urls:
83121
try:
84-
decoded_url = new_decoderv1(url, interval=interval_time)
122+
decoded_url = gnewsdecoder(url, interval=interval_time)
85123
if decoded_url.get("status"):
86124
print("Decoded URL:", decoded_url["decoded_url"])
87125
else:
88126
print("Error:", decoded_url["message"])
89127
except Exception as e:
90128
print(f"Error occurred: {e}")
91129

92-
# Output: decoded_url - {'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}
93-
94-
95130
if __name__ == "__main__":
96131
main()
97132
```
98133

99-
100-
101-
### Using a proxy to deal with rate limiting
134+
### Using a for loop to decode multiple URLs with Proxy
102135

103136
```python
104-
from googlenewsdecoder import new_decoderv1
137+
from googlenewsdecoder import gnewsdecoder
105138

106139
def main():
140+
interval_time = 1 # interval is optional, default is None
141+
proxy = "http://user:pass@localhost:8080" # proxy is optional, default is None
107142

108-
interval_time = 5 # default interval is 1 sec, if not specified
109-
110-
source_url = "https://news.google.com/read/CBMi2AFBVV95cUxPd1ZCc1loODVVNHpnbFFTVHFkTG94eWh1NWhTeE9yT1RyNTRXMVV2S1VIUFM3ZlVkVjl6UHh3RkJ0bXdaTVRlcHBjMWFWTkhvZWVuM3pBMEtEdlllRDBveGdIUm9GUnJ4ajd1YWR5cWs3VFA5V2dsZnY1RDZhVDdORHRSSE9EalF2TndWdlh4bkJOWU5UMTdIV2RCc285Q2p3MFA4WnpodUNqN1RNREMwa3d5T2ZHS0JlX0MySGZLc01kWDNtUEkzemtkbWhTZXdQTmdfU1JJaXY?hl=en-US&gl=US&ceid=US%3Aen"
111-
112-
try:
113-
decoded_url = new_decoderv1(source_url, proxy="http://user:pass@localhost:8080")
114-
if decoded_url.get("status"):
115-
print("Decoded URL:", decoded_url["decoded_url"])
116-
else:
117-
print("Error:", decoded_url["message"])
118-
except Exception as e:
119-
print(f"Error occurred: {e}")
143+
source_urls = [
144+
"https://news.google.com/read/CBMilgFBVV95cUxOM0JJaFRwV2dqRDk5dEFpWmF1cC1IVml5WmVtbHZBRXBjZHBfaUsyalRpa1I3a2lKM1ZnZUI4MHhPU2sydi1nX3JrYU0xWjhLaHNfU0N6cEhOYVE2TEptRnRoZGVTU3kzZGJNQzc2aDZqYjJOR0xleTdsemdRVnJGLTVYTEhzWGw4Z19lR3AwR0F1bXlyZ0HSAYwBQVVfeXFMTXlLRDRJUFN5WHg3ZTI0X1F4SjN6bmFIck1IaGxFVVZyOFQxdk1JT3JUbl91SEhsU0NpQzkzRFdHSEtjVGhJNzY4ZTl6eXhESUQ3XzdWVTBGOGgwSmlXaVRmU3BsQlhPVjV4VWxET3FQVzJNbm5CUDlUOHJUTExaME5YbjZCX1NqOU9Ta3U?hl=en-US&gl=US&ceid=US%3Aen",
145+
"https://news.google.com/read/CBMiiAFBVV95cUxQOXZLdC1hSzFqQVVLWGJVZzlPaDYyNjdWTURScV9BbVp0SWhFNzZpSWZxSzdhc0tKbVlHMU13NmZVOFdidFFkajZPTm9SRnlZMWFRZ01CVHh0dXU0TjNVMUxZNk9Ibk5DV3hrYlRiZ20zYkIzSFhMQVVpcTFPc00xQjhhcGV1aXM00gF_QVVfeXFMTmtFQXMwMlY1el9WY0VRWEh5YkxXbHF0SjFLQVByNk1xS3hpdnBuUDVxOGZCQXl1QVFXaUVpbk5lUGgwRVVVT25tZlVUVWZqQzc4cm5MSVlfYmVlclFTOUFmTHF4eTlfemhTa2JKeG14bmNabENkSmZaeHB4WnZ5dw?hl=en-US&gl=US&ceid=US%3Aen"
146+
]
120147

121-
# Output: decoded_urls - [{'status': True, 'decoded_url': 'https://healthdatamanagement.com/articles/empowering-the-quintuple-aim-embracing-an-essential-architecture/'}]
148+
for url in source_urls:
149+
try:
150+
decoded_url = gnewsdecoder(url, interval=interval_time, proxy=proxy)
151+
if decoded_url.get("status"):
152+
print("Decoded URL:", decoded_url["decoded_url"])
153+
else:
154+
print("Error:", decoded_url["message"])
155+
except Exception as e:
156+
print(f"Error occurred: {e}")
122157

123158
if __name__ == "__main__":
124159
main()

googlenewsdecoder/__init__.py

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,36 @@
1-
from .new_decoderv1 import decode_google_news_url as new_decoderv1
21
from .decoderv1 import decode_google_news_url as decoderv1
32
from .decoderv2 import decode_google_news_url as decoderv2
43
from .decoderv3 import decode_google_news_url as decoderv3
54
from .decoderv4 import decode_google_news_url as decoderv4
5+
from .new_decoderv1 import decode_google_news_url as new_decoderv1
6+
from .new_decoderv2 import GoogleDecoder
7+
from .__version__ import __version__
8+
9+
10+
def gnewsdecoder(source_url, interval=None, proxy=None):
11+
"""
12+
Decodes a Google News article URL into its original source URL.
13+
This is a convenience function that uses the GoogleDecoder class internally.
14+
15+
Parameters:
16+
source_url (str): The Google News article URL.
17+
interval (int, optional): Delay time in seconds before decoding to avoid rate limits.
18+
proxy (str, optional): Proxy to be used for all requests.
19+
20+
Returns:
21+
dict: A dictionary containing 'status' and 'decoded_url' if successful,
22+
otherwise 'status' and 'message'.
23+
"""
24+
decoder = GoogleDecoder(proxy=proxy)
25+
return decoder.decode_google_news_url(source_url, interval=interval)
26+
27+
28+
__all__ = [
29+
"decoderv1",
30+
"decoderv2",
31+
"decoderv3",
32+
"decoderv4",
33+
"new_decoderv1",
34+
"GoogleDecoder",
35+
"gnewsdecoder",
36+
]

googlenewsdecoder/__version__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
__version__ = "0.1.7"

0 commit comments

Comments
 (0)