News Reader-Selenium Project

Web scraping is the automated gathering of content and data from a website or any other resource available on the internet. Unlike screen scraping, web scraping extracts the HTML code under the webpage. Users can then process the HTML code of the webpage to extract data and carry out data cleaning, manipulation, and analysis

Project Description

In this project , Web Browser will be accessed with Selenium. News has to be fetched from the browser & printed [ from the specifiesource eg- Hindustan Times ] the headlines and also save it in a text file, also converting the news headlines into speech using Google-Text-To-Speech & saving it into audio file (mp3)

Designed for Linux. Not yet tested on Windows and macOS!

Installation

STEP1: Clone this repository

~$ git clone https://github.com/Devansh-Seth-DEV/News_Scraping.git

STEP2: Create a virtual environment

Open your favourite Terminnal

~$ cd <path to cloned repository> /News_Scraping
News_Scraping:~$ pip3 install virtualenv
News_Scraping:~$ virtualenv <venv_name>

STEP3: Activate virtual environment

News_Scraping:~$ source <venv_name>/bin/activate

STEP4: Give permissions to firefox driver

(<venv_name>) News_Scraping:~$ chmod +x drivers/FirefoxDriver/geckodriver

Install Selenium and gTTS (Google-Text-To-Speech)

Assuming that the virtual environment is activated

(<venv_name>) News_Scraping:~$ pip3 install selenium
(<venv_name>) News_Scraping:~$ pip3 install webdriver-manager
(<venv_name>) News_Scraping:~$ pip3 install gTTS

RUN

Assuming that the virtual environment is activated

(<venv_name>) News_Scraping:~$ python3 news_scraperBOT.py

OUTPUT

News Text-Files Directory

(<venv_name>) News_Scraping:~$ cd ./docs/headlines

News-Text Audio Directory

(<venv_name>) News_Scraping:~$ cd ./audio/

Deactivating Virtual Environment

(<venv_name>) News_Scraping:~$ deactivate

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
audio/2023-07-28 22:36:01.890727		audio/2023-07-28 22:36:01.890727
docs		docs
drivers/FirefoxDriver		drivers/FirefoxDriver
logs		logs
selenium_basic_examples		selenium_basic_examples
LICENSE.md		LICENSE.md
news_scraperBOT.py		news_scraperBOT.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

News Reader-Selenium Project

Project Description

Installation

Install Selenium and gTTS (Google-Text-To-Speech)

RUN

OUTPUT

Deactivating Virtual Environment

About

Uh oh!

Releases

Packages

Languages

License

Devansh-Seth-DEV/News_Scraping

Folders and files

Latest commit

History

Repository files navigation

News Reader-Selenium Project

Project Description

Installation

Install Selenium and gTTS (Google-Text-To-Speech)

RUN

OUTPUT

Deactivating Virtual Environment

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages