Skip to content

Commit c4dbd1b

Browse files
Merge pull request #1940 from SyedImtiyaz-1/imgScrape
Added `Image Scrapper`
2 parents 820d8f3 + cc6dc44 commit c4dbd1b

File tree

4 files changed

+78
-0
lines changed

4 files changed

+78
-0
lines changed

Auto-Linkedin /AutoLinkedIn.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
from selenium import webdriver #connect python with webbrowser-chrome
2+
from selenium.webdriver.common.keys import Keys
3+
import pyautogui as pag
4+
5+
def main():
6+
url = "http://linkedin.com/" #url of LinkedIn
7+
network_url = "http://linkedin.com/mynetwork/" # url of LinkedIn network page
8+
driver = webdriver.Chrome('F:\Argha\WebDriver\chromedriver.exe') # path to browser web driver
9+
driver.get(url)
10+
11+
12+
def login():
13+
username = driver.find_element_by_id("login-email") # Getting the login element
14+
username.send_keys("username") # Sending the keys for username
15+
password = driver.find_element_by_id("login-password") # Getting the password element
16+
password.send_keys("password") # Sending the keys for password
17+
driver.find_element_by_id("login-submit").click() # Getting the tag for submit button
18+
19+
def goto_network():
20+
driver.find_element_by_id("mynetwork-tab-icon").click()
21+
22+
def send_requests():
23+
n= input("Number of requsts: ") # Number of requests you want to send
24+
for i in range(0,n):
25+
pag.click(880,770) # position(in px) of connection button
26+
print("Done!")

Auto-Linkedin /README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
## Auto Linkedin
2+
3+
- It imports the necessary modules: webdriver from Selenium to control the web browser, Keys from Selenium to handle keyboard keys, and pyautogui as pag to simulate mouse clicks.
4+
- The main() function sets up the Selenium WebDriver with the Chrome browser and opens the LinkedIn website.
5+
- The login() function finds the login elements on the LinkedIn page and enters the provided username and password. It also clicks the submit button to log in.
6+
- The goto_network() function clicks on the "My Network" tab on the LinkedIn page.
7+
- The send_requests() function prompts the user to enter the number of connection requests they want to send. It then uses the PyAutoGUI library to simulate mouse clicks on the connection button (at the specified position) the specified number of times.
8+
9+
- Install this before running :
10+
1. pip install selenium
11+
2. pip install pyautogui
12+
13+
Once you have installed the necessary libraries and downloaded the Chrome WebDriver, you should be able to run the code successfully.

Image-Scraper/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
## Image Scraper
2+
3+
The aim of the provided script is to scrape all HTML <img> tags from a given URL.
4+
5+
It imports the necessary modules: BeautifulSoup from the bs4 (Beautiful Soup) library for parsing HTML, and requests for making HTTP requests.
6+
The code checks the length of the command-line arguments. If the length is not equal to 2 (indicating that a URL was not provided), it exits with an error message.
7+
It uses the requests.get() function to make an HTTP GET request to the provided URL. The User-Agent header is set to mimic a web browser to avoid any potential blocking or filtering.
8+
The response from the request is then passed to BeautifulSoup to parse the HTML content of the page.
9+
The find_all() method is used on the parsed HTML data to find all <img> tags with a valid src attribute. The src=True parameter filters out <img> tags without the src attribute.
10+
A loop iterates over the list of found images, and each image is printed.
11+
12+
In summary, the script allows you to scrape and print all HTML <img> tags (along with their attributes) from a given URL.
13+
14+
15+
### Installation Requirements -
16+
1. pip install beautifulsoup4
17+
2. pip install requests
18+

Image-Scraper/scrape_images.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Scrape all HTML <img> tags from a provided URL.
2+
3+
from bs4 import BeautifulSoup
4+
import requests
5+
import sys
6+
7+
if len(sys.argv) != 2:
8+
sys.exit("Usage: python scrape_images.py {url}")
9+
10+
response = requests.get(
11+
sys.argv[1],
12+
headers={
13+
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"
14+
}
15+
)
16+
17+
html_data = BeautifulSoup(response.text, 'html.parser')
18+
images = html_data.find_all('img', src=True)
19+
20+
for image in images:
21+
print(image)

0 commit comments

Comments
 (0)