Skip to content
This repository was archived by the owner on Jul 3, 2023. It is now read-only.

Commit e0406ec

Browse files
authored
Merge pull request #2 from jaywad/support-scraper-leads
Support ScraperAPI and LeadsAPI
2 parents 1899dfe + a536da3 commit e0406ec

File tree

6 files changed

+109
-8
lines changed

6 files changed

+109
-8
lines changed

README.md

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Or install it yourself as:
1818

1919
$ gem install proxycrawl
2020

21-
## Usage
21+
## Crawling API Usage
2222

2323
Require the gem in your project
2424

@@ -130,6 +130,48 @@ puts response.original_status
130130
puts response.pc_status
131131
```
132132

133+
## Scraper API usage
134+
135+
Initialize the Scraper API using your normal token and call the `get` method.
136+
137+
```ruby
138+
scraper_api = ProxyCrawl::ScraperAPI.new(token: 'YOUR_TOKEN')
139+
```
140+
141+
Pass the url that you want to scrape plus any options from the ones available in the [Scraper API documentation](https://proxycrawl.com/docs/scraper-api/parameters).
142+
143+
```ruby
144+
api.get(url, options)
145+
```
146+
147+
Example:
148+
149+
```ruby
150+
begin
151+
response = scraper_api.get('https://www.amazon.com/Halo-SleepSack-Swaddle-Triangle-Neutral/dp/B01LAG1TOS')
152+
puts response.status_code
153+
puts response.body
154+
rescue => exception
155+
puts exception.backtrace
156+
end
157+
```
158+
159+
## Leads API usage
160+
161+
Initialize with your Leads API token and call the `get` method.
162+
163+
```ruby
164+
leads_api = ProxyCrawl::LeadsAPI.new(token: 'YOUR_TOKEN')
165+
166+
begin
167+
response = leads_api.get('stripe.com')
168+
puts response.status_code
169+
puts response.body
170+
rescue => exception
171+
puts exception.backtrace
172+
end
173+
```
174+
133175
If you have questions or need help using the library, please open an issue or [contact us](https://proxycrawl.com/contact).
134176

135177
## Development

lib/proxycrawl.rb

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
1-
require "proxycrawl/version"
1+
# frozen_string_literal: true
2+
3+
require 'proxycrawl/version'
24
require 'proxycrawl/api'
5+
require 'proxycrawl/scraper_api'
6+
require 'proxycrawl/leads_api'
37

48
module ProxyCrawl
59
end

lib/proxycrawl/api.rb

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# frozen_string_literal: true
2+
23
require 'net/http'
34
require 'json'
45
require 'uri'
@@ -7,8 +8,6 @@ module ProxyCrawl
78
class API
89
attr_reader :token, :body, :status_code, :original_status, :pc_status, :url
910

10-
BASE_URL = 'https://api.proxycrawl.com'
11-
1211
INVALID_TOKEN = 'Token is required'
1312
INVALID_URL = 'URL is required'
1413

@@ -58,15 +57,19 @@ def post(url, data, options = {})
5857

5958
private
6059

60+
def base_url
61+
'https://api.proxycrawl.com'
62+
end
63+
6164
def prepare_uri(url, options)
62-
uri = URI(BASE_URL)
65+
uri = URI(base_url)
6366
uri.query = URI.encode_www_form({ token: @token, url: url }.merge(options))
6467

6568
uri
6669
end
6770

6871
def prepare_response(response, format)
69-
if format == 'json'
72+
if format == 'json' || base_url.include?('/scraper')
7073
@status_code = response.code.to_i
7174
@body = response.body
7275
else
@@ -78,4 +81,4 @@ def prepare_response(response, format)
7881
end
7982
end
8083
end
81-
end
84+
end

lib/proxycrawl/leads_api.rb

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# frozen_string_literal: true
2+
3+
require 'net/http'
4+
require 'json'
5+
require 'uri'
6+
7+
module ProxyCrawl
8+
class LeadsAPI
9+
attr_reader :token, :body, :status_code
10+
11+
INVALID_TOKEN = 'Token is required'
12+
INVALID_DOMAIN = 'Domain is required'
13+
14+
def initialize(options = {})
15+
raise INVALID_TOKEN if options[:token].nil?
16+
17+
@token = options[:token]
18+
end
19+
20+
def get(domain)
21+
raise INVALID_DOMAIN if domain.empty?
22+
23+
uri = URI('https://api.proxycrawl.com/leads')
24+
uri.query = URI.encode_www_form({ token: token, domain: domain })
25+
26+
response = Net::HTTP.get_response(uri)
27+
28+
@status_code = response.code.to_i
29+
@body = response.body
30+
31+
self
32+
end
33+
end
34+
end

lib/proxycrawl/scraper_api.rb

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# frozen_string_literal: true
2+
3+
module ProxyCrawl
4+
class ScraperAPI < ProxyCrawl::API
5+
6+
def post
7+
raise 'Only GET is allowed for the ScraperAPI'
8+
end
9+
10+
private
11+
12+
def base_url
13+
'https://api.proxycrawl.com/scraper'
14+
end
15+
end
16+
end

lib/proxycrawl/version.rb

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# frozen_string_literal: true
2+
13
module ProxyCrawl
2-
VERSION = "0.2.0"
4+
VERSION = '0.2.1'
35
end

0 commit comments

Comments
 (0)