Skip to content

A tool to scrape Google Scholar search results with a simple API. Get academic articles, research papers, citations, authors, and publication information.

Notifications You must be signed in to change notification settings

serpapi/google-scholar-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Google Scholar Scraper

Google News scraper

Google Scholar Scraper - A tool to scrape Google Scholar search results with a simple API. Get academic articles, research papers, citations, authors, and publication information.

We provide the results in a structured JSON format, eliminating the need for parsing, coding, proxies, or any other web scraping headaches for developers.

How to scrape Google Scholar?

Using a simple GET request, you can retrieve Google Scholar search results:

https://serpapi.com/search.json?engine=google_scholar&q=machine+learning&api_key=YOUR_API_KEY
  • Register for free at SerpApi to get your API Key
  • q parameter: defines the search query. Supports author: and source: syntax.
  • as_ylo/as_yhi parameters (optional): filter by year range.

Code examples

Here are some code examples based on your favorite programming languages.

cURL Integration

curl --get https://serpapi.com/search \
 -d engine="google_scholar" \
 -d q="machine learning" \
 -d api_key="secret_api_key"

Python Integration

Step 1: Create a new main.py file.

Step 2: Install requests package with:

pip install requests

Step 3: Add this code to your file:

import requests
SERPAPI_API_KEY = "YOUR_SERPAPI_API_KEY"

params = {
    "api_key": SERPAPI_API_KEY,
    "engine": "google_scholar",
    "q": "machine learning"
}

search = requests.get("https://serpapi.com/search", params=params)
response = search.json()
print(response)

If you're only interested in the organic_results, you can print them from the response directly:

print(response["organic_results"])

JavaScript Integration

Step 1: Install the SerpApi JavaScript package:

npm install serpapi

Step 2: Create a new index.js file.

Step 3: Add this to your file:

const { getJson } = require("serpapi");
getJson({
  api_key: API_KEY,
  engine: "google_scholar",
  q: "machine learning"
}, (json) => {
  console.log(json["organic_results"]);
});

Other Programming Languages

While you can use our APIs using a simple GET request with any programming language, you can also see our ready-to-use libraries here: SerpApi Integrations.

Google Scholar Scraper Parameters

Please find the parameters for the Google Scholar API below:

Name Description Requirement
q Parameter defines the search query. Supports author: and source: syntax Required*
Advanced Search
cites Parameter defines unique article ID for "Cited By" searches Optional
cluster Parameter defines unique article ID for "All Versions" searches Optional
Year Range
as_ylo Parameter defines the minimum year for results Optional
as_yhi Parameter defines the maximum year for results Optional
Localization
hl Parameter defines the two-letter language code Optional
lr Parameter defines language restrictions using lang_{code} format Optional
Pagination
start Parameter defines result offset for pagination Optional
num Parameter defines results per page (1-20, default 10) Optional
Filtering
scisbd Parameter to sort by recent articles Optional
as_vis Parameter to include/exclude citations Optional
as_rr Parameter to show review articles only Optional
as_sdt Parameter defines search type (case law, patents) or filtering Optional

*q is optional only when using the cites parameter.

Visit our documentation for more information on all available parameters.

Available data on Google Scholar (JSON Response)

Google Scholar can return different information from time to time. Here is what the organic_results array may contain:

 "organic_results": [
    {
      "position": "Integer - Position of the result",
      "title": "String - Article title",
      "result_id": "String - Unique identifier",
      "link": "String - URL to the article",
      "snippet": "String - Brief excerpt from the article",
      "publication_info": {
        "summary": "String - Authors, source, year",
        "authors": [
          {
            "name": "String - Author name",
            "link": "String - Link to author profile",
            "author_id": "String - Google Scholar author ID"
          }
        ]
      },
      "inline_links": {
        "cited_by": {
          "total": "Integer - Number of citations",
          "link": "String - URL to citing articles",
          "cites_id": "String - ID for cited by search"
        },
        "versions": {
          "total": "Integer - Number of versions",
          "link": "String - URL to all versions",
          "cluster_id": "String - ID for cluster search"
        },
        "related_pages_link": "String - URL to related articles"
      },
      "resources": [
        {
          "title": "String - Resource title (e.g., PDF)",
          "file_format": "String - File type",
          "link": "String - URL to resource"
        }
      ]
    }
  ],

The API response also includes:

  • related_searches: Suggested related queries
  • pagination: Navigation links for result pages

Use cases

Here are some use cases for the Google Scholar API:

  • Build academic research tools for literature reviews.
  • Track citations and measure research impact (h-index analysis).
  • Monitor publications from specific authors or institutions.
  • Create bibliography generators for academic writing.
  • Analyze research trends in specific fields.
  • Build tools to discover related papers and research gaps.

Blog tutorial

Video tutorial

Contacts

Feel free to reach out via contact@serpapi.com.

Check other Google Scrapers from SerpApi.

About

A tool to scrape Google Scholar search results with a simple API. Get academic articles, research papers, citations, authors, and publication information.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published