Skip to content

Latest commit

 

History

History
54 lines (54 loc) · 2.47 KB

File metadata and controls

54 lines (54 loc) · 2.47 KB
api_or_bulk_downloads Bulk
citation Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)
code https://github.com/sam-arts/respol_patents_code
contributors
Sam Arts
Jianan Hou
Juan Carlos Gomez
cost None
datasets_and_publications_using_this_dataset Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)
description Different open access data files related to the text of USPTO patent documents, including 1) for each US patent a list of processed, cleaned and stemmed keywords, 2) for each patent a list of the 1,000 most similar patents (based on cosine similarity) from the entire population of US patents, 3) for each US patent the average cosine similarity with all prior patents from the previous 5 years, and the average cosine similarity with all later patents in the following 5 years, 4) each new keyword (unigram), bigram (sequence of two adjacent keywords), trigram, and pairwise keyword combination introduced for the first time in history by a US patent, the number of the patent introducing it for the first time, and the total number of patents from the entire population using these new keywords, bigrams, trigrams, and new keyword combinations.
documentation https://zenodo.org/record/3515985
doi https://doi.org/10.5281/zenodo.3515985
error_metrics Yes
last_edit Fri, 01 Dec 2023 17:56:16 GMT
location https://zenodo.org/record/3515985
maintained_by Sam Arts
open_access TRUE
related_projects
supercedes
patenttext
related_publications Arts S, Hou J, Gomez JC. (2020). Natural language processing to identify the creation and impact of new technologies in patent text: code, data, and new measures. Forthcoming Research Policy. (https://doi.org/10.1016/j.respol.2020.104144)
shortname patent_text_new_measures
superseded_by Fri, 25 Feb 2022 23:35:52 GMT
tags
patent measures
text
natural language processing
novelty
impact
USPTO
technological progress
terms_of_use Open Data Commons Attribution License v1.0
timeframe 1969-2018
title Patent text: code, data, and new measures
uuid 44f33a6f-5099-4481-abed-af9aadf0bd4f
versioning FALSE