Skip to content
forked from sfu-db/deeperlib

Deep Web Crawler for Data Enrichment

Notifications You must be signed in to change notification settings

Nekokir/deeperlib

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepER - Deep Entity Resolution

Travis David

A web data integration tool, A novel framework to overcome limitations, Easy for configuration, Fully functional, Smooth interface.

which aims to find pairs of records that describe the same entity between a local database and a hidden database and has many applications in data enrichment and data cleaning.

API Support

DeepER is ready for the following API:

  • DBLP(DataBase systems and Logic Programming)
  • YELP(Yelp Fusion API)
  • AMiner(arnetminer)

Custom

implement a subclass of deeper.api.simapi and pass it to deeper.core.smartcrawl and you would integrate a new api to collect more data.

Documentation

Fantastic documentation is available at https://sfu-db.github.io/deeperlib/

Requirements

  • pqdict>=1.0.0
  • requests>=2.18.4
  • simplejson>=3.11.1
  • rauth>=0.7.3

Requests officially supports Python 2.7.13, and runs great on PyPy.

Installation and Update

pip install deeperlib
pip install --upgrade deeperlib

Changelog

v0.2a

  • 2017/09/19 support Windows-32bit/64bit, Linux-32bit/64bit, MacOs-64bit, csv and pickle input

v0.1a

  • 2017/09/14 deeper's birthday

Team

  • Jiannan Wang, Assistant Professor at Simon Fraser University
  • Eugene Wu, Assistant Professor at Columbia University
  • Ryan Shea, Research Associate at Simon Fraser University
  • Pei Wang, Ph.D. Student at Simon Fraser University
  • Yongjun He, Undergraduate Student at Nanjing University

Discussing

Maintainer email
Yongjun He 141250047@smail.nju.edu.cn

About

Deep Web Crawler for Data Enrichment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%