Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 860 Bytes

File metadata and controls

14 lines (10 loc) · 860 Bytes

Database_Creation

Objective :

  1. To scrape all the relevant information from the GEO website for provided GSE IDs and store it into a database of your choice.
  2. After creating the database we have to annotate the biological keywords in the summary & store the annotated keywords of each dataset within the Database
  3. Write a query to get all the dataset IDs which contain disease keyword.

GSE IDs: GSE63312, GSE78224, GSE74018, GSE50734, GSE114644, GSE60477, GSE53599, GSE80582, GSE109493, GSE35200

Milestones achieved

1.) Get basic info using GSE IDs from NCBI website for some sequencing projects. 2.) Annotate the summary info from above scraped data using becas apis. (not working due to some server error from their side) 3.) Creates a database and add the fetched data into it. (Eg. query's to fetch & retrieve data)