Developing a more modular approach

Proposed design:

ProteoFAV's main features:
1 - Reading/parsing formatted files to pandas DataFrames (e.g. mmCIF, PDB, SIFTS XML, DSSP files)
2 - Downloading data files on the fly (e.g. mmCIF, PDB, SIFTS XML, DSSP files)
3 - Fetching sequence annotations (features) (e.g. variants from Ensembl and UniProt)
4 - Merging all the previous data onto a main DataFrame

With this in mind, I think would be great to have a structure like this:

```
proteofav.mmCIF.read() 		
proteofav.mmCIF.write() 
proteofav.mmCIF.download()
proteofav.mmCIF.select()
proteofav.PDB.read()
proteofav.PDB.write()
proteofav.PDB.download()
proteofav.PDB.select()
proteofav.DSSP.read()
proteofav.DSSP.download()
proteofav.DSSP.select()
proteofav.SIFTS.read()
proteofav.SIFTS.download()
proteofav.SIFTS.select()
proteofav.Validation.read()
proteofav.Validation.download()
proteofav.Validation.select()
proteofav.Annotations.read()
proteofav.Annotations.download()
proteofav.Annotations.select()
proteofav.Variants.fetch()
proteofav.Variants.select()
proteofav.Tables.merge()
proteofav.Tables.generate()
```
Classes generally have the following basic methods

* read - read/parse from file
* write - write output to a file
* download - downloads data to a file (mmCIF, etc.)
* fetch - downloads data to the handle, but can be cached (JSON, etc.)
* merge - merge any set of DataFrames, so each DataFrame should be aware of what type of data it contains
* generate - automated table generation by input (i.e. input PDB ID/CHAIN ID or input UniProt ID)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developing a more modular approach #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Developing a more modular approach #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions