-
Notifications
You must be signed in to change notification settings - Fork 1
TODO
hrybacki edited this page Mar 21, 2013
·
5 revisions
- Add logging to detect unknown errors, reference types, etc -> parsers.py
- Write pretty-print repr for Document
- Resolve given names -> .-delimited initials
- Resolve journal title abbreviations
- Improve document merging / conflict resolution
- Need to know who, when, and from which batch. I.E.: user.datetime.query.pointers to all documents/raw data collected.
- local storage?
- database?
- used in conjunction with merges/conflict resolutions
- Consider Bloom filter vs hash for DB queries
- controllers.py -- Store meta-collection data i.e. query used, source obtained from, and timestamp
- Think about an optimistic insert or random ID
- Think about saving all captured information to disk -- json?
- Document controller -- unsure about resolutionToken lifespan
- Determine first date articles added to OAI and modify default DATE_FROM in controllers.py
- Should db.add_or_update() return the objectID of the document inserted into the DB?
- db.py should have param=DB; where DB = database the controller should interact with