-
Notifications
You must be signed in to change notification settings - Fork 10
Import a new catalog
Marco Fossati edited this page Jul 3, 2019
·
1 revision
- Ensure you have the test environment up and running;
- create a model file for the database you want to import in
${PROJECT_ROOT}/soweego/importer/models/; - call it
${NEW_DATABASE}_entity.pyand paste the snippet below. It is enough to replace${NEW_DATABASE}with your database name. Other variables (marked with a leading$) are optional; -
optional: you can define database-specific columns, see
TODO. Column names must be unique: no overlapping among classes.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""${NEW_DATABASE} SQL Alchemy ORM model"""
__author__ = '${YOUR_NAME_HERE}'
__email__ = '${YOUR_EMAIL_HERE}'
__version__ = '1.0'
__license__ = 'GPL-3.0'
__copyright__ = 'Copyleft ${YEAR}, ${YOUR_NAME_HERE}'
from sqlalchemy import Column, ForeignKey, String
from sqlalchemy.ext.declarative import declarative_base
from soweego.importer.models.base_entity import BaseEntity
from soweego.importer.models.base_link_entity import BaseLinkEntity
BASE = declarative_base()
class ${NEW_DATABASE}Entity(BaseEntity, BASE):
__tablename__ = '${NEW_DATABASE}'
__mapper_args__ = {
'polymorphic_identity': __tablename__,
'concrete': True}
# TODO Optional: define database-specific columns here
# For instance:
# birth_place = Column(String(255))
class ${NEW_DATABASE}LinkEntity(BaseLinkEntity, BASE):
__tablename__ = '${NEW_DATABASE}_link'
__mapper_args__ = {
'polymorphic_identity': __tablename__,
'concrete': True}
catalog_id = Column(String(32), ForeignKey(${NEW_DATABASE}Entity.catalog_id),
index=True)- create the file
${PROJECT_ROOT}/soweego/importer/${NEW_DATABASE}_dump_downloader.py; - define a class
${NEW_DATABASE}DumpDownloader(BaseDumpDownloader); - override
BaseDumpDownloadermethods:-
import_from_dumpcreates${NEW_DATABASE}Entityand${NEW_DATABASE}LinkEntityinstances for each entity and stores it in the database. See the instructions below; -
dump_download_urlcomputes and returns the latest dump URL. The override is optional: if you don't implement it, you'll always have to call the import of your database with the--download-urloption (see later).
-
Setup:
db_manager = DBManager()
db_manager.drop(${NEW_DATABASE}Entity)
db_manager.create(${NEW_DATABASE}Entity)
Creating a transaction:
session = db_manager.new_session()
Adding an entity to a transaction
current_entity = ${NEW_DATABASE}Entity()
...
session.add(current_entity)
Committing a transaction:
session.commit()
Keep your sessions as small as possibile!
${PROJECT_ROOT}/soweego/importer/importer.py contains the following CLI command:
@click.command()
@click.argument('catalog', type=click.Choice(['discogs', 'musicbrainz']))
@click.option('--download-url', '-du', default=None)
@click.option('--output', '-o', default='output', type=click.Path())
def import_cli(catalog: str, download_url: str, output: str) -> None:
"""Check if there is an updated dump in the output path;
if not, download the dump"""
importer = Importer()
downloader = BaseDumpDownloader()
if catalog == 'discogs':
downloader = DiscogsDumpDownloader()
elif catalog == 'musicbrainz':
downloader = MusicBrainzDumpDownloader()
importer.refresh_dump(
output, download_url, downloader)Add an elif case for your database and make sure you set the appropriate downloader for your database.
The same database name you choose for the if statement needs to be added in the list: @click.argument('catalog', type=click.Choice(['discogs', 'musicbrainz'])).
-
Ensure to be in test or production mode.
-
run
python -m soweego importer import_catalog ${YOUR_DATABASE_NAME}You have the following options:
-
--output,-o, for setting the output folder in which the dump will be stored -
--download-url,-du, for specifying a dump URL to download
-