|
3 | 3 | Defining New Filetypes |
4 | 4 | ====================== |
5 | 5 |
|
| 6 | +Implementing support for a new Graphtage filetype entails extending the :class:`graphtage.Filetype` class. Subclassing :class:`graphtage.Filetype` automatically registers it with Graphtage. |
| 7 | + |
| 8 | +Filetype Matching |
| 9 | +----------------- |
| 10 | + |
| 11 | +Input files are matched to an associated :class:`graphtage.Filetype` using MIME types. Each :class:`graphtage.Filetype` registers one or more MIME types for which it will be responsible. Input file MIME types are classified using the :mod:`mimetypes` module. Sometimes a filetype does not have a standardized MIME type or is not properly classified by the :mod:`mimetypes` module. For example, Graphtage's :class:`graphtage.pickle.Pickle` filetype has neither. You can add support for such a filetype as follows: |
| 12 | + |
| 13 | +.. code-block:: python |
| 14 | +
|
| 15 | + import mimetypes |
| 16 | +
|
| 17 | + if '.pkl' not in mimetypes.types_map and '.pickle' not in mimetypes.types_map: |
| 18 | + mimetypes.add_type('application/x-python-pickle', '.pkl') |
| 19 | + mimetypes.suffix_map['.pickle'] = '.pkl' |
| 20 | +
|
| 21 | +Implementing a New Filetype |
| 22 | +--------------------------- |
| 23 | + |
| 24 | +With the MIME type registered, here is a sketch of how one might define the Pickle filetype: |
| 25 | + |
| 26 | +.. code-block:: python |
| 27 | +
|
| 28 | + from graphtage import BuildOptions, Filetype, Formatter, TreeNode |
| 29 | +
|
| 30 | + class Pickle(Filetype): |
| 31 | + def __init__(self): |
| 32 | + super().__init__( |
| 33 | + "pickle", # a unique identifier |
| 34 | + "application/python-pickle", # the primary MIME type |
| 35 | + "application/x-python-pickle" # an optional secondary MIME type |
| 36 | + ) |
| 37 | +
|
| 38 | + def build_tree(self, path: str, options: Optional[BuildOptions] = None) -> TreeNode: |
| 39 | + # return the root node of the tree built from the given pickle file |
| 40 | +
|
| 41 | + def build_tree_handling_errors(self, path: str, options: Optional[BuildOptions] = None) -> Union[str, TreeNode]: |
| 42 | + # the same as the build_tree() function, |
| 43 | + # but on error return a string containing the error message |
| 44 | + # |
| 45 | + # for example: |
| 46 | + try: |
| 47 | + return self.build_tree(path=path, options=options) |
| 48 | + except PickleDecodeError as e: |
| 49 | + return f"Error deserializing {os.path.basename(path)}: {e!s}" |
| 50 | +
|
| 51 | + def get_default_formatter(self) -> GraphtageFormatter: |
| 52 | + # return the formatter associated with this file type |
0 commit comments