Skip to content

Getting Started

Nithin Krishna edited this page Aug 10, 2016 · 9 revisions

Polar deep insights contains 2 major components

  • Insight Generator
  • Insight Visualizer

##Insight Generator

The insight generator is a python library which provides an interface to extract entities, locations, file metadata and measurements from documents.

The given a file path as argument the main.py script recurses down the directory tree and extracts the above mentioned metadata from each file and saves the extracted metadata onto an elastic search index.

This main.py script is extensible. Users can build custom implementations to handle the extracted metadata.

def customProcessor(metadata):
  # Do something with the extracted metadata

def process(PATH):
  md = InformationExtractor(PATH).extract()
  customProcessor(md)

DirTreeTraverser(BASE_PATH).iterateAndPerform(process)
Clone this wiki locally