IntelLabs
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 1 deletion b/‎.gitignore‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎Makefile‎
Lines changed: 1 addition & 0 deletions b/‎Makefile‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/source/api.rst‎
Lines changed: 15 additions & 0 deletions b/‎doc/source/api.rst‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎doc/source/assets/bist_service.png‎
100755100644
-258 KB b/‎doc/source/assets/bist_service.png‎
100755100644
-258 KB
diff --git a/‎doc/source/assets/ner_service.png‎
43.1 KB b/‎doc/source/assets/ner_service.png‎
43.1 KB
diff --git a/‎doc/source/assets/spacy_ner_service.png‎
-77.1 KB b/‎doc/source/assets/spacy_ner_service.png‎
-77.1 KB
diff --git a/‎doc/source/service.rst‎
Lines changed: 42 additions & 31 deletions b/‎doc/source/service.rst‎
Lines changed: 42 additions & 31 deletions
diff --git a/‎server/web_service/__init__.py‎ ‎examples/ner/__init__.py‎server/web_service/__init__.py renamed to examples/ner/__init__.py b/‎server/web_service/__init__.py‎ ‎examples/ner/__init__.py‎server/web_service/__init__.py renamed to examples/ner/__init__.py
diff --git a/‎licenses/ipdb-license.txt‎
Lines changed: 35 additions & 0 deletions b/‎licenses/ipdb-license.txt‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎nlp_architect/api/ner_api.py‎
Lines changed: 153 additions & 0 deletions b/‎nlp_architect/api/ner_api.py‎
Lines changed: 153 additions & 0 deletions
@@ -18,8 +18,9 @@ generated
 *.hdf5
 *.h5
 *.html
-!server/web_service/visualizer/displacy/*.html
 !solutions/set_expansion/ui/templates/*.html
+.vscode
+!server/web_service/static/*.html
 !tests/fixtures/data/server/*.gz
 *.log
 .idea/
@@ -31,3 +32,4 @@ pylint.txt
 flake8.txt
 nlp_architect/pipelines/bist-pretrained/*
 venv
+nlp_architect/api/ner-pretrained/*
@@ -35,6 +35,7 @@ test_prepare: test_requirements.txt $(ACTIVATE)
 
 test: test_prepare $(ACTIVATE) dev
 	@. $(ACTIVATE); spacy download en
+	@. $(ACTIVATE); python -c 'from nlp_architect.api.ner_api import NerApi; NerApi(prompt=False)'
 	@. $(ACTIVATE); py.test -rs -vv tests
 
 flake: test_prepare
 
@@ -89,6 +89,8 @@ these will be placed into a central repository.
     nlp_architect.data.babi_dialog.BABI_Dialog
     nlp_architect.data.wikimovies.WIKIMOVIES
 
+
+
 ``nlp_architect.pipelines``
 ---------------------------
 .. py:module:: nlp_architect.pipelines
@@ -103,3 +105,16 @@ NLP pipelines modules using models implemented from ``nlp_architect.models``.
     nlp_architect.pipelines.spacy_np_annotator.NPAnnotator
     nlp_architect.pipelines.spacy_np_annotator.SpacyNPAnnotator
 
+
+
+``nlp_architect.server``
+------------------------
+.. py:module:: server
+
+.. autosummary::
+    :toctree: generated/
+    :nosignatures:
+
+    server.serve
+    server.service
+
@@ -28,30 +28,37 @@ Running NLP Architect server
 ============================
 Some of the components, which we provide pre-trained models, are exposed through this server. In order to run the server, a user needs to specify which service, so NLP Archtiect serer will only upload the needed model.
 
-Currently we provide 2 services:
+Currently we provide 3 services:
 
  1. `bist` service which provides BIST Dependency parsing
  2. `spacy_ner` service which provides Spacy NER annotations.
+ 3. `ner` service which provides NER annotations without Spacy.
 
-To run the server, simply run `serve.py` with the Parameter `--name` as the name of the service you wish to serve.
-Once the model is loaded, the server will run on `http://localhost:8080/{service_name}`.
+The server code is split into two pieces:
 
-If you wish to use the server's visualization - enter `http://localhost:8080/{service_name}/demo.html`
+1. :py:class:`Service <server.service>` which is a representation of each model's API
+2. :py:mod:`Server <server.serve>` which handles processing of HTTP requests
+
+To run the server, from the root directory simply run ``hug -p 8080 -f server/serve.py``, the server will run on `http://localhost:8080`.
+
+If you wish to use the server's visualization - enter `http://localhost:8080`
 
 Otherwise the expected Request for the server is the following:
 
 .. code:: json
 
-    {"docs":
-      [
-        {"id": 1,
-         "doc": "Time flies like an arrow. fruit flies like a banana."},
-        {"id": 2,
-         "doc": "the horse passed the barn fell"},
-        {"id": 3,
-         "doc": "the old man the boat"}
-       ]
-     }
+    {
+        "model_name": "ner" | "spacy_ner" | "bist",
+        "docs":
+        [
+            {"id": 1,
+            "doc": "Time flies like an arrow. fruit flies like a banana."},
+            {"id": 2,
+            "doc": "the horse passed the barn fell"},
+            {"id": 3,
+            "doc": "the old man the boat"}
+        ]
+    }
 
 Request Headers
 ---------------
@@ -64,49 +71,53 @@ The server supports 2 types of Responses (see `Annotation Structure Types - Serv
 
 Examples for running NLP Architect server
 =========================================
-We currently support only 2 services:
+We currently support 3 services:
 
 - BIST parser - Core NLP models annotation structure
 
-.. code:: python
-
-    python server/serve.py --name bist
-
-Once the server is up and running you can go to `http://localhost:8080/bist/demo.html`
+Once the server is up and running you can go to `http://localhost:8080`
 and check out a few test sentences, or you can send a POST request (as described above)
-to `http://localhost:8080/bist`, and receive `CoreNLPDoc` annotation structure response.
+to `http://localhost:8080/inference`, and receive `CoreNLPDoc` annotation structure response.
 
 .. image :: assets/bist_service.png
 
-- Spacy NER - High-level models annotation structure
-
-.. code:: python
-
-    python server/serve.py --name spacy_ner
+- Spacy NER, NER - High-level models annotation structure
 
-Once the server is up and running you can go to `http://localhost:8080/spacy_ner/demo.html`
+Once the server is up and running you can go to `http://localhost:8080`
 and check out a few test sentences, or you can send a Post request (as described above)
-to `http://localhost:8080/spacy_ner`, and receive `HighLevelDoc` annotation structure response.
+to `http://localhost:8080/inference`, and receive `HighLevelDoc` annotation structure response.
+
+Spacy NER:
 
 .. image :: assets/spacy_ner_service.png
 
+NER:
+
+.. image :: assets/ner_service.png
+
 You can also take a look at the tests (tests/nlp_architect_server) to see more examples.
 
 Example CURL request
 --------------------
 
+Running `ner` model
+
+.. code:: json
+
+    curl -i -H "Response-Format:json" -H "Content-Type:application/json" -d '{"model_name": "ner", "docs": [{"id": 1,"doc": "Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley."}]}' http://{localhost_ip}:8080/inference
+
 Running `spacy_ner` model
 
 .. code:: json
 
-    curl -i -H "Response-Format:json" -H "Content-Type:application/json" -d '{"docs": [{"id": 1,"doc": "Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley."}]}' http://{localhost_ip}:8080/spacy_ner
+    curl -i -H "Response-Format:json" -H "Content-Type:application/json" -d '{"model_name": "spacy_ner", "docs": [{"id": 1,"doc": "Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley."}]}' http://{localhost_ip}:8080/inference
 
 
 Running `bist` model
 
 .. code:: json
 
-    curl -i -H "Response-Format:json" -H "Content-Type:application/json" -d '{"docs":[{"id": 1,"doc": "Time flies like an arrow. fruit flies like a banana."},{"id": 2,"doc": "the horse passed the barn fell"},{"id": 3,"doc": "the old man the boat"}]}' http://10.13.133.120:8080/bist
+    curl -i -H "Response-Format:json" -H "Content-Type:application/json" -d '{"model_name": "bist", "docs":[{"id": 1,"doc": "Time flies like an arrow. fruit flies like a banana."},{"id": 2,"doc": "the horse passed the barn fell"},{"id": 3,"doc": "the old man the boat"}]}' http://{localhost_ip}:8080/inference
 
 
 Annotation Structure Types - Server Responses
@@ -173,7 +184,7 @@ In order to add a new service to the server you need to go over 3 steps:
 
 1. Choose the type of your service: Core NLP models or High-level models
 
-2. Create API for your service. Create the file under `nlp_architect/api/abstract_api` folder. Make sure your class inherits from `AbstractApi` (`from nlp_architect.api.abstract_api import AbstractApi`) and implements all its methods. Notice that your `inference` class_method must return either "CoreNLPDoc" or "HighLevelDoc".
+2. Create API for your service. Create the file under `nlp_architect/api/abstract_api` folder. Make sure your class inherits from :py:class`AbstractApi <nlp_architect.api.abstract_api>` and implements all its methods. Notice that your `inference` class_method must return either "CoreNLPDoc" or "HighLevelDoc".
 
 3. Add new service to `services.json` in the following template:
 
 
@@ -0,0 +1,35 @@
+
+https://github.com/gotcha/ipdb/blob/master/COPYING.txt
+
+
+
+Copyright (c) 2007-2016 ipdb development team
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+Redistributions in binary form must reproduce the above copyright notice,
+this list of conditions and the following disclaimer in the documentation
+and/or other materials provided with the distribution.
+
+Neither the name of the ipdb Development Team nor the names of its
+contributors may be used to endorse or promote products derived from this
+software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
+IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -0,0 +1,153 @@
+# ******************************************************************************
+# Copyright 2017-2018 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ******************************************************************************
+import pickle
+
+from os import path, makedirs, sys
+from keras.preprocessing.sequence import pad_sequences
+import numpy as np
+from nlp_architect.api.abstract_api import AbstractApi
+from nlp_architect.utils.io import download_unlicensed_file
+from nlp_architect.models.ner_crf import NERCRF
+
+from nlp_architect.utils.text import SpacyInstance
+
+nlp = SpacyInstance(disable=['tagger', 'ner', 'parser', 'vectors', 'textcat'])
+
+
+class NerApi(AbstractApi):
+    """
+    Ner model API
+    """
+    dir = path.dirname(path.realpath(__file__))
+    pretrained_model = path.join(dir, 'ner-pretrained', 'model.h5')
+    pretrained_model_info = path.join(dir, 'ner-pretrained', 'model_info.dat')
+
+    def __init__(self, ner_model=None, prompt=True):
+        self.model = None
+        self.model_info = None
+        self.model_path = NerApi.pretrained_model
+        self.model_info_path = NerApi.pretrained_model_info
+        self._download_pretrained_model(prompt)
+
+    def encode_word(self, word):
+        return self.model_info['word_vocab'].get(word, 1.0)
+
+    def encode_word_chars(self, word):
+        return [self.model_info['char_vocab'].get(c, 1.0) for c in word]
+
+    def encode_input(self, text_arr):
+        sentence = []
+        sentence_chars = []
+        for word in text_arr:
+            sentence.append(self.encode_word(word))
+            sentence_chars.append(self.encode_word_chars(word))
+        encoded_sentence = pad_sequences(
+            [np.asarray(sentence)], maxlen=self.model_info['sentence_len'])
+        chars_padded = pad_sequences(
+            sentence_chars, maxlen=self.model_info['word_len'])
+        if self.model_info['sentence_len'] - chars_padded.shape[0] > 0:
+            chars_padded = np.concatenate((np.zeros(
+                (self.model_info['sentence_len'] - chars_padded.shape[0],
+                    self.model_info['word_len'])), chars_padded))
+        encoded_chars = chars_padded.reshape(1, self.model_info['sentence_len'],
+                                             self.model_info['word_len'])
+        return encoded_sentence, encoded_chars
+
+    def _prompt(self):
+        response = input('\nTo download \'{}\', please enter YES: '.
+                         format('ner'))
+        res = response.lower().strip()
+        if res == "yes" or (len(res) == 1 and res == 'y'):
+            print('Downloading {}...'.format('ner'))
+            responded_yes = True
+        else:
+            print('Download declined. Response received {} != YES|Y. '.format(res))
+            responded_yes = False
+        return responded_yes
+
+    def _download_pretrained_model(self, prompt=True):
+        """Downloads the pre-trained BIST model if non-existent."""
+        dir_path = path.join(self.dir, 'ner-pretrained')
+        if not path.isfile(path.join(dir_path, 'model.h5')):
+            print('The pre-trained models to be downloaded for the NER dataset'
+                  'are licensed under Apache 2.0. By downloading, you accept the terms'
+                  'and conditions provided by the license')
+            makedirs(dir_path, exist_ok=True)
+            if prompt is True:
+                agreed = self._prompt()
+                if agreed is False:
+                    sys.exit(0)
+            download_unlicensed_file('http://nervana-modelzoo.s3.amazonaws.com/NLP/NER/',
+                                     'model.h5', self.model_path)
+            download_unlicensed_file('http://nervana-modelzoo.s3.amazonaws.com/NLP/NER/',
+                                     'model_info.dat', self.model_info_path)
+            print('Done.')
+
+    def load_model(self):
+        with open(self.model_info_path, 'rb') as fp:
+            self.model_info = pickle.load(fp)
+            self.model = NERCRF()
+            self.model.build(
+                self.model_info['sentence_len'],
+                self.model_info['word_len'],
+                self.model_info['num_of_labels'],
+                self.model_info['word_vocab'],
+                self.model_info['vocab_size'],
+                self.model_info['char_vocab_size'],
+                word_embedding_dims=self.model_info['word_embedding_dims'],
+                char_embedding_dims=self.model_info['char_embedding_dims'],
+                word_lstm_dims=self.model_info['word_lstm_dims'],
+                tagger_lstm_dims=self.model_info['tagger_lstm_dims'],
+                dropout=self.model_info['dropout'],
+                external_embedding_model=self.model_info[
+                    'external_embedding_model'])
+            self.model.load(self.model_path)
+
+    def pretty_print(self, text, tags):
+        tags_str = [self.model_info['labels_id_to_word']
+                    .get(t, None) for t in tags[0]][-len(text):]
+        mapped = [
+            {'index': idx, 'word': el, 'label': tags_str[idx]} for idx, el in enumerate(text)
+        ]
+        counter = 0
+        ents = []
+        spans = []
+        for obj in mapped:
+            if(obj['label'] != 'O'):
+                spans.append({
+                    'start': counter,
+                    'end': (counter + len(obj['word'])),
+                    'type': obj['label']
+                })
+            counter += len(obj['word']) + 1
+        ents = dict((obj['type'].lower(), obj) for obj in spans).keys()
+        ret = {}
+        ret['doc_text'] = ' '.join(text)
+        ret['annotation_set'] = list(ents)
+        ret['spans'] = spans
+        ret['title'] = 'None'
+        return {"doc": ret, 'type': 'high_level'}
+
+    def process_text(self, text):
+        input_text = ' '.join(text.strip().split())
+        return nlp.tokenize(input_text)
+
+    def inference(self, doc):
+        text_arr = self.process_text(doc)
+        words, chars = self.encode_input(text_arr)
+        tags = self.model.predict([words, chars])
+        tags = tags.argmax(2)
+        return self.pretty_print(text_arr, tags)