Skip to content

Commit cbff0bb

Browse files
authored
docs: python pipelines (#1613)
1 parent 5556550 commit cbff0bb

File tree

4 files changed

+146
-1
lines changed

4 files changed

+146
-1
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,9 @@ from deeppavlov import evaluate_model
181181
model = evaluate_model(<config_path>, install=True, download=True)
182182
```
183183

184+
DeepPavlov also [allows](https://docs.deeppavlov.ai/en/master/features/python.html) to build a model from components for
185+
inference using Python.
186+
184187
## License
185188

186189
DeepPavlov is Apache 2.0 - licensed.

deeppavlov/_meta.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
__version__ = '1.0.1'
1+
__version__ = '1.0.2'
22
__author__ = 'Neural Networks and Deep Learning lab, MIPT'
33
__description__ = 'An open source library for building end-to-end dialog systems and training chatbots.'
44
__keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot']

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Welcome to DeepPavlov's documentation!
99
QuickStart <intro/quick_start>
1010
General concepts <intro/overview>
1111
Configuration file <intro/configuration>
12+
Python pipelines <intro/python.ipynb>
1213
Models overview <features/overview>
1314

1415

docs/intro/python.ipynb

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "6d5cd16b",
6+
"metadata": {},
7+
"source": [
8+
"#### Python pipelines"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "da10fd80",
14+
"metadata": {},
15+
"source": [
16+
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deeppavlov/DeepPavlov/blob/master/docs/intro/python.ipynb)"
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"id": "d55ebe35",
22+
"metadata": {},
23+
"source": [
24+
"Python models could be used without .json configuration files.\n",
25+
"\n",
26+
"The code below is an alternative to building [insults_kaggle_bert](https://github.com/deepmipt/DeepPavlov/blob/master/deeppavlov/configs/classifiers/insults_kaggle_bert.json) model and using it with\n",
27+
"\n",
28+
"```python\n",
29+
"from deeppavlov import build_model\n",
30+
"\n",
31+
"model = build_model('insults_kaggle_bert', download=True)\n",
32+
"```"
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"id": "fa1db63b",
38+
"metadata": {},
39+
"source": [
40+
"At first, define variables for model components and download model data."
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"id": "9d6671e2",
47+
"metadata": {},
48+
"outputs": [],
49+
"source": [
50+
"from deeppavlov.core.commands.utils import expand_path\n",
51+
"from deeppavlov.download import download_resource\n",
52+
"\n",
53+
"\n",
54+
"classifiers_path = expand_path('~/.deeppavlov/models/classifiers')\n",
55+
"model_path = classifiers_path / 'insults_kaggle_torch_bert'\n",
56+
"transformer_name = 'bert-base-uncased'\n",
57+
"\n",
58+
"download_resource(\n",
59+
" 'http://files.deeppavlov.ai/deeppavlov_data/classifiers/insults_kaggle_torch_bert_v5.tar.gz',\n",
60+
" {classifiers_path}\n",
61+
")\n"
62+
]
63+
},
64+
{
65+
"cell_type": "markdown",
66+
"id": "332d644e",
67+
"metadata": {},
68+
"source": [
69+
"Then, initialize model components."
70+
]
71+
},
72+
{
73+
"cell_type": "code",
74+
"execution_count": null,
75+
"id": "809c31ad",
76+
"metadata": {},
77+
"outputs": [],
78+
"source": [
79+
"from deeppavlov.core.data.simple_vocab import SimpleVocabulary\n",
80+
"from deeppavlov.models.classifiers.proba2labels import Proba2Labels\n",
81+
"from deeppavlov.models.preprocessors.torch_transformers_preprocessor import TorchTransformersPreprocessor\n",
82+
"from deeppavlov.models.torch_bert.torch_transformers_classifier import TorchTransformersClassifierModel\n",
83+
"\n",
84+
"\n",
85+
"preprocessor = TorchTransformersPreprocessor(\n",
86+
" vocab_file=transformer_name,\n",
87+
" max_seq_length=64\n",
88+
")\n",
89+
"\n",
90+
"classes_vocab = SimpleVocabulary(\n",
91+
" load_path=model_path/'classes.dict',\n",
92+
" save_path=model_path/'classes.dict'\n",
93+
")\n",
94+
"\n",
95+
"classifier = TorchTransformersClassifierModel(\n",
96+
" n_classes=classes_vocab.len,\n",
97+
" return_probas=True,\n",
98+
" pretrained_bert=transformer_name,\n",
99+
" save_path=model_path/'model',\n",
100+
" optimizer_parameters={'lr': 1e-05}\n",
101+
")\n",
102+
"\n",
103+
"proba2labels = Proba2Labels(max_proba=True)"
104+
]
105+
},
106+
{
107+
"cell_type": "markdown",
108+
"id": "87e8ec20",
109+
"metadata": {},
110+
"source": [
111+
"Finally, create model from components. ``Element`` is a wrapper for a component. ``Element`` receives the component and the names of the incoming and outgoing arguments. ``Model`` combines ``Element``s into pipeline."
112+
]
113+
},
114+
{
115+
"cell_type": "code",
116+
"execution_count": null,
117+
"id": "acfe29de",
118+
"metadata": {},
119+
"outputs": [],
120+
"source": [
121+
"from deeppavlov import Element, Model\n",
122+
"\n",
123+
"model = Model(\n",
124+
" x=['x'],\n",
125+
" out=['y_pred_labels'],\n",
126+
" pipe=[\n",
127+
" Element(component=preprocessor, x=['x'], out=['bert_features']),\n",
128+
" Element(component=classifier, x=['bert_features'], out=['y_pred_probas']),\n",
129+
" Element(component=proba2labels, x=['y_pred_probas'], out=['y_pred_ids']),\n",
130+
" Element(component=classes_vocab, x=['y_pred_ids'], out=['y_pred_labels'])\n",
131+
" ]\n",
132+
")\n",
133+
"\n",
134+
"model(['you are stupid', 'you are smart'])"
135+
]
136+
}
137+
],
138+
"metadata": {},
139+
"nbformat": 4,
140+
"nbformat_minor": 5
141+
}

0 commit comments

Comments
 (0)