You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-17Lines changed: 25 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,20 +58,19 @@ Doccano can be deployed to AWS ([Cloudformation](https://docs.aws.amazon.com/AWS
58
58
59
59
> Notice: (1) EC2 KeyPair cannot be created automatically, so make sure you have an existing EC2 KeyPair in one region. Or [create one yourself](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair). (2) If you want to access doccano via HTTPS in AWS, here is an [instruction](https://github.com/chakki-works/doccano/wiki/HTTPS-setting-for-doccano-in-AWS).
60
60
61
-
62
61
## Features
63
62
64
-
* Collaborative annotation
65
-
* Multi-Language support
66
-
* Emoji :smile: support
67
-
* (future) Auto labeling
63
+
- Collaborative annotation
64
+
- Multi-Language support
65
+
- Emoji :smile: support
66
+
- (future) Auto labeling
68
67
69
68
## Requirements
70
69
71
-
* Python 3.6+
72
-
* Django 2.1.7+
73
-
* Node.js 8.0+
74
-
* Google Chrome(highly recommended)
70
+
- Python 3.6+
71
+
- Django 2.1.7+
72
+
- Node.js 8.0+
73
+
- Google Chrome(highly recommended)
75
74
76
75
## Installation
77
76
@@ -162,7 +161,9 @@ Finally, to start the server, run the following command:
162
161
```bash
163
162
python manage.py runserver
164
163
```
164
+
165
165
Optionally, you can change the bind ip and port using the command
166
+
166
167
```bash
167
168
python manage.py runserver <ip>:<port>
168
169
```
@@ -197,28 +198,34 @@ After creating a project, you will see the "Import Data" page, or click `Import
-`CSV file`: file must contain a header with a `text` column or be one-column csv file.
202
-
-`JSON file`: each line contains a JSON object with a `text` key. JSON format supports line breaks rendering.
201
+
You can upload the following types of files (depending on project type):
202
+
203
+
-`Text file`: file must contain one sentence/document per line separated by new lines.
204
+
-`CSV file`: file must contain a header with `"text"` as the first column or be one-column csv file. If using labels the sencond column must be the labels.
205
+
-`Excel file`: file must contain a header with `"text"` as the first column or be one-column excel file. If using labels the sencond column must be the labels. Supports multiple sheets as long as format is the same.
206
+
-`JSON file`: each line contains a JSON object with a `text` key. JSON format supports line breaks rendering.
203
207
204
208
> Notice: Doccano won't render line breaks in annotation page for sequence labeling task due to the indent problem, but the exported JSON file still contains line breaks.
205
209
206
-
`example.txt` (or `example.csv`)
207
-
```python
210
+
`example.txt/csv/xlsx`
211
+
212
+
```txt
208
213
EU rejects German call to boycott British lamb.
209
214
President Obama is speaking at the White House.
210
215
He lives in Newark, Ohio.
211
216
...
212
217
```
218
+
213
219
`example.json`
220
+
214
221
```JSON
215
222
{"text": "EU rejects German call to boycott British lamb."}
216
223
{"text": "President Obama is speaking at the White House."}
217
224
{"text": "He lives in Newark, Ohio."}
218
225
...
219
226
```
220
227
221
-
Any other columns (for csv) or keys (for json) are preserved and will be exported in the `metadata` column or key as is.
228
+
Any other columns (for csv/excel) or keys (for json) are preserved and will be exported in the `metadata` column or key as is.
222
229
223
230
Once you select a TXT/JSON file on your computer, click `Upload dataset` button. After uploading the dataset file, we will see the `Dataset` page (or click `Dataset` button list in the left bar). This page displays all the documents we uploaded in one project.
224
231
@@ -228,7 +235,6 @@ Click `Labels` button in left bar to define your own labels. You should see the
Now, you are ready to annotate the texts. Just click the `Annotate Data` button in the navigation bar, you can start to annotate the documents you uploaded.
@@ -249,11 +255,14 @@ by adding `external_id` to the imported file. For example:
249
255
250
256
Input file may look like this:
251
257
`import.json`
258
+
252
259
```JSON
253
260
{"text": "EU rejects German call to boycott British lamb.", "meta": {"external_id": 1}}
254
261
```
262
+
255
263
and the exported file will look like this:
256
264
`output.json`
265
+
257
266
```JSON
258
267
{"doc_id": 2023, "text": "EU rejects German call to boycott British lamb.", "labels": ["news"], "username": "root", "meta": {"external_id": 1}}
259
268
```
@@ -270,7 +279,6 @@ As with any software, doccano is under continuous development. If you have reque
270
279
271
280
Here are some tips might be helpful. [How to Contribute to Doccano Project](https://github.com/chakki-works/doccano/wiki/How-to-Contribute-to-Doccano-Project)
272
281
273
-
274
282
## Contact
275
283
276
284
For help and feedback, please feel free to contact [the author](https://github.com/Hironsan).
0 commit comments