Doccano

Latest version: v1.8.4

Safety actively analyzes 641024 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 6

1.3.0

Mainly improve upload and download features:

Upload

- [x] Support a large file upload
- [x] Support a folder upload
- [x] Support multiple files upload
- [x] Support asynchronous upload
- [x] Enable to show upload progress

Ingestion

- [x] Support batch import to speed up the process
- [x] Support file validation
- [x] Import as much of the file's content as possible
- [x] Provide feedback on which file and what line is wrong and why.
- [x] Support 90 over encodings
- [x] Support auto encoding detection
- [x] Support saving filename

File format

- [x] Expand the available formats
- [x] Support for specifying columns for labels and text
- [x] Support for specifying schema of CoNLL(IOB2, IOE2, IOBES, BILOU)
- [x] Support for specifying separator(CSV, CoNLL)

Download

- [x] Support a large file download
- [x] Support asynchronous download
- [x] Support zip download
- [x] Support JSON download

Others

- [x] Support project tags

1.2.4

Fix the dependency problem 1278

1.2.3

Bugfix

- 1274 Replace TokenAuthentication with SessionAuthentication
- 1271 Update libpq-dev version to avoid build failure
- 1260 Add v-shortkey to text classification page

Enhancement

- 1277 Update getting-started.md
- 1263 Display shortcut keys
- 1261 Support single label classification
- 1253 Change the limit length of label name from 30 to 100 characters

1.2.2

- Remove vuex as much as possible
- Refactor the frontend code
- Typescript support
- Fix a few bugs

1.2.1

- Add `auto-labeling-pipeline` to setup.py as a dependency, fix 1208

1.2.0

Support Auto Labeling

This PR allows users to label text automatically. I think this enables users to speed up annotation.

![](https://i.gyazo.com/20776641e232fa8102f82e1f7889d4f2.gif)

How it works

This feature enables automatic labeling by calling the Web API from doccano. Therefore, you can use any commercial service (e.g. Google Natural Language API, Amazon Comprehend, Watson, etc.) or your own server for labeling, as long as you can call the API from doccano. Notice that there is no learning function for now. This is an issue for the future.

How to use

Configuration

1. Select "Settings" on the side menu.

![image](https://user-images.githubusercontent.com/6737785/108456951-e533f780-72b4-11eb-9a5b-cf557bf313d6.png)

2. Select "Auto Labeling" tab and press "Create" button.

![image](https://user-images.githubusercontent.com/6737785/108457020-08f73d80-72b5-11eb-9579-7ef5ba9e4c52.png)

3. Select a configuration template. Some tasks have predefined templates to simplify the configuration.

![image](https://user-images.githubusercontent.com/6737785/108457067-22988500-72b5-11eb-8620-409d37775120.png)

4. Enter the parameters required to use the API.

![image](https://user-images.githubusercontent.com/6737785/108457117-44920780-72b5-11eb-8590-5440455a5189.png)

5. Write a mapping template to extract labels from API responses. If you select the predefined template, you can skip this process.

![image](https://user-images.githubusercontent.com/6737785/108457186-6f7c5b80-72b5-11eb-84f7-7d1e9089590c.png)

6. Map the label fetched from the API to a label defined by you.

![image](https://user-images.githubusercontent.com/6737785/108457273-9fc3fa00-72b5-11eb-80da-05986ea4d957.png)

Turn on the feature

1. Go to the annotation page.
2. Select the settings button.
3. Turn on the feature.

Notice that if you don't have any configuration, you can't use this feature.

Future works

- Assigning a configuration to each user
- Allow the admin to set throttling for each user
- Allow merging of responses from multiple APIs
- Implementing the training feature
- Increase the number of the predefined templates(https://github.com/doccano/auto-labeling-pipeline)

close 191

Page 4 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.