
Latest version: v0.24.0

Safety actively analyzes 638039 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies


- [x] :fire: 296

- [x] 527
- [x] automate `NoFolding` support, easy API usage :fire: 466
- [x] :wrench: 417
- [x] :wrench: 489
- [x] :wrench: 502 because of 501 were fixed
- [x] :wrench: 510
- [x] 503
- [x] 507
- [x] remove everything related to applications and related framework if everything will be OK with the paper (0.23.1 as well)
- [x] 520
- [x] :wrench: 526

- [x] Long delayed issue: 284
- [x] API: doc renaming 457
- [x] 522

Changes and Simplifications
- [x] :x: 517
- [x] :x: Drop support of reading grouped Opinions (491 and 492 related) [the related unit-test was optional and has been removed as well)
- [x] :x: 483
- [x] :x: 376

- [x] 521
- [x] 505
- [x] 514
- [x] 534
- [x] Make to string for `doc_now` parameter:

Main Updates

- [x] 439
- [x] 447
- [x] fixed : 440
- [x] new: :fire: 459
- [x] moving `evaluation` module outside :wrench: 449 (new separate project)
- [x] utils: 467
- [x] universal API for proof-of-concept

[Full Changelog](

**Implemented enhancements:**

- `NativeCsvWriter` -- sync deliimiter with other CSV formatters [\486](

[v0.23.1-rc]( (2023-06-02)

[Full Changelog](

**Implemented enhancements:**

- `filters=[]` -- consider the case of None by default \[Paper feedback\] [\479](
- `opinions=[]` -- simplify usage of API \[paper feedback\] [\478](
- `BaseSerializerPipelineItem` -- required by `arekit-ss` [\476](
- `Neural Network Serializer` -- `rows_provider` should be declared outside \[paper backlog/arekit\_ss project\] [\475](
- Streaming -- support `JSON` output format [\474](
- `RuAttitudesDocumentProvider` -- refactor to follow the structure of the rest resources [\470](
- Support `None` for `get_doc_existed_opinion_func` \[user/paper feedback\] [\469](
- `SynonymsCollection` -- setup default value of `iter_group_values_lists` to `[]` [\468](
- `DOC_ID` column -- remove `int` type limitation [\463](
- Streaming -- provide header column names for CSV [\462](
- `tqdm` -- display amount of processed documents in progress-bar \[Project Gutenberg backlog\] [\461](
- `OpinionCollection` -- `iter_sentiment` method is not in use anymore [\456](
- `OpinionCollection` -- the case of `None` for `opinion` results in incomplete initialization [\455](
- `OpinionCollection` -- `copy` method is not in use anymore [\454](
- `OpinionCollection` -- consider `opinions=[]` by default in, i.e. empty collection. [\453](
- -- is empty and might be removed \[QUICK check and fix\] [\451](
- Pandas -- completely remove dependencies [\450](
- `BertTextBTemplates` -- switch name to prompts [\446](
- RuSentRel -- embed train and test indices in collection [\444](
- SentiNEREL -- entity filter [\443](
- SentiNEREL -- move from another project \[NIVTS project backlog, RuSentNE competitions\] [\439](

**Fixed bugs:**

- `Network` module -- context constant has a predefined `text` value which is limited for networks only [\485](
- `read_ruattitudes_to_brat_in_memory` -- case of `keep_doc_ids_only==True` causes exception [\482](
- `prompt` -- object non subscriptable [\481](
- `fill` -- in case of `None` rows count `tqdm` throws exception [\458](
- `create_sample_provider` -- misused parameter [\445](
- `CroppedBertSampleRowProvider` -- might crop with references outside of the bounds \[googletranslate-feedback\] [\440](

**Closed issues:**

- Shortening to `RuSentRelOpinions.iter_from_doc` [\480](
- `InputTextOpinionProvider` -- rename to `ContentsProvider` [\473](
- `` -- rename method `read_collection` to `read` \[paper feedback\] [\472](
- `DocumentOperation` -- provide directory-based document provider by default \[Project Gutenberg feedback\] [\467](
- Stream writing [\459](
- `dist_in_sent=0` by default [\452](
- Evaluation -- is not a part of the AREkit soon [\449](
- Prompting -- collect base classes that allows such input processing [\447](
- SentiNEREL -- move `split_fixed.txt` into the data SentiNEREL data archive. [\442](
- What's new in 0.23.0 [\401](

**Merged pull requests:**

- CVE-2007-4559 Patch [\412]( ([TrellixVulnTeam](

\* *This Changelog was automatically generated by [github_changelog_generator](*


What's new: Globalization and Internalization


Globalization for any language is the major aspect of 0.23.0, since we annou
nce `AREnets` and `sample-transfer`
We tend to generalize some aspects in order to consider **other languages than original one** (Russian).
We introduce `CompoundEntities` which may include other entities.

- [x] Nested/Compound entities support! 398
- [x] Detaching `networks` contrib module 423 -> AREnets
- [x] Appearance of transfer:

Fixed bugs
- [x] Refactored BRAT parser, fixed bugs for other languages/collections.

- [x] 375
- [x] Internalization (435)

[Full Changelog](

**Implemented enhancements:**

- `PipelineContext` -- support `parent` contexts in case of the nested pipelines. [\433](
- Idle mode -- provide such flag into main pipeline [\432](
- `MapPipelineItem` -- provide `ctx` parameter in order to reach out parent Pipeline Context \[Idle mode\] [\431](
- NetworkSerializer -- support the case of `Vectorizers==Null` \[Without embedding, google-trans-sampler backlog\] [\430](
- ParsedRow -- depends on `pandas`, while it might be switched to `dict` type instead \[AREnets backlog\] [\427](
- Remove unused code after AREnets movement [\425](
- `AREnets` -- separated project for `networks` contrib part, which provides NN implementation based on Tensorflow [\423](
- `Entity` -- Adopt `DisplayValue` property for CSV serialization [\419](
- TsvWriter -- Remove `Dataframe` dependency [\408](
- OpenNREJsonWriter -- `df.sort` is not an inplace by default [\407](
- NeuralNetworkModelIO -- simplify implementation [\406](
- Brat -- support nested entities \(`CompoundEntity` type\) \[simple implementation\] [\398](
- What's New -- 0.22.1 Release [\323](

**Fixed bugs:**

- Brat -- incorrect parsing approach may sometimes results in a wrong value might be mismatched \(use `t`\) [\437](
- `VocabRepositoryUtils` -- `numpy` API considers `` by default in vocabulary on load [\428](
- LabelsScaler -- uint dict and dict might have different sizes [\426](

**Closed issues:**

- `read_ruattitudes_to_brat_in_memory` -- no need to pass label scaler [\436](
- `PosTags` -- make them optional parameter for neural networks [\435](
- RuSentiFrames -- clarify `tqdm` caption when loading \(ARElight backlog\) [\434](
- Sync with AREnets updates [\429](
- `BERT` -- provide cropped sampler [\422](
- `googletrans` -- move to the separeted project [\421](
- `_provide_sentence_terms` -- consider `s_ind` and `t_ind` as well since they may combined with and modified at the same time \[nivts\_project backlog\] [\420](
- Entity -- provide DisplayValue property \(which is `Value` by default\) [\418](
- `googletrans` -- TranslatorPipelineItem for parsed texts [\416](
- Instant downloading -- simplify data downloading [\413](
- PandasBasedRowsStorage -- implement the nested type from the `BaseRowsStorage` [\410](
- Readers/Writers -- make a part of the contrib [\409](
- TextOpinion Annotation -- particular filtering rules for SentiNEREL and Russian texts. \[pipeline items\] [\404](
- Evalution -- enhancing error log analysis [\400](
- Statistical Folding provided via file [\399](
- Balancing as a side part of the Storage [\380](

**Merged pull requests:**

- CVE-2007-4559 Patch [\412]( ([TrellixVulnTeam](

\* *This Changelog was automatically generated by [github_changelog_generator](*


Release Notes :tada:

[Full Changelog](

- :notebook: Provide `BRAT-based reader` (refactoring) of documents and mentioned entities in it! :partying_face:
- :wrench: Provide verbose treatment of values for SynonymsCollection (327)
- :wrench: Fixed embedding issues for `Entity` type for neural networks (308)
- :wrench: Refactoring `RuSentRel` reader, which is now repesents an ontop build over BRAT. (287)
- :wrench: Attitude annotation performed on a fly within a pipeline! (281)
- :wrench: Opinion annotation does not depend on the experiment (250)
- :wrench: 347
- :new: added `utils` contrib part and there were moved :partying_face:
- evaluation (2-3 scale)
- cv-splittings (324)
- entity formatters
- synonyms collections templates: stemmer-based
- experiment handlers (325)
- np_utils -- utils to interact with np-serialized data (348)
- **pipelines** :loop: for opinions extraction and data serialization, text processing: we are now able to declare a custom pipeline and adopt serialization for a variety of RE tasks
- :new: API for conversion of external `text_opinions` into `parsed_news` (338)
- :new: API for a variety of pipelines for data preparation, depending on `DataType` (343)
- :new: `DataType` now includes `Dev` and `Etalon` by default (345)
- :new: Evaluation refactoring, and support `TextOpinion` level results evaluation (355)
- :wastebasket: `experimential_rusentrel` contrib part removed (321)
- :wastebasket: `OpinionRowsProvider` should be removed [ARElight backlog] (282)
- fixed: 356

**Implemented enhancements:**

- RuSentiFrames stat -- move script from `source` to the related UnitTest dir [\391](
- Vocabulary for Embedding -- save it in `.txt` format. [\388](
- BratSentence -- entities should be initialized via parameter [\383](
- ModelIO -- move vocab and embedding related API to EmbeddingIO [\382](
- BERT -- formatter differs only in TextB. [\381](
- Provide JSON writer for OpenNRE library [\378](
- ExperimentSerializationContext -- some parameters might be optional \[Remove them\] [\369](
- `ExperimentSerializationContext` -- `Annotator` property is not used. [\368](
- DocumentOperations -- `iter_doc_ids` actually wraps the ExperimentContext functionality [\367](
- `iter_tagget_doc_ids` -- this might be treated as `iter_doc_ids` of an another instance [\366](
- `ExperimentIterationHandler` -- switch to the PipelineItem for NN and BERT serialization \[Remove `ExperimentEngine` and `ExperimentHandler`\] [\365](
- `FixedFolding` -- intersected parts are not supported \[NIVTS project backlog\] [\364](
- `InputDataSerializationHelper` -- refactoring [\362](
- `exp_io.balance_samples`-- remove Dependency from `DataType.Train` [\360](
- NeuralNetwork -- for the fine-tunning it is impossible to pick a default embedding/vocabulary. [\359](
- Evaluation -- support results evaluation for `TextOpinion` [\355](
- `DefaultOpinionAnnotator` -- `etalon_opinion` logic might be moved outside \[Remove `DataType` dependency, backlog\] [\354](
- `StatesCount`, `StateIndex` and `iter_states` of `BaseDataFolding` -- this is a part of CV-based method [\353](
- Evaluator refactoring [\352](
- Processing module -- Multiple Languages Scaling \[Eng/Rus\] \[Contents Relocation\] [\351](
- ExperimentContext -- remove Evaluator from the base class. [\349](
- `np_utils` -- move from `networks` to `utils` contrib part [\348](
- `StringWithEmbeddingNetworkTermMapping` -- has hard-coded algorithms for tokens and terms embedding creation. [\347](
- Existed in Embedding -- log \(remove print\) [\346](
- DataType -- provide `Dev` and `Etalon` default types \[QUICK fix\] [\345](
- Data Serialization -- update API that allow to provide a particular pipeline processor for each `DataType` \[Backlog\] [\343](
- Model io utils -- move into `contrib` part [\342](
- `Engine` -- provide states iterator as a parameter instead of `DataFolding` [\341](
- Brat -- provide stability [\340](
- BaseParsedNewsServiceProvider -- support conversion from `Entity` to `DocumentEntity` [\338](
- OpinionEntityType -- this should be generalized [\335](
- BratTextEntitiesParser and StringPartitioning -- nested entities are not supported. \[Temp fix\] [\334](
- RuAttitudesLabelConverter -- required only for conversion \(not for parsing\) [\332](
- SentenceOpinion -- no need to store entity values [\331](
- Utils -- provide opinion converters from brat [\330](
- RuAtttitudes -- move `SentenceOpinion` to brat [\329](
- BratEntityCollectionHelper -- `extract_entities` considering for rows prefixed with `T` [\328](
- SynonymsCollection -- `value_to_group_id_func` does not support expansion by default. [\327](
- BERT and Network Serialization -- refactoring duplicated serialization implementations [\322](
- `exp_joined` -- removed such experiment at `experiment_rusentrel` contrib [\321](
- `rusentrel_experiment` -- organize a separated python project [\320](
- "Uknown}" -- specific to RuSentRel entity case [\319](
- `BertExperimentInputSerializerIterationHandler` -- Simplify API \[Blog example backlog\] [\318](
- BaseRowsStorage -- consider rows shuffling \[ARElight backlog\] [\316](
- EntityIds -- expected to be a part of the BaseSampleRowProvider \[ARElight backlog\] [\312](
- `iter_synonym_groups` \[Sources\]-- refactor to common method \[ARElight backlog\] [\310](
- term-embedding-pairs -- refactor chain of the parameter dependencies. [\304](
- Move EntityFormatters outside [\302](
- Sources -- RusentRel collection based on brat toolkit serialization format [\287](
- `BaseOpinionsRowProvider` -- useless class and hence should be removed \[refactoring IOUtils\] [\282](
- IOUtils -- replace `experiment` instance \(and dependency\) with string provider. [\252](
- Annotator and algorithm is not related to experiment. [\250](
- DocumentOperations -- parsed docs related API is not related to the expetiment concepts. [\249](
- Remove `sep_doc_id` variable [\131](
- Update Framework Description [\74](

**Fixed bugs:**

- `StringWithEmbeddingNetworkTermMapping` -- `map_token` is expected a particular type of embedding which return embedding only [\395](
- NetworksTrainingPipelineItem -- pass labels count [\379](
- `BertDefaultStringTextTermsMapper` -- non masked entity values might be with ` ` separation between words [\377](
- `iter_rows_linked_by_text_opinions` -- fixed bug with incorrect check. Removed doc-related check. [\356](
- TextOpinion should be a part of a single sentence -- this limitation is not emphasized in any way of exceptions and assertions [\339](
- BaseParsedNewsServiceProvider -- incorrect IDs assignation [\337](
- Example -- Documents become mixed \[RuAttitudes Affection\] [\292](
- RuAttitudes -- `extract_text_opinions_linkages` utilizes a different approach which is not covered by common impementation. [\232](

**Closed issues:**

- `SamplesIO` -- view always intialized from `tsv` [\397](
- `SamplesIO` -- make optional writer [\396](
- NoLabel -- allow to customize so for annotators. [\393](
- Source -- remove `common` labels [\392](
- Tutorials [\390](
- Embed SentiNEREL collection [\389](
- RuSentRel and RuAttitudes data pipelines -- provide at `utils` contrib [\387](
- Serialization pipelines -- move them to `utils` contrib \[pipeline part\] [\386](
- Lexicons -- move to the `utils` contrib project [\385](
- Remove Gensim dependency [\384](
- Evaluation -- ability to extract errors \[Backlog\] [\375](
- BaseSampleRowProvider -- has BERT dependencies from contrib [\374](
- `BaseIOUtils` -- remove `write_opinion_collection` [\373](
- `BaseExperiment` -- remove this class. [\372](
- `ExperimentTrainingContext` -- this could be removed. [\371](
- BaseTensorflowModel -- provide `DataType` parameter for fitting [\370](
- ExperimentSerializationContext -- remove EntityFormatter \[Backlog\] [\361](
- `TextOpinion` -- id may be a variety of types [\358](
- `TextOpinion` -- remove `owner` field [\357](
- Experiment `pipelines` to `contrib.utils` [\326](
- Experiment `handlers` to `contrib.utils` [\325](
- Experiment `cv` to `contrib.utils` [\324](
- RuSentRelOpinionCollectionWriter -- provide encoding parameter \[ARElight backlog\] [\317](
- LabelsFormatter for TextB \[BERT\] -- labels might be not supported \[ARElight backlog\] [\315](
- RuSentRel experiment -- TextParser could not be customized \[ARElight backlog\] [\314](
- InputSerializers \(BERT/Networks\) --` __init__` should not depend on data-related information \[ARElight backlog\] [\313](
- StringEntitiesFormatter -- rename EntityType to OpnionEntityType \[QUICK\] [\307](
- Annotation -- Opinion annotation should be implemented at `OpinionOperations.iter_opinions_for_extraction` [\281](
- SampleView -- adopt multiple views provider \[Refactoring\] [\269](

[v0.22.0-rc-p1]( (2022-04-02)

[Full Changelog](

**Implemented enhancements:**

- Remove non utilized flags in IterationHandlers \[ARElight backlog\] [\309](

**Fixed bugs:**

- BertExperimentInputSerializerIterationHandler -- missed `value_to_group_id_func` parameter [\311](

[v0.22.0-rc-p0]( (2022-03-29)

[Full Changelog](

**Fixed bugs:**

- Remove `,` presence assertion from Opinon `__init__` class method [\306](
- ModuleNotFoundError: No module named '' [\301](

**Closed issues:**

- What's New -- Release 0.22.0 [\227](

Release Notes :tada:
* Pipelines integration!
* Utilized now in text processing, which now could be deleted onto tokenization, entities assignation, frames assignation stages.
* Repositories for opinions and network input samples!
* Storage kernel customizations support for opinion and samples! Using Pandas by default.
* Opinion-related service turn into providers: pairs, opinions, text-opinions, etc.

> **NOTE:** issue [\232]( has been moved to the next release.
**This version does not support RuAttitudes collection news parsing!**
Will be fixed in the [upcomming project](


[v0.22.0-rc]( (2022-03-17)

[Full Changelog](


**Implemented enhancements:**

- `create_term_embedding` -- Embedding algorithm based on parts requires useless check [\298](
- UnitTests -- BertOntoNotes is no longer below the core processing [\293](
- SingleLabelScaler -- provide \[QUICK\] [\291](
- BRAT visualization -- support processing in case of multiple documents. [\286](
- Entity -- IDs Refactoring [\280](
- BaseSampleRowProvider -- provide sentence id [\279](
- BRAT tool -- adopt ui as a callback for the predict pipeline [\275](
- ExperimentIterationHandler -- add Labeled Output Samples convertion to OpinionCollection [\270](
- InferenceContext -- split bags and samples extraction from a single method \[Quick\] [\268](
- DataFolding -- organize united data folding. [\267](
- BaseDataFolding -- iter\_index is not related to the base implementation [\266](
- DataFolding -- move into experiment context [\264](
- DataIO \(exp\_data var\) -- rename it to `ExperimentContext` [\263](
- ExperimentIterationHandler \(Callback before\) -- organize ExperimentEvaluationCallback [\262](
- NetworkCallback -- this callback should not inherit experiment base Callback [\261](
- Neural Network Hidden states writers and providers refactoring [\260](
- TrainingCallback -- separate onto `TrainingTerminationCallback` and `HiddenWriterCallback` classes. [\259](
- BaseTensorflowModel -- simplify `fit` and `predict` operations. [\258](
- LabeledCollection -- remove `is_empty` and `reset_labels` api [\257](
- NetworkCallback -- move train/predict notification info into callback [\256](
- Tensorflow saver -- move the related logic outside of the model implementation [\255](
- DefaultSingleLabelAnnotationAlgorithm -- single label is not a part of the algo [\244](
- `ThreeScaleTaskAnnotator` -- rename and move into core. [\243](
- Data/output -- create pipelines directory with the related output processing [\240](
- Examples -- document parsing executes twicely [\239](
- Might be utilized pipeline implementation [\238](
- OpinionsProvider -- performs two actions, including ids assignation [\236](
- entity\_to\_group\_func -- `BaseExperiment` should not provide this method. [\235](
- TextOpinionHelper -- to news/parsed/providers \(implement the latter as a provider\) [\233](
- DefaultSingleLabelAnnotationAlgorithm -- iter\_opinion duplicates the generalized pair opinion pair creation approach [\231](
- Common `languages` dir -- move its contents into processing contrib. [\229](
- Linked Text Opinions Refactoring. [\228](
- Lemmatization should be a part of the frames processing pipeline stage [\226](
- DefaultTextParser -- this class is actually a Tokenizer [\225](
- News -- text-opinions provider and entities access API might be a part of a `ParsedNews` by means of `NewsParser` \(new class\) [\224](
- StringLabelsFormatter -- switch to label\_types instead of label instances. [\223](
- AnnotationAlgorithm -- iter\_opinions requires EntitiesCollection while the latter utilized for entities iteration [\222](
- TextParseOptions -- add `keep_tokens` [\221](
- FrameVariantsParser -- return modified terms only [\218](
- FramesAnnotation -- `is_inverted` flag and processing shoult be a pipeline item [\217](
- FramesCollection -- use `FrameConnotationProvider` instead [\216](
- FrameVariantsParser -- move into processing subfolder. [\215](
- OpinionOperations -- remove `try_read_annotated_opinion_collection` [\213](
- DocumentOperation -- unify iter\_doc\_ids operation into one with `tag` parameter. [\212](
- OpinionOperations -- move readers\* into IO. [\211](
- OpinionCollectionsProvider -- serialization should not be a part of this class [\210](
- data -- separate data-related information from the experiment [\209](
- BaseInputReader -- class stores `_df`, however it should replaced with `BaseRowsStorage` [\207](
- Repositories -- fill method should be a part of a `storage` rather than provider. [\204](
- BaseStorage -- exclude `save` method into separated class BaseRowsWriter [\202](
- Experiments -- rename `formats` to `api` \(QUICK\) [\201](
- Embedding and Vocabulary -- organize Storage/Repository with `serialize`/`load` operations. [\200](
- Sample -- remove dependency from DefaultNetworkConfig. [\199](
- BaseOutputFormatter -- both provider and formatter mixes `df` usage [\198](
- OpinionProvider -- remove dependency from Opinion and Document Operation instances. [\197](
- Repositiories -- add this class which unite all the providers for data writing [\195](
- Add column providers [\194](
- NetworkSampleFormatter -- switch to provider [\193](
- BaseSampleStorage -- use `store_labels` instead of `data_type` passing \(QUICK\) [\192](
- NetworkOutputEncoder -- separate formatting from serialization. [\191](
- BaseSampleFormatter -- `__create_row` is not relted to the Formatter, should be moved. [\190](
- BaseDocumentStatGenerator -- provider depends on IO files. [\189](
- OpinonFormatter -- use the latter in experiment io. [\188](
- News -- remove `return_text` parameter from iter\_sentences method \(QUICK\) [\187](
- BaseRowsFormatter -- move `format` method in another class [\185](
- BaseSampleFormatter -- `_iter_sentence_terms` should not be a part of this class. \(QUICK\) [\184](
- BaseSampleFormatter -- `_provide_rows` behavior depends on row\_ids\_provider instance type. [\182](
- BaseSampleFormatter -- remove `data_type` parameter from ctor [\181](
- BaseObjectParser -- `parse` method should return object of the same type as `sentence` [\179](
- News -- remove `entities_parser` instance from News class. [\178](
- BaseEntitiesParser -- generalize to BaseObjectsParser. [\177](
- Provide SHA checksums utilization for downloaded resources. [\176](
- OpinionCollectionsFormatter -- use it as instance, created within `with` block [\175](
- BaseOutput -- move `_csv_to_dataframe` out of this class. [\174](
- DataIO -- remove `Stemmer` instance [\172](
- BaseRowsFormatter -- `formatter_type_log_name` mehod should be removed. [\171](
- BaseOpinionsFormatter -- leave `save` method implementation for inheritor classes. [\170](
- BaseSampleFormatter -- leave `save` method implementation for inheritor classes. [\169](
- BaseIOUtils -- remove dependencies from file/\(path\) based data storage format [\168](
- BaseIOUtils -- `get_input_sample_filepath` `get_input_opinions_filepath` are limit possible storage abilities. [\166](
- perform\_reading\_and\_initialization -- provide samples reader. [\165](
- perform\_reading\_and\_initialization -- remove dependency from `doc_ops` [\164](
- NetworkInputSampleReader -- remove inheritance from TSV-based reader. [\163](
- OpinionCollectionsFormatter -- use `save_to` and `load_from` notation for method names with source provider \(file/archive/storage, etc.\). [\142](
- RuSentRelOpinionCollectionFormatter -- move all the opinion iteration during saving/loading into base class [\141](
- news\_id or doc\_id -- normalize class and field names [\133](
- embeddings subdir -- considered to be a part of networks contrib [\132](
- Sentiment frame polarity \(A0-\>A1\) considered to be a part of the related experiment. [\118](
- EnumServices -- provide a base class with string to Enum conversion functionality [\117](
- EntityFormaters -- Move formaters into the particular experiment implementation [\116](
- \_create\_parse\_options -- remove this method from DocumentOperations across all the experiments. [\112](
- NewsParseOptions -- provide this options for the particular `DefaultParser` derived from `TextParser` [\111](
- TextParser -- Provide a separated class with a text processing algorithm implementation API [\75](
- Providing all the logging information into log\ [\30](

**Fixed bugs:**

- ModuleNotFoundError: No module named '' [\301](
- UnitTests -- Discard RuAttitudes-v1.2 support due to `index out of range` exception on reading [\295](
- text\_opinions\_iter\_pipeline -- ids assigments varies after multiple calls [\278](
- EntitiesParser -- provide doc\_level ids [\277](
- DeepPavlovNER -- BertOntoNotes entities annotation \[Treating string and list-based text representation simultaneously\] [\274](
- Examples -- get\_index\_by\_term of Vocabulary failed [\271](
- Annotator Performance -- keeps all possible pairs between entities. [\253](
- Network SampleID -- has type `unicode`, but expected to be integer type [\248](
- Example -- given two sentences results in samples of only last of them. [\246](
- UnitTests -- Incorrect labels formatter \(QUICK\) [\186](
- test\_samples\ -- incorrect API usage in Tensorflow contrib. [\158](

**Closed issues:**

- Transfer examples folder into separated project \[ARElight\] [\300](
- RuSentRel Experiment -- Text is lemmatized irrespect of the save\_lemmas parameter in parser \[OK\] [\297](
- Experiment -- refactor inference pipeline implementation [\290](
- Example -- reorganize infer folder \(experiment\) [\289](
- Experiment -- Organize pipeline stages as items of the BasePipeline [\285](
- BaseSampleRowProvider -- provide entity values and entity types. \[QUICK\] [\283](
- DeepPavlov NER -- adopt BERTontonotes. [\272](
- NeuralNetworks -- graph and tf session should be initialized before the `predict` method call. [\247](
- NewsServiceCollection -- implement [\245](
- numpy 1.19.5 -- returns int64 by default [\242](
- Organize unit tests for Output to Opinion conversion pipeline [\241](
- Iter\_opinions\_collection -- complicated, considering pipeline processing instead [\237](
- EntitiesCollection -- provide `value_to_group` function instead of SynonymsCollection. [\230](
- BaseTextParser -- `parse_news` is not related to the text parsing concepts and should be a part of the another class [\220](
- DocumentOperations -- `_get_text_parser` should not be a part of this API [\219](
- Create simple parser for text with mentioned \[entities\] [\214](
- NetworkInputHelper -- performing `serialize_missed_collections` during writing process [\208](
- RowIDs -- should be common for input and output [\206](
- SampleRowBalancerHelper -- simplify by using `pandas` group sampling [\203](
- convert\_output\_to\_opinion\_collections -- pass opinion reader into parameters. [\167](
- Experiment -- Separate TSV-based formater from based one for samples and opinions [\162](
- Switch to Python3.6 [\160](
- RuSentRel Experiment Contrib -- update description [\153](
- Provide Cache for data sources [\151](
- SynonymsCollection considered in ReadOnly mode only [\5](

**Merged pull requests:**

- 0.21.1 rc [\234]( ([nicolay-r](
- 0.21.1 rc [\196]( ([nicolay-r](
- 0.21.0 rc [\159]( ([nicolay-r](
- 0.21.0 rc [\157]( ([nicolay-r](
- 0.21.0 rc [\152]( ([nicolay-r](



[v0.21.0-rc]( (2021-08-15)

[Full Changelog](

**Implemented enhancements:**

- Sources -- clarify `do_overwrite` and refactor `check_uniqueness` flags RuSentiFrames [\150](
- Compose Python Library [\145](
- Sources -- provide local storage at home directory [\144](
- Enum -- clarify enum34 package using instead of the enum. [\143](
- OpinionCollectionsFormatter -- support to save/load only supported by label\_formatter opinions [\139](
- UnitTests -- gather all tests into single folder [\125](
- BaseAnnotator -- intialize method is useless as the passed parameters requires only at `serialize_missed_collections` method. [\123](
- NeutralAnnotator -- Rename to annotator, as neutral prefix is related to a specifics of the particular task [\122](
- NeutralAnnot -- use a predefined template for names, based on labels count, instead of Name property [\121](
- DefaultNeutralAlgo -- provide dist in sentence parameter [\120](
- NeutralAnnot -- Two/Three scale annotators considered to be a part of the related experiment [\119](
- Evaluation Metrics -- such functions considered to be a part of the particular experiment [\115](
- Embedding -- set\_stemmer method is not declared in base class [\114](
- FrameVariantsCollection -- remove stemmer from \_\_init\_\_ params. [\113](
- Bag \(NeuralNetworks\) -- label could be presented as uint. [\110](
- experiment\_rusentrel -- Group all folders by a single `exp` prefix [\108](
- BaseModel -- Replace epochs\_count parameter with generalized parameter structure. [\107](
- OpinionCollection -- provide set of supported labels \(opinion filtration by labels\) [\106](
- LabelCalculationMode -- make it enum [\105](
- BaseModel -- replace epochs\_count with model options [\104](
- ThreeLabelsScaler -- remove dependecies of the latter in NeuralNetwork contrib [\103](
- RuAttitudes -- use int\_to\_label function instead of label scaler [\102](
- Labels -- Move Scaler into common/labels [\101](
- Labels -- Provide a unique labels for the partucular experiment in contrib [\100](
- Experiments -- reorganize rusentrel experiments data within the related new folder [\97](

**Fixed bugs:**

- RuAttitudes-v1.2. -- fix downloading link [\155](
- sources -- Remove data folder [\149](
- Entity -- type could be `None` while there is no restriction for that [\148](
- RuSentRelOpinionCollectionFormatter -- label could not be found during neural network training. [\137](
- frame\_variant -- label scaler receives `NoLabel` while experiment based on `NeutralLabel` [\136](
- BaseEvaluator -- opinion labels might be incompatible with the one utilized in ResultEvaluator. [\124](

**Closed issues:**
- UnitTests -- Run all unit tests via bash script [\156](
- Remove release\ file and move the related content into Releases descriptions. [\146](
- Tutorial -- Clarify on how we perform optimization [\90](



* Using custom check of duplicated opinions during `OpinionCollection` initialization.
* Speed-up and engine optimizations:
* Optionally loading neutral annotator.
* Multi-Instance networks: now we consider that the next appered context always continues the prior.
(check out multi-instance bags creation for details)
* Now shuffling in models performed for bags, not for bag groups.
* Networks: added `allow_growth=True` flag for tensorflow based neural networks.
Memory fraction parameter has been removed.

Collection of parsed news become dispatched from text opinions collection.
* **News parsing now is assumed to be performed using `TextParser.parse(news, options)` call. Related refactoring.**
* Stemmer application from `RuAttitiudes` parser has been removed.
* Removed dependency from `RelatedParsedNewCollection` in TextOpinionCollection.
* Labeling now separated from LinkedTextOpinion collection.
* `ParsedText` class has been refactored, removed unused methods. Keep tokens has been discarded.
* BERT tsv-format-encoders are now in a Factory (at contrib directory).
* Fixed: `RuSentRelTextOpinion` replaced with `TextOpinion`, and independent from `OpinionRef`.
* `Single`/`Multi` models now are not exist, as the latter prefixes affects only onto batch types selection. Refactoring.


Release Notes

* Labels conversion `to_str` and `from_str` now a part of external formatters (unique for each source, experiment, etc.).
* Added labels-scaler, and labels casing (to int or uint) now depends on scaler;
* Added bert exporter in contribution folder: with related formatters according to the [[paper]](
* **NLI** -- (Natural language inference) format, assumes to provide an additional sentence, which describes
attitude should be extracted
* **QA** -- (Question answering) provides an additional question onto attitude sentiment.
With Label encoding in following format:
* **Multiple** -- all the supported sentiment labels (positive, negative, neutral)
* **Binary** -- (YES, NO) according to mention (additional sentence), provided by **NLI** and **QA** formatters.
* Refactoring experiments in order to apply the latter also for classifiers (models from scikit-learn)
* Updated nn-engine API
* Refactoring tf-based neural network implementation.
* Bert now moved into separated folder from `contrib` directory.
* frame_variants moved to `frames` directory.
* Frame variants labeling in news now performed during `parse` operation.
* `DataType` now enumeration. List of Supported data-types now a part of experiment
The latter were moved onto sample level.
* `Service` folder removed as the latter assumes to be apart of this repository.

Release Notes
* Experiments now supports two-scale and three-scale task representation with the related evaluation formats.

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.