Volume 33 - 2020 - Special issue CRI 2019

1. Named Entity Recognition in Low-resource Languages using Cross-lingual distributional word representation

Paulin Melatagia Yonta ; Michael Franklin Mbouopda.
Named Entity Recognition (NER) is a fundamental task in many NLP applications that seek to identify and classify expressions such as people, location, and organization names. Many NER systems have been developed, but the annotated data needed for good performances are not available for low-resource languages, such as Cameroonian languages. In this paper we exploit the low frequency of named entities in text to define a new suitable cross-lingual distributional representation for named entity recognition. We build the first Ewondo (a Bantu low-resource language of Cameroon) named entities recognizer by projecting named entity tags from English using our word representation. In terms of Recall, Precision and F-score, the obtained results show the effectiveness of the proposed distributional representation of words

2. Extraction of lexico-grammatic features and coupling of CRF (Conditional Random Field) units to the deep neural network for aspect extraction

Saint Germes Bienvenu Bengono Obiang ; Norbert Tsopze.
The Internet contains a wealth of information in the form of unstructured texts such as customer comments on products, events and more. By extracting and analyzing the opinions expressed in customer comments in detail, it is possible to obtain valuable opportunities and information for customers and companies. The model proposed by Jebbara and Cimiano. for the extraction of aspects, winner of the SemEval2016 competition, suffers from the absence of lexico-grammatic input characteristics and poor performance in the detection of compound aspects. We propose the model based on a recurrent neural network for the task of extracting aspects of an entity for sentiment analysis. The proposed model is an improvement of the Jebbara and Cimiano model. The modification consists in adding a CRF to take into account the dependencies between labels and we have extended the characteristics space by adding grammatical level characteristics and lexical level characteristics. Experiments on the two SemEval2016 data sets tested our approach and showed an improvement in the F-score measurement of about 3.5%.