Extraction des caractéristiques lexico-grammaticales et couplage des unités CRF (Conditional Random Field) au réseau de neurones profond pour l'extraction des aspects

Saint Germes Bienvenu Bengono Obiang; Norbert Tsopze

doi:10.46298/arima.6438

Saint Germes Bienvenu Bengono Obiang ; Norbert Tsopze - Extraction des caractéristiques lexico-grammaticales et couplage des unités CRF (Conditional Random Field) au réseau de neurones profond pour l'extraction des aspects

arima:6438 - Revue Africaine de Recherche en Informatique et Mathématiques Appliquées, 28 juillet 2021, Volume 33 - Numéro spécial CRI 2019 - 2020/2021 - https://doi.org/10.46298/arima.6438

Extraction des caractéristiques lexico-grammaticales et couplage des unités CRF (Conditional Random Field) au réseau de neurones profond pour l'extraction des aspectsArticle

Auteurs : Saint Germes Bienvenu Bengono Obiang ^1,^2,³; Norbert Tsopze ^4,^5,^2,³

1 Département d'informatique, Faculté des Sciences, Université de Yaoundé 1
2 Unité de modélisation mathématique et informatique des systèmes complexes [Bondy]
3 University of Yaoundé 1 = Université de Yaoundé I
4 Département d'informatique, Faculté des Sciences, Université de Yaoundé 1
5 Sorbonne Université

[en]
The Internet contains a wealth of information in the form of unstructured texts such as customer comments on products, events and more. By extracting and analyzing the opinions expressed in customer comments in detail, it is possible to obtain valuable opportunities and information for customers and companies. The model proposed by Jebbara and Cimiano. for the extraction of aspects, winner of the SemEval2016 competition, suffers from the absence of lexico-grammatic input characteristics and poor performance in the detection of compound aspects. We propose the model based on a recurrent neural network for the task of extracting aspects of an entity for sentiment analysis. The proposed model is an improvement of the Jebbara and Cimiano model. The modification consists in adding a CRF to take into account the dependencies between labels and we have extended the characteristics space by adding grammatical level characteristics and lexical level characteristics. Experiments on the two SemEval2016 data sets tested our approach and showed an improvement in the F-score measurement of about 3.5%.

[fr]
L'analyse des opinions consiste à extraire des connaissances à partir des commentaires laissés par les utilisateurs à propos d'un produit, service, texte,... L'analyse des opinions basée sur les aspects consiste alors à décomposer le commentaire afin d'extraire les aspects que cet utilisateur a évalué. Le modèle proposé par Jebbara et Cimiano, vainqueur de la compétition SemEval2016 n'extrait pas correctement des aspects composés et ne prend pas en compte les caractéristiques lexico-grammaticales des textes en entrée, ce qui limite aussi ses performances dans la détection des aspects. Nous proposons une amélioration du modèle de Jebbara et Cimiano. en y introduisant des unités CRF afin de prendre en compte les dépendances entre les étiquettes et ajoutant aux entrées du modèle des caractéristiques lexico-grammaticales. Les expérimentations faites sur les deux jeux de données de SemEval2016 ont permis de tester cette approche et montrer une amélioration de la mesure F-score d'environ 3.5%. ABSTRACT. The Internet contains a wealth of information in the form of unstructured texts such as customer comments on products, events and more. By extracting and analyzing the opinions expressed in customer comments in detail, it is possible to obtain valuable opportunities and information for customers and companies. The model proposed by Jebbara and Cimiano. for the extraction of aspects, winner of the SemEval2016 competition, suffers from the absence of lexico-grammatic input characteristics and poor performance in the detection of compound aspects. We propose the model based on a recurrent neural network for the task of extracting aspects of an entity for sentiment analysis. The proposed model is an improvement of the Jebbara and Cimiano model. The modification consists in adding a CRF to take into account the dependencies between labels and we have extended the characteristics space by adding grammatical level characteristics and lexical level characteristics. Experiments on the two SemEval2016 data sets tested our approach and showed an improvement in the F-score measurement of about 3.5%.

https://doi.org/10.46298/arima.6438

Source : HAL:hal-02557636v4

Volume : Volume 33 - Numéro spécial CRI 2019 - 2020/2021

Publié le : 28 juillet 2021

Accepté le : 14 juin 2021

Soumis le : 29 avril 2020

Mots-clés : [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], [en] Deep learning, Gated Recurrent Unit, Sentiments analysis, ABSA; [fr] Analyse des sentiments, ABSA, Apprentissage profond, Unité récurrente à portes, Unité récurrente à portes Sentiments analysis, Deep learning, Gated Recurrent Unit

Saint Germes Bienvenu Bengono Obiang ; Norbert Tsopze - Extraction des caractéristiques lexico-grammaticales et couplage des unités CRF (Conditional Random Field) au réseau de neurones profond pour l'extraction des aspects

Références bibliographiques

Partager et exporter

Statistiques de consultation