Clustering-based Graph Numbering using Execution Traces for Cache Misses Reduction in Graph Analysis Applications

Régis Audran Mogo Wafo; Thomas Messi Nguelé; Armel Jacques Nzekon Nzeko'o; Djam Youh Xaviera

doi:10.46298/arima.13538

Régis Audran Mogo Wafo ; Thomas Messi Nguelé ; Armel Jacques Nzekon Nzeko'o ; Djam Youh Xaviera - Clustering-based Graph Numbering using Execution Traces for Cache Misses Reduction in Graph Analysis Applications

arima:13538 - Revue Africaine de Recherche en Informatique et Mathématiques Appliquées, 8 mars 2025, Volume 43 - 2025 - https://doi.org/10.46298/arima.13538

Clustering-based Graph Numbering using Execution Traces for Cache Misses Reduction in Graph Analysis ApplicationsArticle

Auteurs : Régis Audran Mogo Wafo ¹; Thomas Messi Nguelé ¹; Armel Jacques Nzekon Nzeko'o ¹; Djam Youh Xaviera

1 Université de Yaoundé I [UY1]

Social graph analysis is generally based on a local exploration of the underlying graph. That is, the analysis of a node of the graph is often done after having analyzed nodes located in its vicinity. However, over the time, networks are bound to grow with the addition of new members, which inevitably leads to the enlargement of the corresponding graphs. At this level we therefore have a problem because more the size of the graph increases, more the execution time of graph analysis applications too. This is due to the very large number of nodes that will need to be treated. Some recent work in-faces this problem by exploiting the properties of social networks such as the community structure to renumber the nodes of the graph in order to reduce cache misses. Reducing cache misses in an application allows to reduce the execution time of this application. In this paper, we argue that combining existing graph ordering with a new numbering that exploit execution traces analysis can allow to improve cache misses reduction and hence execution time reduction. The idea is to build graph numbering using execution traces of graph analysis applications and then combine it with an existing graph numbering (such as cn-order). To build this new ordering, we define a new distance and then used it to analyse execution traces with well known clustering algorithms K-means (for Kmeans-order) and hierarchical clustering (for cl-hier-order). Experiments on a user machine (dual-core) and four cores of Grid'5000 node (Neowise) show that this combination improves slightly existing graph ordering (cn-order, numbaco, rabbit and gorder) in almost all the cases (the two cores of dual-core, all the four cores of neowise), with PageRank graph application and astro-ph dataset. For example, on neowise with one thread and Astro-ph dataset, the best performance is given with the combination kmeans-order_cn-order which allows to reduce by 42.59% the cache misses (compared to the second numbaco with 40.79%) and therefore by 7.27 % the time of execution (compared to 6.89% for the second numbaco).

https://doi.org/10.46298/arima.13538

Source : HAL:hal-03687348v4

Volume : Volume 43 - 2025

Publié le : 8 mars 2025

Accepté le : 24 février 2025

Soumis le : 3 mai 2024

Mots-clés : [INFO]Computer Science [cs], [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [en] Graph analysis, Execution trace, Machine learning, Cache misses reduction

Licence : Hal authorisation v1

Références bibliographiques

Partager et exporter

Statistiques de consultation

Cette page a été consultée 532 fois.

Le PDF de cet article a été téléchargé 280 fois.