Leveraging Graph Databases for Real-Time Analytics in Data Engineering Pipelines

Authors:
Prasad Sundaramoorthy

Addresses:
Department of Data and AI Analytics, Nordstrom, Washington, Seattle, United States of America.

Abstract:

Relational databases are struggling to meet real-time analytics expectations as data grows and complexity.  This article examines whether graph databases can alleviate concerns in data engineering pipelines. Data is stored as nodes, edges, and characteristics in graph databases, which model complex data point connections better. Social websites, fraud detection, and recommendation systems benefit from this design's fast, flexible data searching. Real-time data engineering and analytics enable organizations to process and analyze data for fast insights and informed decision-making. The incorporation of graph databases into data pipelines can enable real-time analysis by speeding up query responses and schema modification. The study utilizes a 100,000-transaction e-commerce dataset, which includes user ID, product ID, amount, and timestamp, to demonstrate the real-time functionality of graph databases. Neo4j is compared to MySQL to examine how graph databases handle real-time analytics, such as fraud detection. Graph databases enhance real-time analytics by accelerating response times and minimizing resource requirements for complex queries, according to tests.  Finally, the report discusses how graph databases in data pipelines can influence future R&D.

Keywords: Graph Databases; Real-Time Analytics; Data Engineering; Data Pipelines; Performance Optimization; Customer Experience; Business Organizations; Data Management.

Received on: 01/08/2024, Revised on: 11/10/2024, Accepted on: 22/11/2024, Published on: 07/03/2025

DOI: 10.69888/FTSIN.2025.000367

FMDB Transactions on Sustainable Intelligent Networks, 2025 Vol. 2 No. 1, Pages: 22-32

  • Views : 57
  • Downloads : 7
Download PDF