Architectural Patterns for Immutable Financial Data Lakes on AWS S3 for Regulatory Compliance and Real-Time Analytics

Authors

  • Gunjan Kumar Independent Research

DOI:

https://doi.org/10.56127/ijst.v2i1.2316

Keywords:

Unchanging Data Lake, Financial Data, S3 at AWS, Compliance with Regulations, Real-Time analytics, Object Lock, Data Governance, Fraud Detection, Cloud native Architecture, Streaming Data

Abstract

Financial services business is becoming highly reliant on the safe and dependable storage of large quantities of structured and unstructured information. As the global regulatory demands become increasingly strict, including the SEC Rule 17a-4, Sarbanes-Oxley Act (SOX), and the General Data Protection Regulation (GDPR), organizations now have to contend with the twin demands of maintaining compliance and ensuring immutability with the actions of being analytics agile in making decisions. In this paper, the design of an immutable financial data lake is explored based on Amazon Web Services (AWS) Simple Storage Service (S3) and a set of Amazon Web Services-native tools. In particular, it examines how using S3 versioning and object lock, and write-once-read-many (WORM) policies can be utilized together with AWS Glue, Amazon Athena, Amazon Redshift and Amazon Kinesis to create a secure, compliant and high-performing architecture. Through the implementation of architectural designs based on cloud-native principles, the study illustrates how financial institutions can be able to meet high auditability standards and at the same time, provide the benefit of real-time analytics to detect fraud and manage risk and conduct market research. In addition, it provides the governance, lineage and security structures that are required to maintain integrity and promote innovation. The results underscore the fact that data lakes of immutable data stored in clouds, properly designed, will level compliance assurances, as well as a stable base of a contemporary financial intelligence.

References

[1] Mariana, O., Rakesh, I., & Thomas, W. (2023). Migrating Bfsi Data Workloads To Cloud-Native Environments A Case Study On Multi-Tier Data Lakehouse Architectures WiTh Aws Redshift, Athena, And Intelligent Orchestration For Compliance. Journal of Engineering, Mechanics and Modern Architecture, 2(11), 50-61.

[2] Irani, B. (2023). Modern Data Architecture on AWS. Packt Publishing.

[3] Simões, J. M. D. (2023). POWER Data Framework Architecture (Master's thesis)

[4] Kansara, M. (2021). Cloud migration strategies and challenges in highly regulated and data-intensive industries: A technical perspective. International Journal of Applied Machine Learning and Computational Intelligence, 11(12), 78-121.

[5] Mohna, H. A., Barua, T., Mohiuddin, M., & Rahman, M. M. (2022). AI-ready data engineering pipelines: a review of medallion architecture and cloud-based integration models. American Journal of Scholarly Research and Innovation, 1(01), 319-350.

[6] Minichino, J. (2023). Data Analytics in the AWS Cloud: Building a Data Platform for BI and Predictive Analytics on AWS. John Wiley & Sons.

[7] Arul, K. (2023). Energy-efficient Data Engineering Practices for Big Data Workloads in Cloud Infrastructure. Journal of Current Science Research and Review, 1(3).

[8] Adebowale, A. M., & Akinnagbe, O. B. (2023). Cross-platform financial data unification to strengthen compliance, fraud detection and risk controls. World J Adv Res Rev, 20(3), 2326-2343.

[9] Miglani, R. (2023). Mastering Cloud Storage: Navigating cloud solutions, data security, and cost optimization for seamless digital transformation (English Edition). BPB Publications.

[10] Marosi, A., Emődi, M. B., Farkas, A., Lovas, R., Beregi, R. J., Pedone, G., ... & Gáspár, P. (2022). Towards Reference Architectures: a Cloud-agnostic Data Analytics Platform Empowering Autonomous Systems. IEEE ACCESS, 10, 60658-60673.

[11] Siciliani, V. (2021). Design and implementation of a real time data lake in cloud (Doctoral dissertation, Politecnico di Torino).

[12] Bouziane, A. (2023). On-demand Health Data Provisioning with Custom Temporary Data Views for Big Data Platforms. Ecole Polytechnique, Montreal (Canada).

[13] Abbasi, A. (2020). AWS Certified Data Analytics Study Guide: Specialty (DAS-C01) Exam. John Wiley & Sons.

[14] Fofanah, A. J. (2021). Review of Knowledge Management in Optical Networks, Lambda Architecture using Database Technologies in Cloud Settings. International Journal of Scientific and Research Publications, 11(8).

[15] Rangarajan, P., & Bounds, D. (2023). Cloud Native AI and Machine Learning on AWS. BPB Publications.

[16] Vinnikainen, O. (2023). Data mesh: a holistic examination of its principles, practices, and potential.

[17] Firouzi, F., & Farahani, B. (2020). Architecting iot cloud. In Intelligent Internet of Things: From device to fog and cloud (pp. 173-241). Cham: Springer International Publishing.

[18] Bhaskaran, S. V. (2020). Integrating data quality services (dqs) in big data ecosystems: Challenges, best practices, and opportunities for decision-making. Journal of Applied Big Data Analytics, Decision-Making, and Predictive Modelling Systems, 4(11), 1-12.

[19] Mennuni, M. (2023). An Analysis of SOC Monitoring Systems (Doctoral dissertation, Politecnico di Torino).

[20] Shah, J. K. (2022). AI-Driven Resilience in Cloud-Native Big Data Platforms Against Cyberattacks. Journal of Computer Science and Technology Studies, 4(2), 191-199.

[21] Abu-Salih, B., Wongthongtham, P., Zhu, D., Chan, K. Y., & Rudra, A. (2021). Social big data analytics. Springer Singapore..

[22] Banerjee, S. (2022). Scalable Data Architecture with Java. Build Efficient Enterprise-Grade Data Architecting Solutions Using Java/Sinchan Banerjee.

[23] Barreto, L. P. T. (2019). Real time data intake and data warehouse integration (Master's thesis, Universidade do Porto (Portugal)).

[24] Joaquim, H. F. C. (2021). System Architecture and Web Development for Healthcare Big Data Driven Application (Master's thesis, Universidade NOVA de Lisboa (Portugal)).

[25] Das, R. A. H. U. L., Sirazy, M. R. M., Khan, R. S., & Rahman, S. H. A. R. I. F. U. R. (2023). A collaborative intelligence (ci) framework for fraud detection in us federal relief programs. Applied Research in Artificial Intelligence and Cloud Computing, 6(9), 47-59.

[26] Joseph, A. (2023). A holistic framework for unifying data security and management in modern enterprises. International Journal of Social and Business Sciences, 17(10), 602-609.

[27] Vance, T. C., Wengren, M., Burger, E., Hernandez, D., Kearns, T., Medina-Lopez, E., ... & Wilcox, K. (2019). From the oceans to the cloud: opportunities and challenges for data, models, computation and workflows. Frontiers in Marine Science, 6, 211.

[28] Kumar, T. V. (2016). Multi-Cloud Data Synchronization Using Kafka Stream Processing Varun Kumar Tambi.

[29] Correia, J. B., Abel, M., & Becker, K. (2023). Data management in digital twins: a systematic literature review. Knowledge and Information Systems, 65(8), 3165-3196.

[30] Goniwada, S. R. (2021). Cloud Native Data Architecture. In Cloud Native Architecture and Design: A Handbook for Modern Day Architecture and Design with Enterprise-Grade Examples (pp. 325-369). Berkeley, CA: Apress.

Downloads

Published

2023-02-27

How to Cite

Gunjan Kumar. (2023). Architectural Patterns for Immutable Financial Data Lakes on AWS S3 for Regulatory Compliance and Real-Time Analytics. International Journal Science and Technology, 2(1), 80–94. https://doi.org/10.56127/ijst.v2i1.2316

Similar Articles

<< < 1 2 3 4 5 6 7 8 9 > >> 

You may also start an advanced similarity search for this article.