Challenge :

Banking customer was looking for solution to handle the huge volume of data & data processing issues they had in the on premises infrastructure.

Solution :

We implemented a snow flake cloud-based architectures. The key components of our solution included

Data Sources:

compressed files, CSV , AWS S3 data sources & local files

Log Collection:

Cloud watch used to collect the logs

Data Ingestion:

From S3, the data are accessed by the Glue Workflow & EMR . AWS Glue crawler to populate the AWS Glue Data Catalog with databases and tables.

Data Processing:

AWS Glue workflows used to create and visualize ETL activities involving multiple crawlers and jobs, with triggers to manage the execution and monitoring of all components.
Amazon EMR,is a cloud big data platform for processing large datasets using popular distributed frameworks such as Apache Spark, Hadoop, and others. It process and analyze vast amounts of data quickly and cost-effectively.

Visualization and Reporting:

BI tools such as Tableau and PowerBI were used to create dashboards and reports, leveraging AWS powerful querying capabilities for both real-time and historical data.

Outcome:

By leveraging a hybrid architecture, we ensured that real-time data was stored and processed on cloud without any challenges they had in on –premises.

Benefits:

  • Real-time monitoring and incident response with on-premises storage.
  • Efficient long-term storage and querying of historical data .
  • Scalable solution capable of handling large volumes of security events.