Data Lake for demand planning

Realizations
1

Challenge

The aim of the project was to implement an efficient environment for analyzing energy consumption data. Within the scope of the project, the architecture of the Data Lake environment and data flows from several source systems were developed. The business goal was to provide efficient forecasting of energy consumption. The technical goal was to optimize the current data transformation and reporting processes

Challenge Image
2

Solution

apache_cassandra-ar21.svghadoop-logo.svgapache_hive_logo_icon_167868.svgapache_kafka-ar21.svg
  • Apache Hadoop (HDFS, Hive, Spark)
  • Airflow
  • Zeppelin Notebook
  • Oracle (source system)
  • SAS
3

Result

  • Cluster installation and configuration (OS, cluster components, security)
  • Integration of the cluster with external systems
  • Building reports that allow you to analyze data on energy consumption
  • Development of a data repository in the distributed processing technology
  • Reducing the time of preparing billing reports from several hours to several several seconds
More effective sales analyses from a cross-channel perspective