Objectives
The objective of my project is to build a connector between HPCC world and SPARK world.
Motivation
HPCC is the heart of LN business and almost all the important data is stored in the HPCC systems. Hence, if an analyst/statistician wants to build a model of (some or all) data using SPARK, she needs to download data to either her local system or move it to a cluster. This can be time-consuming and she needs to be very careful of being compliant with strict data rules.
A possible solution to this problem can be having a bridge between these two (SPARK and HPCC) worlds.
Expected Output
We expect our output to be in form of a connector when installed can enable ECL programmers to use SPARK algorithms on data stored in HPCC as well as PySpark programmers to use HPCC data.
Github Repo
Motivation
HPCC is the heart of LN business and almost all the important data is stored in the HPCC systems. Hence, if an analyst/statistician wants to build a model of (some or all) data using SPARK, she needs to download data to either her local system or move it to a cluster. This can be time-consuming and she needs to be very careful of being compliant with strict data rules.
A possible solution to this problem can be having a bridge between these two (SPARK and HPCC) worlds.
Expected Output
We expect our output to be in form of a connector when installed can enable ECL programmers to use SPARK algorithms on data stored in HPCC as well as PySpark programmers to use HPCC data.
Github Repo
Comments
Post a Comment