Apache Spark™ is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. It is a fast and general engine for large-scale data processing. We can say it as an engine that increases the computing workloads Hadoop can handle. Also increasing the performance by using in-memory storage during execution. It is a standalone project, but it designed to work with/on top of the Hadoop Distributed File System.
The ecosystem of Spark projects. Source: Databricks
Below is a very useful training video link for Spark beginners by Intellipaat (copy righted to Intellipaat and shared through their YouTube Channel). Hope that will help to get some initial idea about Spark for sure!
One thought on “Apache Spark: A promising framework for Big Data world!”