Apache Spark: A promising framework for Big Data world!

Apache Spark™ is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. It is a fast and general engine for large-scale data processing. We can say it as an engine that increases the computing workloads Hadoop can handle. Also increasing the performance by using in-memory storage during execution. It is a standalone project, but it designed to work with/on top of the Hadoop Distributed File System.

The ecosystem of Spark projects. Source: Databricks

The ecosystem of Spark projects. Source: Databricks

Below is a very useful training video link for Spark beginners by Intellipaat (copy righted to Intellipaat and shared through their YouTube Channel). Hope that will help to get some initial idea about Spark for sure!

Advertisements

One Reply to “Apache Spark: A promising framework for Big Data world!”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s