Hadoop is synonymous with Big Data. In this course we learn about Hadoop storage system, learn to extract data and also develop various machine learning models. We work with a number of applications that run on hadoop.

Apache Kafka is a streaming platform that enables one to organize and manage data from many different sources with one reliable, high performance system. A streaming platform provides not only the system to transport data, but all of the tools needed to connect data sources, applications, and data sinks to the platform. We carry out experiments to understand the complete ecosystem in depth.