Prakash is an experienced Data Engineer with a background in both traditional data warehousing as well as Big Data and Analytics. Before Intersys, Prakash played a lead role in successfully implementing enterprise level ETL architecture and Hadoop data lake in the Healthcare vertical. Prakash enjoys learning and exploring new open source technologies in NoSQL, Data Analytics, and Hadoop area.

Posts by Prakash Singh

Apache NiFi as an Orchestration Engine

Apache NiFi as an Orchestration Engine Orchestration of services is a pivotal part of Service Oriented Architecture (SOA). Apache NiFi provides a highly configurable simple Web-based user interface to design orchestration framework that can address enterprise level data flow and orchestration needs together. While most other frameworks primarily are for service orchestration only, NiFi can …

Spark RDD vs Spark SQL Performance comparison using Spark Java APIs

Resilient Distributed Dataset (RDD) is the main abstraction of Spark framework while Spark SQL (a Spark module for structured data processing) provides Spark more information about the structure of both the data and the computation being performed, and therefore uses this extra information to perform extra optimizations. Up until Spark 1.6, RDDs used to perform …