Mohamed Tarek

Mohamed Tarek is a Data Engineer with a great passion about data and all technologies around data Ecosystem, he specialized in data integration and data streaming technologies, he is a part of a development team who delivers end to end solutions including data integration, data model designs, and data analytics.

Mohamed Tarek

Introduction to Impala .. Architecture and Components | Impala

Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Cloudera Impala query UI in Hue) as Apache Hive. This provides a familiar …

Introduction to Impala .. Architecture and Components | Impala Read More »

Aggregation Queries in Apache Hive | Apache Hive

Introduction Data aggregation is the process of gathering and expressing data in a summary to get more information about particular groups based on specific conditions. HiveQL offers several built-in aggregate functions, such as max, min, avg,..etc. It also supports advanced aggregation using keywords such as Variance and Standard Deviation and different types of window functions. …

Aggregation Queries in Apache Hive | Apache Hive Read More »

Apache Kafka and Apache Spark Integration | Apache Kafka | Apache Spark

Introduction Apache Kafka is a scalable, high performance, low latency platform that allows reading and writing streams of data like a messaging system. We can start writing Kafka applications using Java fairly easily, check our previous article on how to design a Kafka pipeline in Java. If you research the variety of real-world use-cases for Kafka, you will very …

Apache Kafka and Apache Spark Integration | Apache Kafka | Apache Spark Read More »

Create a Kafka Pipeline using Java Application | Apache Kafka

Introduction This Article is about Programming Apache Kafka producer and consumer using Java language, as we’ll see, using Java we’ll be able to reproduce what the CLI does and even more. Prerequisites Kafka Installation and configuration article ( To setup cluster will be used in this article) Any java programming editor Ex. (Netbeans – IntelliJ …

Create a Kafka Pipeline using Java Application | Apache Kafka Read More »

Setup Apache Kafka Environment | Apache Kafka

Introduction This article is about configuring and starting an Apache Kafka server on a Windows OS and Linux. This guide will also provide instructions to set up Java and Apache Zookeeper, and after the setup we will create a simple pipeline to test our installation. Kafka on windows Make sure you have the following prerequisites …

Setup Apache Kafka Environment | Apache Kafka Read More »