The Knowledge Hub

Ensemble Learning | Machine Learning | Data Science

By Ahmed Abulkhair / November 15, 2020 / 0 Comments

Docker Commands | Dockers

By Mahmoud Fetiha / August 22, 2020 / 0 Comments

Introduction to Apache Flink | Apache Flink

By Ahmed Hesham / May 10, 2020 / 0 Comments

Functions in Scala – Part 1 | Scala

By Ahmed Ibrahem / April 12, 2020 / 0 Comments

Setup Free AWS RDS Instance | AWS RDS

By Ahmed Ibrahem / April 5, 2020 / 0 Comments

Apache Hive Table Types | Apache Hive

Apache Hive Table Types | Apache Hive

Apache Hive is designed to give data engineers and data scientist a SQL like access to the big data available in Hadoop cluster, so we can think of it as a normal RDBMS, in normal RDBMS we have database, and ...
Data Science Roadmap .. Concepts, Tools, and Technologies

Data Science Roadmap .. Concepts, Tools, and Technologies

In this article, we will depict some skills and concepts that must be learned in the journey of becoming a data scientist but first, what is data science?  Data Science is the art of uncovering the insights and trends in ...
Introduction to Hive | Apache Hive

Introduction to Hive | Apache Hive

Hive was initially developed by Facebook in 2007 to help the company handle massive amounts of new data. At the time Hive was created, Facebook had a 15TB dataset they needed to work with. A few short years later, that ...
Setup Talend Open Studio on Linux

Setup Talend Open Studio on Linux

Introduction Talend is an open-source data integration platform. It provides different solutions and services for data integration, data quality, cloud storage, and Big Data. According to the latest Gartner report, Talend named in the leader’s quadrant among other data integration ...
How to choose your ETL solution | Data Integration

How to choose your ETL solution | Data Integration

ETL stands for Extraction Transform Load is a common concept in data engineering, and as we can imply from the name of the concept that this concept has three types of operations, Extract which indicate the process of extracting data ...
Apache Kafka and Apache Spark Integration | Apache Kafka | Apache Spark

Apache Kafka and Apache Spark Integration | Apache Kafka | Apache Spark

Introduction Apache Kafka is a scalable, high performance, low latency platform that allows reading and writing streams of data like a messaging system. We can start writing Kafka applications using Java fairly easily, check our previous article on how to design a Kafka pipeline ...