Blogs - Alephys

Designing a Scalable Data Loading and Custom Logging Framework for ETL Jobs using Hive and PySpark

June 3, 2025 Cloudera

Introduction Efficient ETL (Extract, Transform, Load) pipelines are the backbone of modern data processing architectures. However, building reliable pipelines requires more than just moving data — it...

Creating a Custom HTTP Source Connector for Kafka

May 15, 2025 Confluent

Introduction Apache Kafka has become the backbone of modern data pipelines, enabling real-time data streaming at scale. While Kafka provides many built-in connectors through its Connect API, sometimes...

Unlocking the Power of Databricks Serverless Compute for Everyone: A Game-Changer for Data Teams

May 14, 2025 Databricks

As cloud computing has transformed the technology landscape, we keep searching for better, faster, and cheaper ways to manage resources. Databricks Serverless Compute offers a practical solution for...

Cloudera Navigator to Apache Atlas Migration

April 26, 2025 Cloudera

Introduction Organizations using CDH for their Big Data requirements typically rely on Cloudera Navigator for features like search, auditing, and data lifecycle management. However, with the advent of...

Our Locations : Hyderabad, Texas, Singapore

Blog Posts

Designing a Scalable Data Loading and Custom Logging Framework for ETL Jobs using Hive and PySpark

Creating a Custom HTTP Source Connector for Kafka

Unlocking the Power of Databricks Serverless Compute for Everyone: A Game-Changer for Data Teams

Cloudera Navigator to Apache Atlas Migration

Our Locations

United States

India

Singapore