Data EngineeringBeginner
The 'Small Files' Problem in Data Lakes: Why Your Kafka Sink is Slow
The 'Small Files' Problem: The Data Lake Killer Streaming data from Kafka into a Data Lake (like Amazon S3 or Azure Blob Storage) seems simple. However, if you write data as soon as it arrives, you will quickly hit the S…
Apr 20, 20262 min read
Deep Dive
#data-lake#s3#kafka-connect