Apache Hudi: The Definitive Guide (for Duc ka) by Shiyan Xu Prashant Wason Bhavani Sudha Saktheeswaran Rebecca Bilbro

Apache Hudi: The Definitive Guide (for Duc ka) by Shiyan Xu Prashant Wason Bhavani Sudha Saktheeswaran Rebecca Bilbro

Author:Shiyan Xu, Prashant Wason, Bhavani Sudha Saktheeswaran, Rebecca Bilbro
Language: eng
Format: epub
Publisher: O'Reilly Media, Inc.
Published: 2024-10-02T00:00:00+00:00


Figure 1-6. Illustration to show how Hudi Merge-on-Read tables work.

Figure 1-6 illustrates how different queries work in a Merge-on-Read Hudi table. Let’s just take snapshot queries for now and discuss other query types in later sections. We can see that a snapshot query that runs just after commit time T10 will merge the data from the base file and the append logs for file groups 1 & 2 on-the-fly. For file group 3, only the base file data will be returned. For file group 4, only merged data from the append logs will be returned (since there is no base file). The periodic compaction process on the write side (discussed in Chapter 6) reconciles these changes to produce a new version of the base file that has the latest updates to keep the query performances in check. In Figure 1-6, the periodic compaction process happened at commit time T5 to produce version T5 base file in all relevant file groups.

For the Fraud detection team’s use case described previously, the snapshot query issued at 10:47 am should return all the latest trips that are less than 3 mins in duration and happened in San Francisco city between 10:32 am and 10:47 am. This query would, on-the-fly, merge the updates from log files onto base files as needed from the relevant file groups.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
Whisky: Malt Whiskies of Scotland (Collins Little Books) by dominic roskrow(56089)
What's Done in Darkness by Kayla Perrin(26623)
The Fifty Shades Trilogy & Grey by E L James(19101)
Shot Through the Heart: DI Grace Fisher 2 by Isabelle Grey(19088)
Shot Through the Heart by Mercy Celeste(18956)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 10 by Isuna Hasekura and Jyuu Ayakura(17140)
Python GUI Applications using PyQt5 : The hands-on guide to build apps with Python by Verdugo Leire(17030)
Peren F. Statistics for Business and Economics...Essential Formulas 3ed 2025 by Unknown(16902)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 03 by Isuna Hasekura and Jyuu Ayakura & Jyuu Ayakura(16843)
Wolf & Parchment: New Theory Spice & Wolf, Vol. 01 by Isuna Hasekura and Jyuu Ayakura & Jyuu Ayakura(16471)
The Subtle Art of Not Giving a F*ck by Mark Manson(14395)
The 3rd Cycle of the Betrayed Series Collection: Extremely Controversial Historical Thrillers (Betrayed Series Boxed set) by McCray Carolyn(14162)
Stepbrother Stories 2 - 21 Taboo Story Collection (Brother Sister Stepbrother Stepsister Taboo Pseudo Incest Family Virgin Creampie Pregnant Forced Pregnancy Breeding) by Roxi Harding(13683)
Scorched Earth by Nick Kyme(12789)
Drei Generationen auf dem Jakobsweg by Stein Pia(10986)
Suna by Ziefle Pia(10906)
The Ultimate Python Exercise Book: 700 Practical Exercises for Beginners with Quiz Questions by Copy(10862)
De Souza H. Master the Age of Artificial Intelligences. The Basic Guide...2024 by Unknown(10824)
D:\Jan\FTP\HOL\Work\Alien Breed - Tower Assault CD32 Alien Breed II - The Horror Continues Manual 1.jpg by PDFCreator(10810)
Scythe by Neal Shusterman(10375)