Statistical Data Cleaning with Applications in R by Mark van der Loo & Edwin de Jonge

Statistical Data Cleaning with Applications in R by Mark van der Loo & Edwin de Jonge

Author:Mark van der Loo & Edwin de Jonge
Language: eng
Format: epub
ISBN: 9781118897133
Publisher: Wiley
Published: 2018-01-25T00:00:00+00:00


6.5.2 Validating in the Pipeline

The pipe operator (%>%) of the magrittr package (Bache and Wickham, 2014) makes it easy to perform consecutive data manipulations on a dataset. The functions check_that and confront have been designed to conform to the pipe operator. For example, we can do

retailers %>% check_that(turnover >= 0, staff >= 0) %>% summary() ## rule items passes fails nNA error warning expression ## 1 V1 60 56 0 4 FALSE FALSE (turnover - 0) >= -1e-08 ## 2 V2 60 54 0 6 FALSE FALSE (staff - 0) >= -1e-08

For more involved checks, it is more convenient to define a validator object first.

v <- validator(turnover>= 0, staff >= 0) retailers %>% confront(v) %>% summary() ## rule items passes fails nNA error warning expression ## 1 V1 60 56 0 4 FALSE FALSE (turnover - 0) >= -1e-08 ## 2 V2 60 54 0 6 FALSE FALSE (staff - 0) >= -1e-08



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.