Modularising code

coding-practices
Author

Shaun Nielsen

Published

November 27, 2023

I’ve been working on a project where I’ve had to incorporate better development practices. One issue was the long-winded functions that required breaking up into into smaller pieces for unit testing, and easy digestion by future developers.

Background

  • Working with non-programming but data/stat experts
  • Long functions
  • No testing

Example function

this_long_function <- function(data){
  
  data_tmp <-
    data |>  
    do_this |> 
    then_do_that |> 
    and_that
  
  population_dataset <-
    read_this_file()
  
  data_tmp <-
    data_tmp |> 
    join(population_dataset)
  
  data_model <- model_this(data_tmp)
  
  data_model_estimates <-
    data_model |> 
    predict()
  
  
  data_final <-
    data_model_estimates |> 
    do_this |> 
    then_do_that |> 
    and_that
  
  data_final
  
}

Issues: How do we easily understand what each step is doing and further ensure they’re doing them correctly?

Modular code

structure_input_data <- function(data){
data |>  
do_this |> 
then_do_that |> 
and_that
}

join_population_data <- function(data){

population_dataset <-
read_this_file()

data |> 
join(population_dataset)
}

# ...

this_long_function <- function(data){

data_tmp <-
structure_input_data()

data_tmp <-
join_population_data()

# ...

}

Testing modular code

Now we can test the functionality and expected outputs of the functions

test_that("structure_input_data works", {
  
  # A test dataset
  test_data =
    data.frame(
      year = c(2022, 2021, 2022, 2021),
      sex = c('male', 'male', 'female', 'female'),
      value = c(4,3,2,1)
    )
  
  # What is to be expected
  expected_data =
    data.frame(
      ReportingValue1 = c(2022, 2021, 2022, 2021),
      ReportingValue2 = c('male', 'male', 'female', 'female'),
      Value = c(4,3,2,1)
    )
  
  # The output from the function
  result <- 
    structure_input_data(test_data)
  
  # A number of tests
  testthat::expect_true(
    names(result) == names(expected_data)
  )
  
   testthat::expect_true(
    result$ReportingValue1 == test_data$year
  )
  
  
})