statistics | BioStatMatt

After an embarrassing teleconference in which I presented a series of percentages that did not sum to 100 (as they should have), I found some R code on stackoverflow.com to help me to avoid this in the future.

In general, the sum of rounded numbers (e.g., using the base::round function) is not the same as their rounded sum. For example:

> sum(c(0.333, 0.333, 0.334))
[1] 1
> sum(round(c(0.333, 0.333, 0.334), 2))
[1] 0.99

The stackoverflow solution applies the following algorithm

Round down to the specified number of decimal places
Order numbers by their remainder values
Increment the specified decimal place of values with 'k' largest remainders, where 'k' is the number of values that must be incremented to preserve their rounded sum

Here's the corresponding R function:

round_preserve_sum <- function(x, digits = 0) {
  up <- 10 ^ digits
  x <- x * up
  y <- floor(x)
  indices <- tail(order(x-y), round(sum(x)) - sum(y))
  y[indices] <- y[indices] + 1
  y / up
}

Continuing with the example:

> sum(c(0.333, 0.333, 0.334))
[1] 1
> sum(round(c(0.333, 0.333, 0.334), 2))
[1] 0.99
> sum(round_preserve_sum(c(0.333, 0.333, 0.334), 2))
[1] 1

BioStatMatt

Tag Archives: statistics

Round values while preserve their rounded sum in R