Tag Archives: statistics

Round values while preserve their rounded sum in R

After an embarrassing teleconference in which I presented a series of percentages that did not sum to 100 (as they should have), I found some R code on stackoverflow.com to help me to avoid this in the future.

In general, the sum of rounded numbers (e.g., using the base::round function) is not the same as their rounded sum. For example:

> sum(c(0.333, 0.333, 0.334))
[1] 1
> sum(round(c(0.333, 0.333, 0.334), 2))
[1] 0.99

The stackoverflow solution applies the following algorithm

    1. Round down to the specified number of decimal places
    2. Order numbers by their remainder values
    3. Increment the specified decimal place of values with 'k' largest remainders, where 'k' is the number of values that must be incremented to preserve their rounded sum

Here's the corresponding R function:

round_preserve_sum <- function(x, digits = 0) {
  up <- 10 ^ digits
  x <- x * up
  y <- floor(x)
  indices <- tail(order(x-y), round(sum(x)) - sum(y))
  y[indices] <- y[indices] + 1
  y / up
}

Continuing with the example:

> sum(c(0.333, 0.333, 0.334))
[1] 1
> sum(round(c(0.333, 0.333, 0.334), 2))
[1] 0.99
> sum(round_preserve_sum(c(0.333, 0.333, 0.334), 2))
[1] 1