Round to Integer Keeping the Sum Fixed
roundfixS.RdGiven a real numbers \(y_i\) with the particular property that
\(\sum_i y_i\) is integer, find integer numbers \(x_i\)
which are close to \(y_i\) (\(\left|x_i - y_i\right| < 1 \forall i\)), and have identical “marginal”
sum, sum(x) == sum(y).
As I found later, the problem is known as “Apportionment Problem” and it is quite an old problem with several solution methods proposed historically, but only in 1982, Balinski and Young proved that there is no method that fulfills three natural desiderata.
Note that the (first) three methods currently available here were all (re?)-invented by M.Maechler, without any knowledge of the litterature. At the time of writing, I have not even checked to which (if any) of the historical methods they match.
Usage
roundfixS(x, method = c("offset-round", "round+fix", "1greedy"))Details
Without hindsight, it may be surprising that all three methods give
identical results (in all situations and simulations considered),
notably that the idea of ‘mass shifting’ employed in the
iterative "1greedy" algorithm seems equivalent to the much simpler
idea used in "offset-round".
I am pretty sure that these algorithms solve the \(L_p\) optimization problem, \(\min_x \left\|y - x\right\|_p\), typically for all \(p \in [1,\infty]\) simultaneously, but have not bothered to find a formal proof.
Value
a numeric vector, say r, of the same length as x, but
with integer values and fulfulling sum(r) == sum(x).
References
Michel Balinski and H. Peyton Young (1982) Fair Representation: Meeting the Ideal of One Man, One Vote;
https://en.wikipedia.org/wiki/Apportionment_paradox
https://www.ams.org/samplings/feature-column/fcarc-apportionii3
See also
round etc
Examples
## trivial example
kk <- c(0,1,7)
stopifnot(identical(kk, roundfixS(kk))) # failed at some point
x <- c(-1.4, -1, 0.244, 0.493, 1.222, 1.222, 2, 2, 2.2, 2.444, 3.625, 3.95)
sum(x) # an integer
#> [1] 17
r <- roundfixS(x)
stopifnot(all.equal(sum(r), sum(x)))
m <- cbind(x=x, `r2i(x)` = r, resid = x - r, `|res|` = abs(x-r))
rbind(m, c(colSums(m[,1:2]), 0, sum(abs(m[,"|res|"]))))
#> x r2i(x) resid |res|
#> [1,] -1.400 -1 -0.400 0.400
#> [2,] -1.000 -1 0.000 0.000
#> [3,] 0.244 0 0.244 0.244
#> [4,] 0.493 1 -0.507 0.507
#> [5,] 1.222 1 0.222 0.222
#> [6,] 1.222 1 0.222 0.222
#> [7,] 2.000 2 0.000 0.000
#> [8,] 2.000 2 0.000 0.000
#> [9,] 2.200 2 0.200 0.200
#> [10,] 2.444 2 0.444 0.444
#> [11,] 3.625 4 -0.375 0.375
#> [12,] 3.950 4 -0.050 0.050
#> [13,] 17.000 17 0.000 2.664
chk <- function(y) {
cat("sum(y) =", format(S <- sum(y)),"\n")
r2 <- roundfixS(y, method="offset")
r2. <- roundfixS(y, method="round")
r2_ <- roundfixS(y, method="1g")
stopifnot(all.equal(sum(r2 ), S),
all.equal(sum(r2.), S),
all.equal(sum(r2_), S))
all(r2 == r2. & r2. == r2_) # TRUE if all give the same result
}
makeIntSum <- function(y) {
n <- length(y)
y[n] <- ceiling(y[n]) - (sum(y[-n]) %% 1)
y
}
set.seed(11)
y <- makeIntSum(rnorm(100))
chk(y)
#> sum(y) = -12
#> [1] TRUE
## nastier example:
set.seed(7)
y <- makeIntSum(rpois(100, 10) + c(runif(75, min= 0, max=.2),
runif(25, min=.5, max=.9)))
chk(y)
#> sum(y) = 1128
#> [1] TRUE
if (FALSE) { # \dontrun{
for(i in 1:1000)
stopifnot(chk(makeIntSum(rpois(100, 10) +
c(runif(75, min= 0, max=.2),
runif(25, min=.5, max=.9)))))
} # }