The bit of R code below illustrates the principal curves methods as described inĀ The Elements of Statistical Learning, by Hastie, Tibshirani, and Friedman (Ch. 14; the book is freely available from the authors' website). Specifically, the code generates some bivariate data that have a nonlinear association, initializes the principal curve using the first (linear) principal component, and then computes three iterations of the algorithm described in section 14.5.2. I used the 'animation' package to generate the following animated GIF, which illustrates these steps.
## generate some bivariate data
set.seed(42)
x1 <- seq(1,10,0.3)
w = .6067;
a0 = 1.6345;
a1 = -.6235;
b1 = -1.3501;
a2 = -1.1622;
b2 = -.9443;
x2 = a0 + a1*cos(x1*w) + b1*sin(x1*w) + a2*cos(2*x1*w) +
b2*sin(2*x1*w) + rnorm(length(x1),0,3/4)
x <- scale(cbind(x1,x2))
alim <- extendrange(x, f=0.1)
alim_ <- range(x)
## plot centered data
plot(x[,1], x[,2], bty='n',
xlab=expression(x[1]),
ylab=expression(x[2]),
xlim=alim, ylim=alim)
legend("topleft", legend=c("Initialize"), bty="n")
## plot first principal component line
svdx <- svd(x)
clip(alim_[1],alim_[2],alim_[1],alim_[2])
with(svdx, abline(a=0, b=v[2,1]/v[1,1]))
## plot projections of each point onto line
z1 <- with(svdx, x%*%v[,1]%*%t(v[,1]))
segments(x0=x[,1],y0=x[,2],
x1=z1[,1],y1=z1[,2])
## compute initial lambda (arc-lengths associated with
## orthogonal projections of data onto curve)
lam <- with(svdx, as.numeric(u[,1]*d[1]))
for(itr in 1:3) {
#### step (a) of iterative algorithm ####
## compute scatterplot smoother in either dimension
## increase 'df' to make the curve more flexible
fit1 <- smooth.spline(x=lam, y=x[,1], df=4)
fit2 <- smooth.spline(x=lam, y=x[,2], df=4)
## plot data and the principal curve for a sequence of lambdas
plot(x[,1], x[,2], bty='n',
xlab=expression(x[1]),
ylab=expression(x[2]),
xlim=alim, ylim=alim)
legend("topleft", legend=c("Step (a)"), bty="n")
seq_lam <- seq(min(lam),max(lam),length.out=100)
lines(predict(fit1, seq_lam)$y, predict(fit2, seq_lam)$y)
## show points along curve corresponding
## to original lambdas
z1 <- cbind(predict(fit1, lam)$y, predict(fit2, lam)$y)
segments(x0=x[,1],y0=x[,2],
x1=z1[,1],y1=z1[,2])
#### step (b) of iterative algorithm ####
## recompute lambdas
euc_dist <- function(l, x, f1, f2)
sum((c(predict(f1, l)$y, predict(f2, l)$y) - x)^2)
lam <- apply(x,1,function(x0) optimize(euc_dist,
interval=extendrange(lam, f=0.50),
x=x0, f1=fit1, f2=fit2)$minimum)
## show projections associated with recomputed lambdas
plot(x[,1], x[,2], bty='n',
xlab=expression(x[1]),
ylab=expression(x[2]),
xlim=alim, ylim=alim)
legend("topleft", legend=c("Step (b)"), bty="n")
seq_lam <- seq(min(lam),max(lam),length.out=100)
lines(predict(fit1, seq_lam)$y, predict(fit2, seq_lam)$y)
z1 <- cbind(predict(fit1, lam)$y, predict(fit2, lam)$y)
segments(x0=x[,1],y0=x[,2],
x1=z1[,1],y1=z1[,2])
}
After optimizating the euclidean distances, could one say that this is a non-parametric equivalent to orthogonal nonlinear least-squares?
Cheers,
Andrej
I don't think so, but I'm not certain. After reading the 'onls' vignette (https://cran.r-project.org/web/packages/onls/vignettes/onls.pdf), it seems that orthogonal NLS is used to estimate a function of one of the inputs, whereas the principal curve need not be a function of either variable (the principal function can be defined in higher dimensions as well).