Existem dois elementos no exemplo de @ Peter, que podem ser úteis para desemaranhar:
(1) Especificação incorreta do modelo. Os modelos
yi=β0+β1xi+εi(1)
&
wi=γ0+γ1zi+ζi(2)
wi=yixi−−√zi=xi−−√
wi=β0z2i+β1+εiz2i−−−−−−−−−−−√(1)
yi=(γ0x−−√i+γ1x−−√i+ζix−−√i)2(2)
YXβ1=0WZ
YXWZW
EYx−−√=EY−−√z≈β0−−√+VarY8β3/20z
z
Seguindo o exemplo ...
set.seed(123)
x <- rnorm(100, 20, 2)
y <- rnorm(100, 20, 2)
w <- (y/x)^.5
z <- x^.5
wrong.model <- lm(w~z)
right.model <- lm(y~x)
x.vals <- as.data.frame(seq(15,25,by=.1))
names(x.vals) <- "x"
z.vals <- as.data.frame(x.vals^.5)
names(z.vals) <- "z"
plot(x,y)
lines(x.vals$x, predict(right.model, newdata=x.vals), lty=3)
lines(x.vals$x, (predict(wrong.model, newdata=z.vals)*z.vals)^2, lty=2)
abline(h=20)
legend("topright",legend=c("data","y on x fits","w on z fits", "truth"), lty=c(NA,3,2,1), pch=c(1,NA,NA,NA))
plot(z,w)
lines(z.vals$z,sqrt(predict(right.model, newdata=x.vals))/as.matrix(z.vals), lty=3)
lines(z.vals$z,predict(wrong.model, newdata=z.vals), lty=2)
lines(z.vals$z,(sqrt(20) + 2/(8*20^(3/2)))/z.vals$z)
legend("topright",legend=c("data","y on x fits","w on z fits","truth"),lty=c(NA,3,2,1), pch=c(1,NA,NA,NA))
yxwzw against z, might be tempted to think that intervening to increase z will reduce w—we can only hope & pray they don't succumb to a temptation we've all been incessantly warned against; that of confusing correlation with causation.
Aldrich (2005), "Correlations Genuine and Spurious in Pearson and Yule", Statistical Science, 10, 4 provides an interesting historical perspective on these issues.