Variância do produto de k variáveis ​​aleatórias correlacionadas


Respostas:


12

Mais informações sobre esse tópico do que você provavelmente precisa podem ser encontradas em Goodman (1962): "A variação do produto de K variáveis ​​aleatórias" , que deriva fórmulas para variáveis ​​aleatórias independentes e variáveis ​​aleatórias potencialmente correlacionadas, juntamente com algumas aproximações. Em um artigo anterior ( Goodman, 1960 ), a fórmula para o produto de exatamente duas variáveis ​​aleatórias foi derivada, o que é um pouco mais simples (embora ainda bastante complicado), para que possa ser um lugar melhor para começar se você quiser entender a derivação .

Para ser completo, porém, é assim.

Duas variáveis

Suponha o seguinte:

  • x ey são duas variáveis ​​aleatórias
  • X eY são suas expectativas (diferentes de zero)
  • V(x) eV(y) são suas variações
  • δx=(xX)/X (e da mesma forma paraδy )
  • Di,j=E[(δx)i(δy)j]
  • Δx=xX (e da mesma forma paraΔy )
  • Ei,j=E[(Δx)i(Δy)j]
  • é o coeficiente de variação ao quadrado: V ( x ) / X 2 (da mesma forma para G ( Y ) )G(x)V(x)/X2G(Y)

Então: ou equivalente:

V(xy)=(XY)2[G(y)+G(x)+2D1,1+2D1,2+2D2,1+D2,2D1,12]

V(xy)=X2V(y)+Y2V(x)+2XYE1,1+2XE1,2+2YE2,1+E2,2E1,12

Mais de duas variáveis

O artigo de 1960 sugere que este é um exercício para o leitor (que parece ter motivado o artigo de 1962!).

A notação é semelhante, com algumas extensões:

  • (x1,x2,xn) be the random variables instead of x and y
  • M=E(i=1kxi)
  • A=(M/i=1kXi)1
  • si = 0, 1, or 2 for i=1,2,k
  • u = number of 1's in (s1,s2,sk)
  • m = number of 2's in (s1,s2,sk)
  • D(u,m)=2u2 for m=0 and 2u for m>1,
  • C(s1,s2,,sk)=D(u,m)E(i=1kδxisi)
  • s1sk indicates summation of the 3kk1 sets of (s1,s2,sk) where 2m+u>1

Then, at long last:

V(i=1kxi)=Xi2(s1skC(s1,s2sk)A2)

See the papers for details and slightly more tractable approximations!


please note, that the above answer from Matt Krause contains a mistake as well as the paper itself. In the definition of the function C(s1,...,sk) it should be a product instead of a sum.
Nicolas Gisler

Could you elaborate a little bit more..? "Because I - an anonymous person from the Internet - say so" is not really an answer...
Tim

If you try to get the variance var(x*y) for independent random variables, via the formula for arbitrary k you can see that only a product and not a sum gives you the correct answer. In addition, if you look at the paper you can see it as well, on page 59 of the paper (at least in my version) he used a product instead of a sum.
Nicolas Gisler

1
For the case of two random variables, an easier-to-read formula for the variance of the product of two correlated random variables can be found in this answer by @macro. This answer also points out the essential problem in
V(xy)=X2V(y)+Y2V(x)+2XYE1,1+2XE1,2+2YE2,1+E2,2E1,12,
viz., the thicket of notation conceals the essential fact that there are terms in it whose value cannot be determined unless we know cov(x2,y2), or enough about the joint density of the two random variables to determine this quantity.
Dilip Sarwate

An edit suggestion, that should really have been a comment, suggested that the original paper contained a typo where a sum and product were mixed up and this answer should be amended. See stats.stackexchange.com/review/suggested-edits/83662
Silverfish

4

Just to add to the awesome answer of Matt Krause (in fact easily derivable from there). If x, y are independent then,

E1,1=E[(xE[x])(yE[y])]=Cov(x,y)=0E1,2=E[(xE[x])(yE[y])2]=E[xE(x)]E[(yE[y])2]=(E[x]E[x])E[(yE[y])2]=0E2,1=0E2,2=E[(xE[x])2(yE[y])2]=E[(xE[x])2]E[(yE[y])2=V[x]V[y]V[xy]=E[x]2V[y]+E[y]2V[x]+V[x]V[y]

1
The result for the case of n independent random variables has been discussed here.
Dilip Sarwate

3

In addition to the general formula given by Matt it may be worth noting that there is a somewhat more explicit formula for zero mean Gaussian random variables. It follows from Isserlis' theorem, see also Higher moments for the centered multivariate normal distribution.

Suppose that (x1,,xk) follows a multivariate normal distribution with mean 0 and covariance matrix Σ. If the number of variables k is odd, E(ixi)=0 and

V(ixi)=E(ixi2)=Σ~i,j
where Σ means sum over all partitions of {1,,2k} into k disjoint pairs {i,j} with each term being a product of the corresponding k Σ~i,j's, and where
Σ~=(ΣΣΣΣ)
is the covariance matrix for (x1,,xk,x1,,xk). If k is even,
V(ixi)=Σ~i,j(Σi,j)2.
In the case k=2 we get
V(x1x2)=Σ1,1Σ2,2+2(Σ1,2)2Σ1,22=Σ1,1Σ2,2+(Σ1,2)2.
If k=3 we get
V(x1x2x3)=Σi,jΣk,lΣr,t,
where there are 15 terms in the sum.

It is, in fact, possible to implement the general formula. The most difficult part appears to be the computation of the required partitions. In R, this can be done with the function setparts from the package partitions. Using this package it was no problem to generate the 2,027,025 partitions for k=8, the 34,459,425 partitions for k=9 could also be generated, but not the 654,729,075 partitions for k=10 (on my 16 GB laptop).

A couple of other things are worth noting. First, for Gaussian variables with non-zero mean it should be possible to derive an expression as well from Isserlis' theorem. Second, it is unclear (to me) if the above formula is robust against deviations from normality, that is, if it can be used as an approximation even if the variables are not multivariate normally distributed. Third, though the formulas above are correct, it is questionable how much the variance tells about the distribution of the products. Even for k=2 the distribution of the product is quite leptokurtic, and for larger k it quickly becomes extremely leptokurtic.


Neat approach! For what it's worth, the formula in my answer also has a combinatorial blow-up: the summation over C involves summing O(3k) terms.
Matt Krause
Ao utilizar nosso site, você reconhece que leu e compreendeu nossa Política de Cookies e nossa Política de Privacidade.
Licensed under cc by-sa 3.0 with attribution required.