Edit: This answer is based on an incorrect assumption that likelihood of the marginal counts given px,y is only a function of the marginal probabilities px=∑ypx,y and py=∑xpx,y. I'm still thinking about it.
Wrong stuff follows:
As mentioned in a comment, the problem with finding "the" maximum-likelihood estimator for px,y is that it's not unique. For instance, consider the case with binary X,Y and marginals S1=S2=T1=T2=10. The two estimators
p=(120012),p=(14141414)
have the same marginal probabilities px and py in all cases, and hence have equal likelihoods (both of which maximize the likelihood function, as you can verify).
Indeed, no matter what the marginals are (as long as two of them are nonzero in each dimension), the maximum likelihood solution is not unique. I'll prove this for the binary case. Let p=(acbd) be a maximum-likelihood solution. Without loss of generality suppose 0<a≤d. Then p=(0c+ab+ad−a) has the same marginals and is thus also a maximum-likelihood solution.
If you want to additionally apply a maximum-entropy constraint, then you do get a unique solution, which as F. Tussell stated is the solution in which X,Y are independent. You can see this as follows:
The entropy of the distribution is H(p)=−∑x,ypx,ylogpx,y; maximizing subject to ∑xpx,y=py and ∑ypx,y=px (equivalently, g⃗ (p)=0 where gx(p)=∑ypx,y−px and gy(p)=∑xpx,y−py) using Lagrange multipliers gives the equation:
∇H(p)=∑k∈X∪Yλk∇gk(p)
All the gradients of each gk are 1, so coordinate-wise this works out to
1−logpx,y=λx+λy⟹px,y=e1−λx−λy
plus the original constraints ∑xpx,y=py and ∑ypx,y=px. You can verify that this is satisfied when e1/2−λx=px and e1/2−λy=py, giving
px,y=pxpy.
maximum-entropy
tag? Você está buscando uma solução de entropia máxima?