Algoritmo para minimizar a área de superfície, dado o volume

Considere a seguinte tarefa algorítmica:

Entrada: um número inteiro positivo $n$ , juntamente com sua fatoração primária.
Encontre: números inteiros positivos $x,y,z$ que minimizam $xy+yz+xz$ , sujeitos à restrição de que $xyz=n$

Qual é a complexidade desse problema? Existe um algoritmo de tempo polinomial? É NP-difícil?

Esse problema basicamente pergunta: de todos os sólidos retangulares cujo volume é e cujas dimensões são todas inteiras, qual possui a menor área de superfície? $n$

Esse problema foi proposto por Dan Meyer, sob o título O problema de matemática que mil professores de matemática não conseguiram resolver . Até agora, nenhum dos professores de matemática com quem ele trabalhou encontrou um algoritmo razoável para esse problema. No seu contexto, a definição de "razoável" é um pouco imprecisa, mas, como cientistas da computação, podemos fazer uma pergunta mais precisa sobre a complexidade desse problema.

A abordagem óbvia é enumerar todas as possibilidades para , mas isso leva tempo exponencial. Os comentaristas do blog de Dan Meyer propuseram muitos algoritmos candidatos eficientes que, infelizmente, todos estavam incorretos. Martin Strauss sugere que esse problema parece vagamente remanescente de 3 partições $x,y,z$ , mas não vejo uma redução.

Deixe-me também esclarecer alguns conceitos errôneos que eu vi nos comentários / respostas:

Você não pode reduzir a partir de 3 partições simplesmente substituindo cada número por sua potência , pois as funções objetivas dos dois problemas são diferentes. A redução óbvia simplesmente não funciona. $q$ $2^q$
Não é verdade que a solução ideal envolva escolher um de para ser o divisor mais próximo de a $x,y,z$ $n$ . Vejo várias pessoas que estão assumindo que isso é verdade, mas, na verdade, isso não está correto. Isso já foi refutado no blog de Dan Meyer. Por exemplo, considere; $\sqrt[3]{n}$ $n=68$ e 4 divide 68, então você pode pensar que pelo menos um dedeve ser 4; no entanto, isso não está correto. A solução ideal é,,. Outro contra-exemplo é, $\sqrt[3]{68} \approx 4$ $x,y,z$ $x=2$ $y=2$ $z=17$ $n=222$ , mas a solução ideal é,,. (Elapodeser verdade que, para todos os, a solução ideal envolve fazer, pelo menos, um deser igual quer a menor divisor demaior do que $\sqrt[3]{222}\approx 6$ $x=37$ $y=3$ $z=2$ $n$ $x,y,z$ $n$ ouo maior divisor demenor que $\sqrt[3]{n}$ $n$ - não tenho um contra-exemplo no momento - mas se você acha que essa afirmação é verdadeira, seria necessária uma prova. Você absolutamente não pode assumir que é verdade.) $\sqrt[3]{n}$
"Fazer ter o mesmo tamanho" não parece necessariamente produzir a resposta ideal em todos os casos; veja a publicação no blog de Dan Meyer para obter contra-exemplos. Ou, pelo menos, para algumas interpretações razoáveis da frase "torná-las aproximadamente do mesmo tamanho", há contra-exemplos que mostram que essa estratégia não é de fato ideal. Se você quiser tentar alguma estratégia desse tipo, declare com precisão a afirmação e forneça uma prova matemática cuidadosa. $x,y,z$
Um tempo de execução de não é polinomial. Para que esse problema esteja em P, o tempo de execução deve ser um polinômio no comprimento da entrada . O comprimento da entrada é algo como , não . O algoritmo óbvio de força bruta pode ser executado no tempo ou , mas isso é exponencial em e, portanto, conta como um algoritmo de tempo exponencial. Portanto, isso não ajuda. $O(n^3)$ $\lg n$ $n$ $O(n^3)$ $O(n^2)$ $\lg n$

algorithms optimization integers

— DW
fonte

Interessante. Minha abordagem ingênua seria "tornar

aproximadamente do mesmo tamanho", generalizando a idéia de que o cubo é o sólido retangular com a menor área de superfície para um determinado volume. Isso funcionaria? E se sim: não vejo como fazer isso com eficiência, mas há uma redução mais fácil de conseguir, talvez?

x, y, z

$x,y,z$

— G. Bach

Uma redução será um pesadelo, pois você precisa de uma maneira de gerar números primos adequados. O melhor que você poderia esperar é uma redução aleatória, usando algo como o Teorema de Dirichlet para gerar números primos adequados, mas mesmo isso parece improvável.

— Tom van der Zanden

@ G.Bach, acho que o artigo do blog considera várias heurísticas dessa veia (por exemplo, comece com cada um de

como o número inteiro mais próximo de

x, y, z

$x,y,z$

\sqrt[3]{n}

$\sqrt[3]{n}$ and then adjust them a tiny bit), and shows explicit counterexamples for each. But maybe you have an algorithm that they haven't considered?

— D.W.

oeis.org/A075777 seems to claim an algorithm, but it appears to be incorect (n=1332 generates 9,4,37 instead of 6,6,37 for example)

— Scott Farrar

Here's an observation that may be useful. Given

x

$x$ , the optimal

y, z

$y,z$ do in fact satisfy the "naive dream": they must be the pair of factors of

n / x

$n/x$ closest to

\sqrt{n / x}

$\sqrt{n/x}$ . (This is easy to prove.) At an optimal solution

x^{*}, y^{*}, z^{*}

$x^*,y^*,z^*$ , this condition must hold for all three variables simultaneously:

x^{*}, y^{*}

$x^*,y^*$ are the pair corresponding to

z^{*}

$z^*$ , etc. One implication: given

z

$z$ , there is only one possible pair

x, y

$x,y$ with which it can be optimal. Unfortunately, (1) this condition does not uniquely identify the optimal triple; (2) I don't see how to find the corresponding pair fast.

— usul

Respostas:

Aqui está uma versão modificada do algoritmo "escolher divisor próximo à raiz do cubo". Ele ainda deve ter força bruta em muitos casos, por isso não tenho certeza do quanto de uma melhoria real é sábia em relação à enumeração de todos os casos. No entanto, enviei-o como uma correção ao algoritmo no OEIS (o que gerou resultados incorretos) porque acredito que deve ser preciso pelo menos.

Aqui está o algoritmo para encontrar (s1, s2, s3) e a área de superfície de um prisma retangular, dado seu volume n:

Dado n, encontre a raiz do cubo.
Defina um valor inicial inteiro s1 no teto dessa raiz do cubo.
Teste para ver se s1 é um divisor de n e, se não, reduza s1 em 1.
Se um divisor s1 for encontrado, defina um s2 inicial como o teto da raiz quadrada de (n / s1).
Em seguida, teste para ver se s2 é um divisor de n / s1 e, se não, reduza s2 em 1.
Quando um divisor s2 é encontrado, s3 é definido como n / (s1 * s2).
A área atual da superfície é calculada por 2 * (s1 * s2 + s1 * s3 + s2 * s3).
O SA atual é comparado com o mínimo atual. Se for a primeira área de superfície calculada, ela será armazenada como minSA. Após o primeiro, testamos para ver se o SA atual é menor que minSA e, em caso afirmativo, armazenamos em minSA.

Esse algoritmo enumera alguns dos triplos (s1, s2, s3), mas precisa apenas testar os divisores sob a raiz do cubo. (Como nem todos os três divisores podem estar acima da raiz do cubo). De maneira semelhante, s2 precisa apenas testar os divisores de n / s1 sob a raiz quadrada de n / s1, pois nem ambos os divisores podem estar acima da raiz quadrada)

Uma observação na etapa 3: se a raiz do cubo é um divisor, n é um cubo e podemos parar aí com uma área superficial mínima 6 * s1 ^ 2 da caixa (s1, s1, s1).

Python:

import math
def minSArectprism(n):
    s1_0 = int(math.ceil(n ** (1 / 3.0))) 
    minSA=-1
    s1 = s1_0
    while s1>=1:
        while n % s1 > 0:  
            s1 = s1 - 1
        s1quot = int(n/s1) 
        s2_0 = int(math.ceil(math.sqrt(n/s1)))
        s2 = s2_0
        while s2>=1:
            while s1quot % s2 > 0:
                s2 = s2 - 1
            s3 = int(n / (s1 * s2))  
            SA = 2*(s1*s2 + s1*s3 + s2*s3)  
            if minSA==-1:
                minSA=SA
            else:
                if SA<minSA:
                    minSA=SA
            s2 = s2 - 1
        s1 = s1 - 1    
    return minSA

— Scott Farrar
fonte

Seu algoritmo leva tempo exponencial. Cada loop examina cerca de

\sqrt[3]{n}

$\sqrt[3]{n}$ possible candidates, so the running time is

O ({\sqrt[3]{n}}^{2}) = O (n^{2 / 3})

$O(\sqrt[3]{n}^2) = O(n^{2/3})$ , which is exponential, not polynomial time. Thus, this algorithm doesn't answer the question. (I already mentioned an exponential-time algorithm in the question.)

— D.W.

Hmm, y is not confined under the cube root of n, for example, n=1332, we will eventually test s1=2, meaning s2 will be under the square root of 1332/2 ~= 26. Indeed (2,18,37) is tested with y and z above the cube root.

— Scott Farrar

@ScottFarrar, yes, I know. I did not include all the gory details of the complexity analysis; there wasn't space in a single comment. If you do include the gory details, I think you'll find that you get the running time I quoted. You can either trust me :-), or read our reference question to learn more about those gory details. In any case, even if you removed the inner loop, the outer loop still does

Θ (n^{1 / 3})

$\Theta(n^{1/3})$ iterations, so the running time of your algorithm is at least

Ω (n^{1 / 3})

$\Omega(n^{1/3})$ -- i.e., it is certainly exponential.

— D.W.

The problem would of course be related to factoring complexity if prime decompositions were not given. Given the factors, and taking logs of all the prime factors, this problem appears to be about the same as minimizing the deviation-from-mean of $k$ partition sums (exercise, maybe either analytically or experimentally, find how closely this intuitive approximation of the problem holds).

Here this is the 3-way case (partition sums are $\log(x), \log(y), \log(z)$ ). the 2-way case has been extensively studied and is NP hard (from 1^st ref). (This 2-way case is not quite the same as the known NP-complete 2-way Partition problem where partition sums are equal. Note equal partition sums implies 0 deviation in partition sums and vice versa.) The 2^nd ref studies 3-way and $n$ -way partitioning, partly empirically, where there is not as much study as the 2-way case.

The Easiest Hard Problem: Number Partitioning / Mertens
Multi-Way Number Partitioning / Korf

— vzn
fonte

This answer is not helpful and does not answer the question. 1. I am looking for proofs or evidence, not speculation. There is no evidence that minimizing deviation yields an optimal solution. Even if that were true, it wouldn't answer the question: it would not tell us the complexity of minimizing the deviation. 2. The first reference is about 2-partition. Pointing me to a reference on 2-partition is not helpful. I already explained in the question why my problem is not just 3-partition (or 2-partition). A paper on a variant of a problem that I didn't ask about isn't helpful.

— D.W.

Counterexample to the claim that you should minimize the absolute deviation from the mean:

n = 68

$n=68$ . Then

1, 4, 17

$1,4,17$ yields an absolute deviation of

2.85342

$2.85342$ , which is the lowest possible absolute deviation. However

2, 2, 17

$2,2,17$ is the correct (optimal) solution, and has smaller surface area. [By absolute deviation from the mean, I specifically mean

| \log (x) - μ | + | \log (y) - μ | + | \log (z) - μ |

$|\log(x)-\mu|+|\log(y)-\mu|+|\log(z)-\mu|$ (where

μ = (\log (x) + \log (y) + \log (z)) / 3

$\mu=(\log(x)+\log(y)+\log(z))/3$ ).]

— D.W.

ok! there was never any claim this algorithm was correct, it was based on inspection of some examples & others suggestions in comments. this is only a single counterexample (you indicated the minimizing deviation method is flawed in the revised post). the question of "how often" this algorithm gives a correct solution is interesting because it could give some clues to a correct optimization metric. conjecture this algorithm "often" gives the correct answer. the 2-way ref is to show a deviation version of the problem which is different than the typical exact version on wikipedia etc

— vzn

Edit

Here’s an informal argument for why a fast algorithm is unlikely to exist. That sentence hasn't changed, but I’ve taken out what used to be here because it was structured too much like the formal proof in the next section and the discussion was getting sidetracked onto its bugs, some of which I noticed myself and one of which D.W. kindly pointed out to me. Let me instead try to express the intuition behind it.

The clever approaches to this problem seem to be focusing on finding ways to partition the prime factors of $N$ into three lists and quickly prune most of the combinations without needing to check them. So suppose this approach works, that is, it gives you a P-time algorithm for finding the optimal box.

What happens when we translate the same steps into a different algebra, such as addition and subtraction instead of multiplication and division? We know (see lemma below) that our algorithm will find a 3-partition whose products are equal, if one exists, or else determine that no such 3-partition exists. So, if we could translate the same techniques into the additive group, we could find a 3-partition whose sums are equal, or determine that no such partition exists. In other words, we could solve 3-partition in polynomial time. That’s not very plausible.

So, why might such an algorithm work for multiplication and fail for addition? One possible reason is that every integer has a unique prime factorization under multiplication, but is cyclic under addition. Another is that multiplication forms a ring with addition, so you have another set of operations you can use. Another is that you would have to generalize the algorithm to work for non-primes, and it might depend on their primality. The one D.W. pointed out is that the specific method of translation might exponentially increase the size of your inputs. And maybe P = NP after all.

But if those are the loopholes that let a fast algorithm work, I think it’s still useful to know, because it suggests where we should focus our efforts. We should be looking for something that would break if we tried to apply it to a NP-complete problem. An approach that would generalize to other algebras is probably barking up the wrong tree. I suspect, though, that multiplication is not really different enough for that to work, but that’s just a hunch.

Lemma

Let $m = \sqrt[3]{N}$ . For any solution $(am, bm, \frac{m}{ab})$ , where $a$ and $b$ are positive, the weight of the solution is $(ab + \frac{1}{a} + \frac{1}{b})m^2$ and is minimized when $a = b = 1$ .

Proof: We see immediately from the restriction $xyz = N$ that any arbitrary solution has this form.

Its weight is $(am)(bm) + (am)(\frac{m}{ab}) + (bm)(\frac{m}{ab}) = abm^2 + \frac{m^2}{b} + \frac{m^2}{a} = (ab+\frac{1}{a}+\frac{1}{b})m^2$ .

Let $f(a,b) = ab + \frac{1}{a} + \frac{1}{b}$ . To minimize this function, we take the partial derivatives $\frac{\delta f}{\delta a} = b - \frac{1}{a^2}$ and $\frac{\delta f}{\delta b} = a - \frac{1}{b^2}$ . The critical point where these partial derivatives are zero comes where $a = b^2, b = a^2$ , and therefore, since $a$ and $b$ must be real numbers greater than 0, $a = b = 1$ . We see from the matrix of second-order partial derivatives that this is a global minimum.

My immediate motivation to prove this was to fill in a hand wave in my proof above that, if a perfect-cube solution exists, it is optimal. However, this formula could be useful for pruning the search tree.

Assorted Thoughts

I don’t see any obvious symmetry except the interchangeability of x, y and z, which only gives us at best a constant factor of 6. We do have some speedups for the 2-partition that basically say we’d want both terms to be as close to each other as possible, but I don’t immediately see an application to this problem.

Off the top of my head, simply taking the log of all the numbers immediately reduces this to a classic 3-partition problem using addition, or equivalently, taking some number to the power of the numbers in any 3-partition addition problem turns it into a multiplication problem such as this. That implies this problem should not be any easier. We do have here the prime factorization, whereas that factorization would not be prime, but does that help us?

Graphically, the surface xyz = 0 would look like the union of the xy-, yz- and xz-planes, and for any positive n, the equation would look like y = n/xz, so each slice along a coordinate plane would be a hyperbola. We can generally say that the quantity we’re trying to minimize is lowest at the origin and grows most rapidly along the line where x = y = z. If we can search only along this manifold, we might be able to reduce to a two-dimensional problem.

— Davislor
fonte

If x+y+z=n, 2^n=2^(x+y+z)=2^x*2^y*2^z, which is an instance of this problem minus the restriction that the inputs are a prime decomposition of the product. They would instead all be powers of two.

— Davislor

It’s true that the weight to minimize will be different, but if x=y=z in the original problem, won’t x'y'+x'z'+y'z' be minimized in the corresponding problem where each w is replaced by w'=2^w, meaning that if a solution to the original problem exists, the reduction would find it? We might get a spurious solution from the transformed problem, but we can detect that in linear time when converting back by verifying the sums.

— Davislor

as above comment by GBach suggests, maximizing

x y + y z + x z

$xy + yz + xz$ subject to

x y z = n

$xyz=n$ likely happens when

x, y, z

$x,y,z$ are "close together" or have low deviation (from average). this is not necessarily the same as "close to

\sqrt[3]{n}

$\sqrt[3]{n}$ ". the numerical examples given by Meyer on his page appear to fit this pattern.

— vzn

@vzn: We’re trying to minimize surface area, not maximize it. If the 3-partition problem has a solution, that translates into a modified box-dimension problem where the solution is a perfect cube. A hypothetical poly-time algorithm would find the factors of the sides of that cube, and we could then translate it back into the original domain, while checking for spurious solutions, in linear time. That suggests an algorithm for a slightly-relaxed problem could serve as an oracle for a hard problem, making it unlikely a better-than-exponential algorithm exists.

— Davislor

? am not disagreeing with you. arent we saying the same thing? plz drop by Computer Science Chat to untangle/ sort this out further. also cant follow @D.W.s claim that the logarithmic transformation doesnt work, can you? am using some of your (seemingly on-target) analysis as basis for my own answer.

— vzn