Por favor, explique esta definição formal de computação

7

Estou tentando atacar o TAOCP mais uma vez, dado o peso literal dos volumes que tenho problemas em comprometer seriamente. No TAOCP 1, Knuth escreve, página 8, conceitos básicos:

Seja um conjunto finito de letras. Seja o conjunto de todas as seqüências de caracteres em (o conjunto de todas as seqüências ordenadas ... que e estão em para ). A idéia é codificar os estados da computação para que eles sejam representados por seqüências de caracteres . Agora seja um número inteiro não negativo e Q (o estado) seja o conjunto de todos , onde está em e j é um número inteiro ; deixe $A$ $A^*$ $A$ $x_1$ $x_2$ $x_n$ $n \ge 0$ $x_j$ $A$ $1 \le j \le n$ $A^*$ $N$ $(\sigma, j)$ $\sigma$ $A^*$ $0 \le j \le N$ $I$ (entrada) ser o subconjunto de Q com e deixar (a saída) ser o subconjunto com . Se e são cadeias de caracteres em , dizemos que ocorre em se tiver a forma para cadeias e . Para concluir nossa definição, seja uma função do seguinte tipo, definido pelas cadeias , e os números inteiros , para $j=0$ $\Omega$ $j = N$ $\theta$ $\sigma$ $A^*$ $\theta$ $\sigma$ $\sigma$ $\alpha \theta \omega$ $\alpha$ $\omega$ $f$ $\theta_j$ $\phi_j$ $a_j$ $b_j$ $0 \le j \le N$ :

$f((\sigma, j)) = (\sigma, a_j)$ se não ocorrer em $\theta_j$ $\sigma$

$f((\sigma, j)) = (\alpha \psi_j \omega, b_j)$ se for a string mais curta possível para a qual $\alpha$ $\sigma = \alpha \theta_j \omega$

$f((\sigma,N)) = (\sigma, N)$

Não sendo um cientista da computação, tenho problemas para entender toda a passagem. Eu meio que entendo a ideia que está por trás de um sistema de códigos de operação, mas não progredi efetivamente no entendimento. Penso que o principal problema é que não sei como lê-lo de forma eficaz.

Seria possível explicar a passagem acima para que eu possa entendê-la e me dar uma estratégia para entrar na lógica na interpretação dessas afirmações?

formal-languages turing-machines computation-models

— Stefano Borini
fonte

Então você não deve incluir seu comentário na suposta citação, confundindo qualquer pessoa que não tenha o livro à mão. -.- Espero que minha resposta ajude ...

— Raphael

@ Rafael: a citação é literalmente do livro. Acabei de adicionar explicação entre parênteses dos símbolos para eu e ômega

— Stefano Borini

@SteanoBorini: Mas não é "explicação", está errado. Entendo como você pode ler o texto original para chegar à mesma conclusão que você, mas ainda não é útil. Se você mencionar algo e adicionar comentários, marque-o como tal para que as pessoas possam tomá-lo com um grão de sal.

— Raphael

Há um contexto faltando aqui: qual computação e quais estados?

— Reinierpost

8

Estamos perdendo algum contexto, então não tenho idéia do ponto que Knuth está tentando fazer, mas aqui está como interpretar uma máquina de Turing dessa maneira. Talvez isso ajude você a entender o que está acontecendo. Em geral, uma boa maneira de entender um conceito é brincar com ele. No caso de paradigmas de programação, isso significa escrever um programa. Neste caso, mostrarei como escrever qualquer programa.

Suponha que a fita da máquina de Turing tenha símbolos $\{0,1,\epsilon\}$ (Onde $\epsilon$ significa "vazio") e adicione mais um símbolo que representa a localização da cabeça $H$ . Seus estados serão pares da forma $(q,\alpha)$ , Onde $q$ é um estado da máquina de Turing e $\alpha \in \{0,\ldots,14\}$ . Também identificamos $(F,0)$ com $N$ para qualquer estado final.

Entrada (não vazia) $x$ , seu ponto de partida será $(Hx,(s,0))$ , Onde $s$ is the starting state. The difficult part is to encode states. Suppose that at state $q$ , upon reading input $x$ , you replace it with $a(q,x)$ , move in direction $D(q,x) \in \{L,R\}$ , and switch to state $\sigma(q,x)$ . For the $\theta$ s, we have

\begin{aligned} θ_{q, 0} & = 0 H 0, & θ_{q, 1} & = 0 H 1, & θ_{q, 2} & = 0 H ϵ, \\ θ_{q, 3} & = 1 H 0, & θ_{q, 4} & = 1 H 1, & θ_{q, 5} & = 1 H ϵ, \\ θ_{q, 6} & = ϵ H 0 & θ_{q, 7} & = ϵ H 1, & θ_{q, 8} & = ϵ H ϵ, \\ θ_{q, 9} & = H 0, & θ_{q, 10} & = H 1, & θ_{q, 11} & = H ϵ, \\ θ_{q, 12} & = 0 H, & θ_{q, 13} & = 1 H, & θ_{q, 14} & = ϵ H . \end{aligned}

$\begin{align*} \theta_{q,0} &= 0H0, & \theta_{q,1} &= 0H1, & \theta_{q,2} &= 0H\epsilon, \\ \theta_{q,3} &= 1H0, & \theta_{q,4} &= 1H1, & \theta_{q,5} &= 1H\epsilon, \\ \theta_{q,6} &= \epsilon H0 & \theta_{q,7} &= \epsilon H1, & \theta_{q,8} &= \epsilon H\epsilon, \\ \theta_{q,9} &= H0, & \theta_{q,10} &= H1, & \theta_{q,11} &= H\epsilon, \\ \theta_{q,12} &= 0H, & \theta_{q,13} &= 1H, & \theta_{q,14} &= \epsilon H. \end{align*}$ For the

a

$a$ s, we have

a_{q, i} = (q, i + 1)

$a_{q,i} = (q,i+1)$ for

i < 14

$i < 14$ , and

a_{q, 14} = (q, 14)

$a_{q,14} = (q,14)$ , though we should never really get that far. For the

b

$b$ s, we have

\begin{aligned} b_{q, 0} = b_{q, 3} = b_{q, 6} = b_{q, 9} = (σ (q, 0), 0), \\ b_{q, 1} = b_{q, 4} = b_{q, 7} = b_{q, 10} = (σ (q, 1), 0), \\ b_{q, 2} = b_{q, 5} = b_{q, 8} = b_{q, 11} = b_{q, 12} = b_{q, 13} = b_{q, 14} = (σ (q, ϵ), 0) . \end{aligned}

$\begin{align*} &b_{q,0} = b_{q,3} = b_{q,6} = b_{q,9} = (\sigma(q,0),0), \\ &b_{q,1} = b_{q,4} = b_{q,7} = b_{q,10} = (\sigma(q,1),0), \\ &b_{q,2} = b_{q,5} = b_{q,8} = b_{q,11} = b_{q,12} = b_{q,13} = b_{q,14} = (\sigma(q,\epsilon),0). \end{align*}$ Now it remains to determine the

ψ

$\psi$ s. Let

a_{0} = a (q, 0)

$a_0 = a(q,0)$ . If

D (q, 0) = L

$D(q,0) = L$ then

\begin{aligned} ψ_{q, 0} & = H 0 a_{0}, & ψ_{q, 3} & = H 1 a_{0}, & ψ_{q, 6} & = ψ_{q, 9} = H ϵ a_{0} . \end{aligned}

$\begin{align*} \psi_{q,0} &= H0a_0, & \psi_{q,3} &= H1a_0, & \psi_{q,6} &= \psi_{q,9} = H\epsilon a_0. \end{align*}$ If

D (q, 0) = R

$D(q,0) = R$ then

\begin{aligned} ψ_{q, 0} & = 0 a_{0} H, & ψ_{q, 3} & = 1 a_{0} H, & ψ_{q, 6} & = ϵ a_{0} H, & ψ_{q, 9} & = a_{0} H ϵ . \end{aligned}

$\begin{align*} \psi_{q,0} &= 0a_0H, & \psi_{q,3} &= 1a_0H, & \psi_{q,6} &= \epsilon a_0 H, & \psi_{q,9} &= a_0H\epsilon. \end{align*}$ Next, let

a_{1} = a (q, 1)

$a_1 = a(q,1)$ . If

D (q, 1) = L

$D(q,1) = L$ then

\begin{aligned} ψ_{q, 1} & = H 0 a_{1}, & ψ_{q, 4} & = H 1 a_{1}, & ψ_{q, 7} & = ψ_{q, 10} = H ϵ a_{1} . \end{aligned}

$\begin{align*} \psi_{q,1} &= H0a_1, & \psi_{q,4} &= H1a_1, & \psi_{q,7} &= \psi_{q,10} = H\epsilon a_1. \end{align*}$ If

D (q, 1) = R

$D(q,1) = R$ then

\begin{aligned} ψ_{q, 1} & = 0 a_{1} H, & ψ_{q, 4} & = 1 a_{1} H, & ψ_{q, 7} & = ϵ a_{1} H, & ψ_{q, 10} & = a_{1} H ϵ . \end{aligned}

$\begin{align*} \psi_{q,1} &= 0a_1H, & \psi_{q,4} &= 1a_1H, & \psi_{q,7} &= \epsilon a_1 H, & \psi_{q,10} &= a_1 H\epsilon. \end{align*}$ Finally, let

a_{ϵ} = a (q, ϵ)

$a_\epsilon = a(q,\epsilon)$ . If

D (q, ϵ) = L

$D(q,\epsilon) = L$ then

\begin{aligned} ψ_{q, 2} & = H 0 a_{ϵ}, & ψ_{q, 5} & = H 1 a_{ϵ}, & ψ_{q, 8} & = ψ_{q, 11} = H ϵ a_{ϵ}, \\ ψ_{q, 12} & = H 0 a_{ϵ}, & ψ_{q, 13} & = H 1 a_{ϵ}, & ψ_{q, 14} & = H ϵ a_{ϵ} . \end{aligned}

$\begin{align*} \psi_{q,2} &= H0a_\epsilon, & \psi_{q,5} &= H1a_\epsilon, & \psi_{q,8} &= \psi_{q,11} = H\epsilon a_\epsilon, \\ \psi_{q,12} &= H0a_\epsilon, & \psi_{q,13} &= H1a_\epsilon, &\psi_{q,14} &= H\epsilon a_\epsilon. \end{align*}$ If

D (q, ϵ) = R

$D(q,\epsilon) = R$ then

\begin{aligned} ψ_{q, 2} & = 0 a_{ϵ} H, & ψ_{q, 5} & = 1 a_{ϵ} H, & ψ_{q, 8} & = ϵ a_{ϵ} H, & ψ_{q, 11} & = a_{ϵ} H ϵ, \\ ψ_{q, 12} & = 0 a_{ϵ} H, & ψ_{q, 13} & = 1 a_{ϵ} H, & ψ_{q, 14} & = ϵ a_{ϵ} H . \end{aligned}

$\begin{align*} \psi_{q,2} &= 0a_\epsilon H, & \psi_{q,5} &= 1a_\epsilon H, & \psi_{q,8} &= \epsilon a_\epsilon H, & \psi_{q,11} &= a_\epsilon H\epsilon, \\ \psi_{q,12} &= 0a_\epsilon H, & \psi_{q,13} &= 1a_\epsilon H, & \psi_{q,14} &= \epsilon a_\epsilon H. \end{align*}$

Now apply $f$ repeatedly until you get stuck. If you follow the construction, you will see that we have simulated the running of the Turing machine.

— Yuval Filmus
fonte

understood: nothing. Not your fault. Thank you anyway :(

3

"We are missing some context." It's: we should have some precise description of what we mean by a 'method of computation'; here's one given by A.A. Markov; there are other equivalent ones, such as Turing machines.

— rgrig

6

Let us break it down bit by bit. First of all, remember what Knuth wrote on page 7:

Let us formally define a computational method to be a quadruple $(Q,I,\Omega,f)$ , in which $Q$ is a set containing subsets $I$ and $\Omega$ , and $f$ is a function from $Q$ into itself. [...] The four quantities $Q$ , $I$ , $\Omega$ , $f$ are intended to represent repectively the state of the computation, the input, the output, and the computational rule.

This is the outline. You have to read "represent" as "contain"; $Q$ is going to contain states (some of which are in $I$ , some are in $\Omega$ ) and $f$ is going to be a transition function between states; think of it as a program.

Let $A$ be a finite set of letters. Let $A^*$ be the set of all strings in $A$ (the set of all ordered sequences $x_1$ $x_2$ ... $x_n$ where $n \ge 0$ and $x_j$ is in $A$ for $1 \le j \le n$ ).

This is just a reiteration of what $A^*$ is. See also here.

The idea is to encode the states of the computation so that they are represented by strings of $A^*$ .

This is probably the key sentence. We are talking about computations, that is execution sequences of some (programming language) statements which manipulate some state, which can be thought of as values in memory cells, or valuations of variables. Knuth says here that he wants to encode these states in an abstract way, namely as word over some alphabet.

Example: Consider a program that uses (at most) $k$ variables, each of which stores an integer. That is, a state is given by the tuple of values $(x_1, \dots, x_k)$ where $x_k$ is the (current) value of the $k$ -th variable. In order to encode states of this form in a formal language, we can choose $A = \{0,1,\#\}$ with $\#$ a separator. Now model such a state by $\#\overline{x_1}\#\cdots\#\overline{x_k}\#$ where $\overline{x_i}$ is the binary encoding of $x_i$ .

Specifically, $(3,5,0)$ would be $\#11\#101\#0\#$ .

Now let $N$ be a non-negative integer and Q be the set of all $(\sigma, j)$ , where $\sigma$ is in $A^*$ and j is an integer $0 \le j \le N$ ; let $I$ be the subset of Q with $j=0$ and let $\Omega$ be the subset with $j = N$ .

You misquoted there (bad Stefano!); the parentheses are not in the original text, and they were misleading (see above). Knuth defines $Q$ here as the set of all possible states ( $\sigma \in A^*$ ) at all possible places in the computation ( $j$ can be understood as program counter). Therefore, $Q$ contains all statement-indexed states any computation of the algorithm given by $f$ can assume. By definition, we start with program counter $0$ and end in $N$ , thus states indexed $0$ are input states and those indexed $N$ are output states.

If $\theta$ and $\sigma$ are strings in $A^*$ , we say that $\theta$ occurs in $\sigma$ if $\sigma$ has the form $\alpha \theta \omega$ for strings $\alpha$ and $\omega$ .

I hope that this is clear; it is just a (re)definition of substrings.

To complete our definition, let $f$ be a function of the following type, defined by the strings $\theta_j$ , $\phi_j$ and the integers $a_j$ , $b_j$ for $0 \le j \le N$ :

$f((\sigma, j)) = (\sigma, a_j)$ if $\theta_j$ does not occur in $\sigma$

$f((\sigma, j)) = (\alpha \psi_j \omega, b_j)$ if $\alpha$ is the shortest possible string for which $\sigma = \alpha \theta_j \omega$

$f((\sigma,N)) = (\sigma, N)$

This is a a small programming language; if you fix $\theta_j, \psi_j, a_j, b_j$ , you have a program. On program counter $j$ , $f$ replaces the left-most occurrence $\theta_j$ in the state with $\psi_j$ and goes to statement $b_j$ . If there is no $\theta_j$ in the current state, it goes to statement $a_j$ . The program loops if statement $N$ is reached, modelling termination.

On the upper half of page 8, there is a more concrete example of a "program" $f$ . Keep in mind that Knuth is going to use assembly language later on; this informs how he looks at programs (atomic statements connected by jumps).

— Raphael
fonte

1

Now I got a bit better understanding of what is going on. However, two things are still not clear and I would really appreciate if you could expand your answer. First, θj,ψj,aj,bj - what are these strings and numbers? What do they represent? If I understand correctly, aj and bj represent the step number or command counter for state j+1. But I am not sure what θj,ψj strings mean. Can you explain what do you mean by " if you fix θj,ψj,aj,bj, you have a program"? Or rather, how would I fix it for some example?

— Georgy Bolyuba

@GeorgyBolyuba: You are right about

a_{j}

$a_j$ and

b_{j}

$b_j$ . The program's state is a string

σ

$\sigma$ and a "program counter"

j

$j$ .

θ_{j}

$\theta_j$ and

ψ_{j}

$\psi_j$ are used to modify that state (see second case of

f

$f$ ). They can have all kinds of shapes; it really depends on how you encode state as a string. See the book for an example.

— Raphael

5

That text describes the following (Python) pseudocode:

subs = a list of string pairs  
As = a list of integers  
Bs = a list of integers

def f(state, pc):  
  if pc == N: return (state, pc)  
  if state.find(subs[pc][0]) != -1:  
    return (state.replace(subs[pc][0],subs[pc][1],1), Bs[pc])  
  else:  
    return (state,As[pc])

The function f is presumably going to be applied repeatedly.

The last three bullet points is all you really need once you understand the notations. All that comes before is a bit analogous to explaining how Python works before giving the Python code.

— rgrig
fonte

Ah ok, it's a Turing machine.

— Stefano Borini

1

Rather, it is a different model of computation with the same power as a Turing machine.

— Yuval Filmus

Well, three lines below your quote Knuth says that this is equivalent to Turing machines, so presumably you already knew this when you asked. I thought you were asking for help with the notation. Now I have no idea what is it that you wanted to ask.

— rgrig