Add++, 146 bytes
D,g,@~~,L2_|*;;*|_2L,@,g,D
D,ff,@^^,BG€gBF;;FBg€GB,@D1:?:
xx:?
aa:1
`bb
Bxx;;B
Waa*bb,`yy,$ff>xx,`aa,xx|yy,`bb,Byy,xx:yy
O;;O:,B,`,|,`,>$,`,*W`
Try it online!
Fun fact: This was 272 bytes long before the explanation was started, now it beats Java.
Outputs True
for perfectly balanced strings, and False
otherwise
To my great satisfaction, this beats the boring palindromize version by 2 bytes, to prevent the result being printed twice. I have also aimed to have as little dead code as possible, nevertheless there are still some commented-out sections, and the code exits with an error code of 1, after printing the correct value.
NB : A bug with the BF
commands was fixed while this answer was in development.
How it works
The code starts by defining the two key functions, ff and g. These two functions are used to calculate the next step in the process of removing pairs, and work entirely from ff i.e. only ff is called from the main program, never g. If we define the input string as S, ff(S) modifies S in the following way:
First, identical adjacent characters in S are grouped together. For an example of abbbaabacc, this yields the array [[a],[bbb],[aa],[b],[a],[cc]]. Over each of the sublists (i.e. the identical groups), we run the function g, and replace the sublists with the result of the function.
g starts by unpacking the group, splatting the characters onto the stack. It then pushes the number of characters on the stack and takes the absolute difference with 2 and that number. We'll call this difference x. Lets see how this transforms the respective inputs of [a], [bb] and [ccc]:
[a]⇒[a,1]
[bb]⇒[b,b,0]
[ccc]⇒[c,c,c,1]
As you can see x indicates how many of the next character we wish to keep. For simple pairs, we remove them entirely (yielding 0 of the next character), for lone characters we leave them untouched, or yield 1 of them, and for groups where x>2, we want x−2 of the character. In order to generate x of the character, we repeat the character with *
, and the function naturally returns the top element of the stack: the repeated string.
After g(s) has been mapped over each group s, we splat the array to the stack to get each individual result with BF
. Finally, the ^
flag at the function definition (D,ff,@^^,
) tells the return function to concatenate the strings in the stack and return them as a single string. For pairs, which yielded the empty string from g, this essentially removes them, as the empty string concatenated with any string r results in r. Anything after the two ;;
is a comment, and is thus ignored.
The first two lines define the two functions, ff and g, but don't execute ff just yet. We then take input and store it in the first of our 4 variables. Those variables are:
- xx : The initial input and previous result of applying ff
- yy : The current result of applying ff
- aa : The loop condition
- bb : Whether yy is truthy
As you can see, all variables and functions (aside from g) have two letter names, which allows them to be removed from the source code fairly quickly, rather than having a comment with a significant amount of xyab. g doesn't do this for one main reason:
If an operator, such as €
, is run over a user defined function abc, the function name needs to be enclosed in {...}
, so that the entire name is taken by the operator. If however, the name is a single character, such as g, the {...}
can be omitted. In this case, if the function name was gg, the code for ff and g would have to change to
D,gg,@~~,L2_|*;;*|_2L,@D (NB: -2 bytes)
D,ff,@^^,BG€{gg}BF;;FB}gg{€GB,@D?: (NB: +6 bytes)
which is 4 bytes longer.
An important term to introduce now is the active variable. All commands except assignment assign their new value to the active variable and if the active variable is being operated on, it can be omitted from function arguments. For example, if the active variable is x=5, then we can set x=15 by
x+10 ; Explicit argument
+10 ; Implicit argument, as x is active
The active variable is x by default, but that can be changed with the `
command. When changing the active variable, it is important to note that the new active variable doesn't have to exist beforehand, and is automatically assigned as 0.
So, after defining ff and g, we assign the input to xx with xx:?
. We then need to manipulate our loop conditions ever so slightly. First, we want to make sure that we enter the while loop, unless xx is empty. Therefore, we assign a truthy value to aa with aa:1
, the shortest such value being 1. We then assign the truthiness of xx to bb with the two lines
`bb
Bxx
Which first makes bb the active variable, then runs the boolean command on xx. The respective choices of aa:=1 and bb:=¬¬xx matter, as will be shown later on.
Then we enter our while loop:
Waa*bb,`yy,$ff>xx,`aa,xx|yy,`bb,Byy,xx:yy
A while loop is a construct in Add++: it operates directly on code, rather than variables. Constructs take a series of code statements, separated with ,
which they operate on. While and if statements also take a condition directly before the first ,
which consist of a single valid statement, such as an infix command with variables. One thing to note: the active variable cannot be omitted from the condition.
The while loop here consists of the condition aa*bb
. This means to loop while both aa and bb are truthy. The body of the code first makes yy the active variable, in order to store the result of ff(x). This is done with
`yy,$ff>xx
We then activate our loop condition aa. We have two conditions for continued looping:
- 1) The new value doesn't equal the old value (loop while unique)
- 2) The new value isn't the empty string
One of Add++'s biggest drawbacks is the lack of compound statements, which necessitates having a second loop variable. We assign our two variables:
aa:=xx≠yy
bb:=¬¬(yy)
With the code
`aa,xx|yy,`bb,Byy
Where |
is the inequality operator, and B
converts to boolean. We then update the xx variable to be the yy variable with xx:yy
, in preperation for the next loop.
This while loop eventually reduces the input into one of two states: the empty string or a constant string, even when applied to ff. When this happens, either aa or bb result in False, breaking out of the loop.
After the loop is broken, it can break for one of two reasons, as stated above. We then output the value of aa. If the loop was broken due to x=y, then both the output and aa are False. If the loop was broken because yy was equal to the empty string, then bb is falsy and aa and the output are truthy.
We then reach our final statement:
O
The program can now be in one of three states, in all of which the active variable is bb:
- 1) The input was empty. In this case, the loop didn't run, aa=1 and bb=False. The correct output is False.
- 2) The input was perfectly balanced. If so, the loop ran, aa=True and bb=False. The correct output is False
- 3) The input was not perfectly balanced. If so, the loop ran, aa=False and bb=True. The correct output is True
As you can see, bb is equal to the expected output (albeit reversed from the logical answer), so we simply output it. The final bytes that help us beat Java come from the fact that bb is the active variable, so can be omitted from the argument, leaving us to output either True or False, depending on whether the input is perfectly balanced or not.