A word is formed by a number of letters of the alphabet. We show words with capital letters. A sentence is formed by a number of words. For example if $A=aa$ and $B=ab$ then the sentence $AB$ is equivalent to $aaab$. In this language, $A^n$ indicates $\underbrace{AA \cdots A}_{n}$. We have an equation when two sentences are equal. For example $XYX=YZ^2$ and it means that if we write the alphabetic letters forming the words of each sentence, we get two equivalent sequences of alphabetic letters. An equation is simplified, if the words of the left and the right side of the sentences of the both sides of the equation are different. Note that every word contains one alphabetic letter at least. $\text{a})$We have a simplified equation in terms of $X$ and $Y$. Prove that both $X$ and $Y$ can be written in form of a power of a word like $Z$.($Z$ can contain only one alphabetic letter). $\text{b})$ Words $W_1,W_2,\cdots , W_n$ are the answers of a simplified equation. Prove that we can produce these $n$ words with fewer words. $\text{c})$ $n$ words $W_1,W_2,\cdots , W_n$ are the answers of a simplified system of equations. Define graph $G$ with vertices ${1,2 \cdots ,n}$ such that $i$ and $j$ are connected if in one of the equations, $W_i$ and $W_j$ be the two words appearing in the right side of each side of the equation.($\cdots W_i = \cdots W_j$). If we denote by $c$ the number of connected components of $G$, prove that these $n$ words can be produced with at most $c$ words. Proposed by Mostafa Einollah Zadeh Samadi
Problem
Source: Iranian 3rd round Combinatorics exam P4 - 2014
Tags: combinatorics proposed, combinatorics
alibez
28.09.2014 16:45
this problem proposed by Mostafa Einollah Zadeh Samadi
G81928128
10.09.2019 19:12
This problem took me forever to parse, but it's cool. I think the wording should be changed a little, though; part (a) should be, for instance, "Prove that for every solution $(X,Y)$, both $X$ and $Y$ can be written in form of a power of a word like $Z$." Likewise for the other two parts. Because equations like $XY=YX$ have the general solution $X=Z^a,Y=Z^b$ for any positive integers $a,b$, which doesn't exactly match what is given in the original question.
Parts (a) and (b)Note (a) is a special case of (b), so it suffices to just prove (b).
To do this, consider the rightmost words on either sides of the equation, say $W_i$ and $W_j$. If both have equal length, they are equal and so we are done. Otherwise, WLOG assume $W_i$ is longer than $W_j$. In this case, we can re-express $W_i$ as $W_i'W_j$ and substitute this back into the equation, replacing *all* copies of $W_i$, to yield $\ldots W_i'W_j = \ldots W_k W_j$, where $1\leq k\leq n$, and $k$ may possibly be equal to $j$. Regardless, cancelling the $W_j$ from each side gives another simplified equation in $n$ words, $\ldots W_i' = \ldots W_k$ (note that $W_i'$ cannot be the rightmost term on the RHS). Note that $W_i'$ is strictly shorter than $W_i$. Repeating this process, we must eventually reach a point where we can no longer perform this substitution (as the sum of the words' lengths is strictly decreasing), which is when the rightmost words on each side are of equal length, so we can equate them, expressing the substituted words in the form of $n-1$ words. Reversing the substitutions lets us express the original $n$ words in terms of these $n-1$ words.
RemarkI believe there are ways to solve (a) without (b). One way would be to just forcibly construct the shared word: WLOG let $X$ be shorter than $Y$, so $Y$ both begins and ends in $X$. You can then show that the first letter in $Y$ appears every $d$ letters in $X$ and $Y$, where $d$ is the $\gcd$ of the lengths of $X$ and $Y$. Repeat with the second, third, ..., $d^\text{th}$ letters of $Y$ and you've constructed $Z$. Related problem: https://artofproblemsolving.com/community/c6h91083p532713
Part (c)General idea: perform part (b) repeatedly and keep substituting the results into the remaining equations, each time reducing the number of word-variables. Then, you just need to check that each substitution doesn't render an unused equation "useless", increasing your degrees of freedom.
Consider a spanning tree of each component of $G$; WLOG delete all equations whose edges are not in a spanning tree. Now there are $n-c$ edges in total, so $n-c$ remaining equations.
Pick one of the equations (involving, say, $W_1,\ldots,W_k$) and perform the algorithm from part (b) on it. This lets you express $W_1,\ldots,W_k$ in terms of fewer words, say $W_1',\ldots,W_{k-1}'$. We want to see how this affects $G$.
We can group the vertices of $G$ into $k$ groups: one group for the vertices representing words which, after the substitution, become expressed in the form $\ldots W_i'$ for each $1\leq i \leq k-1$, and one group for the remaining vertices. We want to construct a graph $G'$ that corresponds to the situation after all the $W_1,\ldots,W_k$ have been replaced by all the $W_1',\ldots,W_{k-1}'$; this graph will have $k-1$ vertices representing the words $W_1',\ldots,W_{k-1}'$, as well as $n-k$ vertices representing the unaffected words in the last group. It suffices to show that the number of components of $G'$ is at most $c$.
Let's look at edges that join two vertices from two different groups in $G$, say the groups corresponding to $W_i'$ and $W_j'$. Then the equation this edge represents will become $\ldots W_i' = \ldots W_j'$ after substitution, which means that there will be an edge drawn between the vertices representing $W_i'$ and $W_j'$ in $G'$.
Now, we will show that each group contains at least one vertex. Let's look at the algorithm again. Suppose we start by writing "$W_1=W_1$, ..., $W_k=W_k$" on a whiteboard and update it as each substitution $W=XY$ is made (so after the first substitution, one of the equations becomes $W_i=W_i'W_j$). Note that until the process terminates, there will always be $k$ distinct terms on all the RHSs; after it terminates, there will be $k-1$, which are the $W_1',\ldots,W_k'$.
We claim that after the first substitution, all but one of the $k$ existing terms on the RHSs is the last term in at least one of the equations; the last remaining term will be the rightmost word on one side of the equation after cancellations are made (for example, after the first substitution $W_i=W_i'W_j$ is made, this word in concern is $W_i'$). It suffices to show that each successive substitution-cancellation step maintains this condition; call the one special word $W^*$.
Now, given some equation $\ldots W^* = \ldots W$, we either make the substitution $W^* = (W^*)'W$, or $W = W' W^*$. In the first case, all words on the RHSs remain the last term in the expression of some $W_i$, except for $(W^*)'$, which, after cancelling $W$ from both sides, is the rightmost term of one of the sides, and the condition is met. In the other case, $W^*$ becomes the last term of one of the expressions on the whiteboard (the one that $W$ was formerly the last term of), while $W'$ takes the place of $W^*$ after cancellation.
In the final step, we are equating $W^*$, which is not the last term of any of the expressions on the whiteboard, with some other $W$, which is. Thus, in the end, all the $n-1$ new words $W_1',\ldots,W_{k-1}'$ are the rightmost terms in the expression of at least one $W_i$ (and in fact, the only repeat occurs between the expressions of $W_i$ and $W_j$, i.e. those involved in the first step).
This means that each group actually only contains one vertex, except for one group, which contains two (the two representing $W_i$ and $W_j$). This means that the new graph $G'$ is identical to the original $G$, except that the two vertices representing $W_i$ and $W_j$ are merged into one new vertex, and everything joined to either of them (this is an exclusive either, otherwise $G$ has a cycle) is joined to this new vertex.
This does not change the number of components of the graph, and so, repeating this process, we eventually get $c$ isolated vertices (as the number of edges is strictly decreasing). Reversing the substitutions as before, we can express the $n$ words in terms of $c$ words.