AoPS - Problem

Problem

Source:

Tags: function, induction, combinatorics proposed, combinatorics

Omid Hatami

09.06.2010 04:08

$S,T$ are two trees without vertices of degree 2. To each edge is associated a positive number which is called length of this edge. Distance between two arbitrary vertices $v,w$ in this graph is defined by sum of length of all edges in the path between $v$ and $w$. Let $f$ be a bijective function from leaves of $S$ to leaves of $T$, such that for each two leaves $u,v$ of $S$, distance of $u,v$ in $S$ is equal to distance of $f(u), f(v)$ in $T$. Prove that there is a bijective function $g$ from vertices of $S$ to vertices of $T$ such that for each two vertices $u,v$ of $S$, distance of $u,v$ in $S$ is equal to distance of $g(u)$ and $g(v)$ in $T$.

JBL

12.06.2010 20:53

Here's an outline of a proof that I believe works, but that I don't find satisfying: First, we rephrase the question in the following way: we are given the pairwise distances between a set of nodes (the leaves) that are known a priori to be defined by a tree in the way described, and we wish to show that this tree is unique. Let $n$ be the number of leaves. Now, we consider small cases: if there are two leaves at distance $a$, then the only tree of the desired sort has a single edge of length $a$. If there are three leaves at pairwise distances $a, b, c$, then there is a single tree of the desired sort with one internal node connected by edges of length $\frac{a + b - c}{2}, \frac{a - b + c}{2}, \frac{-a + b + c}{2}$ to the three leaves. If there are four leaves $1, 2, 3, 4$ at distance $a_{12}, a_{13}, \ldots, a_{34}$ from each other, we have two possibilities: * if there exist a partition $\{i, j\} \cup \{k, \ell\} = \{1, 2, 3, 4\}$ such that $a_{ij} + a_{k\ell} = a_{i \ell} + a_{jk} = a_{ik} + a_{j\ell}$, then the tree is a star graph with four leaves. Otherwise, * the graph has two internal nodes connected to each other, with each connected to two of the leaves. In both of these cases, the edge lengths are uniquely determined by the leaf distances. (These claims require checking, which I omit here.) Now suppose we have $n > 4$ leaves. There are two cases: 1) There are four leaves, $t, u, v, w$, such that $d(t, u) + d(v, w) = d(t, w) + d(u, w) = d(t, v) + d(u, w)$. It follows from the discussion of the case $n = 4$ that there is an internal node $z$ such that the paths connecting $t, u, v$ and $w$ to $z$ are pairwise disjoint, and moreover we can compute the lengths of these four paths and so also the distances $d(t, z), d(u, z), d(v, z)$ and $d(w, z)$. Now for any leaf $x$, by comparing the distances $d(x, y)$ for $y \in \{t, u, v, w\}$ we may deduce whether the path from $x$ to $z$ contains any nodes in common with one of the paths connecting $z$ to an element of $\{t, u, v, w\}$. It follows that we may decompose our tree into five smaller subtrees rooted at $z$: four containing one element each of $\{t, u, v, w\}$ and a fifth that contains none of these vertices. From here, we may proceed by induction; each of the five smaller trees is uniquely determined, so the larger tree is as well. 2) There are no such vertices. Then every internal node has degree exactly 3. It follows that in this situation, the leaves come in pairs such that the two leaves in each pair share their unique neighbor. We can figure out this pairing as follows: $u$ and $v$ are paired if and only if there is a constant $c$ such that $d(u, z) = d(v, z) + c$ for every other leaf $z$. In this case, the common neighbor $t$ of $u$ and $v$ satisfies $d(u, t) + d(v, t) = d(u, v)$ and $d(u, t) - d(v, t) = c$, so we may compute the lengths of the edges $tu$ and $tv$ and so the distances from $t$ to all leaves. Now erase $u$ and $v$ and proceed by induction. Obviously there are lots of technical details missing; is there a nicer proof? Also, I would be a little shocked if this were original to the Iranian TST; does anyone know where it has appeared before? (It feels very much like there should be some implications from this problem for phylogenetic trees, but perhaps this is a misleading emotion; for one thing, these trees aren't directed in the way that phylogenetic trees are.)

ocha

22.09.2010 10:22

@JBL: nice solution. I think your final observation is the strongest; enough to prove the problem without the previous cases. Here's a sketch First, instead of assuming no vertecies have degree two, allow such vertecies to exist iff the pairwise distances between them and the other nodes are known. Hence, whenever they occur we can simply ignore them and show that the rest of the tree is uniquely defined. It will follow that the vertecies of degree two are uniquely defined as well. Consider the set, $L_0$, of nodes with their pairwise distances given. Among the leaves group pairs, $\{u,v\}$ together if $d(u,z)=d(v,z)+c$ for all other $z\in L_0$ and some constant $c$. If there are pairs $\{u,v\},\{v,w\},\{w,u\}$ then make that one group $\{u,v,w\}$. We will then have $L_0$ partitioned into several disjoint subsets. For each subset the nodes are all connected to one unique vertex. Now it can be shown that 1) For each subset with more than one element the distances from those elements to that unique vertex are uniquely defined. So we can erase that subset, which leaves the unique vertex, (which is now a node). So the set of nodes goes from $L_i \to L_{i+1}$ with decreased cardinality, and their pairwise distances still known. 2) At any stage we can always find a subset of $L_i$ with more than one element. Suppose the set $L_i$ of nodes are part of a tree $T_i$, if we remove all the nodes then $T_i\to T_i'$ is still a tree, so there exists at least two nodes, say $x,y$, in $T_i'$. Then back in $T$, $x$ and $y$ have degree at least $3$ because we are ignoring vertecies of degree $2$, and they can't have degree $1$. Eventually $L_i$ will become either one or two vertecies, which uniquely define a tree. Then working backwards, reattaching the nodes erased at each stage, $L_0$ must uniquely define a tree.

JBL

23.09.2010 05:55

Looks very nice!

MellowMelon

24.05.2012 14:28

Here's a full solution that's not spread between multiple posts. Hidden for length.

Let $d_G(u,v)$ be the distance between vertices $u,v$ in a graph $G$ as defined in the problem. We begin with a lemma on trees. Lemma: For any three leaves $x,y,z$ of a tree $T$, consider the path $P_z$ from $x$ to $y$, $P_y$ from $x$ to $z$, and $P_x$ from $y$ to $z$. Then there exists exactly one vertex $w$, which we call the center, in the intersection $P_x \cap P_y \cap P_z$, where the paths are taken to contain both vertices and edges. Furthermore we have $d_T(x,y) = d_T(x,w) + d_T(w,y)$ and similarly for the other pairs. Proof: Consider the intersection of $P_y \cap P_z$. Both $P_y, P_z$ are paths starting from $x$, which is a leaf. At some point, they diverge, and do not rejoin since $T$ is a tree. If $w$ is the last vertex the paths share in common, then $P_y \cap P_z$ is a path from $x$ to $w$; let it be $Q_x$. Similarly define $Q_y, Q_z$. Notice that $Q_x \cap Q_y$, $Q_y \cap Q_z$, and $Q_z \cap Q_x$ are all equal to $\{w\}$. Furthermore, since a path from $x$ to $y$ is $Q_x \cup Q_y$ and there is only one such path, $P_z = Q_x \cup Q_y$ and similarly for $P_y, P_x$. So \[ P_x \cap P_y \cap P_z = Q_x \cap Q_y \cap Q_x = \{w\}. \] Finally from $P_z = Q_x \cup Q_y$ we deduce $d_T(x,y) = d_T(x,w) + d_T(w,y)$, and similarly for the other pairs. $\blacksquare$ Notice that for any leaf $x$ and non-leaf $w$, we can find $y,z$ such that $w$ is the center of $x,y,z$. Since $w$ is a non-leaf it has degree at least 3, so deleting it creates at least three connected components. Choose three vertices from different components with one of them $x$; this gives us our $x,y,z$ with $w$ as the center. Next we show that if $w$ is a non-leaf vertex and the center of $x,y,z$ in $S$, and $W$ is the center of $f(x),f(y),f(z)$ in $T$, then $d_S(x,w) = d_T(f(x),W)$ and similarly for $y,z$. Define $a = d(y,z)$, $b = d(z,x)$, and $c = d(x,y)$. From the lemma we know there exist positive $a', b', c'$ with $a = b' + c'$, $b = c' + a'$, and $c = a' + b'$; such a representation is unique. Let $a'' = d_T(f(x), W)$, $b'' = d_T(f(y), W)$, and $c'' = d_T(f(z), W)$. Since $W$ is the center of $f(x), f(y), f(z)$, we have $b'' + c'' = d_T(f(y),f(z)) = d_S(y,z) = a$. Similarly, $c'' + a'' = b$ and $a'' + b'' = c$. Therefore $a' = a''$, $b' = b''$, and $c' = c''$. Then $d_S(x,w) = a' = a'' = d_T(f(x), W)$, as desired. We now define $g$. For a leaf $v \in S$, set $g(v) = v$. For $w$ a non-leaf vertex of $S$, choose $x,y,z$ with $w$ as the center. Define $g(w)$ as the center of $f(x), f(y), f(z)$. By the previous paragraph this is well-defined. All that is left to show is that $d_S(u,v) = d_T(g(u),g(v))$ for all $u,v \in S$. Delete $u$ from $S$ and choose two leaves $x_u, y_u$ that are not in the same component as $v$ or eachother. Likewise choose $x_v, y_v$. Then $u$ is the center of $x_u, x_v, y_u$ and $v$ is the center of $x_u, x_v, y_v$, with the path from $x_u$ to $x_v$ passing through $u,v$ in that order. Therefore $d_S(x_u,v) - d_S(x_u,u) = d_S(u,v)$ (1). In $T$, $g(u)$ is the center of $g(x_u, x_v, y_u)$ and $g(v)$ is the center of $g(x_u, x_v, y_v)$, so $g(u), g(v)$ are still on the path between $x_u, x_v$. Since $d_T(g(x_u), g(u)) = d_S(x_u,u) < d_S(x_u,v) = d_T(g(x_u), g(v))$, $g(u)$ is on the path between $g(x_u)$ and $g(v)$. Therefore, $d_T(g(x_u), g(v)) - d_T(g(x_u), g(u)) = d_T(g(u), g(v))$. Combining this with (1) shows $d_S(u,v) = d_T(g(u), g(v))$ as desired.