Let $A_1$, $A_2$, $\cdots$, $A_m$ be $m$ subsets of a set of size $n$. Prove that $$ \sum_{i=1}^{m} \sum_{j=1}^{m}|A_i|\cdot |A_i \cap A_j|\geq \frac{1}{mn}\left(\sum_{i=1}^{m}|A_i|\right)^3.$$
Problem
Source: China Wuhan . Dec 31, 2017
Tags: combinatorics, inequalities, China TST
02.01.2018 18:43
Is there any probalistic argument for this?
05.01.2018 06:43
A heuristic for why this problem is true is that the LHS counts the number of walks of length three in a bipartite graph with $m$ vertices on one side and $n$ on the other (edge $v_i$ and $w_j$ is present iff $i\in A_j$). For any triple of edges chosen at random, the probability that the first two share a vertex in the side with $m$ vertices is $\frac 1m$ and the probability that the last to share a vertex in the side with $n$ vertices is $\frac 1n$, so you would expect there to be at least $\frac{e^3}{mn}$ such walks, which is exactly the RHS. Unfortunately, we cannot just multiply the probabilities because the events are not independent, but maybe this can provide some insight into the solution.
05.01.2018 06:51
Good instinct there. The missing ingredient you need to wrap this up is Hölder's inequality
05.01.2018 07:35
Thanks
12.01.2018 06:25
\[\sum_{i=1}^m\sum_{j=1}^m |A_i|\cdot|A_i\cap A_j|=\sum_{A_i}\sum_{A_j}|A_i|\cdot \sum_{y\in A_i\cap A_j} 1=\sum_{A_i}\sum_{y\in A_i}\left(\sum_{A_j\ni y}1\right)|A_i|\]\[\ge \frac{\left(\sum_{A_i}\sum_{y\in A_i} 1\right)^3}{\left(\sum_{A_i}\sum_{y\in A_i} \frac{1}{\sum_{A_j\ni y} 1}\right)\left(\sum_{A_i}\sum_{y\in A_i} \frac{1}{|A_i|}\right)}=\frac{\left(\sum_{i=1}^m |A_i|\right)^3}{\left(\sum_{y=1}^n\frac{1}{\sum_{A_j\ni y} 1}\sum_{A_i\ni y} 1\right)\left(\sum_{A_i}\frac{1}{|A_i|}\sum_{y\in A_i} 1\right)}\]\[=\frac{\left(\sum_{i=1}^m |A_i|\right)^3}{\left(\sum_{y=1}^n 1\right)\left(\sum_{A_i} 1\right)}=\frac{1}{mn}\left(\sum_{i=1}^m |A_i|\right)^3\]where the inequality follows by Holder.
12.01.2018 08:18
Nice. Thank you .
15.01.2018 19:50
It is just some calculations. I think it is an easy question for China.
20.01.2018 04:14
As ABCDE says above, the problem (after some translation) is asking us to show that the number of paths of length 3 in a bipartite graph with $m$ vertices on one side and $n$ vertices on the other is at least $\frac{e^3}{mn}.$ Here's a proof of this fact (not original, I took it from somewhere). Notationally, let $P(G)$ denote the number of paths of length 3 in a graph $G$. Lemma 1: If $G$ is bipartite with $m$ vertices on one side, $n$ on the other, and $e$ edges, then we have that $P(G) \ge \frac{1}{27} \frac{e^3}{mn}.$ Proof: Let the sides be $A, B$ s.t. $|A| = m, |B| = n.$ Define \[ A' = \{ v \in A : d(v) \ge \frac{e}{3m} \} \text{ and } B' = \{ v \in B : d(v) \ge \frac{e}{3n} \}. \]Now, we count the number of edges such that it has vertices in both $A'$ and $B'$. The number of edges NOT satisfying this is obviously at most \[ \sum_{v \not\in A'} d(v) + \sum_{v \not\in B'} d(v) < \sum_{v \not\in A'} \frac{e}{3m} + \sum_{v \not\in B'} \frac{e}{3n} < \frac{2e}{3}. \]So at least $\frac{e}{3}$ edges have both vertices in either $A'$ or $B'.$ Only counting these edges, they will contribute in total at least \[ \frac{e}{3} \cdot \frac{e}{3n} \cdot \frac{e}{3m} = \frac{1}{27} \frac{e^3}{mn} \]paths of length 3. Now, we use this to finish the problem. For a bipartite graph $G$ define the bipartite graph $G^{\otimes k}$ with two sides $(a_1, a_2, \dots, a_k)$ where $a_i \in A \forall i$ and $(b_1, b_2, \dots, b_k)$ where $b_i \in B \forall i.$ Say that the previous two "vertices" are adjacent if and only if $a_i$ and $b_i$ are adjacent for all $i$. It is clear that the two sides of $G^{\otimes k}$ have sizes $m^k$ and $n^k$. Also, this graph has $e^k$ edges, and it is easy to check that $P(G^{\otimes k}) = P(G)^k.$ By our lemma above, we have that $P(G)^k = P(G^{\otimes k}) \ge \frac{1}{27} \frac{e^{3k}}{m^k n^k}.$ Taking $k$-th roots and taking $k \to \infty$ completes the proof. Remark: This is called the tensor-power trick. It is really good.
07.02.2018 11:59
Looking at the sum from the framework of the elements of $A_i$, every element $a_r$ contributes $f(a_r) \cdot |A_i|$ to the sum, where $f(a_r)$ is the number of subsets in which $a_r$ is present. Construct a bipartite graph with the elements on left side and the subsets on the other side. Thus, this sum is the number of paths of length 3. Now we look at the middle two edges of each path. Suppose the number of edges is $e$ and the vertices are $a_x, b_y$ on the left and right side respectively, then by Holder's inequality, $\left[ \sum \frac{1}{d(a_x)} \right] \left[ \sum \frac{1}{d(b_y)} \right] \left[ \sum d(a_x)d(b_y) \right] \ge e^3$, where the sum is taken over each edge, which simplifies to the required sum being at least $e^3/mn$.
23.03.2018 00:50
what mean the double summatory?? an example?
08.09.2018 09:40
Construct a bipartite and count how many road with length 3 in this graph.
07.11.2020 00:49
Here is another nice proof of this problem from a Youtube video of Tim Gowers using entropy. It suffices to show that a bipartite graph with $m$ vertices on one side and $n$ vertices on the other side and $e$ edges has at least $\frac{e^3}{mn}$ paths of length 3. Let the graph have sides $X$ and $Y$ with $|X|=m$ and $|Y|=n$, edge set $E$, and length 3 path set $P$. We wish to show that $|P|\cdot|X|\cdot|Y|\ge|E|^3$. Consider the following distribution on paths, a vertex in $X$, and a vertex in $Y$: choose an edge in $E$ at random, say it is $u_1-v_1$ with $u_1\in X,v_1\in Y$. Now, choose a uniform random neighbor $v_2$ of $u_1$ and a uniform random neighbor $u_2$ of $v_1$. Then we take our random path and vertices to be $(v_2,u_1,v_1,u_2),u_1,v_1$. The first observation is that the edges $u_1-v_2$ and $u_2-v_1$ are also distributed uniformly. Indeed, conditioned on $u_1$, $u_1-v_2$ is equidistributed as the uniform random edge $u_1-v_1$, and $u_1$ is the $X$ endpoint of a uniform random edge. The same argument works for $u_2-v_1$. Furthermore, conditioned on $u_1$, $v_2$ is independent from $u_2,v_1$, and conditioned on $v_1$, $u_2$ is independent from $u_1,v_2$. We then compute the entropy of $u_1,u_2,v_1,v_2$ using these observations. We have that $H[u_1,u_2,v_1,v_2]=H[u_1,v_2]+H[v_1\mid u_1,v_2]+H[u_2\mid u_1,v_1,v_2]=H[u_1,v_2]+H[v_1\mid u_1]+H[u_2\mid v_1]=H[u_1,v_2]+H[u_1,v_1]+H[u_2,v_1]-H[u_1]-H[v_1]$. Because $u_1-v_2,u_1-v_2,u_2-v_1$ are distributed as uniform random edges, this is equal to $3\log e-H[u_1]-H[v_1]\ge 3\log e-\log m-\log n$ as entropy is maximized at the uniform distribution. But this means that $\log|P|\ge3\log e-\log m-\log n$ or $|P|\cdot|X|\cdot|Y|\ge|E|^3$, as desired.
04.09.2021 00:29
Just double counting: WLOG let the set of size $n$ be $\{1,2,\ldots n\}$, and let $a_{x}$ denote the number of $i$ where $x\in A_{i}$. Observe that \[\sum_{j=1}^{m} |A_{i}\cap A_{j}| = \sum_{x\in A_{i}} a_{x}\]This is because for each $x\in A_{i}$, there are $a_{x}$ times where $x\in A_{j}$. Now, we have \[mn\sum_{i=1}^{m}|A_{i}|\sum_{x\in A_{i}} a_{x} \geq mn \sum_{i=1}^{n}\left(\sum_{x\in A_{i}} \sqrt{a_{x}}\right)^2\geq n\left(\sum_{i=1}^{m} \sum_{x\in A_{i}} \sqrt{a_{x}}\right)^2 = n\left(\sum_{i=1}^{n} a_{i}\sqrt{a_{i}}\right)^2\]where each inequality follows by cauchy schwartz, and the last step follows by the fact each $\sqrt{a_{x}}$ will be counted $a_{x}$ times. However, by Holder's inequality, we have \[(1 + 1 + \ldots + 1)^{\frac{1}{3}} \left(\sum_{i=1}^{n} a_{i}^{\frac{3}{2}}\right)^{\frac{2}{3}} \geq \sum_{i=1}^{n} a_{i}\]Cubing both sides gives us the desired inequality.
15.01.2023 11:51
adamov1 wrote: \[\sum_{i=1}^m\sum_{j=1}^m |A_i|\cdot|A_i\cap A_j|=\sum_{A_i}\sum_{A_j}|A_i|\cdot \sum_{y\in A_i\cap A_j} 1=\sum_{A_i}\sum_{y\in A_i}\left(\sum_{A_j\ni y}1\right)|A_i|\]\[\ge \frac{\left(\sum_{A_i}\sum_{y\in A_i} 1\right)^3}{\left(\sum_{A_i}\sum_{y\in A_i} \frac{1}{\sum_{A_j\ni y} 1}\right)\left(\sum_{A_i}\sum_{y\in A_i} \frac{1}{|A_i|}\right)}=\frac{\left(\sum_{i=1}^m |A_i|\right)^3}{\left(\sum_{y=1}^n\frac{1}{\sum_{A_j\ni y} 1}\sum_{A_i\ni y} 1\right)\left(\sum_{A_i}\frac{1}{|A_i|}\sum_{y\in A_i} 1\right)}\]\[=\frac{\left(\sum_{i=1}^m |A_i|\right)^3}{\left(\sum_{y=1}^n 1\right)\left(\sum_{A_i} 1\right)}=\frac{1}{mn}\left(\sum_{i=1}^m |A_i|\right)^3\]where the inequality follows by Holder. I have a similar solution, but it looks more beautiful. Let $\left[x\in A_i\right]=\begin{cases}1&x\in A_i\\0&x\notin A_i\end{cases}$, then $|A_i|=\sum\limits_x\left[x\in A_i\right]$ and $|A_i\cap A_j|=\sum\limits_x\left[x\in A_i\right]\left[x\in A_j\right]$. Let $B_y=\sum_{j=1}^m\left[y\in A_j\right]$ $\therefore\begin{aligned}LHS&=\sum_{i=1}^m\sum_{j=1}^m\left(\sum\limits_x\left[x\in A_i\right]\right)\left(\sum\limits_y\left[y\in A_i\right]\left[y\in A_j\right]\right)\\&=\sum_{i=1}^m\sum_{j=1}^m\sum\limits_x\sum\limits_y\left[x\in A_i\right]\left[y\in A_i\right]\left[y\in A_j\right]\\&=\sum_{i=1}^m\sum\limits_y\left[y\in A_i\right]\left(\sum\limits_x\left[x\in A_i\right]\right)\left(\sum_{j=1}^m\left[y\in A_j\right]\right)\\&=\sum_{i=1}^m\sum\limits_y\left[y\in A_i\right]\cdot |A_i|\cdot |B_y|\end{aligned}$ $\because m=\sum_{i=1}^m1=\sum_{i=1}^m\sum\limits_y\frac{\left[y\in A_i\right]}{|A_i|},n=\sum_{i=1}^m\sum\limits_y\frac{\left[y\in A_i\right]}{|B_y|}$ $\therefore$ Using Holder inequality, $LHS\cdot n\cdot m\geqslant\left(\sum_{i=1}^m\sum\limits_y\left[y\in A_i\right]\right)^3=\left(\sum_{i=1}^{m}|A_i|\right)^3$
06.09.2023 04:00
Construct a grid with $n$ rows and $m$ columns numbered from $1$ to $n$ and $1$ to $m$ respectively. For a column $i$, take each $e \in A_i$ and color the cell in row $e$ and column $i$. Let $r_a$ and $c_b$ denote the number of colored cells in row $a$ and column $b$ respectively. The LHS clearly counts the number of ways we can pick a colored cell, and then pick a colored cell in its row and another one in its column (possibly equal to the colored cell itself), so we want to prove $$mn\sum_{(i,j)\text{ colored}} r_ic_j \geq \left(\sum_{(i,j)\text{ colored}}1\right)^3.$$The key insight is that $n=\sum_{(i,j)\text{ colored}} r_i^{-1}$, which is clear if we count the RHS by row, and a similar identity is true for the columns. Hence we can rewrite the inequality as $$\left(\sum_{(i,j)\text{ colored}} r_i^{-1}\right)\left(\sum_{(i,j)\text{ colored}} c_j^{-1}\right)\left(\sum_{(i,j)\text{ colored}} r_ic_j\right)\geq \left(\sum_{(i,j)\text{ colored}} 1\right)^3,$$which is just Cauchy-Schwarz. $\blacksquare$
26.11.2023 23:19
Very nice problem! WLOG all subsets are nonempty. We interpret the sets as cells in an $m \times n$ grid colored black. Let the $A_i$ be subsets of the set $\{1, 2, \dots, n\}$. Color the cell $(a, b)$ black if $b \in A_a$, and white otherwise. Denote $r_i$ by the number of black cells in the $i$th row and $c_j$ by the number of black cells in the $j$th column. Claim: We have \[ \sum_{(i, j) \ \text{black}} (r_i^{-1}, c_j^{-1}) = (m, n). \]Proof. We will show that \[ \sum_{(i, j) \ \text{black}} r_i^{-1} = m, \]as the result for columns is analogous. Note that the LHS (by Linearity of Expectation) is the expected number of black cells that are the leftmost of their row. But since the number of such cells is always $m$, its expected value is $m$, as desired. Realize that the desired inequality is notationally equivalent to \[ \sum_{(i, j) \ \text{black}} r_ic_j \geq \frac{1}{mn} \left(\sum_{(i, j) \ \text{black}} 1\right)^3. \]Replace $m$ with $\sum_{(i, j) \ \text{black}} r_i^{-1}$ and $n$ with $\sum_{(i, j) \ \text{black}} c_j^{-1}$. Thus it suffices to show that \[ \left(\sum_{(i, j) \ \text{black}} r_ic_j\right) \left(\sum_{(i, j) \ \text{black}} r_i^{-1}\right) \left(\sum_{(i, j) \ \text{black}} c_j^{-1} \right)\geq \left(\sum_{(i, j) \ \text{black}} 1\right)^3. \]However, this holds by Holder's inequality, as desired.
13.06.2024 00:25
Storage/Write-up Practice
13.06.2024 07:18
@above clever use of indicator function!