TY - JOUR
T1 - Exact and Soft Successive Refinement of the Information Bottleneck
AU - Charvin, Hippolyte
AU - Catenacci Volpi, Nicola
AU - Polani, Daniel
A2 - Lewandowsky, Jan
A2 - Bauch, Gerhard
N1 - © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/
PY - 2023/9/19
Y1 - 2023/9/19
N2 - The information bottleneck (IB) framework formalises the essential requirement for efficient information processing systems to achieve an optimal balance between the complexity of their representation and the amount of information extracted about relevant features. However, since the representation complexity affordable by real-world systems may vary in time, the processing cost of updating the representations should also be taken into account. A crucial question is thus the extent to which adaptive systems can leverage the information content of already existing IB-optimal representations for producing new ones, which target the same relevant features but at a different granularity. We investigate the information-theoretic optimal limits of this process by studying and extending, within the IB framework, the notion of successive refinement, which describes the ideal situation where no information needs to be discarded for adapting an IB-optimal representation’s granularity. Thanks in particular to a new geometric characterisation, we analytically derive the successive refinability of some specific IB problems (for binary variables, for jointly Gaussian variables, and for the relevancy variable being a deterministic function of the source variable), and provide a linear-programming-based tool to numerically investigate, in the discrete case, the successive refinement of the IB. We then soften this notion into a quantification of the loss of information optimality induced by several-stage processing through an existing measure of unique information. Simple numerical experiments suggest that this quantity is typically low, though not entirely negligible. These results could have important implications for (Formula presented.) the structure and efficiency of incremental learning in biological and artificial agents, (Formula presented.) the comparison of IB-optimal observation channels in statistical decision problems, and (Formula presented.) the IB theory of deep neural networks.
AB - The information bottleneck (IB) framework formalises the essential requirement for efficient information processing systems to achieve an optimal balance between the complexity of their representation and the amount of information extracted about relevant features. However, since the representation complexity affordable by real-world systems may vary in time, the processing cost of updating the representations should also be taken into account. A crucial question is thus the extent to which adaptive systems can leverage the information content of already existing IB-optimal representations for producing new ones, which target the same relevant features but at a different granularity. We investigate the information-theoretic optimal limits of this process by studying and extending, within the IB framework, the notion of successive refinement, which describes the ideal situation where no information needs to be discarded for adapting an IB-optimal representation’s granularity. Thanks in particular to a new geometric characterisation, we analytically derive the successive refinability of some specific IB problems (for binary variables, for jointly Gaussian variables, and for the relevancy variable being a deterministic function of the source variable), and provide a linear-programming-based tool to numerically investigate, in the discrete case, the successive refinement of the IB. We then soften this notion into a quantification of the loss of information optimality induced by several-stage processing through an existing measure of unique information. Simple numerical experiments suggest that this quantity is typically low, though not entirely negligible. These results could have important implications for (Formula presented.) the structure and efficiency of incremental learning in biological and artificial agents, (Formula presented.) the comparison of IB-optimal observation channels in statistical decision problems, and (Formula presented.) the IB theory of deep neural networks.
KW - information bottleneck
KW - deep learning
KW - unique information
KW - incremental learning
KW - coarse-graining
KW - successive refinement
KW - Blackwell order
UR - http://www.scopus.com/inward/record.url?scp=85172309555&partnerID=8YFLogxK
U2 - 10.3390/e25091355
DO - 10.3390/e25091355
M3 - Article
C2 - 37761653
SN - 1099-4300
VL - 25
SP - 1
EP - 51
JO - Entropy
JF - Entropy
IS - 9
M1 - 1355
ER -