Next:Multivariate EntropiesUp:Shannon's Entropy
Previous:Measures of Uncertainty: Shannon's Go to:Table of Contents

Properties of Shannon's Entropy


The measure of uncertainty $ H_n(p_1,p_2,...,p_n)$ given by (1.7) satisfy many interesting properties. For simplicity, we shall take $ H(p_1,p_2,...,p_n)$ or $ H(P)$ instead of $ H_n(p_1,p_2,...,p_n)$. Unless otherwise specified, it is understood that $ P=(p_1,p_2,...,p_n) \in \Delta_n,$$ Q=(q_1,q_2,$$ ...,q_m) \in \Delta_m,$ and $ P*Q =(p_1q_1,...,p_1q_m,$$ p_2q_1,...,p_2q_m,...,p_nq_1, ...,p_nq_m) \in \Delta_{nm},V=(v_{11},$$ v_{12},...,v_{1m},v_{21},...,v_{2m},$$ ...,v_{n1},...,v_{nm})\ \in\ \Delta_{nm}\ {\rm and}\W_i=(w_{i1},...,w_{im})\ \in\ \Delta_m,\ \forall\ i=1,2,...,n$. Also it is understood that $ 0\log 0=0$ and all the logarithms are with base 2.

Property 1.1. (Nonnegativity)$ H(P) \geq0$ with equality iff $ P=P^\circ$, where$ P^\circ=(0,0,...,1^{i)},0,$$ ...,0)\ \in\ \Delta_n$.

Property 1.2. (Continuity)$ H(P)$ is a continuous function of P.

Property 1.3. (Symmetry)$ H(P)$ is a symmetric function of its arguments, i.e.,

$\displaystyle H(p_1,...,p_n)=H(p_{\tau(1)},...,p_{\tau(n)}),$
where $ \tau$ is any permutation from 1 to n.

Property 1.4. (Expansible). We have

$\displaystyle H(p_1,p_2,...,p_n,0)=H(p_1,p_2,...,p_n).$

Property 1.5. (Decisive). We have

$\displaystyle H(1,0)=H(0,1)=0.$

Property 1.6. (Normality). We have

$\displaystyle H({1\over 2},{1\over 2})=1.$

Property 1.7. (Sum Representation). We can write 

$\displaystyle H(P)=\sum_{i=1}^n{f(p_i)},$
where 
$\displaystyle f(p)=-p\log p,\ 0\leq p \leq 1.$

Property 1.8. (Recursivity). We have

$\displaystyle H(p_1,p_2,...,p_n)=H(p_1+p_2,p_3,...,p_n)+(p_1+p_2)H\Big({p_1\overp_1+p_2},{p_2\over p_1+p_2}\Big),$
$ p_1+p_2>0,\ n\geq3.$

Property 1.9. (Additivity). We have

$ H(p_1q_1,...,p_1q_m,p_2q_1,...,p_2q_m,...,p_nq_1,...,p_nq_m)$

$\displaystyle =H(p_1,p_2,...,p_n)+H(q_1,q_2,...,q_m).$

Property 1.10. (Strongly Additive). We have

$ H(p_1w_{11},...,p_1w_{1m},p_2w_{21},...,p_2w_{2m},...,p_nw_{n1},...,p_nw_{nm})$

$\displaystyle =H(p_1,p_2,...,p_n)+\sum_{i=1}^n{p_iH(w_{i1},w_{i2},...,w_{im})}.$
Property 1.11. (Grouping). We have

$ H(p_1,p_2,...,p_n)=H(p_1+p_2+...+p_\sigma+p_{\sigma+1}+...+p_n)$

$\displaystyle +\,\sum_{k=1}^\sigma{p_k H\Big(p_1\Big/\sum_{k=1}^\sigma{p_k},...,p_\sigma\Big/\sum_{k=1}^\sigma{p_k}\Big)}$
$\displaystyle +\sum_{k=\sigma+1}^n{p_k H\Big(p_{\sigma+1}\Big/\sum_{k=\sigma+1}^n{p_k},...,p_n\Big/\sum_{k=\sigma+1}^n{p_k}\Big)}.$

Property 1.12. (Generalized Grouping). We have

$ H(p_1,...,p_{\sigma_1},p_{\sigma_1+1},...,p_{\sigma_2},...,p_{\sigma_{n-1}+1},...,p_{\sigma_n})$

$ =H(p_1+...+p_{\sigma_1},p_{\sigma_1+1}+...+p_{\sigma_2},...,p_{\sigma_{n-1}+1}+...+p_{\sigma_n})$

$\displaystyle +\,\sum_{i=1}^n{(p_{\sigma_{i-1}+1}+...+p_{\sigma_i})H\Big(p_{\......igma_i}{p_j},...,p_{\sigma_i}\Big/\sum_{j=\sigma_{i-1}+1}^{\sigma_i}{p_j}\Big)}$

Property 1.13. (Binary-Entropic). Let

$\displaystyle \psi(p)=H(p,1-p),\ 0 \leq p \leq 1.$
    (1.8)

Then

(i) $ \psi(p)=\psi(1-p).$
(ii) $ \psi(1)=\psi(0).$
(iii) $ \psi\Big({1\over 2}\Big)=1.$
(iv) $ \psi(p)+(1-p)\psi\Big({q\over 1-p}\Big)=\psi(q)+\psi\Big({p\over 1-q}\Big),\ p,q \in [0,1),\ p+q\leq 1.$
(v) there exists a K such that $ \psi(p)\leq K$.
(vi) the function $ p \to \psi(p)$ is non-decreasing on the interval $ \bigl(0,{1\over 2}\bigr].$
(vii) for every $ p_0\ \in\ ]0,1[$, we have
$\displaystyle \lim_{q\to {0^+}}H(p_0,q)=h(p_0).$
(viii) $ \lim_{p\to {0^+}} \psi(p)=0.$
(ix) $ H(P)=\sum_{t=2}^n{(p_1+p_2+...+p_t)\psi\Big({p_t\over p_1+p_2+...+p_t}\Big)}.$
Property 1.14. (Shannon-Gibbs Inequality). For all $ P=(p_1,p_2,...,p_n)\ \in\ \Delta_n$ and $ Q=(q_1,$$ q_2,...,q_n) \in\ \Delta_n$, we have
$\displaystyle - \sum_{i=1}^n{p_i\log p_i}\leq - \sum_{i=1}^n{p_i\log q_i},$
    (1.9)

with equality iff $ p_i=q_i,\ \forall\ i$ or iff $ P=Q$ (whenever$ q_i=0$ for some $ i$, the corresponding $ p_i$ is also zero).

Property 1.15. (Maximality).$ H(p_1,p_2,...,p_n)$ is maximum when all the probabilities are equal i.e., 

$\displaystyle H(p_1,p_2,...,p_n)\leq H\Big({1\over n},{1\overn},...,{1\over n}\Big),$
with equality iff $ p_i={1\over n},\\forall\ i=1,2,...,n.$

Property 1.16. (Uniform distribution). Let 

$\displaystyle \phi(n)=H\Big({1\over n},{1\over n},...,{1\over n}\Big),\n\geq 2,\ n\ \in\ I\!\!N.$

Then
(i) $ H \Big({n-n_1\over n},{n_1\over n}\Big)=-{n_1\over n}[\phi (n_1) - \phi(n)]-{n-n_1\over n}[\phi(n-n_1)-\phi(n)],$
$ 0.\phi(0)=0,\ 1\leq n_1\leq n.$
(ii) $ \phi(n)\leq \phi(n+1).$
(iii) $ n\phi(n) \leq (n+1)\phi(n+1).$
(iv) $ \lim_{n\to \infty}\Big[\phi(n+1)-{n+1\over n}\phi(n)\Big]=0.$
Property 1.17. (Sub-additivity). We have

$ H(v_{11},v_{12},...,v_{1m},v_{21},...,v_{2m},...,v_{n1},...,v_{nm}\Big)$

$\displaystyle \leqH\Big(\sum_{i=1}^n{v_{i1}},\sum_{i=1}^n{v_{i2}},...,\sum_{i......+H\Big(\sum_{j=1}^m{v_{1j}},\sum_{j=1}^m{v_{2j}},...,\sum_{j=1}^m{v_{nj}}\Big).$

Property 1.18. (Independence Inequality). If 

$\displaystyle p_i=\sum_{j=1}^m{v_{ij}}, \forall\ i=1,2,...,n,$
and
$\displaystyle q_j=\sum_{i=1}^n{v_{ij}}, \forall\ j=1,2,...,m$
then
$\displaystyle H(v_{11},v_{12},...,v_{1m},v_{21},...,v_{2m},...,v_{n1},...,v_{nm})\leq H(p_1q_1,...,p_1q_m,...,p_nq_1,...,p_nq_m)$

Property 1.19. (Concavity).$ H(P)$ is a concave function of $ P$ in $ \Delta_n$.

Property 1.20. (Schur-concavity).$ H(P)$ is a Shur-concave function of $ P$ in $ \Delta_n$.

Property 1.21. Let $ Q=(q_1,...,q_n)\ \in\\Delta_n$ be a probability distribution such that $ q_1\geq q_2\geq\cdots \geq q_n$. Let us define $ P=(p_1,...,p_n)\ \in\ \Delta_n$ such that $ p_1=q_1-\Delta_u,\ p_2=q_2+\Delta_u, \ {\rm and} \p_i=q_i, \ \forall\ i=3,4,...,n.$ Then $ H(Q)\geq H(P)$.

Property 1.22. (Difference among two entropies). If 

$\displaystyle \sum_{i=1}^n{\vert p_i-q_i\vert}\leq \theta \leq {1\over2},$
then 
$\displaystyle \vert H(P)-H(Q)\vert\leq -\theta \log {\theta\over n},$
for all $ P,Q\ \in\ \Delta_n$.

Property 1.23. Bounds on H(P). For $ 1\leq\sigma \leq n$, we have

(i) $ \displaystyle H\Big(\sum_{i=1}^\sigma{p_i},1-\sum_{i=1}^\sigma{p_i}\Big)\leq H(p_1,p_2,...,p_n).$
(ii) $ H(p_1,p_2,...,p_n)\leq H\Big(\underbrace{\sum_{i=1}^{\sigma}{{p_i\over\sig......=1}^{\sigma}{{p_i\over\sigma}}}_{\sigma-times}, p_{\sigma + 1},...,p_n \Big).$
Property 1.24. (Relative to maximum probability). Let $ p_{max}=max \{p_1,...,p_n \}$. Then
(i) $ H(p_{max},1-p_{max})\leq H(P).$
(ii) $ H(P)\leq H\Big(\underbrace{{1-p_{max}\over n-1},...,{1-p_{max}\over n-1}}_{(n-1)-times},p_{max}\Big).$
(iii) $ H(k\,p_{max},1-k\,p_{max}) + k\,p_{max} \log k \leq H(P),$ where $ k$ is a positive integer satisfying $ {1\over k+1}\leqp_{max} < {1\over k}.$
(iv) $ 1-p_{max} \leq {1\over 2}H(P),\, p_{max}\geq {1\over 2}.$
Property 1.25. Let $ (p,1-p) \in \Delta_2$ and$ (u,1-u) \in \Delta_2$ be two probability distributions. If $ u>max\{ p,1-p \}$, then $ \psi(u)<\psi(p)$, where $ \psi$ is as given in (1.8).

Property 1.26. If $ \{ H_n \}$ is strongly additive (property 1.10), then it is additive (property 1.9).

Property 1.27. If $ \{ H_n \}$ is additive (property 1.9) for $ n=m=2$ and is expansible (property 1.4), then it is also additive (property 1.9).

Property 1.28. If $ \{ H_n \}$ is expansible (property 1.4) and strongly additive (property 1.10), then it is recursive (property 1.8).

Property 1.29. If $ \{ H_n \}$ is recursive (property 1.8) for $ n=3$ and symmetric (property 1.3) for $ n=3$, then it is symmetric (property 1.3) for $ n=2$ and decisive (property 1.5).

Property 1.30. If $ \{ H_n \}$ is recursive (property 1.8), symmetric (property 1.3) for $ n=3$, then it is symmetric (property 1.3) and expansible (property 1.4).

Property 1.31. If $ \{ H_n \}$ is symmetric (property 1.3) and recursive (property 1.8), then

$\displaystyle H_{n+m-1}(p_1q_{11},p_1q_{12},...,p_1q_{1m},p_2,...,p_n)=H_n(p_1,p_2,...,p_n)+p_1H_m(q_{11},...,q_{1m}).$
Property 1.32. If $ \{ H_n \}$ is recursive (property 1.8) and symmetric (property 1.3) for $ n=3$, then it is also strongly additive (property 1.10).

Property 1.33. If $ \{ H_n \}$ is expansible (property 1.4) and subadditive (property 1.17) for $ n=m$, then it is nonnegative (property 1.1).

Property 1.34. If $ \{ H_n \}$ is branching of the form

$\displaystyle H(p_1,p_2,...,p_n)-H(p_1+p_2,p_3,...,p_n)=\phi(p_1,p_2),$
where
$\displaystyle \phi : \left\{ (x,y)\vert x \in [0,1], y \in [0,1], x+y\leq 1\right\} \rightarrow !\!R,$
then 
$\displaystyle H(p_1,p_2,...,p_n)=H(p_1 +...+p_{n-1},p_n) + \sum_{k=2}^{n-1}{\phi(p_1 +...+ p_{k-1},p_k)}.$






Property 1.35. If $ \{ H_n \}$ is recursive (property 1.8), and $ \psi(p)=H(p,1-p), \ 0\leq p \leq 1$, then

$\displaystyle H(p_1,p_2,...,p_n)=\sum_{t=2}^n{(p_1+...+p_t)\psi \Big({p_t\overp_1+...+p_t} \Big)},$
with the convention that $ 0\,\psi\bigl({0\over 0} \bigr)=0.$

Property 1.36. If $ \{ H_n \}$ is normalized (property 1.6), symmetric (for $ n=3$) (property 1.3) and recursive (for $ n=3$) (property 1.8), then the function $ \psi(x)=H(1-x,x),\x \in [0,1]$ satisfies the property 1.13(i)-(iv).

Property 1.37. Binary entropic properties given by 1.13(i)-(iv) implies that $ \{ H_n \}$ is symmetric (property 1.3), normalized (property 1.6), expansible (property 1.4), decisive (property 1.5), recursive (property 1.8), strongly additive (property 1.9) and additive (property 1.10).

Note 1.1. Some of the properties given above can be seen in Aczél and Daróczy (1975) [2] and Mathai and Rathie (1975) [71]
 


21-06-2001
Inder Jeet Taneja
Departamento de Matemática - UFSC
88.040-900 Florianópolis, SC - Brazil