Chapter 4
Statistics for Communications and Security

 4.1 Binary Values
 4.2 Counting
 4.3 Permutations and Combinations
 4.4 Probability
 4.5 Collisions

File: crypto/statistics.tex, r1791

This chapter presents a selection of definitions and examples of mathematical properties that may be useful in learning computer communications and security.

Presentation slides that accompany this chapter can be downloaded in the following formats: slides only (PDF); slides with notes (PDF, ODP, PPTX).

4.1 Binary Values

Applying several properties of exponentials and logarithms can make it easier when dealing with large binary values. Consider the following properties:

nx × ny = nx+y

nx ny = nx-y

log n x × y = log n(x) + log n(y)

log n x y = log n(x) - log n(y)

Example 4.1 (Properties of Exponentials). Properties can be applied to simplify calculations:

212 = 22+10 = 22 × 210 = 4 × 1024 = 4096

With this property of exponentials, if you can remember the values of 21 to 210 then you can approximate most values of 2b that you come across in communications and security. Table 4.1 gives the exact or approximate decimal value for b-bit numbers.





Exponent, b
2b
(bits) Exact Value Approx. Value






0 1 -
1 2 -
2 4 -
3 8 -
4 16 -
5 32 -
6 64 -
7 128 -
8 256 -
9 512 -
10 1,024 1,000 = 103
11 - 2,000
12 - 4,000
13 - 8,000
14 - 16,000
19 - 512,000
20 - 1,000,000 = 106
21 - 2 × 106
22 - 4 × 106
23 - 8 × 106
29 - 512 × 106
30 - 109
31 - 2 × 109
32 - 4 × 109
33 - 8 × 109
39 - 512 × 109
40 - 1012
50 - 1015
60 - 1018
70 - 1021
x × 10 - 103x




Table 4.1: Useful Exact and Approximate Values in Binary

Example 4.2 (Properties of Exponentials with Binary Values). Properties and approximations can be used to perform large calculations:

2128 2100 = 2128-100 = 228 = 28 × 220 256 × 106 108

Example 4.3 (Properties of Logarithms). The number of bits needed to represent a decimal number can be found using logarithms:

log 2(20, 000) = log 2(20 × 103) = log 2(20) + log 2(103) 4 + 10 14

4.2 Counting

Definition 4.1 (Number of Binary Values). Given an n-bit number, there are 2n possible values.

Example 4.4 (Number of Sequence Numbers). Consider a sliding-window flow control protocol that uses an 16-bit sequence number. There are 216 = 65, 536 possible values of the sequence number, ranging from 0 to 65,535 (after which it wraps back to 0).

Example 4.5 (Number of IP Addresses). An IP address is a 32-bit value. There are 232 or approximately 4 × 109 possible IP addresses.

Example 4.6 (Number of Keys). If choosing a 128-bit encryption key randomly, then there are 2128 possible values of the key.

Video: Number of Binary Values (5 min; Jan 2015)

Definition 4.2 (Fixed Length Sequences). Given a set of n items, there are nk possible k-item sequences, assuming repetition is allowed.

Example 4.7 (Sequences of PINs). A user chooses a 4-digit PIN for a bank card. As there are 10 possible digits, there are 104 possible PINs to choose from.

Example 4.8 (Sequences of Keyboard Characters). A standard keyboard includes 94 printable characters (a–z, A–Z, 0–9, and 32 punctuation characters). If a user must select a password of length 8, then there are 948 possible passwords that can be selected.

Video: Fixed Length Sequences (7 min; Jan 2015)

Definition 4.3 (Pigeonhole Principle). If n objects are distributed over m places, and if n > m, then some places receive at least two objects.

Video: Pigeonhole Principle (2 min; Jan 2015)

Example 4.9 (Pigeonhole Principle on Balls). There are 20 balls to be placed in 5 boxes. At least one box will have at least two balls. If the balls are distributed in a uniform random manner among the boxes, then on average there will be 4 balls in each box.

Video: Pigeonhole Principle with Uniform Random Distribution (1 min; Jan 2015)

Example 4.10 (Pigeonhole Principle on Hash Functions). A hash function takes a 100-bit input value and produces a 64-bit hash value. There are 2100 possible inputs distributed to 264 possible hash values. Therefore at least some input values will map to the same hash value, that is, a collision occurs. If the hash function distributes the input values in a uniform random manner, then on average, there will be 2100 264 6.4 × 1010 different input values mapping to the same hash value.

Video: Pigeonhole Principle and Hash Functions (5 min; Jan 2015)

4.3 Permutations and Combinations

Definition 4.4 (Factorial). There are n! different ways of arranging n distinct objects into a sequence.

Example 4.11 (Factorial and Balls). Consider four coloured balls: Red, Green, Blue and Yellow. There are 4! = 24 arrangements (or permutations) of those balls:

RGBY, RGYB, RBGY, RBYG, RYGB, RYBG,

GRBY, GRYB, GBRY, GBYR, GYRB, GYBR,

BRGY, BRYG, BGRY, BGYR, BYRG, BYGR,

YRGB, YRBG, YGRB, YGBR, YBRG, YBGR

Video: Factorial and arranging balls (2 min; Jan 2015)

Example 4.12 (Factorial and English Letters). The English alphabetic has 26 letters, a–z. There are 26! 4 × 1026 ways to arrange those 26 letters.

Video: Arranging English Letters (2 min; Jan 2015)

Example 4.13 (Factorial and Plaintext Messages). An encryption algorithm takes a 64-bit plaintext message and a key as input and then maps that to a 64-bit ciphertext message as output. There are 264 1.6 × 1019 possible input plaintext messages. There are 264! 101088 different reversible mappings from plaintext to ciphertext, i.e. 264! possible keys.

Video: Number of keys for ideal block cipher (6 min; Jan 2015)

Definition 4.5 (Combinations). The number of combinations of items when selecting k at a time from a set of n items, assuming repetition is not allowed and order doesn’t matter, is:

n! k! n - k!

The following definition is just a specific instance of number of combinations (Definition 4.5) when k = 2. However the formula is simplified.

Definition 4.6 (Number of Pairs). The number of pairs of items in a set of n items, assuming repetition is not allowed and order doesn’t matter, is:

n n - 1 2

Example 4.14 (Pairs of Coloured Balls). There are four coloured balls: Red, Green, Blue and Yellow. The number of different coloured pairs of balls is 4 × 32 = 6. They are: RG, RB, RY, GB, GY, BY. Repetitions are not allowed (as they won’t produce different coloured pairs), meaning RR is not a valid pair. Ordering doesn’t matter, meaning RG is the same as GR.

Example 4.15 (Pairs of Network Devices). A computer network has 10 devices. The number of links needed to create a full-mesh topology is 10 × 92 = 45.

Example 4.16 (Pairs of Key Sharers). There are 50 users in a system, and each user shares a single secret key with every other user. The number of keys in the system is 50 × 492 = 1, 225.

Video: Number of Pairs from n Items (5 min; Jan 2015)

4.4 Probability

In this chapter when referring to a “random” number it means taken from a uniform random distribution. That means there is equal probability of selecting each value from the set.

Definition 4.7 (Probability of Selecting a Value). Probability of randomly selecting a specific value from a set of n values is 1n.

Example 4.17 (Probability of Selecting Coloured Ball). There are five coloured balls in a box: red, green, blue, yellow and black. The probability of selecting the yellow ball is 1/5.

Example 4.18 (Probability of Selecting Backoff Value). IEEE 802.11 (WiFi) involves a station selecting a random backoff from 0 to 15. The probability of selecting 5 is 1/16.

Video: Probability of Selecting a Particular Value from a Set (2 min; Jan 2015)

Definition 4.8 (Total Expectation). For a set of n events which are mutually exclusive and exhaustive, where for event i the expected value is Ei given probability Pi, then the total expected value is:

E = i=1nE iPi

Video: Total Expectation Definition (1 min; Jan 2015)

Example 4.19 (Total Expectation of Packet Delay). Average packet delay for packets in a network is 100 ms along path 1 and 150 ms along path 2. Packets take path 1 30% of the time, and take path 2 70% of the time. The average packet delay across both paths is: 100 × 0.3 + 150 × 0.7 = 135 ms.

Video: Total Expectation and Packet Delay (3 min; Jan 2015)

Example 4.20 (Total Expectation of Password Length). In a network with 1,000 users, 150 users choose a 6-character password, 500 users choose a 7-character password, 250 users choose 9-character password and 100 users choose a 10-character password. The average password length is 7.65 characters.

Video: Total Expectation and Password Selection (3 min; Jan 2015)

Definition 4.9 (Number of Attempts). If randomly selecting values from a set of n values, then the number of attempts needed to select a particular value is:

best case: 1

worst case: n

average case: n2

Video: Number of Attempts Needed to Randomly Select a Value (1 min; Jan 2015)

Example 4.21 (Number of Attempts in Choosing Number). One person has chosen a random number between 1 and 10. Another person attempts to guess the random number. The best case is that they guess the chosen number on the first attempt. The worst case is that they try all other numbers before finally getting the correct number, that is 10 attempts. If the process is repeated 1000 times (that is, one person chooses a random number, the other guesses, then the person chooses another random number, and the other guesses again, and so on), then on average 10% of time it will take 1 attempt (best case), 10% of the time it will take 2 attempts, 10% of the time it will take 3 attempts, …, and 10% of the time it will take 10 attempts (worst case). The average number of attempts is therefore 5.

Video: Attempts to select a value between 1 and 10 (5 min; Jan 2015)

Example 4.22 (Number of Attempts in Choosing Key). A user has chosen a random 128-bit encryption key. There are 2128 possible keys. It takes an attacker on average 21282 = 2127 attempts to find the key. If instead a 129-bit encryption key was used, then the attacker would take on average 21292 = 2128 attempts. (Increasing the key length by 1 bit doubles the number of attempts required by the attacker to guess the key).

Video: Attempts to guess a secret key (3 min; Jan 2015)

4.5 Collisions

Definition 4.10 (Birthday Paradox). Given n random numbers selected from the range 1 to d, the probability that at least two numbers are the same is:

p(n; d) 1 - d - 1 d nn-12

Example 4.23 (Two People Have Same Birthday). Given a group of 10 people, the probability of at least two people have the same birth date (not year) is:

p(10; 365) 1 - 364 365 1092 = 11.6%

Defintion 4.10 can be re-arranged to find the number of values needed to obtain a specified probability that at least two numbers are the same:

n(p; d) 2d ln 1 1 - p

Example 4.24 (Group Size for Birthday Matching). How many people in a group are needed such that the probability of at least two of them having the same birth date is 50%?

n(0.5; 365) 2 × 365 × ln 1 1 - 0.5 = 22.49

So 23 people in a group means there is 50% chance that at least two have the same birth date.

Example 4.25 (Group Size for Hash Collision). Given a hash function that outputs a 64-bit hash value, how many attempts are need to give a 50% chance of a collision?

n(0.5; 264) 2 × 264 × ln 1 1 - 0.5 264 = 232

Following Example 4.25, the number of attempts to produce a collision when using an n-bit hash function is approximately 2n2.