File: nsl/statistics.tex, r1669
This chapter presents a selection of definitions and examples of mathematical properties that may be useful in learning computer communications and security.
Applying several properties of exponentials and logarithms can make it easier when dealing with large binary values. Consider the following properties:
For example:
With this property of exponentials, if you can remember the values of to then you can approximate most values of that you come across in communications and security. Table B.1 gives the exact or approximate decimal value for -bit numbers.
Exponent, | ||
(bits) | Exact Value | Approx. Value |
0 | 1 | - |
1 | 2 | - |
2 | 4 | - |
3 | 8 | - |
4 | 16 | - |
5 | 32 | - |
6 | 64 | - |
7 | 128 | - |
8 | 256 | - |
9 | 512 | - |
10 | 1,024 | 1,000 = |
11 | - | 2,000 |
12 | - | 4,000 |
13 | - | 8,000 |
14 | - | 16,000 |
… | ||
19 | - | 512,000 |
20 | - | 1,000,000 = |
21 | - | |
22 | - | |
23 | - | |
… | ||
29 | - | |
30 | - | |
31 | - | |
32 | - | |
33 | - | |
… | ||
39 | - | |
40 | - | |
50 | - | |
60 | - | |
70 | - | |
- | ||
Another example:
Similar with logarithms:
Example 1.1. Consider a sliding-window flow control protocol that uses an 16-bit sequence number. There are possible values of the sequence number, ranging from 0 to 65,535 (after which it wraps back to 0).
Example 1.3. If choosing a 128-bit encryption key randomly, then there are possible values of the key.
Video: Number of Binary Values (5 min; Jan 2015)
Definition 2 (Fixed Length Sequences). Given a set of items, there are possible -item sequences, assuming repetition is allowed.
Example 2.1. A user chooses a 4-digit PIN for a bank card. As there are 10 possible digits, there are possible PINs to choose from.
Example 2.2. A standard keyboard includes 94 printable characters (a–z, A–Z, 0–9, and 32 punctuation characters). If a user must select a password of length 8, then there are possible passwords that can be selected.
Video: Fixed Length Sequences (7 min; Jan 2015)
Definition 3 (Pigeonhole Principle). If objects are distributed over m places, and if , then some places receive at least two objects.
Video: Pigeonhole Principle (2 min; Jan 2015)
Example 3.1. There are 20 balls to be placed in 5 boxes. At least one box will have at least two balls. If the balls are distributed in a uniform random manner among the boxes, then on average there will be 4 balls in each box.
Video: Pigeonhole Principle with Uniform Random Distribution (1 min; Jan 2015)
Example 3.2. A hash function takes a 100-bit input value and produces a 64-bit hash value. There are possible inputs distributed to possible hash values. Therefore at least some input values will map to the same hash value, that is, a collision occurs. If the hash function distributes the input values in a uniform random manner, then on average, there will be different input values mapping to the same hash value.
Video: Pigeonhole Principle and Hash Functions (5 min; Jan 2015)
Example 4.1. Consider four coloured balls: Red, Green, Blue and Yellow. There are arrangements (or permutations) of those balls:
RGBY, RGYB, RBGY, RBYG, RYGB, RYBG,
GRBY, GRYB, GBRY, GBYR, GYRB, GYBR,
BRGY, BRYG, BGRY, BGYR, BYRG, BYGR,
YRGB, YRBG, YGRB, YGBR, YBRG, YBGR
Video: Factorial and arranging balls (2 min; Jan 2015)
Example 4.2. The English alphabetic has 26 letters, a–z. There are ways to arrange those 26 letters.
Video: Arranging English Letters (2 min; Jan 2015)
Example 4.3. An encryption algorithm takes a 64-bit plaintext message and a key as input and then maps that to a 64-bit ciphertext message as output. There are possible input plaintext messages. There are different reversible mappings from plaintext to ciphertext, i.e. possible keys.
Video: Number of keys for ideal block cipher (6 min; Jan 2015)
Definition 5 (Combinations). The number of combinations of items when selecting at a time from a set of items, assuming repetition is not allowed and order doesn’t matter, is:
The following definition is just a specific instance of number of combinations (Definition 5) when . However the formula is simplified.
Definition 6 (Number of Pairs). The number of pairs of items in a set of items, assuming repetition is not allowed and order doesn’t matter, is:
Example 6.1. There are four coloured balls: Red, Green, Blue and Yellow. The number of different coloured pairs of balls is . They are: RG, RB, RY, GB, GY, BY. Repetitions are not allowed (as they won’t produce different coloured pairs), meaning RR is not a valid pair. Ordering doesn’t matter, meaning RG is the same as GR.
Example 6.2. A computer network has 10 devices. The number of links needed to create a full-mesh topology is .
Example 6.3. There are 50 users in a system, and each user shares a single secret key with every other user. The number of keys in the system is .
Video: Number of Pairs from n Items (5 min; Jan 2015)
In this chapter when referring to a “random” number it means taken from a uniform random distribution. That means there is equal probability of selecting each value from the set.
Definition 7 (Probability of Selecting a Value). Probability of randomly selecting a specific value from a set of values is .
Example 7.1. There are five coloured balls in a box: red, green, blue, yellow and black. The probability of selecting the yellow ball is 1/5.
Example 7.2. IEEE 802.11 (WiFi) involves a station selecting a random backoff from 0 to 15. The probability of selecting 5 is 1/16.
Video: Probability of Selecting a Particular Value from a Set (2 min; Jan 2015)
Definition 8 (Total Expectation). For a set of events which are mutually exclusive and exhaustive, where for event the expected value is given probability , then the total expected value is:
Video: Total Expectation Definition (1 min; Jan 2015)
Example 8.1. Average packet delay for packets in a network is 100 ms along path 1 and 150 ms along path 2. Packets take path 1 30% of the time, and take path 2 70% of the time. The average packet delay across both paths is: ms.
Video: Total Expectation and Packet Delay (3 min; Jan 2015)
Example 8.2. In a network with 1,000 users, 150 users choose a 6-character password, 500 users choose a 7-character password, 250 users choose 9-character password and 100 users choose a 10-character password. The average password length is 7.65 characters.
Video: Total Expectation and Password Selection (3 min; Jan 2015)
Definition 9 (Number of Attempts). If randomly selecting values from a set of values, then the number of attempts needed to select a particular value is:
Video: Number of Attempts Needed to Randomly Select a Value (1 min; Jan 2015)
Example 9.1. One person has chosen a random number between 1 and 10. Another person attempts to guess the random number. The best case is that they guess the chosen number on the first attempt. The worst case is that they try all other numbers before finally getting the correct number, that is 10 attempts. If the process is repeated 1000 times (that is, one person chooses a random number, the other guesses, then the person chooses another random number, and the other guesses again, and so on), then on average 10% of time it will take 1 attempt (best case), 10% of the time it will take 2 attempts, 10% of the time it will take 3 attempts, …, and 10% of the time it will take 10 attempts (worst case). The average number of attempts is therefore 5.
Video: Attempts to select a value between 1 and 10 (5 min; Jan 2015)
Example 9.2. A user has chosen a random 128-bit encryption key. There are possible keys. It takes an attacker on average attempts to find the key. If instead a 129-bit encryption key was used, then the attacker would take on average attempts. (Increasing the key length by 1 bit doubles the number of attempts required by the attacker to guess the key).
Video: Attempts to guess a secret key (3 min; Jan 2015)
Definition 10 (Birthday Paradox). Given random numbers selected from the range 1 to , the probability that at least two numbers are the same is:
Example 10.1. Given a group of 10 people, the probability of at least two people have the same birth date (not year) is:
Defintion 10 can be re-arranged to find the number of values needed to obtain a specified probability that at least two numbers are the same:
Example 10.2. How many people in a group are needed such that the probability of at least two of them having the same birth date is 50%?
So 23 people in a group means there is 50% chance that at least two have the same birth date.
Example 10.3. Given a hash function that outputs a 64-bit hash value, how many attempts are need to give a 50% chance of a collision?
Following Example 10.3, the number of attempts to produce a collision when using an -bit hash function is approximately .