What are random and pseudorandom number generators: what are the differences and why in computer science we can only talk about seconds. The importance in cryptography and in multiple application fields.

Operations such as flipping a physical coin or seeing where a roulette ball lands are generally regarded as examples of true randomness.

John von Neumann, famous mathematician, physicist, computer scientist and philosopher (1903-1957) who provided the mathematical foundations for the development of computers (his homonymous “architecture” is famous, said that “*anyone who considers arithmetical methods to produce random digits is in a state of sin*“. That is why tossing a coin is cited by many as an example of **natural randomness**.

In reality it is demonstrable that knowing the environment and the tools used, therefore having a precise chronology of past events available, it is possible to predict the path of a roulette ball or the outcome of “heads” or “tails” following the toss of a coin.

I **random number generators** “real” ones use unpredictable physical processes such as thermal noise, radioactive decay, or atmospheric turbulence.

However, observing the events from a truly omniscient perspective, probably nothing, not even the physical world, is truly random because the events that gradually occur are the offspring of a chain of previous events.

For most events regarded as random it is impossible to gain such an omniscient perspective which is therefore something unfeasible and in fact impossible.

## Because we talk about pseudorandom number generators

In computer science, and we have observed it in many articles, we are talking about **pseudorandom number generation**: It is not possible to generate truly random numbers using a computer or a deterministic mathematical formula.

Computer generated numbers are based on a **mathematical algorithm** and on a set of input values known as “*seed*“. Even if these numbers seem random, if you know the algorithm and the **seed** (a bit like the cryptographic key in cryptography) used, you can reproduce the same numbers. Not knowing the *seed*the output is very difficult to predict.

Precisely for this reason, numbers generated by a computer are called “pseudorandom”: they appear to be random but are actually generated in a deterministic way.

The difficulty in **output prediction** it depends on the functioning details of the generator: in some cases it can be very complex, in others (for example due to “weaknesses” of the algorithm itself) even trivial.

Pseudo-random number generation is used in many contexts, such as cryptography, password generation tasks, simulation of physical phenomena, game design and applications of all kinds.

Modern cryptography, a technology also used to secure online transactions, relies on the complexity of making predictions about the behavior of the random number generator.

Suppose you are playing a computer game that relies solely on luck: it obviously includes a random number generator. Here, that generator could have been developed so that system managers can know for sure whether the user will win or not (acquiring the omniscient perspective we referred to earlier…). Conversely, for many natural phenomena, it is impossible for anyone to have an omniscient perspective on events.

## Codebook and pseudorandom number generation

Speaking of pseudorandom number generation, a **codebook** is a set of predefined values or sequences of values used as a starting point for pseudorandom number generation.

A codebook is a sort of “rule book” containing a list of numbers or sequences of numbers selected so that, when used in conjunction with a deterministic algorithm, they produce numbers that appear to be random. The generator chooses a sequence from the codebook as *seed* and then uses a mathematical algorithm to transform the *seed* in a sequence of numbers,

A good codebook must contain sufficiently complex and diverse values to ensure that the generated sequence is unpredictable. It must also be large and contain pages with significant data variations to prevent the generated sequences from becoming predictable.

## Bit out of phase

In a pseudorandom number generator, the “bits” refer to the amount of information that is used to represent each generated pseudorandom number – take a look at our article on binary code. If a generator uses **32 bit**means that each pseudorandom number is represented using 32 “units” of information.

Il **period** instead it refers to the length of the sequence of numbers that the generator is able to generate before repeating an identical sequence. A pseudorandom number generator with a longer period can therefore generate a longer sequence of pseudorandom numbers before it starts repeating a sequence. This parameter depends on how the generator is internally sized.

A 32-bit generator with a period of 2^{32} can be compared to a codebook with a **dimension** equal to 16GB.

A 32-bit generator with a period of 2^{64} it would be as if it used a real book such as to require, in order to be printed, cellulose that can be derived from all the forests available on Earth.

In the case of random number generators that we use, the sequence or the codebook is known: to avoid predictions it is necessary to keep the **secret** on the book page that the generator is drawing on. If the sequence is large enough it is impossible to tell which numbers will come next.

We took the case of the safety of the RSA algorithm to say that some implementations use “weak” pseudorandom prime number generators. There **safety** dell’**cryptographic algorithm** it relies largely on the generated prime numbers: if they are not random enough it is much easier for attackers to factor them and break the encryption. That’s why the pseudorandom number generator must be cryptographically secure.

## What is PCG scheme, *Permuted Congruential Generator*

The generation scheme **PCG ( Permuted Congruential Generator)** is a pseudorandom number generation algorithm that combines a linear congruential generator (LCG) with a permutation function.

PCG uses a **mathematical formula** to produce a sequence of numbers that appear to be random, but which are actually deterministic, since – as we saw earlier – they still derive from a *seed*.

The PCG was introduced in 2014 by Melissa O’Neill and has become one of the most popular and recommended pseudorandom number generators for modern computing applications.

The basic PCG generator is a 32-bit generator with 64 status bits and 63 bits used to select the stream: it’s like having 9,223,372,036,854,775,808 codes, each filled with 18,446,744,073,709,551,616 numbers (integers 32-bit).

These are “numbers” that make the generated data really close to “real” randomness: it’s probably easier to predict dice rolls or guess the numbers that will come up in roulette.

The main feature of the PCG is that it combines the computational efficiency of the LCG algorithm with the good statistical quality of the permutation function. The latter, in particular, allows to remove the correlations between the numbers generated with the LCG algorithm, improving the distribution of pseudorandom numbers.

An interesting comparison of pseudorandom number generators also developed by O’Neill is available on the Web, highlighting how his PCG outperforms the various alternatives in terms of statistical quality, difficulty of prediction, reproducible results, multiple streams, period, useful features , performance, space usage, code size and complexity.