thesis/chap-proofs.tex

\chapter{Security Proofs in Cryptography}

Provable security is a subfield of cryptography where constructions are proven secure with regards to a security model.
To illustrate this notion, let us take the example of public-key encryption schemes.
This primitive consists in three algorithms:~\textit{key generation}, \textit{encryption} and \textit{decryption}.
These algorithms acts according to their names.
Naturally, the question of ``how to define the security of this set of algorithms'' rises.
To answer this question, we have to define the power of the adversary, and its goal.
In cryptography, many ways have been used to define this (random oracle model, universal composability ($\UC$)~\cite{Can01}\ldots) which give rise to stronger security guarantees.
If one may look for the strongest security for its construction, there are known impossibility results in strong models.
For instance, in the $\UC$ model, it is impossible to realize two-party computation~\cite{Yao86} without honest set-up~\cite{CKL06}, while it is possible in the standard model~\cite{LP07}.

In this chapter, we will focus on the computational complexity elements we need to define properly the security models we will use in this thesis.
Then we will define these security models.

%%%%%%%%%%%%%%%%%%%%%%%
% Security Reductions %
%%%%%%%%%%%%%%%%%%%%%%%
\section{Security Reductions}

Provable security  providing constructions for which the security is guaranteed by a security proof, or \emph{security reduction}.
The name ``reduction'' comes from computational complexity.
In this field of computer science, research focuses on defining equivalence classes for problems, based on the necessary amount of resources to solve them.
In order to define lower bound for the complexity of some problems, a classical way of doing this is to provide a construction that goes from an instance of a problem $A$ to an instance of problem $B$ such that if a solution of $B$ is found, then so is a solution of $A$ as well.
This amounts to say that problem $B$ is at least as hard as problem $A$ up to the complexity of the transformation.
For instance, Cook shown that satisfiability of Boolean formulas is at least as hard as every problem in $\NP$~\cite{Coo71} up to a polynomial-time transformation.

Let us now define more formally the notions of reduction and computability using the computational model of Turing machines.

\begin{definition}[Turing Machine] \label{de:turing-machine} \index{Turing machine}
  A $k$-tape Turing Machine (TM) is described by a triple $M = (\Gamma, Q, \delta)$ containing:
  \begin{itemize}
    \item A finite set $\Gamma$, called the \textit{tape alphabet}, that contains symbols that the TM uses in its tapes. In particular, $\Gamma$ contains a \textit{blank symbol} ``$\square$'', and ``$\triangleright$'' that denotes the beginning of a tape.
    \item A finite set $Q$ called the \textit{states} of the TM. It contains special states $q_{start}$, $q_{halt}$, called respectively the \textit{initial state} and the \textit{halt state}.
    \item A function $\delta: (Q \backslash \{q_{halt}\}) \times \Gamma^{k-1} \to Q \times \Gamma^{k-1} \times \{ \leftarrow, \downarrow, \rightarrow \}^k$, called the \textit{transition function}, that describes the behaviour of the internal state of the machine and the TM heads.\\
      \smallskip
      Namely, $\delta(q, a_1, \ldots, a_{k-1}) = (r, b_2, \ldots, b_k, m_1, \ldots, m_k)$ means that upon reading symbols $(a_1, \ldots, a_{k-1})$ on tapes $1$ to $k-1$ (where the first tape is the input tape, and the $k$-th tape is the output tape) on state $q$, the TM will move to state $r$, write $b_2, \ldots, b_k$ on tapes $2$ to $k$ and move its heads according to $m_1, \ldots, m_k$.
  \end{itemize}

  A TM $M$ is said to \emph{compute} a function $f: \Sigma^\star \to \Gamma^\star$, if for any finite input $x \in \Sigma^\star$ on tape $T_1$, blank tapes $T_2, \ldots, T_k$ with a beginning symbol $\triangleright$ and initial state $q_{start}$, $M$ halts in a finite number of steps with $f(x)$ written on its output tape $T_k$.

  A TM $M$ is said to \emph{recognize} a language $L \subseteq \Sigma^\star$ if on a finite input $x \in \Sigma^\star$ written on its input tape $T_1$, blank tapes $T_2, \ldots, T_k$ with a beginning symbol $\triangleright$ and initial state $q_{start}$, the machine $M$ eventually ends on the state $q_{halt}$ with $1$ written on its output tape if and only if $x \in L$.

  A TM $M$ is said to run in $T(n)$-time if, on any input $x$, it eventually stops within $T(|x|)$ steps.

  A TM $M$ is said to run in $S(n)$-space if, on any input $x$, it eventually stops and had write at most $S(|x|)$ memory cells in its working tapes.
\end{definition}

Turing machines are a computational model that proved useful in complexity theory as it is convenient to evaluate the running time of a Turing machine, which amounts to bound the number of steps the machine can make.
Similarly, the working tapes works analogously to the memory of a program, and then counting the number of cells the machine uses is equivalent to evaluate the amount of memory the program requires.

From these considerations, it is possible to describe the time and space complexity of a program from the definition of Turing machines.
In our context, we will work with Turing machine that runs in polynomial-time and space, as polynomials benefit from good stability properties (sum, product, composition, \ldots{}).

\begin{definition}[\textsf{P}~\cite{Rab60}] \index{Complexity classes!P@\textsf{P}}
  The class \textsf{P} describes the set of languages that can be recognized by a Turing machine running in time $T(n) = \bigO(\poly)$.
\end{definition}

In theoretical computer science, the class \textsf{P} is often considered as the set of ``easy'' problems.
These problems are considered easy in the sense that the growth of the cost to solve them is asymptotically negligible in front of other functions such as exponential.
In this context, it is reasonable to consider the computational power of an adversary as polynomial (or quasi-polynomial) in time and space.
As cryptographic algorithms are not deterministic, we also have to consider the probabilistic version of the computation model.

\begin{definition}[Probabilistic Turing machine] \label{de:probabilistic-tm} \index{Turing machine!Probabilistic Turing machine}
  A \emph{probabilistic Turing machine} is a Turing machine with two different transition functions $\delta_0$ and $\delta_1$, where at each step, a random coin is tossed to pick $\delta_0$ or $\delta_1$ with probability $1/2$ independently of all the previous choices.

  The machine only outputs \texttt{accept} and \texttt{reject} depending on the content of the output tape at the end of the execution.
  We denote by $M(x)$ the random variable corresponding to the value $M$ writes on its output tape at the end of its execution.
\end{definition}

\begin{definition}[\textsf{PP}~{\cite{Gil77}}] \index{Complexity classes!PP@\textsf{PP}}
  The class \textsf{PP} describes the set of languages $L \subseteq \Sigma^\star$ that a Turing machine $M$ recognizes such that the TM $M$ stops in time $\poly[|x|]$ on every input $x$ and
  \[ \begin{cases}
      \Pr\left[ M(x) = 1 \mid x \in L \right] > \frac12\\
      \Pr\left[ M(x) = 0 \mid x \notin L \right] > \frac12
  \end{cases}. \]

  In the following $\ppt$ stands for ``probabilistic polynomial time''.
\end{definition}

We defined complexity classes that corresponds to natural sets of programs that are of interest for us, but now how to work with it?
That's why we'll now define the principle of polynomial time reduction.

\begin{definition}[Polynomial time reduction] \label{de:pt-reduction} \index{Reduction!Polynomial time}
  A language $A \subseteq \bit^\star$ is \emph{polynomial-time reducible to} a language $B \subseteq \bit^\star$, denoted by $A \redto B$, if there is a \emph{polynomial-time computable} function $f: \bit^\star \to \bit^\star$ such that for every $x \in \bit^\star$, $x \in A$ if and only if $f(x) \in B$.
\end{definition}

\begin{figure}
  \centering
  \input fig-poly-red
  \caption{Illustration of a polynomial-time reduction~{\cite[Fig. 2.1]{AB09}}.} \label{fig:poly-reduction}
\end{figure}

In other words, a polynomial reduction from $A$ to $B$ is the description of a polynomial time algorithm (also called ``\emph{the reduction}''), that uses an algorithm for $B$ in a black-box manner to solve $A$.
This is illustrated in Figure~\ref{fig:poly-reduction}.

We can notice that \textsf{P} and \textsf{PP} are both closed under polynomial-time reduction.
Namely, if a problem is easier than another problem in \textsf{P} (resp. \textsf{PP}), then this problem is also in \textsf{P} (resp. \textsf{PP}).

Until know, we mainly focus on the running time of the algorithms.
In cryptology, it is also important to consider the success probability of algorithms:
an attack is successful if the probability that it succeed is noticeable.

\index{Negligible function}
\scbf{Notation.} Let $f : \NN \to [0,1]$ be a function. The function $f$ is said to be \emph{negligible} if $f(n) = n^{-\omega(1)}_{}$, and this is written $f(n) = \negl[n]$.
Non-negligible functions are also called \emph{noticeable} functions.
And if $f = 1- \negl[n]$, $f$ is said to be \emph{overwhelming}.

Once that we define the notions related to the core of the proof, we have to define the objects on what we work on.
Namely, defining what we want to prove, and the hypotheses on which we rely, also called ``hardness assumption''.

The details of the hardness assumptions we use are given in Chapter~\ref{chap:structures}.
Nevertheless, some notions are common to these and are evoked here.

The confidence one can put in a hardness assumption depends on many criteria.

First of all, a weaker assumption is preferred to a stronger one if it is possible.
To illustrate this, let us consider the two following assumptions:

\begin{definition}[Discrete logarithm] \label{de:DLP}
  \index{Discrete Logarithm!Assumption}
  \index{Discrete Logarithm!Problem}
  The \emph{discrete algorithm problem} is defined as follows. Let $(\GG, \cdot)$ be a cyclic group of order $p$.
  Given $g,h \in \GG$, the goal is to find an integer $a \in \Zp^{}$ such that: $g^a_{} = h$.

  The \textit{discrete logarithm assumption} is the intractability of this problem.
\end{definition}

\begin{definition}[Decisional Diffie-Hellman] \label{de:DDH} \index{Discrete Logarithm!Decisional Diffie-Hellman}
  Let $\GG$ be a cyclic group of order $p$. The \emph{decisional Diffie-Hellman} ($\DDH$) problem is the following.
  Given the tuple $(g, g_1^{}, g_2^{}, g_3^{}) = (g, g^a_{}, g^b{}, g^c_{}) \in \GG^4_{}$, the goal is to decide whether $c = ab$ or $c$ is sampled uniformly in $\GG$.

  The \textit{\DDH assumption} is the intractability of the problem for any $\ppt$ algorithm.
\end{definition}

The discrete logarithm assumption is implied by the decisional Diffie-Hellman assumption for instance.
Indeed, if one is able to solve the discrete logarithm problem, then it suffices to compute the discrete logarithm of $g_1$, let say $\alpha$, and then check whether $g_2^\alpha = g_3^{}$.
This is why it is preferable to work with the discrete logarithm assumption if it is possible.
For instance, there is no security proofs for the El Gamal encryption scheme from DLP.

Another criterion to evaluate the security of an assumption is to look if the assumption is ``simple'' or not.
It is harder to evaluate the security of an assumption as $q$-Strong Diffie-Hellman, which is a variant of $\DDH$ where the adversary is given the tuple $(g, g^a_{}, g^{a^2}_{}, \ldots, g^{a^q}_{})$ and has to devise $g^{a^{q+1}}$.
The security of this assumption inherently depends on the parameter $q$ of the assumption.
And Cheon proved that for large values of $q$, this assumption is no more trustworthy~\cite{Che06}.
These parameterized assumptions are called \emph{$q$-type assumptions}.
There are also other kind of non-static assumptions, such as interactive assumptions.
An example can be the ``\emph{$1$-more-\textsf{DL}}'' assumption.
Given oracle access to $n$ discrete logarithm queries ($n$ is not known in advance), the $1$-more-\textsf{DL} problem is to solve a $n+1$-th discrete logarithm.

Non-interactive and constant-size assumptions are sometimes called ``\textit{standard}''.

The next step to study in a security proof is the \emph{security model}.
In other words, the context in which the proofs are made.
This is the topic of the next section.

\section{Random-Oracle Model and Standard Model} \label{se:models}

The most general model to do security proofs is the \textit{standard model}.
In this model, nothing special is assumed, and every assumptions are explicit.

For instance, cryptographic hash functions enjoy several different associated security notions~\cite{KL07}.
The weakest is the collision resistance, that states that it is intractable to find two strings that maps to the same digest.
A stronger notion is the second pre-image resistance, that states that given $x \in \bit^\star_{}$, it is not possible for a $\ppt$ algorithm to find $\tilde{x} \in \bit^\star_{}$ such that $h(x) = h(\tilde{x})$.
Similarly to what we saw in the previous section about $\DDH$ and $\DLP$, we can see that collision resistance implies second pre-image resistance.
Indeed, if there is an attacker against second pre-image, then one can choose a string $x \in \bit^\star_{}$ and obtains from this attacker a second string $\tilde{x} \in \bit^\star_{}$ such that $h(x) = h(\tilde{x})$. So a hash function that is collision resistant is also second pre-image resistant.

\index{Random Oracle Model}
The \textit{random oracle model}~\cite{FS86,BR93}, or \ROM, is an idealized security model where hash functions are assumed to behave as a truly random function.
This implies collision resistance (if the codomain of the hash function is large enough, which should be the case for a cryptographic hash function) and other security notions related to hash functions.
In this model, hash function access are managed as oracle access (which then can be reprogrammed by the reduction).

We can notice that this security model is unrealistic~\cite{CGH04}. Let us construct a \emph{counter-example}.
Let $\Sigma$ be a secure signature scheme, and let $\Sigma_y^{}$ be the scheme that returns $\Sigma(m)$ as a signature if and only if $h(0) \neq y$ and $0$ as a signature otherwise.
In the \ROM $h$ behaves as a random function.
Hence, the probability that $h(0) = y$ is negligible with respect to the security parameter for any fixed $y$.
On the other hand, it appears that when $h$ is instantiated with a real world hash function, then $\Sigma_{h(0)}$ is completely insecure as a signature scheme. \hfill $\square$

In this context, one may wonder why is the \ROM still used in cryptographic proofs~\cite{LMPY16,LLM+16}.
One reason is that some constructions are not known to exist yet from the standard model.
One example is non-interactive zero-knowledge (\NIZK) proofs from lattice assumptions~\cite{Ste96,Lyu08}.
\NIZK proofs form an elementary building block for privacy-based cryptography, and forbid the use of the \ROM may slow down research in this direction~\cite{LLM+16}.
Another reason to use the \ROM in cryptography, is that it is a sufficient guarantee in real-world cryptography~\cite{BR93}.
The example we built earlier is artificial, and in practice there is no known attacks against the \ROM.
This consequence comes also from the fact that the \ROM is implied by the standard model.
As a consequence, constructions in the \ROM are at least as efficient as in the standard model.
Thus, for practical purpose, constructions in the \ROM are usually more efficient.
For instance, the scheme we present in Chapter~\ref{ch:sigmasig} adapts the construction of dynamic group signature in the standard model from Libert, Peters and Yung~\cite{LPY15} in the \ROM.
Doing this transform reduces the signature size from $32$ elements in $\GG$, $14$ elements in $\Gh$ and \textit{one} scalars in the standard model~\cite[App. J]{LPY15} down to $7$ elements in $\GG$ and $3$ scalars in the \ROM.

We now have defined the security structure on which we are working on and the basic tools that allows security proofs.
The following section explains how to define the security of a cryptographic primitive.

\section{Security Games and Half-Simulatability}

Up to now, we defined the structure on which security proofs works. Let us now define what we are proving.
An example of what we are proving has been shown in Section~\ref{se:models} with cryptographic hash functions.

In order to define security properties, a common manner is to define security \emph{games} (or \emph{experiments})~\cite{GM84}.

Two examples of security game are given in Figure~\ref{fig:sec-game-examples}: the \emph{indistinguishability under chosen-plaintext attacks} (\indcpa) for public-key encryption (\PKE) schemes and the \emph{existential unforgeability under chosen message attacks} (EU-CMA) for signature schemes.

\begin{figure}
  \centering
  \subfloat[\indcpa{} game for \PKE]{
    \fbox{\procedure{$\Exp{\indcpa}{\adv, b}(\lambda)$}{%
        (pk,sk) \gets \mathcal E.\mathsf{keygen}(1^\lambda)\\
        (m_0, m_1) \gets \adv(pk, 1^\lambda)\\
        \mathsf{ct} \gets \mathcal E.\mathsf{enc}(m_b)\\
        b' \gets \adv(pk, 1^\lambda, \mathsf{ct})\\
        \pcreturn b'
    }}
  } \hspace{1cm}
  \subfloat[EU-CMA game for signatures]{
    \fbox{
      \procedure{$\Exp{\mathrm{EU-CMA}}{\adv}(\lambda)$}{
        (vk,sk) \gets \Sigma.\mathsf{keygen}(1^\lambda)\\
        \mathsf{st} \gets \emptyset\\
        \pcwhile \adv(\texttt{query}, vk, \mathsf{st}, \mathcal O^{\mathsf{sign}}) \pcdo
        ;\\
        (m^\star, \sigma^\star) \gets \adv(\texttt{forge}, vk, \mathsf{st}) \\
        \pcreturn (m^\star, \sigma^\star)
    }}
  }
  \caption{Some security games examples} \label{fig:sec-game-examples}
\end{figure}

\index{Reduction!Advantage}
The \indcpa{} game is an \emph{indistinguishability} game. Meaning that the goal for the adversary $\mathcal A$ against this game is to distinguish between two messages from different distributions.
To model this, for any adversary $\adv$, we define a notion of \emph{advantage} for the $\indcpa$ game as
\[ \advantage{\indcpa}{\adv}(\lambda) = \left| \Pr[ \Exp{\indcpa}{\adv,1}(\lambda) = 1 ] - \Pr[ \Exp{\indcpa}{\adv, 0}(\lambda) = 1] \right|.\]

We say that a $\PKE$ scheme is $\indcpa$ if for any $\ppt$ $\adv$, the advantage of $\mathcal A$ in the $\indcpa$ game is negligible with respect to $\lambda$.