mirror of
https://github.com/Rucknium/misc-research.git
synced 2024-12-23 03:49:21 +00:00
807 lines
40 KiB
TeX
807 lines
40 KiB
TeX
|
|
\documentclass[english]{article}
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[latin9]{inputenc}
|
|
\usepackage{float}
|
|
\usepackage{url}
|
|
\usepackage{amsmath}
|
|
\usepackage{amssymb}
|
|
\usepackage{stackrel}
|
|
\usepackage{graphicx}
|
|
|
|
\makeatletter
|
|
|
|
|
|
\usepackage[dvipsnames,table]{xcolor}
|
|
|
|
\usepackage{lineno}
|
|
\linenumbers
|
|
\linespread{1.25}
|
|
|
|
\usepackage{longtable}
|
|
\usepackage{pdflscape}
|
|
|
|
\usepackage[bookmarks=true]{hyperref}
|
|
\usepackage{orcidlink}
|
|
\usepackage{booktabs}
|
|
\usepackage{caption}
|
|
\usepackage{longtable}
|
|
% \usepackage[T1]{fontenc}
|
|
\usepackage{geometry}
|
|
\geometry{verbose,tmargin=2cm,bmargin=2cm,lmargin=2cm,rmargin=2cm}
|
|
\usepackage{array}
|
|
\usepackage{url}
|
|
\usepackage{multirow}
|
|
\usepackage{stackrel}
|
|
\usepackage{rotating}
|
|
|
|
\usepackage{bbold}
|
|
|
|
|
|
\usepackage{array} % for ExtraRowHeight
|
|
|
|
\usepackage{graphicx}
|
|
\usepackage{siunitx}
|
|
\usepackage[normalem]{ulem}
|
|
\usepackage{colortbl}
|
|
|
|
\usepackage{hhline}
|
|
\usepackage{calc}
|
|
\usepackage{tabularx}
|
|
\usepackage{threeparttable}
|
|
\usepackage{wrapfig}
|
|
\usepackage{adjustbox}
|
|
|
|
|
|
\usepackage{hyperref}
|
|
|
|
\newcolumntype{L}{>{\raggedright\arraybackslash}}
|
|
\newcolumntype{R}{>{\raggedleft\arraybackslash}}
|
|
\newcolumntype{C}{>{\centering\arraybackslash}}
|
|
|
|
% \renewcommand{\arraystretch}{1.5}
|
|
\renewcommand{\tabcolsep}{0.2cm}
|
|
|
|
\setlength{\LTpre}{6pt}
|
|
\setlength{\LTpost}{6pt}
|
|
|
|
\makeatother
|
|
|
|
\usepackage{babel}
|
|
|
|
|
|
|
|
|
|
|
|
% https://tex.stackexchange.com/questions/151241/remove-metadata-of-pdf-generated-by-latex
|
|
\hypersetup{
|
|
bookmarks=true, % show bookmarks bar?
|
|
unicode=false, % non-Latin characters in Acrobat's bookmarks
|
|
pdftoolbar=true, % show Acrobat's toolbar?
|
|
pdfmenubar=true, % show Acrobat's menu?
|
|
pdffitwindow=false, % window fit to page when opened
|
|
% pdfstartview={FitW}, % fits the width of the page to the window
|
|
pdftitle={Defeating a Black Marble Flood Against Monero: Best Options for Ring Size and Transaction Fee}, % title
|
|
pdfauthor={Rucknium}, % author
|
|
pdfsubject={}, % subject of the document
|
|
pdfcreator={Rucknium}, % creator of the document
|
|
pdfproducer={}, % producer of the document
|
|
pdfkeywords={}, % list of keywords
|
|
pdfnewwindow=true, % links in new window
|
|
colorlinks=false, % false: boxed links; true: colored links
|
|
linkcolor=red, % color of internal links
|
|
citecolor=green, % color of links to bibliography
|
|
filecolor=magenta, % color of file links
|
|
urlcolor=cyan % color of external links
|
|
}
|
|
|
|
|
|
|
|
|
|
\begin{document}
|
|
|
|
\title{Defeating a Black Marble Flood Against Monero: Best Options for Ring Size and Transaction Fee\\\vspace{.3cm}
|
|
\large Draft v0.1\vspace{-.715cm}}
|
|
\author{Rucknium\orcidlink{0000-0001-5999-8950} }
|
|
\date{May 28, 2024}
|
|
|
|
|
|
|
|
|
|
\maketitle
|
|
|
|
\section{Summary}
|
|
|
|
Increasing Monero's ring size and minimum transaction fee are two
|
|
options for defeating black marble flooding. This document attempts
|
|
to answer the question: Is it better to increase the ring size or
|
|
the transaction fee, or some combination of the two? Cost-Effectiveness
|
|
Analysis is used to analyze this question. It considers the additional
|
|
costs imposed on transacting users and node operators compared to
|
|
the benefit of stronger resistance to black marble flooding.
|
|
|
|
Consider an adversary with a daily budget of 12.5 XMR, five times
|
|
higher than the daily expenditure of the suspected March 2024 black
|
|
marble flooder. Given the constraints considered, the most cost-effective
|
|
combination of defense parameters are ring size 60 and minimum 70
|
|
nanonero per byte fee. Effective ring size would be 22.8 if the adversary
|
|
spent his entire budget every day. The 2in/2out reference transaction
|
|
with ring size 60 would be about 140\% larger than the transaction
|
|
with current ring size 16. The user's cost to send this transaction
|
|
would be about 4.4 USD cents. The total time to verify all transactions
|
|
in a block of normal transaction volume would increase from 0.5 seconds
|
|
to 1.8 seconds. An unpruned node would grow 59 GB in a year instead
|
|
of 25 GB. Pruned nodes would grow 14 GB instead of 8 GB.
|
|
|
|
\section{Black marble flooding as a game}
|
|
|
|
We will analyze the problem as a game with two players. One player
|
|
aims to flood the Monero blockchain with black marble outputs. This
|
|
player is limited by his budget. The other player aims to deter the
|
|
first player, or at least limit the damage, by choosing minimum fee
|
|
and ring size. This player is limited by the costs that fees and ring
|
|
size impose of transacting users and node operators.
|
|
|
|
Sam is a privacy adversary. His goal is to reduce Monero's effective
|
|
ring size by flooding the Monero blockchain with black marble outputs
|
|
that he owns. He has some budget $b$ denominated in XMR to spend
|
|
on transaction fees per block.
|
|
|
|
Alice wishes to defeat Sam. She can set Monero's ring size and minimum
|
|
transaction fee to try to accomplish her goal. Sam would have to spend
|
|
more XMR per output if the minimum fee per byte were higher. A larger
|
|
ring size would require Sam to own a larger share of outputs to achieve
|
|
a specified effective ring size. (Without changing the minimum fee
|
|
per byte, a larger ring size also requires Sam to spent more XMR to
|
|
produce each output because transaction size is larger.)
|
|
|
|
Larger ring sizes and fees help Alice accomplish her goal of defeating
|
|
Sam, but Alice cannot raise ring size and fee without limit. Users
|
|
who send Monero transactions need to pay a higher fee when the minimum
|
|
transaction fee is higher. Larger ring sizes mean that transactions
|
|
are larger. At a given transaction volume, larger transactions make
|
|
the blockchain grow faster. People who operate Monero nodes need to
|
|
store the blockchain on their storage media such as Solid State Drives
|
|
(SSDs). Alice needs to balance the benefit of greater defense against
|
|
Sam against the cost imposed on transacting users and node operators.
|
|
|
|
These are the factors on Alice's mind:
|
|
\begin{itemize}
|
|
\item I do not know Sam's budget $b$. I do not know what effective ring
|
|
size he hopes to achieve. If I set ring size and fee so that he cannot
|
|
achieve his desired effective ring size with his budget $b$, he will
|
|
choose not to flood the blockchain with black marbles. This is the
|
|
deterrence outcome.
|
|
\item If I fail to deter Sam, at least I can hold him to a specific effective
|
|
ring size when he spends his budget $b$. This is the fallback outcome.
|
|
\item I do not want to set ring size and transaction fee unnecessarily high
|
|
because transacting users and node operators pay higher costs when
|
|
these parameters increase.
|
|
\end{itemize}
|
|
We will simplify the problem:
|
|
\begin{itemize}
|
|
\item Sam may actually change the budget he is willing to spend based on
|
|
the effective ring size he is able to achieve. In other words, Sam
|
|
may have a tradeoff function between budget and effective ring size.
|
|
We will ignore this complication and assume that Sam's budget is fixed,
|
|
but unknown to Alice.
|
|
\item We will use the fallback outcome to measure the effectiveness of Alice's
|
|
options. When the fallback outcome is better for Alice, the deterrence
|
|
outcome is more likely. Therefore, it is a little redundant to compute
|
|
the probability of the deterrence outcome as an effectiveness metric.
|
|
\item Transaction volume by normal users is assumed to be constant and unaffected
|
|
by changes in the transaction fee. In other words, we will assume
|
|
that the demand for Monero transactions is completely inelastic with
|
|
respect to transaction fee.
|
|
\item We will assume that Sam's black marble transactions are 1in/2out because
|
|
the suspected black marble flood of March 2024 used this type of transaction.
|
|
Sam could produce black marble outputs more cheaply with 1in/16out
|
|
transactions, but the flood transactions would be easier for an observer
|
|
to identify.
|
|
\end{itemize}
|
|
Alice will use Cost-Effectiveness Analysis (CEA) to evaluate her ring
|
|
size and fee options. Cost effectiveness is the ratio of cost to effectiveness:
|
|
|
|
\begin{equation}
|
|
CE=\dfrac{\mathrm{Cost}}{\mathrm{Effectiveness}}
|
|
\end{equation}
|
|
|
|
A lower value of $CE$ is better. Alice must define cost and effectiveness
|
|
as functions of ring size, transaction fee, and the adversary's budget.
|
|
Let $n$ be nominal ring size, $f$ be the fee per byte in nanonero
|
|
units, and $b$ be the adversary's budget. Costs will be measured
|
|
in terms of XMR per block.
|
|
|
|
The cost has two components: cost to transacting users and cost to
|
|
node operators. The $i$th transaction has some number of inputs and
|
|
outputs. Changing the ring size $n$ changes the total size of the
|
|
$i$th transaction, which affects the total minimum fee to send the
|
|
transactions. And changing the minimum fee per byte changes the total
|
|
fee, of course. Let $w_{i}\left(n\right)$ be the weight of transaction
|
|
$i$ when ring size is $n$. When a transaction has two outputs, transaction
|
|
weight is equal to transaction size in bytes. Weight is larger than
|
|
size when the number of outputs is greater than two.\footnote{See Section 7.3.2 of koe, Alonso, K. M., \& Noether, S. (2020). \textit{Zero
|
|
to Monero: Second Edition}.} The block is assumed to contain an average set of transactions $T$.
|
|
The average is based on observed transactions confirmed on the blockchain
|
|
in the four weeks before the March 2024 suspected black marble flooding:
|
|
February 5 -- March 3. $C_{u}\left(f,n\right)$ is the aggregate
|
|
users' cost to send transactions for an average block:
|
|
|
|
\begin{equation}
|
|
C_{u}\left(f,n\right)=\underset{i\in T}{\sum}f\cdot w_{i}\left(n\right)
|
|
\end{equation}
|
|
|
|
The cost to node operators is a function of ring size. Node operators
|
|
do not pay higher costs when the minimum transaction fee is higher.
|
|
All units of computer storage in this document will be SI units, i.e.
|
|
a kilobyte, megabyte, gigabyte and terabyte are $10^{3}$, $10^{6}$,
|
|
$10^{9}$, and $10^{12}$ bytes, respectively. The retail price of
|
|
one consumer 1 TB SATA SSD is about 1 XMR.\footnote{In April 2024, the median retail price of a 1TB SATA SSD on \url{https://ssd.userbenchmark.com/}
|
|
was 114.50 USD. The exchange rate at the time was 120 USD per XMR.} A node operator's cost $C_{SSD}$ to store one byte of Monero blockchain
|
|
data is $10^{-12}$ XMR (a piconero). According to \texttt{monero.fail/map},
|
|
there were about 20,000 Monero nodes on the network in April 2024.
|
|
Currently the minimum relay fee is 20,000 piconeros (20 nanoneros)
|
|
per byte. Therefore, by coincidence Monero transactions pay for their
|
|
own storage space on the node network when users pay the minimum fee
|
|
per byte.
|
|
|
|
Let $d_{n}$, the number of nodes (daemons), be 20,000. $z_{i}\left(n\right)$
|
|
is the size of the $i$th transaction in the $T$ set when ring size
|
|
is $n$. The $m$ is an adjustment parameter that raises or lowers
|
|
total node operators' costs by a linear factor to adjust for uncertainty
|
|
about the true number of nodes and to add costs that are more difficult
|
|
to compute like CPU and RAM use. In the analysis below $m$ will be
|
|
set to 2. We will assume that each node is an unpruned node that stores
|
|
all transaction data in full. The total cost to node operators is
|
|
the sum of the size of transactions in the $T$ set multiplied by
|
|
the storage cost on a single SSD, the number of nodes on the network,
|
|
and the $m$ adjustment parameter:
|
|
|
|
\begin{equation}
|
|
C_{d}\left(n,m\right)=m\cdot d_{n}\cdot C_{SSD}\cdot\underset{i\in T}{\sum}z_{i}\left(n\right)
|
|
\end{equation}
|
|
|
|
Notice that $C_{d}\left(n,m\right)$ is the cost to node operators
|
|
under normal transaction volume, i.e. when there is no black marble
|
|
flooding. Total cost is the sum of $C_{u}\left(f,n\right)$ and $C_{d}\left(n,m\right)$:
|
|
|
|
\begin{equation}
|
|
C\left(f,n,m\right)=C_{u}\left(f,n\right)+C_{d}\left(n,m\right)
|
|
\end{equation}
|
|
|
|
With budget $b$, Sam can afford to place $\frac{b}{f}$ bytes of
|
|
transaction data in a block. Sam would create transactions with one
|
|
input and two outputs. The formula for the number of bytes of a transaction
|
|
like this in terms of the ring size $n$ is $975+35n$. The $975$
|
|
bytes is the size of the transaction except for the linear cost of
|
|
the ring size, i.e. a (invalid) 1in/2out transaction with ring size
|
|
0 would have $975$ bytes composed of the input's key image, other
|
|
input data that does not scale up with ring size, the outputs' bulletproofs+,
|
|
the outputs' public key, and \texttt{tx\_extra} data. The $35$ coefficient
|
|
on $n$ is the sum of the bytes of the ``$s$'' component of the
|
|
CLSAG ring signature of each ring member (32 bytes) and $3$ bytes
|
|
of the key offset integer that is used to create the output indices
|
|
of the ring members. The $3$ bytes is an empirical average of the
|
|
byes used by each key offset integer. The number of outputs per byte
|
|
that Sam produces is $2/\left(975+35n\right)$ because each of his
|
|
transaction has two outputs. To calculate the number of outputs per
|
|
block that Sam can afford with budget $b$ when fee is $f$ and nominal
|
|
ring size is $n$, we compute the product of $\frac{b}{f}$ and $2/\left(975+35n\right)$,
|
|
producing the formula for $s\left(b,f,n\right)$:
|
|
|
|
\begin{equation}
|
|
s\left(b,f,n\right)=\dfrac{2b}{f\cdot\left(975+35n\right)}
|
|
\end{equation}
|
|
|
|
Let $r$ be the number of real user outputs. When the number of outputs
|
|
owned by Sam is $s\left(b,f,n\right)$, the long-term mean effective
|
|
ring size\footnote{For a derivation of mean effective ring size, see Section 3 of Draft
|
|
v0.2 of Rucknium (2024) ``March 2024 Suspected Black Marble Flooding
|
|
Against Monero: Privacy, User Experience, and Countermeasures'' \url{https://github.com/Rucknium/misc-research/blob/main/Monero-Black-Marble-Flood/pdf/monero-black-marble-flood.pdf}} is
|
|
|
|
\begin{equation}
|
|
n_{e}\left(b,f,n\right)=1+\left(n-1\right)\cdot\dfrac{r}{r+s\left(b,f,n\right)}\label{eq:expectation-n_e}
|
|
\end{equation}
|
|
|
|
Alice wants to have a larger $n_{e}$ when Sam is producing black
|
|
marbles. $n_{e}$ is the desired outcome in the cost-effectiveness
|
|
analysis:
|
|
|
|
\begin{equation}
|
|
CE=\dfrac{C\left(f,n,m\right)}{n_{e}\left(b,f,n\right)}\label{eq:cost-effectiveness-full}
|
|
\end{equation}
|
|
|
|
Alice's goal is to choose minimum fee per byte $f$ and nominal ring
|
|
size $n$ to minimize $CE$ when Sam spends his budget $b$ producing
|
|
black marbles and the node cost multiplier is some specified $m$.
|
|
In game theory, a player's \textit{best response} in a two-player
|
|
game is a strategy that gives the player the best payoff when the
|
|
other player plays some specified strategy. Alice's best response
|
|
to Sam playing some $b$ as a strategy is to set $f$ and $n$ to
|
|
minimize $CE$. Alice does not know what value of $b$ Sam intends
|
|
to play, but reasonable values of $b$ can be analyzed to guide reasonable
|
|
choices of $f$ and $n$. In game theory terms, Alice's uncertainty
|
|
about Sam's $b$ means that this is a game of imperfect information.
|
|
Sam's player ``type'' is the unknown $b$. Sam has some probability
|
|
of being each type. In this document I will not explicitly declare
|
|
some probability distribution of Sam's type, but one could determine
|
|
Alice's single best response for the expected value of her cost effectiveness
|
|
when Sam's type has some probability distribution.
|
|
|
|
Define $f_{\min}$ and $f_{\max}$ as the minimum and maximum $f$
|
|
that Alice is willing to set. Let $n_{\min}$ and $n_{\max}$ be the
|
|
minimum and maximum $n$ that Alice is willing to set. Assume that
|
|
Alice wants to make sure that the effective ring size does not fall
|
|
below some specified minimum acceptable limit $\check{n}_{e}$. Alice
|
|
will try to minimize (\ref{eq:cost-effectiveness-full}) except when
|
|
the effective ring size would be below $\check{n}_{e}$ at the minimum
|
|
of (\ref{eq:cost-effectiveness-full}). In that case, Alice will exclude
|
|
the values of $n$ and $f$ that cause effective ring size to be below
|
|
$\check{n}_{e}$, then choose $n$ and $f$ to minimize (\ref{eq:cost-effectiveness-full})
|
|
from the set of $n$ and $f$ values that remain.
|
|
|
|
Alice's best response correspondence given Sam's choice of $b$ and
|
|
the node cost multiplier $m$ is the solution to
|
|
|
|
\begin{equation}
|
|
\begin{array}{l}
|
|
\underset{f,n}{\arg\min}\dfrac{C\left(f,n,m\right)}{n_{e}\left(b,f,n\right)}\\
|
|
\mathrm{subject\,to}\\
|
|
f_{\min}\leq f,\,f\leq f_{\max}\\
|
|
n_{\min}\leq n,\,n\leq n_{\max}\\
|
|
\check{n}_{e}\leq n_{e}\left(b,f,n\right)
|
|
\end{array}\label{eq:best-response-correspondence}
|
|
\end{equation}
|
|
|
|
The problem in (\ref{eq:best-response-correspondence}) is a nonlinear
|
|
minimization problem with nonlinear inequality constraints. Note that
|
|
the constraint set is convex, but the objective function is neither
|
|
globally convex nor globally concave.\footnote{The full proof of this statement is TODO. The first four constraints
|
|
form a convex set because they are affine. The $\check{n}_{e}\leq n_{e}\left(b,f,n\right)$
|
|
constraint is more complicated. The Hessian matrix of the second-order
|
|
partial derivatives of $n_{e}$ with respect to $f$ and $n$ is negative
|
|
definite as long as $n>1$. That means that its superlevel set for
|
|
some $\check{n}_{e}$ is convex. (The $\check{n}_{e}\leq n_{e}\left(b,f,n\right)$
|
|
inequality defines the superlevel set.) The intersection of two convex
|
|
sets is convex, so the constraint set of (\ref{eq:best-response-correspondence})
|
|
is convex.} The necessary conditions for the solution could be found analytically
|
|
by checking the Karush-Kuhn-Tucker conditions. I will solve it numerically
|
|
with a grid search. The grid is formed by evaluating (\ref{eq:cost-effectiveness-full})
|
|
many times at different values of $f$ and $n$. The values of $f$
|
|
are 40 equally-spaced values between $f_{\min}$ and $f_{\max}$.
|
|
The values of $n$ are each integer between $n_{\min}$ and $n_{\max}$.
|
|
|
|
We will start with a simple example. Assume that the adversary's budget
|
|
is 2.5 XMR per day. This is approximately the expenditure rate of
|
|
the suspected black marble flooder in March 2024. We will evaluate
|
|
cost-effectiveness at each combination of $f=\left\{ 10,20,40,100,200\right\} $
|
|
nanoneros per byte and $n=\left\{ 16,30,45,60\right\} $ ring size.
|
|
|
|
Table \ref{table-2_5-budget} contains the cost effectiveness (CE)
|
|
computations with other metrics like transaction size, total projected
|
|
growth of the blockchain, and estimated transaction verification time.
|
|
Note that the cost to send a 2in/2out transaction increases when ring
|
|
size increases even if the fee per byte does not increase because
|
|
users have to pay for larger total transaction size. The numerator
|
|
of CE has been scaled to millineros. The lowest value in the CE column
|
|
is 0.48 when nominal ring size is 60 and minimum fee is 40 nanoneros
|
|
per byte. Sam can achieve a 37.5 effective ring size with a 2.5 XMR/day
|
|
budget when nominal ring size is 60 and minimum fee is 40 nanoneros
|
|
per byte. Estimation of transaction and block verification time is
|
|
explained in Appendix \ref{sec:Appendix:-Transaction-verification}..
|
|
|
|
Figure \ref{fig-contour-plot-50-budget} is a shaded contour plot
|
|
of cost effectiveness when Sam has a budget of 50 XMR per day. Lighter
|
|
colors on the plot indicate lower CE values at the specified minimum
|
|
fee and ring size values. The blue triangle indicates the fee and
|
|
ring size values that minimize the CE when the minimum acceptable
|
|
effective ring size of 5 is disregarded. When we allow only fee and
|
|
ring size values that produce effective ring size above the minimum
|
|
acceptable effective ring, the green circle indicates the fee and
|
|
ring size values that minimize the CE. In this plot the triangle and
|
|
circle are at the same location because the minimum CE produces an
|
|
effective ring size of 12.8, above the minimum effective ring size
|
|
of 5.
|
|
|
|
Table \ref{table-scenarios-budget} shows the values of minimum fee
|
|
and ring size that produce optimal cost effectiveness when Sam has
|
|
different budgets. The maximum budget, 500 XMR per day, exceeds Monero's
|
|
daily security budget provided by tail emission. An adversary's budget
|
|
higher than 500 might imply that the adversary could directly 51 percent
|
|
attack the blockchain by renting CPU hashpower. It seems unnecessary
|
|
to consider a black marble flooder's budget greater than 500 XMR per
|
|
day because an adversary with a higher budget might be able to do
|
|
more damage to Monero than flooding the blockchain with black marble
|
|
outputs.
|
|
|
|
\begin{landscape}
|
|
\footnotesize{
|
|
\input{tables/permuted-cost-effectiveness-2_5-budget.tex}
|
|
}
|
|
\end{landscape}
|
|
|
|
\begin{figure}
|
|
\caption{Most cost-effective minimum fee and ring size when adversary budget
|
|
is 50 XMR per day}
|
|
|
|
\label{fig-contour-plot-50-budget}
|
|
|
|
\includegraphics[scale=0.65]{images/cost-effective-contour-plot-50-budget}
|
|
\end{figure}
|
|
|
|
\begin{landscape}
|
|
\footnotesize{
|
|
\input{tables/cost-effectiveness-budget-scenarios.tex}
|
|
}
|
|
\end{landscape}
|
|
|
|
\section{Discussion}
|
|
|
|
What have we learned? According to this analysis, raising the ring
|
|
size is a more cost-effective strategy against a black marble
|
|
attack than raising fees. A combination of a large increase in ring
|
|
size and a modest increase in fee seems to provide a good, cost-effective
|
|
defense.
|
|
|
|
Consider an adversary with a daily budget of 12.5 XMR, five times
|
|
higher than the daily expenditure of the suspected March 2024 black
|
|
marble flooder. Table \ref{table-scenarios-budget} says the most
|
|
cost-effective combination of defense parameters are ring size 60
|
|
and minimum 70 nanonero per byte fee. Effective ring size would be
|
|
22.8 if Sam spent his entire budget every day. The 2in/2out reference
|
|
transaction with ring size 60 would be about 140\% larger than the
|
|
transaction with current ring size 16. The user's cost to send this
|
|
transaction would be about 4.4 USD cents. The total time to verify
|
|
all transactions in a block of normal transaction volume would increase
|
|
from 0.5 seconds to 1.8 seconds. An unpruned node would grow 59 GB
|
|
in a year instead of 25 GB. Pruned nodes would grow 14 GB instead
|
|
of 8 GB.
|
|
|
|
Put these storage requirements into perspective. Recall that we use
|
|
base-10 (SI) units to measure bytes in this document. As of May 2024,
|
|
a unpruned Monero blockchain is 206 GB. A pruned Monero node takes
|
|
79 GB of storage space. The 2023 Ultra 4K edition of Call of Duty
|
|
requires 229 GB of storage.\footnote{{\scriptsize{}\url{https://web.archive.org/web/20231214215231/https://www.callofduty.com/blog/2023/10/call-of-duty-modern-warfare-III-specs-preloading-pc-trailer}}}
|
|
An unpruned BTC node requires 650 GB of storage and grows about 89
|
|
GB per year.\footnote{\url{https://bitcoin.stackexchange.com/a/116350} and \url{https://transactionfee.info/charts/block-size/}}
|
|
Therefore, with ring size 60 the Monero blockchain would grow slower
|
|
than the BTC blockchain, crossing the Call of Duty storage threshold
|
|
within a year.
|
|
|
|
Encouraging node operators to prune their nodes and implementing a
|
|
coinbase consolidation transaction type could reduce the impact of
|
|
increasing the minimum fee and ring size. Pruning could be encouraged
|
|
by setting pruning as the default in more Monero software interfaces,
|
|
such as the Monero GUI wallet Pull Request \#4320, and public information
|
|
campaigns.\footnote{\url{https://github.com/monero-project/monero-gui/pull/4320}}
|
|
A coinbase consolidation type would reduce the transaction size for
|
|
small coinbase outputs.\footnote{\url{https://github.com/monero-project/research-lab/issues/108}}
|
|
|
|
\section{Summary: Downsides and benefits of options}
|
|
\begin{enumerate}
|
|
\item Increase the minimum relay fee per byte
|
|
\begin{enumerate}
|
|
\item Downsides:
|
|
\begin{enumerate}
|
|
\item Users may make fewer transactions. That would reduce Monero's total
|
|
anonymity set because the rate of creation of new outputs would fall.
|
|
\item Users could move to another means of payment.
|
|
\item Monero might lose its reputation as a low-cost means of payment.
|
|
\item Large changes in Monero's fiat exchange rate could make the purchasing
|
|
power of the minimum fee much higher or lower than anticipated.
|
|
\end{enumerate}
|
|
\item Benefits:
|
|
\begin{enumerate}
|
|
\item Miners would earn more from fees. This would increase Monero's resistance
|
|
to 51 percent attack because its mining security budget would increase
|
|
a little.
|
|
\item Higher fees would increase the cost of all spam regardless of motivation.
|
|
(Increasing the ring size only negatively affects spammers that want
|
|
to reduce the effective ring size.)
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
\item Increase the ring size
|
|
\begin{enumerate}
|
|
\item Downsides:
|
|
\begin{enumerate}
|
|
\item Greater storage requirements for operating a Monero node could cause
|
|
some node operators to stop running their nodes. This would make the
|
|
Monero network less decentralized.
|
|
\item Some Monero wallet users may stop running local nodes and switch to
|
|
remote nodes. This would increase the load on public remote nodes
|
|
and potentially expose the wallet users to some privacy risks from
|
|
malicious remote nodes.\footnote{See \url{https://docs.featherwallet.org/guides/nodes}}
|
|
\item Verification time per transaction would increase. During normal operation,
|
|
the Monero node would use more CPU resources. During initial blockchain
|
|
download, total sync time would be greater. Syncing a Monero node
|
|
on a HDD, which is already very difficult, might become completely
|
|
nonviable because of the necessary random reads for ring signature
|
|
verification.
|
|
\item At extremes, long verification times can threaten network stability.
|
|
In 2023 Pirate Chain suffered a transaction spam attack that caused
|
|
chain splits because of long transaction verification times.\footnote{\url{https://web.archive.org/web/20230803171107/https://old.reddit.com/user/SignificantRoof5656/comments/15h9reh/pirate_chains_045_spam_attack_2_months_later/}}
|
|
Monero uses the Fluffy Blocks protocol to verify transactions as they
|
|
arrive in the txpool instead of bottlenecking verification at the
|
|
time new blocks are mined. It is unclear if Pirate Chain, a code fork
|
|
of Zcash, uses a compact block protocol.\footnote{See \url{https://github.com/zcash/zips/issues/360}}
|
|
As long as the time to verify each block's transactions does not become
|
|
a large fraction of mean time between blocks (120 seconds), this is
|
|
probably not a threatening issue, \textit{in theory}. In practice,
|
|
the Monero node performs many more actions than just verifying the
|
|
cryptography of transactions. There may be hidden bottlenecks. Recently,
|
|
spikes of transactions with large numbers of inputs have seemed to
|
|
cause excess RAM usage of nodes, shutting down nodes in some cases.\footnote{\url{https://github.com/monero-project/monero/issues/9317}}
|
|
\end{enumerate}
|
|
\item Benefits:
|
|
\begin{enumerate}
|
|
\item Increasing the ring size increases the anonymity set of all transaction
|
|
inputs. Other statistical attacks unrelated to black marble flooding
|
|
like EAE attacks and timing analysis would become more difficult.
|
|
\end{enumerate}
|
|
\item Neutral:
|
|
\begin{enumerate}
|
|
\item Increasing the ring size has very little effect on wallet sync times.
|
|
The bandwidth costs for syncing transactions in mined blocks are only
|
|
about three bytes per ring member for the ring offset data. No additional
|
|
computation is required. However, ring signature data is sent from
|
|
nodes to wallets when transactions are still in the txpool.\footnote{Thanks to jeffro256 for explaining this.}
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
\item Encourage blockchain pruning
|
|
\begin{enumerate}
|
|
\item Downsides:
|
|
\begin{enumerate}
|
|
\item New unpruned nodes may have to connect to more nodes to create an
|
|
unpruned copy of the blockchain.
|
|
\item All pruned nodes keep one-eighth of the transaction data that is designated
|
|
``prunable''. If all nodes on the network are pruned, there is an
|
|
extremely small chance that one of the eight pruning slices will not
|
|
exist on the whole network. That would mean that not all signature
|
|
data on the blockchain could be verified. When blockchain pruning
|
|
is enabled, a Monero node randomly chooses one of eight possible pruning
|
|
seeds independently of the pruning seeds that other nodes have chosen.
|
|
By chance, the network could be missing one of the eight slices of
|
|
the pruneable part of the blockchain because the choice of pruning
|
|
seed is not coordinated between nodes. This chance is extremely small.
|
|
If the network only has pruned nodes and the total number of nodes
|
|
on the network is 681, the probability of missing one of the eight
|
|
pruning slices is less than $2^{-128}$, which is the probability
|
|
of guessing a specific 12-word BIP39 bitcoin seed phrase with a single
|
|
guess. See Appendix \ref{sec:Probability-of-recovering-complete-blockchain-data}
|
|
for how to compute this probability. This probability does not consider
|
|
the challenge of new nodes finding their pruning slices by connecting
|
|
to multiple nodes throughout the network.
|
|
\end{enumerate}
|
|
\item Benefits:
|
|
\begin{enumerate}
|
|
\item Pruned nodes consume much less storage space.
|
|
\end{enumerate}
|
|
\item Neutral:
|
|
\begin{enumerate}
|
|
\item ``There are no privacy or security downsides when using a pruned
|
|
node.''\footnote{\url{https://web.getmonero.org/resources/moneropedia/pruning.html}}
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
\item Implement ``Coinbase Consolidation Tx Type''\footnote{\url{https://github.com/monero-project/research-lab/issues/108}}
|
|
\begin{enumerate}
|
|
\item Downsides:
|
|
\begin{enumerate}
|
|
\item koe, the original proper of this protocol modification, said, ``After
|
|
thinking more, I am not sure this proposal is the right direction.
|
|
Enote consolidation being statistically significant, and consolidating
|
|
enotes with small amounts being expensive, is a general problem not
|
|
specific to coinbase enotes. Implementing a specific solution for
|
|
coinbase enotes amounts to elevating the circumstances of miners to
|
|
first-class status in the protocol, without solving the more general
|
|
problem. If another major project on the scale of p2pool becomes active
|
|
in Monero and would benefit from specific protocol changes (not trivial
|
|
benefits - privacy and scaling benefits even), should we hard fork
|
|
to accommodate them? To support protocol longevity by reducing hardforks
|
|
(and not setting precedents that would justify a relatively higher
|
|
rate of future hardforks), it seems better to aim for general solutions
|
|
to problems. In this case, one general solution to the privacy impacts
|
|
of consolidation would be a global membership proof. The cost of consolidations
|
|
might be addressed by using aggregate membership proofs that scale
|
|
sub-linearly with the number of memberships being proven (i.e. number
|
|
of tx inputs).''\footnote{\url{https://github.com/monero-project/research-lab/issues/108\#issuecomment-1379288635}}
|
|
\item There is a small privacy impact to some miners. Most centralized mining
|
|
pools already publish the blocks that they mine, so the ownership
|
|
of mining pools' coinbases is usually publicly known already.\footnote{Wijaya, D. A., Liu, J. K., Steinfeld, R., \& Liu, D. (2021) ``Transparency
|
|
or anonymity leak: Monero mining pools data publication''. Paper
|
|
presented at Information Security and Privacy - 26th Australasian
|
|
Conference, ACISP 2021, Virtual Event, December 1-3, 2021, Proceedings.} P2Pool payout addresses are public on the P2Pool side chain, allowing
|
|
good guesses about which transactions are consolidating coinbases
|
|
to specific mining addresses.\footnote{\url{https://p2pool.observer/sweeps}}
|
|
The P2Pool README recommends miners to use a separate mining wallet.\footnote{\url{https://github.com/SChernykh/p2pool?tab=readme-ov-file\#general-considerations}}
|
|
Therefore, a coinbase consolidation transaction type would not have
|
|
a large negative impact on the privacy of most miners because the
|
|
on-chain privacy for miners is low to begin with. The privacy of solo
|
|
miners could be negatively impacted, however. With the new transaction
|
|
consolidation type, those miners could send coins to themselves once
|
|
to create outputs that would enter the non-coinbase anonymity set.
|
|
\end{enumerate}
|
|
\item Benefits:
|
|
\begin{enumerate}
|
|
\item If the coinbase consolidation transaction type is implemented at the
|
|
same time as much larger rings, coinbase consolidations would not
|
|
take up so much storage. In the 60 ring member scenario, annual blockchain
|
|
growth would be 2.7 GB less if all coinbase outputs are spent by inputs
|
|
with ring size one.
|
|
\item If the ring size and/or fee per byte increases a lot, P2Pool mining
|
|
may become uncompetitive compared to centralized pool mining, especially
|
|
for the P2Pool mini chain. Consider the 10th percentile of multi-output
|
|
coinbase outputs during February 2024: 0.000272 XMR. (10\% of the
|
|
likely P2Pool outputs are below this amount.) With the status quo
|
|
ring size and minimum fee per byte, consolidating this P2Pool payout
|
|
by adding an input to a transaction costs the miner about 5 percent
|
|
of the value of that output. With the ring size 60 and 70 nanoneros
|
|
per byte scenario considered above, about 57 percent of the value
|
|
of that output would be consumed by the cost to spent the output in
|
|
a transaction's output. But if coinbase outputs only have to have
|
|
ring size 1, then even paying 60 nanoneros per byte would cost the
|
|
miner only 4.2 percent of the output's value when you spent it in
|
|
a 1-ring-member input. (The cost quoted here do not include the bytes
|
|
contributed by outputs or other transaction data.)
|
|
\item Coinbase outputs can behave like black marbles in the rings of transactions
|
|
that do not spend coinbase outputs. See the ``Avoiding selecting
|
|
coinbase outputs as decoys'' Monero Research Lab issue.\footnote{\url{https://github.com/monero-project/research-lab/issues/109}}
|
|
Implementing a coinbase consolidation transaction type would prevent
|
|
coinbase outputs from being included in the rings of transactions
|
|
that do not spend coinbase outputs. This would improve the privacy
|
|
of those transactions.
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
\end{enumerate}
|
|
|
|
\appendix
|
|
\newpage{}
|
|
|
|
\section{Appendix: Transaction verification time estimates\label{sec:Appendix:-Transaction-verification}}
|
|
|
|
The verification time estimates are based on performance tests developed
|
|
by koe. I modified the test parameters to produce estimates of a large
|
|
set of ring sizes, input counts, and output counts in \url{https://github.com/Rucknium/monero-tx-performance}.
|
|
koe provides interpretation of the performance tests in Monero Research
|
|
Lab issue $\#91$.\footnote{\url{https://github.com/monero-project/research-lab/issues/91}}
|
|
I used the same machine as koe for the tests. The verification performance
|
|
tests do not include the time to read data from storage media. The
|
|
2in/2out reference transaction and the assumed 1in/2out black marble
|
|
transaction type could be tested directly, but there were too many
|
|
permutations of the real transaction in/out types in the February-March
|
|
2024 sample to test those directly. Estimates of the real transaction
|
|
verification type were necessary to estimate the verification time
|
|
for an average real block. All tests were in batches of one because
|
|
setting the batching parameter did not seem to affect the verification
|
|
time of inputs (it did affect verification time of outputs, but the
|
|
research question is about varying different ring sizes of inputs,
|
|
not outputs).
|
|
|
|
A linear regression model was fit by Ordinary Least Squares (OLS)
|
|
to interpolate the estimated verification times for the set of real
|
|
transactions at different ring sizes. The performance test developed
|
|
by koe were originally designed to only compute ring sizes that are
|
|
powers of two. Therefore, ring size performance was tested for ring
|
|
size 1, 2, 4, 8, 16, 32, and 64. The number of outputs tested was
|
|
every integer between 2 and 16 because these are the allowed number
|
|
of transaction outputs by blockchain consensus rules. The maximum
|
|
number of inputs in a single transaction that a standard Monero wallet
|
|
can produce is 150. The number of inputs I used for the performance
|
|
estimates was:
|
|
|
|
$\{1,2,3,4,5,6,7,8,9,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150\}$
|
|
|
|
Taking all permutations of these sets gives $7\cdot15\cdot24=2520$
|
|
permutations. Tests with these permutations produce the dataset used
|
|
in the OLS regression. We can guess a good functional form for the
|
|
regression equation based on knowledge of the time complexity of the
|
|
algorithms used to cryptographically verify transaction components.
|
|
I included ring size and inputs, the base 2 log of each, and their
|
|
interaction terms. Dummy variables of the ceiling of base-2 log of
|
|
the number of outputs was included in the regression equation since
|
|
bulletproofs verification times is a function of the integer ceiling
|
|
of the power of two of the number of outputs in the transaction. The
|
|
full regression equation is below.
|
|
|
|
\begin{equation}
|
|
\begin{array}{cl}
|
|
time= & \beta_{0}+\beta_{1}ring\_size+\beta_{2}inputs+\beta_{3}\log_{2}(ring\_size)+\beta_{4}\log_{2}(inputs)+\beta_{5}\mathbb{{1}}\left\{ \lceil\log_{2}(outputs)\rceil=2\right\} \\
|
|
& +\beta_{6}\mathbb{{1}}\left\{ \lceil\log_{2}(outputs)\rceil=3\right\} +\beta_{7}\mathbb{{1}}\left\{ \lceil\log_{2}(outputs)\rceil=4\right\} \\
|
|
& +\beta_{8}ring\_size\times inputs+\beta_{9}\log_{2}(ring\_size)\times\log_{2}(inputs)+\epsilon
|
|
\end{array}\label{eq:verification-time-regression}
|
|
\end{equation}
|
|
|
|
$\lceil x\rceil$ means the integer ceiling of $x$. $\mathbb{{1}}\left\{ x\right\} $
|
|
is an indicator function. Its value is $1$ when the statement in
|
|
braces is true and is $0$ otherwise. The results of the regression
|
|
are in Table \ref{table-regression-results}.. The adjusted $R^{2}$
|
|
is extremely high (0.9998), indicating that the model fits the data
|
|
well. However, a model may have a high $R^{2}$ when the scale of
|
|
the different observations is vastly different, which is the case
|
|
here.
|
|
|
|
Given the estimated parameters from (\ref{eq:verification-time-regression}),
|
|
predicted values of the verification time for all types of transactions
|
|
and all ring sizes can be computed for the February-March 2024 sample
|
|
by plugging the number of inputs, outputs, and ring size into the
|
|
regression equation with the $\hat{\boldsymbol{\beta}}$ estimated
|
|
parameters.
|
|
|
|
\begin{table}[H]
|
|
\caption{CLSAG transaction verification time OLS regression. Units are milliseconds.}
|
|
|
|
\input{tables/verification-time-regression.tex}
|
|
|
|
\label{table-regression-results}
|
|
\end{table}
|
|
|
|
\newpage{}
|
|
|
|
\section{Probability of recovering complete blockchain data from a network
|
|
with only pruned nodes\label{sec:Probability-of-recovering-complete-blockchain-data}}
|
|
|
|
The problem of collecting all eight of the Monero's pruning slices
|
|
is a type of coupon collector's problem. Holst (1986) provides the
|
|
formula to find the probability that you need $n$ pruned nodes on
|
|
the network to be able to recover the intact blockchain from the eight
|
|
unique slices.\footnote{Holst, L. (1986). ``On Birthday, Collectors\textquoteright , Occupancy
|
|
and Other Classical Urn Problems.'' International Statistical Review
|
|
/ Revue Internationale de Statistique, 54(1), 15--27. https://doi.org/10.2307/1403255} Holst says, ``In this paper we will consider problems connected
|
|
with drawing with replacement from an urn with $r$ balls of different
|
|
colours.....The inverse of the occupancy problem is sometimes called
|
|
the coupon collector's problem. It reads: how many draws are necessary
|
|
for obtaining $k$ different balls?'' Holst gives the general problem
|
|
when $r$ is not necessarily equal to $k$. In the pruned node problem,
|
|
we only need one copy of each unique slice, so $r=k$ in our case.
|
|
Holst says that the probability of needing exactly $n$ draws for
|
|
obtaining $k$ different balls when the urn has $r$ balls of different
|
|
colors is
|
|
|
|
\begin{equation}
|
|
\mathrm{P}\left(W_{k:r}=n\right)=r_{(k)}S(n-1,k-1)/r^{n}\label{eq:coupon-collectors}
|
|
\end{equation}
|
|
|
|
Holst defines $r_{(k)}\equiv r(r-1)\ldots(r-k+1)$. When $r=k$, this
|
|
is the factorial $r_{(k)}=r!$. $S(n,k)$ is a Stirling number of
|
|
the second kind:
|
|
|
|
\[
|
|
S(n,k)=\stackrel[j=0]{k}{\sum}(-1)^{j}\dbinom{k}{j}(k-j)^{n}
|
|
\]
|
|
|
|
The (\ref{eq:coupon-collectors}) equation is the probability that
|
|
you need exactly $n$ nodes on the network to have all eight distinct
|
|
slices. We want to know the probability that you need more than $n$
|
|
nodes to have all the slices. This probability is $1-\sum_{i=8}^{n}\mathrm{P}\left(W_{k:r}=i\right)$.
|
|
|
|
To avoid limitations of floating point computer arithmetic, when computing
|
|
these values it is recommended to use a software library that uses
|
|
arbitrary-precision numbers such as the GNU Multiple Precision Arithmetic
|
|
Library.
|
|
|
|
Figure \ref{fig-pruned-node-collectors-problem} plots the probability
|
|
that a Monero network would not contain all 8 pruned node slices.
|
|
When there are 100 nodes on the network, the probability is about
|
|
0.001 percent. When the number of nodes is 681, the probability of
|
|
not having all 8 pruned node slices is less than $2^{-128}$, which
|
|
is the probability of guessing a specific 12-word BIP39 bitcoin seed
|
|
phrase with a single guess.\footnote{\url{https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki}}
|
|
These probabilities correspond to a network in a single point in time.
|
|
If we consider that a network will have many ``draws'' in its lifetime,
|
|
then the probability of missing one of the eight slices during any
|
|
point in its lifetime is higher. If the whole set of $n$ nodes re-draws
|
|
its random pruning seed $q$ times, the probability of never missing
|
|
one of the eight slices is $\left(1-\mathrm{P}\left(\mathrm{missing\,at\,least\,one\,slice}\right)\right)^{q}$
|
|
because the draws are independent.
|
|
|
|
\begin{figure}
|
|
|
|
\caption{Probability of not having all 8 distinct pruned slices on the Monero
|
|
network}
|
|
|
|
\label{fig-pruned-node-collectors-problem}
|
|
|
|
\includegraphics[scale=0.4]{images/pruned-node-collectors-problem-to-100}\includegraphics[scale=0.4]{images/pruned-node-collectors-problem-to-1000}
|
|
|
|
\end{figure}
|
|
|
|
\end{document}
|