wallet: mitigate statistical dependence for decoy selection within rings

Since we are required to check for uniqueness of decoy picks within any given
ring, and since some decoy picks may fail due to unlock time or malformed EC points,
the wallet2 decoy selection code was building up a larger than needed *unique* set of
decoys for each ring according to a certain distribution *without replacement*. After
filtering out the outputs that it couldn't use, it chooses from the remaining decoys
uniformly random *without replacement*.

The problem with this is that the picks later in the picking process are not independent
from the picks earlier in the picking process, and the later picks do not follow the
intended decoy distribution as closely as the earlier picks. To understand this
intuitively, imagine that you have 1023 marbles. You label 512 marbles with the letter A,
label 256 with the letter B, so on and so forth, finally labelling one marble with the
letter J. You put them all into a bag, shake it well, and pick 8 marbles from the bag,
but everytime you pick a marble of a certain letter, you remove all the other marbles
from that bag with the same letter. That very first pick, the odds of picking a certain
marble are exactly how you would expect: you are twice as likely to pick A as you are B,
twice as likely to pick B as you are C, etc. However, on the second pick, the odds of
getting the first pick are 0%, and the chances for everything else is higher. As you go
down the line, your picked marbles will have letters that are increasingly more unlikely
to pick if you hadn't remove the other marbles. In other words, the distribution of the
later marbles will be more "skewed" in comparison to your original distribution of marbles.

In Monero's decoy selection, this same statistical effect applies. It is not as dramatic
since the distribution is not so steep, and we have more unique values to choose from,
but the effect *is* measureable. Because of the protocol rules, we cannot have duplicate
ring members, so unless that restriction is removed, we will never have perfectly
independent picking. However, since the earlier picks are less affected by this
statistical effect, the workaround that this commit offers is to store the order that
the outputs were picked and commit to this order after fetching output information over RPC.
This commit is contained in:
jeffro256 2023-10-15 17:25:15 -07:00
parent 8123d945f8
commit b2eb47d875
No known key found for this signature in database
GPG key ID: 6F79797A6E392442

View file

@ -8616,6 +8616,26 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
COMMAND_RPC_GET_OUTPUTS_BIN::request req = AUTO_VAL_INIT(req);
COMMAND_RPC_GET_OUTPUTS_BIN::response daemon_resp = AUTO_VAL_INIT(daemon_resp);
// The secret picking order contains outputs in the order that we selected them.
//
// We will later sort the output request entries in a pre-determined order so that the daemon
// that we're requesting information from doesn't learn any information about the true spend
// for each ring. However, internally, we want to prefer to construct our rings using the
// outputs that we picked first versus outputs picked later.
//
// The reason why is because each consecutive output pick within a ring becomes increasing less
// statistically independent from other picks, since we pick outputs from a finite set
// *without replacement*, due to the protocol not allowing duplicate ring members. This effect
// is exacerbated by the fact that we pick 1.5x + 75 as many outputs as we need per RPC
// request to account for unusable outputs. This effect is small, but non-neglibile and gets
// worse with larger ring sizes.
std::vector<get_outputs_out> secret_picking_order;
// Convenience/safety lambda to make sure that both output lists req.outputs and secret_picking_order are updated together
// Each ring section of req.outputs gets sorted later after selecting all outputs for that ring
const auto add_output_to_lists = [&req, &secret_picking_order](const get_outputs_out &goo)
{ req.outputs.push_back(goo); secret_picking_order.push_back(goo); };
std::unique_ptr<gamma_picker> gamma;
if (has_rct)
gamma.reset(new gamma_picker(rct_offsets));
@ -8750,7 +8770,7 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
if (out < num_outs)
{
MINFO("Using it");
req.outputs.push_back({amount, out});
add_output_to_lists({amount, out});
++num_found;
seen_indices.emplace(out);
if (out == td.m_global_output_index)
@ -8772,12 +8792,12 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
if (num_outs <= requested_outputs_count)
{
for (uint64_t i = 0; i < num_outs; i++)
req.outputs.push_back({amount, i});
add_output_to_lists({amount, i});
// duplicate to make up shortfall: this will be caught after the RPC call,
// so we can also output the amounts for which we can't reach the required
// mixin after checking the actual unlockedness
for (uint64_t i = num_outs; i < requested_outputs_count; ++i)
req.outputs.push_back({amount, num_outs - 1});
add_output_to_lists({amount, num_outs - 1});
}
else
{
@ -8786,7 +8806,7 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
{
num_found = 1;
seen_indices.emplace(td.m_global_output_index);
req.outputs.push_back({amount, td.m_global_output_index});
add_output_to_lists({amount, td.m_global_output_index});
LOG_PRINT_L1("Selecting real output: " << td.m_global_output_index << " for " << print_money(amount));
}
@ -8894,7 +8914,7 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
seen_indices.emplace(i);
picks[type].insert(i);
req.outputs.push_back({amount, i});
add_output_to_lists({amount, i});
++num_found;
MDEBUG("picked " << i << ", " << num_found << " now picked");
}
@ -8908,7 +8928,7 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
// we'll error out later
while (num_found < requested_outputs_count)
{
req.outputs.push_back({amount, 0});
add_output_to_lists({amount, 0});
++num_found;
}
}
@ -8918,6 +8938,10 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
[](const get_outputs_out &a, const get_outputs_out &b) { return a.index < b.index; });
}
THROW_WALLET_EXCEPTION_IF(req.outputs.size() != secret_picking_order.size(), error::wallet_internal_error,
"bug: we did not update req.outputs/secret_picking_order in tandem");
// List all requested outputs to debug log
if (ELPP->vRegistry()->allowed(el::Level::Debug, MONERO_DEFAULT_LOG_CATEGORY))
{
std::map<uint64_t, std::set<uint64_t>> outs;
@ -9035,18 +9059,21 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>>
}
}
// then pick others in random order till we reach the required number
// since we use an equiprobable pick here, we don't upset the triangular distribution
std::vector<size_t> order;
order.resize(requested_outputs_count);
for (size_t n = 0; n < order.size(); ++n)
order[n] = n;
std::shuffle(order.begin(), order.end(), crypto::random_device{});
// While we are still lacking outputs in this result ring, in our secret pick order...
LOG_PRINT_L2("Looking for " << (fake_outputs_count+1) << " outputs of size " << print_money(td.is_rct() ? 0 : td.amount()));
for (size_t o = 0; o < requested_outputs_count && outs.back().size() < fake_outputs_count + 1; ++o)
for (size_t ring_pick_idx = base; ring_pick_idx < base + requested_outputs_count && outs.back().size() < fake_outputs_count + 1; ++ring_pick_idx)
{
size_t i = base + order[o];
const get_outputs_out attempted_output = secret_picking_order[ring_pick_idx];
// Find the index i of our pick in the request/response arrays
size_t i;
for (i = base; i < base + requested_outputs_count; ++i)
if (req.outputs[i].index == attempted_output.index)
break;
THROW_WALLET_EXCEPTION_IF(i == base + requested_outputs_count, error::wallet_internal_error,
"Could not find index of picked output in requested outputs");
// Try adding this output's information to result ring if output isn't invalid
LOG_PRINT_L2("Index " << i << "/" << requested_outputs_count << ": idx " << req.outputs[i].index << " (real " << td.m_global_output_index << "), unlocked " << daemon_resp.outs[i].unlocked << ", key " << daemon_resp.outs[i].key);
tx_add_fake_output(outs, req.outputs[i].index, daemon_resp.outs[i].key, daemon_resp.outs[i].mask, td.m_global_output_index, daemon_resp.outs[i].unlocked, valid_public_keys_cache);
}