On the data: these are public citizen-lead efforts to crowd-source the names of the missing - hosted on websites and spreadsheets. There is no official verification process behind the individual entries, which is part of why the duplicate problem existed in the first place.
An issue we have now realised is "bad actors" trying to access the data...
Happy to answer anything - methodology, false positives, data handling, whatever.