|
演讲简介
In d-dimensional space, a point p "dominates" another point q if the coordinate of p is at least that of q on every dimension. A "monotone classifier" is a function h mapping each point to {0, 1}, subject to the condition that h(p) >= h(q) holds whenever p dominates q. Now, fix a set P of points in d-dimensional space, where each point carries a label 0 or 1. A classifier h "misclassifies" a point p in P if h(p) is different from the label of p. The "error" of h is the number of points in P misclassified by h.
In active monotone classification, we have an input set P, where the labels of all points are hidden in the beginning. An algorithm must pay a unit cost to reveal the label of a point in P. The challenge to find the best monotone classifier with the smallest cost. The problem is fundamental to numerous applications, e.g., entity matching, record linkage, duplicate detection, and so on.
This talk will provide in-depth coverage of the latest results on the problem. Our discussion will involve both algorithms with non-trivial guarantees and lower bounds on the best performance achievable by any algorithms, deterministic or randomized.