For the feature selection problem, we propose an efficient privacy-preserving
algorithm. Let $D$, $F$, and $C$ be data, feature, and class sets,
respectively, where the feature value $x(F_i)$ and the class label $x(C)$ are
given for each $xin D$ and $F_i in F$. For a triple $(D,F,C)$, the feature
selection problem is to find a consistent and minimal subset $F’ subseteq F$,
where `consistent’ means that, for any $x,yin D$, $x(C)=y(C)$ if
$x(F_i)=y(F_i)$ for $F_iin F’$, and `minimal’ means that any proper subset of
$F’$ is no longer consistent. On distributed datasets, we consider feature
selection as a privacy-preserving problem: Assume that semi-honest parties
$textsf A$ and $textsf B$ have their own personal $D_{textsf A}$ and
$D_{textsf B}$. The goal is to solve the feature selection problem for
$D_{textsf A}cup D_{textsf B}$ without revealing their privacy. In this
paper, we propose a secure and efficient algorithm based on fully homomorphic
encryption, and we implement our algorithm to show its effectiveness for
various practical data. The proposed algorithm is the first one that can
directly simulate the CWC (Combination of Weakest Components) algorithm on
ciphertext, which is one of the best performers for the feature selection
problem on the plaintext.

