DNA microarray technology is a revolutionary method enabling the measurement of expression levels of at least thousands of genes in a single experiment under diverse experimental conditions. This technology has found numerous applications in research and applied areas like biology, drug discovery, toxicological study and diseases diagnosis.

DNA microarray data is typically represented by a matrix where each cell represents the gene expression level of a gene under a particular experimental condition. One important analysis task of microarray data concerns the simultaneous identification of groups of genes that show similar expression patterns across specific groups of experimental conditions (samples) [1]. Such an application can be addressed by a biclustering process whose aim is to discover coherent biclusters. That is, a bicluster is a subset of genes and conditions of the original expression matrix where the selected genes present a coherent behavior under all the experimental conditions contained in the bicluster.

More generally, biclustering has also applications in other domains such as text mining [2, 3], target marketing [4, 5], markets search [6], search in databases [7, 8] and analyzing foreign exchange data [9].

Formally, let

*I* = {1, 2, ...,

*n*} denote the index set of

*n* genes and

*J* = {1, 2, ...,

*m*} the index set of

*m* conditions, a

*data matrix M*(

*I*,

*J*) associated with

*I* and

*J* is a

*n**

*m* matrix where the

*i*
^{th} row,

*i* ∈

*I*, represents the

*i*
^{th} gene or attribute and the

*j*
^{th},

*j* ∈

*J*, column represents the

*j*
^{th} condition or individual and

*m*
_{
ij
}of the

*i*
^{th} row and the

*j*
^{th} column represents the value of the

*j*
^{th} condition for the

*i*
^{th} gene. A

*bicluster* in a data matrix

*M*(

*I*,

*J*) is a couple (

*I*',

*J*') such that

*I*'⊆

*I* and

*J*'⊆

*J*. The biclustering problem can be formulated as follows: Given a data matrix

*M*, construct a bicluster

*B*
_{
opt
}associated with

*M* such that:

where *f* is an *objective function* measuring the *quality*, i.e., degree of coherence, of a group of biclusters and *BC*(*M*) is the set of all the possible groups of biclusters associated with *M*.

Clearly, biclustering is a highly combinatorial problem with a search space of order of *O*(*2*
^{|I|+|J|}). In the general case, biclustering is known to be NP-hard [1]. Consequently, most of the algorithms used to discover biclusters are based on heuristics to explore partially the combinatorial search space. The existing algorithms for biclustering can roughly be classified into two large families: systematic search methods and stochastic search methods (also called metaheuristic methods). Representative examples of systematic search methods include, among others, greedy algorithms [1, 10–14], divide and conquer algorithms [7, 15] and enumeration algorithms [16–18]. On the other hand, among the metaheuristic methods, we can mention neighbourhood-based algorithms like simulated annealing [19], GRASP [20], evolutionary and hybrid algorithms [21–24]. A recent review of various biclustering algorithms for biological data analysis is provided in [25].

Since the biclustering problem is a NP-hard problem and no single existing algorithm is completely satisfactory for solving the problem. It is useful to seek more effective algorithms for better solutions. In this paper, we introduce a new enumeration algorithm for biclustering of DNA microarray data, called *BiMine*. Our algorithm is based on three original features. First, *BiMine* relies on a new evaluation function called *Average Spearman's rho* (ASR) which is used to guide effectively the exploration of the search space. Second, *BiMine* uses a new tree structure, called *Bicluster Enumeration Tree* (BET), to represent conveniently the different biclusters discovered during the enumeration process. Third, to avoid the combinatorial explosion of the search tree, *BiMine* introduces a parametric rule that allows the enumeration process to cut tree branches that cannot lead to good biclusters.

To assess the performance of the proposed *BiMine* algorithm, we show computational results obtained on both synthetic and real datasets and compare our results with those from four state-of-the-art biclustering algorithms. Moreover, to evaluate the biological relevance of our resulting biclusters, we carry out a practical validation with respect to a specific Gene Ontology (GO) annotation with the help of a popular web tool.