Skip to main content

Table 1 Characteristics of the experimental datasets

From: ProtNN: fast and accurate protein 3D-structure classification in structural and topological space

Dataset

SCOP ID

Family name

Pos.

Neg.

Avg. ∣V∣

Avg. ∣E∣

Max. ∣V∣

Max. ∣E∣

DS1

48623

Vertebrate phospholipase A2

29

29

160

628

451

1812

DS2

52592

G-proteins

33

33

246

971

897

3544

DS3

48942

C1-set domains

38

38

238

928

768

2962

DS4

56437

C-type lectin domains

38

38

185

719

775

3016

DS5

56251

Proteasome subunits

35

35

231

929

897

3544

DS6

88854

Protein kinases, catalyc subunits

41

41

275

1077

775

3016

  1. SCOP ID, Family name, Pos., Neg., Avg. ∣V∣, Avg. ∣E∣, Max. ∣V∣ and Max. ∣E∣ correspond respectively to the identifier of the positive protein family in SCOP, its name, the number of positive examples, the number of negative examples, the average number of nodes, the average number of edges, the maximal number of nodes and the maximal number of edges in each dataset