Skip to main content

Multiclass cancer diagnosis using tumor gene expression signatures



Published on


Experiment Description:    The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a support vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.

*Experiment Identifier:    golub-00228

*Assay Type*:    Gene Expression

*Provider*:    Affymetrix

*Array Designs:    Hu35KsubA, Hu6800

*Organism*:    Homo sapiens (ncbitax)

*Tissue Sites:    

*Material Types:    synthetic_RNA, organism_part, whole_organism, total_RNA

*Cell Types:
*Disease States:    bladder transitional cell carcinoma, B-cell ALL, Colorectal Adenocarcinoma, Medulloblastoma, Normal, pancreatic adenocarcinoma, Lung Adenocarcinoma, T-cell ALL, Ovarian Adenocarcinoma, Acute Myeloid Leukemia, prostate adenocarcinoma, Breast Adenocarcinoma, Mesothelioma, renal cell carcinoma, Melanoma, Follicular Lymphoma, uterine adenocarcinoma, large B-cell lymphoma, Glioblastoma, large-Bcell lymphoma, bladder transitonal cell carcinoma

==================== Included Files ====================

*Experiment Filename:

*MAGE-TAB (IDF and SDRF) files:

Experiment file include both imported and exported MAGE-TAB files. In some cases data was submitted in several batches and the folder includes several imported IDF and  SDRF files. There is only one exported IDF and one exported SDRF file in the folder, the word "export" is incorporated in the exported MAGE-TAB file name.

Cite this work

Researchers should cite this work as follows:

  • (2014), "Multiclass cancer diagnosis using tumor gene expression signatures,"

    BibTex | EndNote


Mervi Heiskanen, Ishwar Chandramouliswaran

National Cancer Institute