Cancer-testis or germ cell antigens (GCAs) are a category of tumor antigens expressed by male germ cells and by cancers of diverse histological origin, but not usually by normal adult somatic tissue. These antigens include products encoded by the MAGE, BAGE, GAGE, SSX, and LAGE/NY-ESO-1 families that encode antigenic peptides recognized by T lymphocytes. In this study, we exploit oligonucleotide technology to identify genes in melanoma and soft tissue sarcoma (STS) that display a cancer-testis/GCA expression profile. We identified 59 such genes, including GCAs we knew to be recognized by T lymphocytes. Among our findings are the expression of PRAME in monophasic synovial sarcoma, PRAME and NY-ESO-1 in myxoid/round cell liposarcoma, and SSX2 and members of the GAGE family in malignant fibrous histiocytoma. Furthermore, the proto-oncogene DBL/MCF2 was identified as encoding a novel candidate GCA expressed by clear cell sarcoma/melanoma of soft parts (MSP). DBL/MCF2 peptides that are bound to HLA-A*0201 were identified and recognized by T lymphocytes. These results show the utility of high-throughput expression analysis in the efficient screening and identification of GCA candidates in cancer, and its application to the discovery of candidate targets for T cell immunity against GCAs expressed by cancer.
This article was published in Cancer Immunity, a Cancer Research Institute journal that ceased publication in 2013 and is now provided online in association with Cancer Immunology Research.
Members of the cancer-testis family of antigens are encoded by genes expressed by cancers of diverse histological origin and are typically absent from normal adult tissues, with the exception of male germ cells. Expression of these antigens in germ cells does not seem to induce autoimmunity, possibly because germ cells are mainly sequestered in an immune privileged site and lack MHC class I and II expression (1). Genes encoding GCAs are frequently located on the X chromosome, but expression can be suppressed in somatic cells by methylation and histone acetylation.
The prototype GCA, MAGE-A1 (melanoma antigen 1), was originally described by Boon et al. (2, 3) as a tumor antigen that was recognized by CD8+ T cells from an autologous melanoma patient. T cell recognition has also been demonstrated for other GCAs, including MAGE-A3, NY-ESO-1, and SSX2. Today, 44 cancer-testis genes or gene families have been identified by cell transfection assays, SEREX (serological analysis of recombinant tumor cDNA expression libraries) (4), and subtractive hybridization and database mining (5, 6), including MAGE-A, -B, -C, -D, BAGE, GAGE, LAGE-1/NY-ESO-1, and SSX, as well as GAGE-like genes such as PAGE (7, 8) and XAGE (9, 10), which are detected in normal adult tissues (including lung, uterus, and prostate) (8, 9). Only a subset of these gene products (19/44) have been characterized as GCAs, based on recognition by T lymphocytes or antibodies (6), and only 7 GCA families are recognized by T lymphocytes (Table 1) [for review, see Scanlan et al. (6)]. These observations show that the immune repertoire is capable of recognizing GCAs, and in several cases immunogenicity following active immunization has been demonstrated (for example, for MAGE-A3, NY-ESO-1, SSX2). The number of known cancer-testis genes, including potentially immunogenic GCAs, probably underrepresents the total universe of cancer-testis genes, and especially antigenic GCAs.
GCAs are particularly attractive for cancer vaccine development for several reasons. They characteristically have a restricted pattern of expression by adult somatic cells that might favor immune recognition with limited autoimmunity. GCAs are expressed across different cancer diagnoses, which would enable the development of vaccines not limited to individual cancer types. Furthermore, these antigens are frequently coexpressed together by cancer cells in a host, which may facilitate the development of polyvalent vaccines to limit the potential occurrence of antigen escape variants.
Advances in high-throughput gene expression profiling using the oligonucleotide microarray platform have enabled the simultaneous characterization of thousands of gene transcripts. We have explored an approach to identify candidate GCAs in melanoma and sarcoma by predicting the expression parameters for known members of this class. Melanoma is known to express GCAs at particularly high frequencies (6). Using data from a panel of STS, melanoma, and normal tissues, we identified 59 cancer-testis genes, including known GCAs recognized by T cells or antibodies, as well as genes encoding potentially novel GCAs.
We examined 104 tissue and cell line specimens, including 81 cancer specimens, 15 adult normal somatic tissues, pooled testes (germ cells), and 7 fetal, neonatal, and adult brain specimens. Cancer specimens included 34 melanomas (13 tumor specimens and 21 established cell lines) (45) and 47 STSs (all tumor specimens). The STS and melanoma specimens, recently presented by us in independent bioinformatic reports of STS tumor classification (45, 46), included leiomyosarcoma, gastrointestinal stromal tumor, synovial sarcoma, malignant fibrous histiocytoma, conventional fibrosarcoma, pleomorphic liposarcoma, dedifferentiated liposarcoma, and myxoid/round cell liposarcoma. Melanoma specimens included conventional melanoma as well as clear cell sarcoma/MSP. Normal adult tissues included lung, heart, skeletal muscle, small intestine, colon, stomach, liver, spleen, pancreas, kidney, skin, bone marrow, prostate, adrenal gland, and connective tissue.
Identification of candidate GCAs
Examination of the Affymetrix®-assigned detection call, which indicates whether a value is reliably detected ("Present") or not ("Absent") (47), identified 9399 genes expressed in 1 or more normal tissues, 10,328 genes expressed in 1 or more cancers, and 6968 genes expressed in pooled testes. Of these, 304 genes were expressed in cancer and testes but were not reliably detected in adult normal somatic tissues (Figure 1).
We applied an additional criterion, signal intensity, to the selected list. We detected 61/304 gene transcripts in testes and cancer that had a >2-fold higher abundance compared to the highest signal value in adult normal somatic tissues. These 61 gene transcripts represented 59 distinct genes (Figure 2). The 61 gene transcripts encoding candidate GCAs were examined more closely. These genes included NY-ESO-1 and PRAME, which were both expressed by synovial sarcoma and myxoid/round cell liposarcoma. In the melanoma panel, 3/9 tumor specimens (33%) and 13/20 cell lines (65%) displayed expression of MAGE and/or GAGE. Notably, BAGE expression was not detected in melanoma using this approach.
In addition, two ESTs were expressed by cancer and germ cells but not by adult normal tissues: C1orf15, expressed predominantly in round cell liposarcoma and brain, and CXorf6, whose locus is located in close proximity to MAGE-A4 and MAGE-A7 to -A9 on chromosome X, were expressed by melanoma and STS specimens, fetal brains, and newborn melanocytes.
Validation of selection criteria
We sought to validate our methodology by measuring a selection of known cancer-testis genes and by assessing genes found on the X chromosome for enrichment. The number of genes is approximate because a single gene can be represented by more than one probe set (as in the case of SSX2). Calculations were performed according to the number of probe sets, rather than distinct genes, because duplicate genes are not excluded in the Affymetrix® U95A probe set annotation.
We found 21 probe sets representing 19 known cancer-testis genes on the Affymetrix U95A GeneChip® (Table 1), out of a total of 12,533 probe sets (0.17%). Gene selection by "Absolute call" yielded 14 probe sets corresponding to known cancer-testis genes, out of 304 genes (4.6%). Selection by signal intensity yielded the same known cancer-testis genes in the list of 61 genes (23.0%) (Table 2). According to the NetAffx™ annotation, we found that 463 of the original 12,533 genes (3.7%) were X-linked. Gene selection by "Absolute call" yielded 32 probe sets corresponding to X-linked genes out of 304 genes (10.5%). Further stringent selection by high signal intensity showed enrichment for X-linked genes in candidate cancer-testis genes, revealing 20 X-linked probe sets out of 61 genes (32.8%). In summary, the application of selection criteria by both "Absolute call" and signal intensity resulted in an approximately 137-fold enrichment of known cancer-testis genes, and an approximately 9-fold enrichment of X-linked genes in a final gene list of 61 probe sets representing 59 different genes.
Immune recognition of novel candidate GCAs identified by transcriptional profiling
DBL/MCF2, a transforming phosphoprotein originally identified as a proto-oncogene expressed by B cell lymphoma, came to our attention because it was expressed by clear cell sarcoma/MSP. We assessed whether T cells from healthy HLA-A*0201 donors recognized DBL/MCF2 peptides (designated DBL180, DBL217, DBL247) presented by HLA-A*0201. All three candidate peptides were shown to bind HLA-A*0201 by T2 stabilization assays (data not shown). CD8+ T cells enriched to >98% purity were used to detect CD8+ T cells that produced IFN-gamma in response to any one of the three DBL/MCF2 peptides. CD8+ T cells were incubated with peptide (or control peptides) for 20 h, and activated T cells were quantitated by ELISPOT assay. Because of the short incubation period in vitro, it is unlikely that T cells doubled more than twice after plating. CD8+ T cells responding with IFN-gamma production were scored as spots per 100,000 CD8+ T cells plated. Thus, the number of specific spots can be used to estimate the frequency of precursor CD8+ T cells that secrete IFN-gamma in response to each DBL/MCF2 peptide when activated in vitro (Figure 3). The estimated frequencies of CD8+ T cells recognizing DBL/MCF2 peptides ranged from 1:1807 to 1:600 (Table 3). These results demonstrate that CD8+ T cells recognize DBL/MCF2 peptides.
Oligonucleotide microarray technology provides unprecedented access to transcription profiles of thousands of genes in multiple specimens. In this analysis, we surveyed the gene expression profile of 12,533 probe sets in over 100 specimens, including STS, melanoma, and adult and fetal normal tissues. GCAs are attractive tumor antigens for the development of cancer vaccines. In trials with MAGE-A3.A1 peptide in patients with metastatic melanoma, no evidence of T cell responses was found in the blood of the four patients who were analyzed, including two who had tumor regression (51). In contrast, a monoclonal T cell response against a MAGE-A3 antigen was reported in a melanoma patient who had partial regression of a lesion after vaccination (52). In another trial by Jäger et al. (53), immune responses were identified against NY-ESO-1, and CD8+ T cell responses were observed in 4/7 patients without pretreatment immune responses to NY-ESO-1. Induction of a specific CD8+ T cell response to NY-ESO-1 in this study was associated with disease stabilization and regression of individual metastases.
We describe the use of high-throughput oligonucleotide microarray technology to identify new cancer-testis genes by their transcriptional expression profile, particularly expression that is restricted to cancer and to adult male testes, and is not expressed by adult normal somatic tissue. Using these parameters, we filtered the expression profiles of 12,533 genes represented on the U95A GeneChip® in two empiric steps, with subsequent validation.
Our initial selection by "Absolute call" selected genes that were reliably detected on the oligonucleotide array in both cancer and testes, and filtered out genes expressed by adult normal somatic tissues represented on our panel. We applied a second level of stringency to the initial gene list by filtering for genes that were detected at a signal intensity at least 2-fold higher in both cancer and testes as compared to that in the normal tissue panel. Signal intensity is a quantitative metric calculated for each probe set that represents the relative level of expression of a transcript.
The current bioinformatic analysis identified 59 genes that satisfy criteria intended to identify potential GCAs by their expression pattern. Proof of concept to support our approach was provided by the identification of 13 known GCAs represented on the oligonucleotide array, including MAGE-A3 and NY-ESO-1. Our observations were further supported by the enrichment for genes located on the X chromosome.
Several observations are worthy of further comment. Our results confirm expression of NY-ESO-1 in monophasic synovial sarcoma, as previously demonstrated by Jungbluth et al. using immunohistochemistry (54), as well as members of the SSX family of genes, in particular SSX2 and SSX3. Furthermore, we detected expression of PRAME in this sarcoma that occurs predominantly in adolescents.
Myxoid/round cell liposarcoma was found to express both PRAME and NY-ESO-1, which had not been identified in this tumor type previously. Expression of an EST designated C1orf15 was revealed in all four specimens of this sarcoma subtype, in addition to two melanomas and adult and fetal brain. C1orf15, located on chromosome 1q25, had previously been shown by both nucleotide homology and functional assays to belong to the nicotinamide mononucleotide adenylyltransferase (NMNAT) enzyme family, members of which catalyze an essential step in the nicotinamide adenylyltransferase (NAD) biosynthetic pathway (55). The predilection of this gene for myxoid/round cell liposarcoma and its candidacy as a GCA also expressed in the brain suggests this may be a novel candidate germ cell-brain antigen in this STS subtype. CXorf6, located on chromosome Xq28 adjacent to loci encoding members of the MAGE family, was identified in several STS and melanoma specimens, neonatal melanocytes, and fetal brain. This EST has been reported to have weak homology to a previously described autoantigen, Ge-1, identified in a patient with Sjorgens disease (56).
DBL/MCF2 was identified in clear cell sarcoma/MSP. It is an X-linked proto-oncogene encoding a protein with a yet undetermined function that is transforming in NIH3T3 bioassays following loss of carboxyl sequences. PCR analysis of mouse tissues shows restricted expression to the gonads and to tissues of neuroectodermal origin (57). This gene is of particular interest for further study because it is implicated in oncogenesis and is detected in clear cell sarcoma/MSP, including tumors from HLA-A*0201 patients.
We tested the CD8+ T cell repertoire for recognition of three candidate HLA-A*0201-restricted peptides in DBL/MCF2, revealing CD8+ T cell recognition of all peptides at varying precursor frequency, in particular AMLDLLKSV (DBL217) at the remarkable frequency of 1:600. Further studies into the processing and presentation of these peptides by MSP are warranted.
Chang et al. originally reported that STSs express class I MHC molecules, and a subset of these express class II MHC molecules (58). Both molecules can be upregulated on STS by interferon-gamma (58, 59). A recent paper by Valmori et al. describes a survey of sarcomas, including STSs, for the expression of a dozen known cancer-testis genes, as determined by PCR and immune histochemistry (59). They found that >70% of sarcomas expressed one or more cancer-testis genes, with evidence for clustering of coexpression of cancer-testis genes by individual tumors. Valmori's results, along with our observations, support the development of active immunization strategies against GCAs by STSs.
In conclusion, this study provides a strategy for the use of high-throughput oligonucleotide technology in the simultaneous screening of multiple tumors and in the identification of tumor antigens. Furthermore, we provide new evidence for the expression of PRAME in monophasic synovial sarcoma, NY-ESO-1 and PRAME in myxoid/round cell liposarcoma, SSX2 and members of the GAGE family in malignant fibrous histiocytoma, and the novel antigen DBL/MCF2 in clear cell sarcoma/MSP. The microarray screen for transcriptional profiling is an effective strategy for identifying cancer-testis genes and candidate GCAs based on relative cancer and tissue expression.
Materials and methods
Tumor specimens were obtained from 60 patients undergoing surgery at the Memorial Sloan-Kettering Cancer Center (MSKCC, NY, USA), including 13 melanomas and 47 STSs, as previously reported by our group in a tumor classification study (45). All specimens were collected under a tissue procurement protocol reviewed and approved by the MSKCC Institutional Review Board. Tumor tissues were freshly embedded in OCT compound and frozen as tissue blocks using liquid nitrogen. Tumor specimens were selected for analysis following validation of histologic diagnosis. Twenty-one cell lines established at MSKCC were also used: 20 melanoma cell lines derived from regional and distant metastases of 20 independent patients (60), and one clear cell sarcoma/MSP cell line that was derived in primary culture from a tumor specimen included in this study. Normal tissue RNA was extracted from normal tissue adjacent to tumor or was obtained from Stratagene (La Jolla, CA, USA) or Clontech (San Jose, CA, USA).
Cryopreserved tumor sections were homogenized under liquid nitrogen by mortar and pestle. Total RNA was extracted in Trizol reagent and purified using the Qiagen RNeasy kit (Qiagen, Valencia, CA, USA). RNA quality was assessed by ethidium bromide agarose gel electrophoresis. Synthesis of cDNA was performed in the presence of oligo(dT)24-T7 (Genset, La Jolla, CA, USA). Complementary RNA was prepared using biotinylated UTP and CTP and hybridized to HG_U95A oligonucleotide arrays (Affymetrix, Santa Clara, CA, USA). Fluorescence was measured by laser confocal scanner (Agilent, Palo Alto, CA, USA) and converted to signal intensity using Affymetrix Microarray Suite v5.0 software.
Data analysis was performed using Microsoft® Excel 2000. Three data sets were established. The first comprised all adult normal somatic tissues except for brain, as the brain may be considered an immune privileged site. The second consisted of melanoma and STS specimens, and the third consisted of pooled testes (from multiple individuals, n = 19) obtained from Clontech (San Jose, CA, USA). First, we used the concatenate function to combine all "Absolute call" information for each separate data set. We selected genes with at least one "Present call" in both the cancer and testes data sets, and zero "Present calls" in the normal tissue data set, allowing for both "Absent" and "Marginal" calls in the latter. After measuring the maximum signal in the selected cancer and normal tissue datasets, we selected genes for which the ratio of maximum cancer-to-maximum benign and of testes-to-maximum benign were greater than or equal to 2. In the case of maximum cancer, signal values corresponded to "Present call" only.
PBMC and CD8+ T cell isolation
PBMCs from the heparinized blood of healthy HLA-A*0201 donors were isolated by centrifugation on a Ficoll-Paque™ gradient (Pharmacia, Piscataway, NJ, USA) after obtaining the informed consent of the donors using a protocol approved by the MSKCC Institutional Review Board. CD8+ subsets were selected using magnetic beads (MACS; Miltenyi Biotec, Auburn, CA, USA). The efficiency of depletion was monitored by flow cytometry. Purified CD8+ T cells were tested directly in the ELISPOT assay.
The DBL/MCF2 amino acid sequence was analyzed for potential peptide sequences binding to HLA-A*0201, using canonical primary anchor motifs (61). The influenza matrix peptide, GILGFVFTL, that is known to bind to HLA-A*0201 was used as a positive control in the ELISPOT assay. Peptides were obtained from Research Genetics (Invitrogen, Carlsbad, CA, USA) and were purified by HPLC to >90% purity. Peptides were resuspended in DMSO at a concentration of 5 mg/ml and were stored in aliquots at -20°C. Binding to HLA-A*0201 molecules was measured using stabilization of MHC expression by the T2 cell line, according to a previously described protocol (62).
IFN-gamma ELISPOT assay
IP-Multiscreen plates (Millipore, Bedford, MA, USA) were coated with 100 µl of mouse antihuman IFN-gamma antibody (10 µg/ml; clone 1-D1K, Mabtech, Sweden) in PBS, incubated overnight at 4°C, washed with PBS to remove any unbound antibody, and blocked with RPMI/human serum for 2 h at 37°C. Purified CD8+ T cells were plated at a concentration of 1 x 105/well. For antigen presentation, 1 x 104 irradiated T2 cells per well were pulsed with 1 µg/ml peptide in a final volume of 100 µl/well. After incubation for 20 h at 37°C, plates were washed with PBS/0.05% Tween-20 and 100 µl/well biotinylated detection antibody against human IFN-gamma (2 µg/ml; clone 7-B6-1, Mabtech, Sweden) was added. Plates were incubated for an additional 2 h at 37°C, and spot development was performed as previously described (63). Spots were counted by an automated ELISPOT reader (Carl Zeiss, Thornwood, NY, USA).
This work has been supported by the Etta Weinheim Memorial Fund, Swim Across America, the Kennedy Family Fund, and by NIH grants CA47179 and CA33049. We would like to thank Juan Li and Liliana Villafania of the Memorial Sloan-Kettering Cancer Center Genomic Core Laboratory and Dr. Rodica Stan for their critical reviews of the manuscript.
- Received November 24, 2004.
- Accepted November 24, 2004.
- Copyright © 2005 by Neil H. Segal