age: 14 sex: M disease status: control subject Filter number: 86 Usage number: 1 Exposures: 3
Growth protocol
Study participants were recruited under an institutional review board approved protocol and informed consent. A skin punch biopsy was used to establish a fibroblast cell culture. Unaffected control subjects were selected from patients visiting dermatology clinics. All of the cell strains (in the array group, between passage 2 and 6 and in the New Subject group, passage 2) were grown to confluence in SmGM2 +10% FBS + bullet kit containing recombinant EGF, FGF and insulin (BioWhittaker). After reaching confluence, the medium was modified to include ascorbic acid at 50-µg/ml [Jain MK, Layne MD, Watanabe M, Chin MT, Feinberg MW, Sibinga NE et al.: In vitro system for differentiating pluripotent neural crest cells into smooth muscle cells. J Biol Chem 1998, 273: 5993-5996]. Fresh medium was added after 24 hours.
Extracted molecule
total RNA
Extraction protocol
At 48h, total RNA was isolated using a guanidinium iso-thiocyanate-phenol-chloroform extraction protocol [Chomczynski P, Sacchi N: Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem 1987, 162: 156-159]. Each sample was analyzed for quality and quantity by UV spectroscopy and gel electrophoresis.
Label
33P
Label protocol
For hybridization to individual arrays, 1 µg of total RNA was used to synthesize 33P-dCTP labeled first strand cDNA with Invitrogen Superscript II.
Hybridization protocol
Each sample was analyzed twice on duplicate arrays. The Research Genetics GF211 arrays were hybridized with 30-60 million cpm of probe/array for 18h then washed extensively at 50 degrees C in 0.5X SSC +0.1% SDS.
Scan protocol
Multiple exposures were collected on a Storm phosphorimager. Images were imported using Research Genetics Pathways 3 software.
Description
Individual GF211 filters were stripped and reused several times; `Filter Number/Use Number' characteristic columns show which filter/which reuse corresponds to each hybridization. Each hyb was exposed on the phosphoimager multiple times with varying durations; each exposure was scanned once; the `Exposures' characteristic shows how many exposures were combined to produce normalized intensity estimates. This information is also encoded in the `Sample name'. Un-normalized values from all exposures/scans of all hybridizations are provided in one of the two attached data files (one for normal fibroblast samples, one for MFS samples). These two files do *not* use the GEO-defined gene ID's for platform GPL538; instead, correlate via GenBank Accession.
Data processing
See Yao et al., `A Marfan syndrome gene expression phenotype in cultured skin fibroblasts', submitted. In outline, we do the following. The median of log10-transformed expression values for each gene across all exposures of all arrays is calculated, and serves as a common baseline for comparison between experiments. Then, for each exposure, we compute a smooth `local' (loess) regression line between its log intensities and the median levels. The resulting regression function transforms the expression values of each exposure to the scale of the common baseline while capturing nonlinear effects. Finally, we combine data from multiple exposures of an array by taking their medians. This technique builds on earlier work [Dozmorov I, Galecki A, Chang Y, Krzesicki R, Vergara M, Miller RA: Gene expression profile of long-lived snell dwarf mice. J Gerontol A Biol Sci Med Sci 2002, 57: B99-108], [Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J et al.: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002, 30: e15.] and more details are reported in [Mulvihill ER, Jaeger J, Sengupta R, Ruzzo WL, Reimer C, Lukito S et al.: Atherosclerotic plaque smooth muscle cells have a distinct phenotype. Arterioscler Thromb Vasc Biol 2004, 24: 1283-1289]. One innovation introduced here is the following: to reduce the impact of outliers and differentially expressed genes on the inferred levels of other genes, the loess regression is based on only a `stable' subset of genes uniformly chosen across all intensity levels. Specifically, for each gene we calculate the rank of its measured intensity in each exposure, and the median absolute deviation (MAD) of its rank across exposures. We sort all n genes by increasing median intensity and partition them into n/5 groups of consecutive genes. From each group of 5, the gene with the minimum MAD statistic is chosen. This selection of n/5 genes comprises the stable subset of genes on which our regression is based. Other approaches to normalization based on stable genes have been proposed [Kepler TB, Crosby L, Morgan KT: Normalization and analysis of DNA microarray data by self-consistency and local regression. Genome Biol 2002, 3: RESEARCH0037.], [Schadt EE, Li C, Ellis B, Wong WH: Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data. J Cell Biochem Suppl 2001, Suppl 37: 120-125.] but those techniques seem more complex, and choose stable genes in ways that potentially yield sparse coverage of some regions of the intensity spectrum, consequently increasing the influence of anomalous or differentially expressed genes on the normalization in those regions. As intensity-dependent non-linearity is a key concern in our normalization, uniform representation of intensities in the stable subset is valuable.