C. acetobutylicum ATCC824 ChIP-on-chip array v1 includes over 20K 60-mer oligonucleotide probes, sourced from the official sequenced genome, and are located in the lower 50% or 500 bp (whichever is shorter) of each ORF. This is a proof of principle array and not a definitive design. The programs Comm_Oligo (Li, He et al. 2005), ROSO (Reymond, Charles et al. 2004), YODA (Nordberg 2005), ArrayOligoSelector (Bozdech, Zhu et al. 2003), OligoWiz 2.0 (Wernersson and Nielsen 2005) and Picky (Chou, Hsia et al. 2004) were used to generate several 60-mers for each Clostridium acetobutylicum ATCC824 chromosome and pSOL1 megaplasmid ORF (Nölling, Breton et al. 2001). Whenever possible the DNA sequences belonging ribosomal RNAs, tRNA and the intergenic regions of the whole genome were used as a negative set (i.e. no match allowed). A maximum identity of 75-85% to any other sequence and other parameters were set to the program defaults. On average, 32 60-mers per ORF where generated.Melting temperatures and DeltaG values for the generated oligomer and its complementary sequence were re-calculated using Hybrid 2.5 (Markham and Zuker 2005) (included in (Rouillard, Zuker et al. 2003)). For each 60-mer, the best four non-specific matches against the Clostridium acetobutylicum ATCC 824 genome were determined using FASTA (Pearson and Lipman 1988; Pearson 1990). The melting temperatures of the heterodimers formed by a 60-mer and the complementary sequence of each of its non-specific matches were also calculated. The difference between the melting temperature between the 60-mer and each one of the heterodimers was calculated and the minimal value of the differences was recorded. The 60-mers targeting each particular ORF were ranked (in descending order) according to the minimal DeltaT previously recorded. Control features are automatically included on the array by Agilent and follow their naming convention. The name for our custom 60-mers is composed of the ORF name (CACXXXX or CAPXXXX), the 60-mer number (1,2 or 3), a character (a,b or c) indicating if it is the first (a), second (b) or third (c) occurrence of this specific 60-mer, and a two letter code (Ch, Co or Tr). A Ch 60-mer (shorthand for ChIP-on-chip) is a 60-mer located in the lower 50% or 500 bp (whichever is shorter) of the target ORF and has a rank of four (4) or greater. A Co 60-mer (shorthand for Common) is a 60-mer located in the lower 50% or 500 bp (whichever is shorter) of the target ORF and has a rank of four (3) or smaller. A Tr 60-mer (shorthand for Transcriptional) is any 60-mer that does not meet the requirements of a Ch or Co 60-mer regarding location and/or rank. Orientation: Features are numbered numbered Left-to-Right, Top-to-Bottom as scanned by an Agilent scanner (barcode on the left, DNA on the back surface, scanned through the glass), matching the FeatureNum output from Agilent's Feature Extraction software. The ID column represents the Agilent Feature Extraction feature number. Rows and columns are numbered as scanned by an Axon Scanner (barcode on the bottom, DNA on the front surface).
Custom 60-mer and control features identification information.
ORF
The ORF targeted by each 60-mer according to the NC_003030.1 (chromosome), NC_001988.1 (pSOL1 plasmid) sequences and the original annotation files provided by Genome Therapeutics Corp. For those ORF that have been deleted in successive versions of the genome annotation, the GI entry on the has been set to N/A and their GeneID can be found appended at the end of the annotation entry.
SPOT_ID
ANNOTATION
The original annotation provided by Genome Therapeutics Corp.