NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE135464 Query DataSets for GSE135464
Status Public on Dec 01, 2019
Title Model-driven generation of artificial yeast promoters
Organism synthetic construct
Experiment type Other
Summary Promoters play a central role in controlling gene regulation; however, a small set of promoters is used for most genetic construct design in the yeast Saccharomyces cerevisiae. The ability to generate and utilize models that accurately predict protein expression from promoter sequence may enable rapid generation of novel useful promoters, facilitating synthetic biology efforts in this model organism. We measured the activity of over 675,000 unique sequences in a constitutive promoter library, and over 327,000 sequences in a library of inducible promoters. Training an ensemble of convolutional neural networks jointly on the two datasets enabled very high (R2 > 0.79) predictive accuracies on multiple prediction tasks. We developed model-guided design strategies which yielded large, sequence-diverse sets of novel promoters exhibiting activities similar to current best-in-class sequences. In addition to providing large sets of new promoters, our results show the value of model-guided design as an approach for generating DNA parts.
 
Overall design Promoter activity was measured using a “FACS-seq” reporter-based assay. Libraries of yeast cells harboring a plasmid in which mCherry expression was driven by PTEF1 (as a control for expression noise) and GFP expression was driven by a member of a sequence library were grown in synthetic complete media containing 2 percent dextrose and lacking uracil, and were FACS-sorted on the basis of the ratio of GFP to mCherry expression. Plasmid DNA was extracted from each bin, and bin-specific barcodes were applied by PCR. PCR amplicons were pooled and sequenced to derive read counts for each sequence in each bin; this data was used to extract quantitative estimates of promoter activity for each sequence. This was first done for a library of constituitive promoters based on natural pGPD, then for one of beta-estradiol-inducible promoters based on the pZEV system (McIsaac et al. 2014, PMID 24445804). Neural network models of promoter activity were then trained on results from the first two libraries and leveraged to generate sets of novel promoters designed to fulfill a variety of objectives. These designs, and control promoters from the first two libraries, were assayed in the third experiment. In the first two experiments, samples were sequenced on Illumina Miseq (2x300) and Nextseq (1x75) platforms. In the third, only Miseq was used. In the GPD experiment, 2 replicates of 12 bins each were collected; in the following experiments, 12 bins each in the presence and absence of 1 uM beta-estradiol inducer were collected. In some experiments, aliquots of the original library were also prepared for sequencing.
 
Contributor(s) Kotopka BJ, Smolke CD
Citation(s) 32355169
Submission date Aug 06, 2019
Last update date May 04, 2020
Contact name Christina Smolke
Organization name Stanford University
Department Bioengineering
Lab Smolke lab
Street address 443 Via Ortega
City Stanford
State/province CA
ZIP/Postal code 94305
Country USA
 
Platforms (2)
GPL17769 Illumina MiSeq (synthetic construct)
GPL19424 Illumina NextSeq 500 (synthetic construct)
Samples (102)
GSM4010730 GPD_miseq_BK724
GSM4010731 GPD_miseq_BK725
GSM4010732 GPD_miseq_BK726
Relations
BioProject PRJNA558976
SRA SRP217548

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE135464_filtered_read_table_miseq_GPD.txt.gz 77.1 Mb (ftp)(http) TXT
GSE135464_filtered_read_table_miseq_ZEV.txt.gz 39.0 Mb (ftp)(http) TXT
GSE135464_final_means_ids_added.csv.gz 5.1 Mb (ftp)(http) CSV
GSE135464_means_nextseq_GPD.csv.gz 24.6 Mb (ftp)(http) CSV
GSE135464_means_nextseq_ZEV.csv.gz 6.8 Mb (ftp)(http) CSV
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap