NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM7838351 Query DataSets for GSM7838351
Status Public on Oct 17, 2023
Title mEB_scQer_scv2_mBC_rep1_S4
Sample type SRA
 
Source name N/A
Organism Mus musculus
Characteristics tissue: N/A
cell line: WD44 (mouse embryonic stem cells)
cell type: Mouse embryoid bodies
genotype: monoclonal CRISPRi (dCAS9-KRAB) cell line, high MOI (polyclonal) single-cell reporter cassette (89% devCRE library + 10% promoter series + 1% EEF1A1-mCherry) delivered by piggyBac
time: Day 23
Growth protocol 7 scQer libraries were pooled with the following proportions (pluripotent up::endoderm down high 13%, pluripotent up::endoderm down low 13%, endoderm up::pluripotent down high 13%, endoderm up::pluripotent down low 13%, mutated CREs 33%, literature-selected CREs 12%, exogenous promoters 3%) prior to transfection in mESCs (biological triplicates, 2M cells per transfection, lipofectamine 2000). Puromycin selection (2 ug/mL) was applied following cell post-transfection recovery (day 3), and mEB induction (4 plates of EB per replicate, one set of mEB per transfection biological replicate) initiated on day 10 post-transfection after unintegrated plasmid dilution. On day 23 post mEB induction, 2 plate's worth of mEBs cells were processed and sequencing libraries constructed. EB induction: exponentially growing mESCs are lifted from the plate (aspirate medium, wash with PBS, add 2.5 mL [for 10 cm plate] 0.05% trypsin , incubate 2 minutes at 37C, deactivate trypsin and triturate to a single-cell suspension with 10 mL pre-warmed SL medium). Cells are then counted and spun down (5 min at 300 g). Supernatant is aspirated and cells are resuspended to 2 M/mL in CA medium (medium for EB induction: DMEM, 10% FBS, 1x MEM non-essential amino-acids, 1x Glutamax , 10-5 beta-mercaptoethanol). Cells are counted again, and density adjusted to 1 M/mL with CA medium. 3 mL (3 M cells) are added to 12 mL of CA medium in 10 cm plates (non gelatinized, non adherent). One the next day, plates are gently agitated to promote cell aggregation. Following induction, embryoid bodies (mEBs) are passaged every two days (no daily medium change). mEBs are collected using a serological pipette and transferred to a 50 mL conical tube (typically three plates are pooled). Leftover mEBs on plates are recovered by a CA medium wash and pooled with in the conical tube. mEBs are left to settle (initially up to 15-20 min, faster as the mEBs grow in size). Once mEBs have settled, medium is aspirated from the top, carefully avoiding disturbing the loose pellet. Fresh, pre-warmed, CA medium is then added to 15 mL/plate and mEBs redistributed to plates.
Extracted molecule total RNA
Extraction protocol mEBs were processed at the three weeks end point as follows (for each replicate): 2 plates of mEBs were pooled into a 50 mL conical left to settle. Medium was aspirated and mEBs were washed twice with PBS, resuspended in 3 mL PBS in the second wash, and split in two 1.5 mL aliquots in 2 mL tubes. PBS was aspirated from the tubes, and 500 uL of trypsin 0.25% was added per tube. Tubes were then mixed on a thermomixer at 37C and 650 rpm for 4 minutes. Cells were then gently dissociated by pipetting up and down 10 times, and placed back on the thermomixer for 2 min. 1 mL of SL medium was then added per sample and pipetted to obtain a single-cell suspensions, the two samples were combined in a 15 mL conical, and passed through a 100 um strainer. The strained single-cell suspension was counted, and cells were spun down (300 g, 5 min), resuspended to 4 M/mL, and taken to FACS. >600k cells were then FACS sorted (in <50 min) in pre-warmed SL medium to ensure the single-cell nature of the suspension (no gating on fluorescence proteins) prior to generating the emulsions for single-cell RNA-seq. Sorted cells were then spun down at 400 g at 4C for 5 min, the medium gently aspirated, and resuspended to an expected 2.5 M cells/mL (based on FACS sort event counts) in ice cold PBS + 0.04% BSA, cells were counted and volume adjusted to 1200 k/uL with ice cold PBS+BSA. Single-cell suspensions in PBS+BSA were taken as the starting point for the 10x Genomics protocol (v3.1 with feature barcoding). Emulsion and reverse transcription were performed per the manufacturer’s instruction. Given prior empirical experience with mEBs processing, each 10x lane was slightly overloaded (by an additional 20%) to approach the expected recovery of 10k cells/lane. Each replicate was profiled with 1 lanes of 10x, for a total of 3 lanes.
For single-cell reporters, three libraries are generated: the standard 3’ gene expression library from 10x (GEx), and two custom derived libraries, one for each reporter RNA (oBC and mBC), obtained from nested PCRs from the amplified cDNA. Briefly, Single-cell library preparation proceeded following the manufacturer's protocol (v3.1 manual CG000205 Rev D, 10x Genomics), with some modifications.For cDNA amplification, primers specific to the mBC (oSR38) and oBC (o246) reporter transcripts were spiked-in the reaction (similar to TAP-seq) at final concentration of 0.5 uM to boost capture. Following cDNA amplification, both the bead and supernatant derived material (steps 2.3Ax and 2.3Bxiv respectively) were saved for downstream processing. Gene expression libraries for all replicates were prepared following the manufacturer’s protocol from 25% of the bead fraction amplified cDNA. oBC enriched libraries were prepared as follows. 25% of the supernatant fraction from the amplified cDNA was taken as input for semi-nested inner PCR was performedwithKapa Robust (50 uL 2x master mix, supernatant cDNA, 5 μL 10 μM NextP5_index1 primer, 5 μL 10 μM indexed primers [o501-o506, one per sample], 0.5 uL SYBr green, and water to 100 μL; run parameters: 3 min at 95C, and cycles 20 s at 95C, 20 s at 60C, 20 s at 72C) withtracking with qPCRand stopped before the inflection point. Libraries were purified by 1.5x ampure.To avoid loop-the-loop products in the oBC libraries, the lowest band in the circularized ladder amplicons was size selected on PAGE (6% TBE, 180V, 30 min) for each library and used for sequencing. Only the poly-dT captured libraries were generated for the mBC for the mouse embryoid body experiment. 25% of the bead-fraction of the purified amplified cDNA was used as template for PCR withKapa Robust (50 uL 2x master mix, supernatant cDNA, 5 μL 10 μM o324 primer, 5 μL 10 μM o529, 0.5 uL SYBr green, and water to 100 μL; run parameters: 3 min at 95C, and cycles 20 s at 95C, 20 s at 65C, 50 s at 72C), with tracking by qPCR and purifying by 1x ampure. A final PCR (same condition as above) was performed to index amplicons with primers o076 and P7-indexed primers (o530-o533), and the resulting amplicons purified by 1x ampure. Read structures: Read structures: GEx: read 1, cell-barcode+UMI (28 cycles); index 1, library index (10 cycles); read 2, transcriptome (90 cycles) oBC: read 1, cell-barcode+UMI (28 cycles, no custom primers); index 1, library index (10 cycles, primer o432); read 2, oBC (90 cycles, primer o433) mBC: read 1, cell-barcode+UMI (28 cycles, no custom primers); index 1, library index (10 cycles, primer o534); read 2: mBC (90 cycles, primer o334) GEX (reseq): read 1, cell-barcode+UMI (34 cycles); index 1, library index (15 cycles); read 2, transcriptome (79 cycles)
scRNA-seq (with custom reporter libraries)
 
Library strategy RNA-Seq
Library source transcriptomic single cell
Library selection cDNA
Instrument model Illumina NextSeq 500
 
Description 10X Genomics (custom mBC library)
Data processing GEx libraries: Fastq files were generated using the makefastq command from cellranger (v6.0.1), and the gene expression count matrices were then generated with cellranger count command, with transcriptome reference mm10-3.0.0. Raw count matrices were then imported as a Seurat object (filtering gene expressed in less than 3 cells, and cell barcodes with less than 50 genes measured). Cell barcodes in the high total UMI mode with low mitochondrial RNA proportion were filtered as likely bona fide cells (fraction of mitochondrial UMI >1% and <12.5%, total gene expression UMI > 450). The filtered count matrices were then used to evaluate doublet scores using scrublet (scrub_doublets command, 30 principal components, mean_center=true, normalize_variance=true), and cell barcodes with doublet score > 0.3 (separating the two modes of the simulated doublet distribution from scrublet) were filtered out. Datasets from all replicates were then combined in a single Seurat object, dimensionally reduced and clustered (NormalizeData, normalization.method= “LogNormalize”, scale.factor=10000; FindVariableFeatures with selection.method = “vst”, nfeatures=1000; ScaleData with all genes as features; RunPCA with identified variable features and 100 principal components; FindNeighbors, dims=1:50; FindClusters, resolution=0.2; RunUMAP, dims=1:50, n.neighbors=50) without batch correction given the good correspondence between replicates. The following additional quality filtering steps were applied to retain high confidence singlet cells. Clusters comprising less than 1% of cells were considered likely doublets/artifacts, and corresponding cells were removed. Cells members of each cluster identified were separately sub-clustered with the same procedure as above (except resolution 0.5 in FindNeighbors). Any sub-cluster with a median doublet score above 0.15 was deemed composed of likely doublets, and corresponding cells were removed. Cells with anomalously high gene expression UMI counts (>8k) were removed. Finally, cells with an estimated MOI > 110 (roughly corresponding to the top 0.1% of the distribution, MOI estimated through oBC UMI > 10) were filtered out. In the end, n=43799 cells passed all these quality filters (6124 replicate 1, 6442 replicate 2, 6442 replicate 3). mBC libraries: Data was converted to fastq using bcl2fastq, and fastqs were minimally processed (e.g., trimming read 1 to 28 cycles) to be compatible with cellranger (version 6.0.1, 10x Genomics), which was run to perform error correction on cell barcodes. The resulting position sorted bam files were then parsed for the mBC reads as follows using a custom python script. Reads aligning to the reference genome or without either corrected cell barcode or UMI (tags CB and UB in the bam file) were discarded. Only reads with the exact expected 7 nt sequence (TCGACAA) downstream of the mBC (positions 16 to 22) were retained. List of all UMIs corresponding to a cell barcode and mBC pair were stored, discarding chimeric UMIs (taken to be UMIs for which the proportion of reads associated to a given mBC vs all other mBC in the specified cell barcode falls below 0.2). mBC comprised of all Gs (empty read) were discarded. Finally, the UMI count was error corrected as follows. For each given mBC and cell barcode, the Hamming distance between all UMIs was calculated, a graph created by connecting UMIs that were a Hamming distance ≤ 1, and the resulting the number of connected components in the graph was taken as the error-corrected UMI count for a given cell barcode-mBC pair. These error corrected UMI counts were used for the per single-cell quantification of the reporter mRNA expression. oBC libraries: processed in an entirely analogous way to the strategy for mBC, with the following modifications: two sequencing runs were combined in a single fastq prior to processing, read 2 were trimmed to 23 cycles, and only reads with the GCTTTAA (constant region after the oBC) at positions 17 to 23 were retained. The number of UMIs per oBC per cell barcode was also taken as the error corrected (1 Hamming distance) count and our measure of oBC expression in single cells (see below for a normalization strategy to correct for gene expression UMIs). Given that cell barcodes derived from capture sequence vs. poly-dT reverse transcription primer are different (bases 8 and 9 reverse complemented) on the same bead (and not error corrected by cellranger in our application), we converted the CS1 cell barcode to its poly-dT counterpart to enable matching across the different libraries. Only cell barcodes passing the QC filters from the GEx analysis were retained in the final count tables. Only mBC and oBC present in the subassembly librarie were retained.
Assembly: mm10
Supplementary files format and content: The four processed files are: GEx_obj_sc_rep_mEB_series_v2: Seurat processed object containing transcriptome quantification and cell annotation metadata (combined for all replicates). oBC_counts_sc_rep_mEB_v2: Raw oBC count table (per cellbarcode, restricted to oBC from the list determined in the subassembly and UMI counts > 1). Column 1: cell barcode; column 2: replicate ID; column 3: oBC; column 4: read counts mBC_poly_dT_counts_sc_rep_mEB_v2: Raw mBC count table (per cellbarcode, restricted to mBC from the list determined in the subassembly). Column 1: cell barcode; column 2: replicate ID; column 3: mBC; column 4: read counts; column 5: UMI counts; capture modality. assigned_oBC_CRE_mBC_joined_counts_sc_rep_mEB_series_v2: Final assigned joined cell-oBC-CRE_mBC table. Restricting to oBC with >10 UMI counts, and to uniquely matchable oBC-CRE-mBC triplets. column 1: cell barcode; column 2: replicate ID; column 3: oBC; column 4: mBC; column 5: CRE class (promoters [exogenous series] or devCRE); column 6: CRE right identity, column 7: CRE right orientation, column 8: CRE left identity, column 9: CRE left orientation; column 10: read counts oBC; column 11: UMI counts oBC; column 12: read counts mBC; column 13: UMI counts mBC. Note: all but the pairwise CREs will have NA for columns 8 and 9 (CRE left).
 
Submission date Oct 12, 2023
Last update date Oct 17, 2023
Contact name Jean-Benoit Lalanne
E-mail(s) [email protected]
Organization name University of Washington
Department Genome Sciences
Lab Jay Shendure
Street address 3720 15th Ave NE
City Seattle
State/province WA
ZIP/Postal code 98195
Country USA
 
Platform ID GPL19057
Series (2)
GSE217690 Multiplex profiling of developmental enhancers with quantitative, single-cell expression reporters
GSE245189 Multiplex profiling of developmental enhancers with quantitative, single-cell expression reporters [scQer_mEB, v2)]
Relations
BioSample SAMN37799148
SRA SRX22079287

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap