Summary

在培养的哺乳动物细胞大规模并行报告基因分析

Published: August 17, 2014
doi:

Summary

The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. Coupling candidate regulatory elements to reporter genes that carry identifying sequence tags enables massive parallelization of these assays.

Abstract

The genetic reporter assay is a well-established and powerful tool for dissecting the relationship between DNA sequences and their gene regulatory activities. The potential throughput of this assay has, however, been limited by the need to individually clone and assay the activity of each sequence on interest using protein fluorescence or enzymatic activity as a proxy for regulatory activity. Advances in high-throughput DNA synthesis and sequencing technologies have recently made it possible to overcome these limitations by multiplexing the construction and interrogation of large libraries of reporter constructs. This protocol describes implementation of a Massively Parallel Reporter Assay (MPRA) that allows direct comparison of hundreds of thousands of putative regulatory sequences in a single cell culture dish.

Introduction

大规模并行报告基因分析(MPRA)允许数以千计的转录调节活性的复用测量到几十万的DNA序列1-7的。在其最常见的实施方式中,复用通过连接每个感兴趣的序列,其中包含一个开放读框下游的一个识别序列标签的合成报道基因来实现(ORF; 图1)。转染后,RNA分离和报道基因转录物的3'末端的深度测序,耦合的序列的相对活动,可以从它们的识别标记的相对丰度来推断。

图1
图1概述MPRA的。MPRA记者库构建体是由耦合假定调控序列来合成构造其次是一个标识序列标签的报告基因,它由一个“惰性”ORF(如GFP或萤光素酶)。该文库转染集体到培养的细胞群和转录的mRNA记者随后恢复。深度测序用于计算出现在记者的mRNA和转染的质粒中的每一标签的数量。 mRNA的比率计算过的质粒计数可以用来推断相应的调节序列的活性。适于与从尼科夫权限 2。

MPRA可适应各种实验设计,包括个体基因的调节元件的1)完整的突变,2)扫描横跨感兴趣的基因座的新的调节元件,3)测试中的一组假定的自然遗传变异的影响启动子,增强子或沉默,而合成的调控元件4)半理性设计。李可以采用多种方法,包括寡核苷酸文库合成法(OLS)的可编程芯片2,3,6,7,组装简并寡核苷酸1,4,组合结扎8和基因组DNA 5碎裂产生的序列变异中图书馆。

本协议描述的建设启动的库变体使用OLS和pMPRA1和pMPRAdonor1向量(Addgene识别码49349和49352,分别http://www.addgene.org),这个库的瞬时转染到培养的哺乳动物细胞和随后的定量通过它们的相关标签(TAG-SEQ)深度测序的启动子活性。早期版本的这个协议的报道,梅尔尼科夫自然生物技术30,271-277(2012)和Kheradpour 基因组研究23,800-811(2013)的研究中使用。

Protocol

1序列的设计与合成开始MPRA与序列的设计和合成,来测定调节活性。用于与pMPRA矢量系列的相容性,使用以下模板设计每个序列:5'-ACTGGCCGCTTCACTG- 变种 -GGTACCTCTAGA- 标签 -AGATCGGAAGAGCGTCG-3' 其中 var表示序列待测定和标记表示一个或多个识别标记。 两个可变区(var和标签 )通过一对的KpnI(GGTACC)和XbaI(TCTAGA)限制性位点的分离,以促进它们之间的报道基因片段的定向连接。此外,两个不同的SfiI(GGCCNNNNNGGCC)网站将通过PCR被添加以允许定向连接到pMPRA骨干。可变区必须不包含任何这些限制性位点的附加拷贝。如果需要的话,其中的一个限制性内切酶的一个或多个可更换,只要更换1)能够进行高效率的切削接近的DNA分子的末端,以及2)没有在载体序列别处切割。见讨论有关其他详细信息。 获得来自商业供应商如表1所列的所需的引物。 MPRA_SfiI_F GCTAAGGGCCTAACTGGCCGCTTCACTG MPRA_SfiI_R GTTTAAGGCCTCCGTGGCCGACGCTCTTC TAGseq_P1 AATGATACGGCGACCACCGAGATCTACACT CTTTCCCTACACGACGCTCTTCCGATCT TAGseq_P2 CAAGCAGAAGACGGCATACGAGAT [指数] GT GACTGGAGTTCAGACGTGTGCTCTTCCGATCTCGAGG 表1引物序列[指数]表示用于多路测序6至8个核苷酸指数序列。获得至少8 TagSeq-P2引物与不同的索引。所有的引物应通过HPLC或PAGE纯化。 寡核苷酸文库可从商业供应商或核心设施来获得,或者通过使用一个可编程微阵列为基础的寡核苷酸合成仪所描述的制造商产生。悬浮OLS的产品在100μlTE缓冲液0.1。 运行在10%TBE-尿素变性聚丙烯酰胺凝胶中的寡核苷酸文库。负载每通道和染色约1皮摩尔与单链DNA敏感的荧光染色,以评估其质量( 图2A)。 如果对应于全长寡核苷酸的离散频带可以被可视化( 图2A,泳道1和2),切下相应的丙烯酰胺凝胶切片和洗脱的寡核苷酸到100微升的TE 0.1过夜,在室温下摇动。否则,继续使用原始OLS停牌。 放大和使用的乳液PCR 9添加的SfiI尾巴寡核苷酸文库</ SUP>。 如表2中描述的建立50μl的PCR反应混合物。从Micellula DNA乳剂300μl的油和表面活性剂的混合物和纯化试剂盒和涡流在4℃下再结合5分钟。 试剂 1X量(μl) Herculase II融合DNA聚合酶 0.5 5X Herculase II反应缓冲液 10 的dNTP(各10毫米) 1.25 BSA(20毫克/毫升) 1.25 底漆MPRA_SfiI_F(25微米) 0.25 底漆MPRA_SfiI_R(25微米) 0.25 OLS模板(1-10渺摩尔) 变化无核酸酶的水 50 ve_content“> 表2乳液PCR反应混合物(水相)。 使用给出了扩增产物的最大产量没有任何瑕疵的出现浓度(见1.12)。 分配所得乳液进PCR管中,并在95℃下进行20个PCR循环(2分钟,20×[20秒,在95℃,20秒,55℃,30秒,72℃],3分钟,在95 ℃)。 池和破每350微升的乳液PCR反应通过加入1 ml异丁醇,并简要涡旋,然后纯化,使用DNA纯化柱的扩增产物。 运行在2%或4%的琼脂糖凝胶的等分试样,以验证无伪影的扩增( 图2B)。 图2制备寡核苷酸合成库。一) </s仲>三种不同原料合成的寡核苷酸库法(OLS),在变性10%TBE-尿素聚丙烯酰胺凝胶运行。对应全长寡核苷酸(*)频段可以可视化,并从图书馆切除1和2图书馆3包含与页净化干扰污染物。如果是这种情况,直接进行相同的寡核苷酸库运行的打开和乳液PCR扩增的PCR扩增。 乙)产品在琼脂糖凝胶上。复杂的寡核苷酸文库PCR扩增经常造成嵌合产品,可能会出现更高和更低的频段等文物。乳液PCR技术可以最大限度地减少这些文物。 2,图书馆建设制备的线性化质粒骨架消化载体pMPRA1用SfiI位在50℃下搅拌2小时,在1%的琼脂糖凝胶电泳,切下的骨干频带(在这种情况下,2.5 kb的),并用凝胶纯化旋转柱纯化它。 摘要从步骤1.4用SfiI位 ,在50℃下进行2小时的乳液PCR产物和纯化使用DNA纯化柱。 结扎的子标签库成线性载体骨架,建立含有100毫微克的消化PCR产物,50毫微克的线性化载体骨架和T4 DNA连接酶反应。过夜孵育在16℃,然后加热,在65℃下进行20分钟以灭活连接酶。 转化大肠杆菌大肠杆菌用连接反应。以保存库的复杂性,目标以外还有在设计库不同的启动子,标记的组合,以获得至少10倍以上的CFU。 当标记的总数为约10万,则目标菌落通常可以通过进行每库6-8平行变换来实现的。 对于每个转换,结合50微升电感受态大肠杆菌大肠杆菌细胞用2微升连接反应混合物在冰上。变换了电,恢复细胞800μLSOC培养基在37℃下振荡1小时,然后从并行变换相结合的细胞。 评估转化效率,板连续稀释液从在LB琼脂平板上以50〜100微克/毫升羧苄青霉素的回收单元的alinquot孵育过夜,在37℃。估计总菌落为(观察到的CFU)×(稀释倍数)×(V – V)/ V,其中V是回收的细胞和v的总体积是采取的系列稀释的等分试样的体积。 同时,使用回收的细胞的其余接种200毫升LB补充有100微克/毫升羧苄青霉素。在摇床培养箱中生长的细胞在37℃下过夜,然后分离出质粒DNA,按照标准方法。 消化用SfiI位的分离的质粒文库的等分试样在50℃下搅拌2小时,并运行在1%琼脂糖凝胶上,以证实插入片段的存在。 通过线性C的质粒库用KpnI和XbaI位启动子变异与标签之间utting。为了最大限度地提高消化效率,执行串行消化: 首先,摘要用KpnI,37℃1小时,纯化使用磁珠。第二,消化用XbaI并加入1 U虾碱性磷酸酶,在37℃下搅拌2小时,热失活,在65℃下进行5分钟,然后纯化使用磁珠。 运行在1%琼脂糖凝胶的等分试样,以证实完全线性化。如果未酶切的质粒是可见的,线性化的片段应该被凝胶纯化。 以产生适合于转染到哺乳动物细胞中的MPRA库,结扎的ORF用KpnI / XbaI位兼容的末端插入线性化中间库。 要准备从pMPRAdonor1兼容luc2的ORF片段,消化该质粒用KpnI和XbaI,37℃1小时,并运行在1%琼脂糖凝胶。消费税的ORF片段(1.7 K表在这种情况下,B)和纯化使用凝胶净化离心柱。 克隆的ORF片段克隆到线性化的中间库与步骤2.3-2.8中描述。 消化MPRA文库用KpnI的等分试样(相当于1-2微克),在37℃下反应1小时,并运行在1%的琼脂糖凝胶。如果消化库运行为单一条带,进入第3节。如果需要额外的频段观察(通常对应于中间结构的结转),该库可进一步纯化如下: 消化3-5微克用KpnI污染库在37℃下1小时,在0.8%琼脂糖凝胶上,在4℃下运行过夜。 切下正确的库带,使用凝胶纯化旋转柱纯化DNA,并使用高浓度T4 DNA连接酶进行1小时进行自身连接,在37℃。然后重复改造和最后文库DNA分离的步骤2.8所述。 把transfection,扰动和RNA分离对于每个独立的转染,培养的细胞所需要的数目(如由MPRA库的复杂性来确定,见讨论)在适当的培养基中。例如,培养的HEK293T / 17的细胞在DMEM补充有10%FBS和L-谷氨酰胺/青霉素/链霉素。培养的细胞为每个库和实验条件至少有两个独立的转染。 转染培养的细胞以MPRA质粒。转染的方法和条件,必须为每个信元类型被优化。对于每一个转染的样品,保留了质粒DNA的匹配对照的等分(50-100毫微克)。 例如,转染0.5×10 7个HEK293T /生长17细胞以〜于10cm的培养皿中含10微克质粒DNA在1ml的Opti-MEM I低血清培养基用加入30μl的Lipofectamine LTX和10μl加试剂50%汇合。 5小时后取出转染混合物,并允许细胞恢复24-48小时。 任选地,进行激活在设计库上下文或信号相关的调控序列所需的任何扰动。 收获细胞,并使用使用其制造商的指示标准寡聚(dT)纤维素柱或磁珠分离出的poly(A)+ mRNA的表达。 确保该列或珠粒的最大结合容量超过mRNA的从收获的细胞预期的总量。例如,从3.3节的预期收益率大约为0.5-2.5微克表达。 4,标签序列消除的载体DNA的交叉污染,从被转染的细胞裂解物,对待每20微升的mRNA样品与1μl的涡轮DNA酶(2U)和2.3微升10倍的Turbo DNA酶缓冲液中于37℃下反应1小时,加入2.4微升的Turbo DNA酶失活的试剂在RT下5分钟,搅拌,离心分离机以10,000 g下90秒,然后将溶液转移到新的管中。 验证纯度在琼脂糖凝胶上进行PCR作为60-100纳克的mRNA样品的4.6节中描述,然后运行该产品。如果特异性扩增是可见的(0.25 KB,如果使用pMPRA1与luc2),柱净化处理后的mRNA,然后重复DNase处理。 要生成标签,序列测序文库,转换报告mRNA合成cDNA,经PCR测序添加适配器。 设置的mRNA / RT引物的混合物,如表3所述。孵育在65℃下5分钟,然后放置在冰上。并联,建立cDNA合成反应中,如表4所述。 试剂 1X量(μl) 基因样本(400-700纳克总) 8 寡0dT(50微米) 1 的dNTP(各10毫米) 1 ove_content“> 表3的RNA /反转录引物混合cDNA合成。 试剂 1X量(μl) 10倍的Superscript III RT缓冲液 2 氯化镁 (25毫米) 4 DTT(0.1M) 2 RNaseOut(40 U /μL) 1 的Superscript III(200 U /μL) 1 表4的cDNA合成反应混合物中。 轻轻地结合cDNA合成的混合基因/ RT引物混合。孵育在50℃下50分钟,85℃下进行5分钟,然后放置在冰上。最后,孵育带U核糖核酸酶的2小时,在37℃下进行20分钟。 设置PCR反应,如表5中所述的4-6微升cDNA的反应混合物或50毫微克报告质粒作为模板。然后进行PCR扩增(95℃,2分钟,26×[95℃30秒,55℃30秒,72℃30秒],72℃,3分钟)。 试剂 1X量(μl) 2个PfuUltra II热启动PCR反应混合液 25 底漆TagSeq_P1(25微米) 0.5 底漆TagSeq_P2(25微米) 0.5 模板(mRNA和cDNA的混合物或质粒DNA) 变化无核酸酶的水 50 表5变量-SEQ PCR反应混合物。 如果使用通过2%琼脂糖凝胶上运行PCR产物,对应于该标签的序列文库的扩增子的附加频带(0.25 kb的pMPRA1与luc2)和纯化这些使用凝胶净化离心柱。 游泳池,变性,直接与Illumina的测序仪测序纯化标签序列扩增。 过滤低质量的撤离所有,一个读)包含了PHRED质量的一个或多个位置排序的标签中得分低于30或b)未在标签设计完全匹配。算上其余每个标签出现在每个库的次数。归一化标签计数到TPM(每百万测序标记标签),然后再计算的mRNA衍生的标签计数超过质粒衍生的标签数为每对测序文库的比率。如果有多个不同的标签被链接到每个序列变异体,用他们的平均比例为下游分析。

Representative Results

MPRA利于高分辨率的转录调控元件的序列 – 活性关系的定量解剖。一个成功的MPRA实验通常会产生高度可重复的测量结果,为广大的转库( 图3A)序列。如果重复性差被观察( 图3B),这是指示记者mRNAs的回收的RNA样本中的浓度太低,由于是1)低的检测序列中的绝对活性,或2)低的转染效率。 图4显示了通过测定生成的具有代表性的“信息足迹”1,2〜人类IFNB基因在HEK293细胞有或没有接触到仙台病毒上游的145 bp的序列37000的随机变量。启动子的TATA盒和已知的近端增强10可以明确认定为信息丰富REGI附件中的病毒依赖性。 图3标签序列的重现性。散点图显示高(A)和低电平(B)的再现性,从 ​​两个独立的复制次转染标记-SEQ数据的例子。后者图示出许多异常的标记具有高的mRNA计数仅在一个在两个重复的。这样的工件通常表示记者mRNA的浓度太低,定量PCR扩增,或者是由于该报道构建中低绝对值的活动,或低的转染效率。 人类IFNB转录起始位点和近端增强图4,信息足迹。 </str翁>上游IFNB人类基因约有37,000随机的145个核苷酸(nt)的区域变异采用MPRA在HEK293细胞(A)和无(二)接触到仙台病毒检测。蓝色条显示记者输出和核苷酸在每个位置之间的互信息。近端增强子和TATA盒中脱颖而出,作为对病毒感染的高信息含量的区域。

Discussion

MPRA is a flexible and powerful tool for dissection of sequence-activity relationships in gene regulatory elements. The success of MPRA experiments depend on at least three factors: 1) careful design of the sequence library, 2) minimization of artifacts during amplification and cloning, and 3) high transfection efficiency.

The possible lengths of the variable regions in the reporter constructs are largely determined by the synthesis or cloning technology used. Standard OLS is generally limited to about 200 nt, but this protocol is compatible with inserts up to at least 1,000 nt. Note that variable regions that are highly repetitive or contain strong secondary structures may end up underrepresented due to PCR and cloning biases. The length of the tags that identify each of the variable regions should be 10-20 nt and the collection of tags should ideally be designed such there are at least two nucleotide differences between any pair. Tags that contain the seed sequences of known microRNA or other factors that might influence mRNA stability should also be avoided when possible.

A key parameter in the design of MPRA experiments is the total number of distinct reporter constructs to be included in the library (the design complexity, denoted CD). In practice, CD is limited by the number of cultured cells that can be transfected. As a rule of thumb, the total number of transfected cells should be at least 50-100 times greater than CD. For example, if 20 million cells can be transfected with a transfection efficiency of 50%, then CD should be no more than ~200,000. Note that CD is equal to the number of distinct regulatory sequence variants multiplied by the number of distinct tags per sequence. The more distinct tags are linked to each regulatory sequence, the more accurate the estimate of the activity of that sequence can be made (because measurements from distinct tags can be averaged), but the fewer distinct variants can be assayed in one experiment. The optimal choice depends on the experimental design. In a simple “promoter bashing” experiment, where a mathematical model will be fitted to the aggregated measurements, a single tag per variant is usually sufficient. In a screen for single-nucleotide polymorphisms that cause changes in regulatory activities, it may be necessary to use 20 or more tags per allele to obtain statistically robust results, because comparing each pair of alleles requires a separate hypothesis test.

If the sequences to be assayed are not expected to contain transcription start sites, a constant promoter can also be added in the same fragment. For example, pMPRAdonor2 (Addgene ID 49353) includes a minimal TATA-box promoter that is useful when the upstream variable region is expected to have significant enhancer activity, while pMPRAdonor3 (Addgene ID 49354) includes a modified, strong SV40 viral promoter that is useful when the variable region is expected to contain silencer activity or other negative regulatory elements.

Raw OLS products often contain a significant fraction of truncated oligonucleotides. These may interfere with accurate PCR amplification of the designed sequences, particularly when there is significant homology between them. Using PAGE purification to remove truncated synthesis products and emulsion PCR to minimize amplification artifacts are effective techniques for ensuring high library quality. If either step is impractical, it is imperative to minimize the number of PCR cycles used at each amplification step. Selection and expansion of the cloned library in liquid culture is generally sufficient to maintain the design complexity, but if recombination-prone vectors are to be used or significant representation bias is observed, the recovered cells can instead be plated directly onto large LB agar plates, expanded as individual colonies and then scraped off for DNA isolation. It is also important to consider the potential impact of synthesis errors, which are typically found at a rate of 1:100-500 in OLS. Full-length sequencing of the reporter constructs prior to transfection is recommended to identify and correct for such errors.

It is not necessary to introduce reporter constructs into every cell in the transfected culture, but transfection efficiencies below ~50% may lead to poor signal to noise ratios. It is advisable to optimize transfection conditions prior to performing MPRA experiments in a new cell type. When working with hard-to-transfect cell types, MPRA signals can be boosted by pre-selecting transfected cells. The pMPRA vector series includes variants that constitutively express a truncated cell surface marker that can be used to physically enrich for transfected cells prior to RNA isolation (for example, Addgene IDs 49350 and 49351).

Disclosures

The authors have nothing to disclose.

Acknowledgements

这项工作是由美国国立卫生研究院在奖数R01HG006785美国国家人类基因组研究所的支持。

Materials

Oligonucleotide library synthesis Agilent, CustomArray or other OLS vendors custom If using OLS construction method
pMPRA1 Addgene 49349 MPRA plasmid backbone
pMPRAdonor1 Addgene 49352 luc2 ORF donor plasmid
TE 0.1 Buffer (10 mM Tris-HCl, 0.1 mM EDTA, pH 8.0) Generic n/a OLS buffer
Novex TBE-Urea Gels, 10% Life Technologies EC6875BOX PAGE purification of OLS products
Novex TBA-Urea Sample Buffer Life Technologies LC6876 PAGE purification of OLS products
SYBR Gold Nucleic Acid Gel Stain Life Technologies S-11494 PAGE purification of OLS products
Micellula DNA Emulsion & Purification Kit EURx/CHIMERx 3600-01 Library amplification by emulsion PCR
Herculase II Fusion DNA Polymerase Agilent 600675 Polymerase for emulsion PCR
SfiI New England Biolabs R0123S Library cloning with pMPRA vectors
KpnI-HF New England Biolabs R3142S Library cloning with pMPRA vectors
XbaI New England Biolabs R0145S Library cloning with pMPRA vectors
T4 DNA Ligase (2,000,000 units/ml) New England Biolabs M0202T Library cloning with pMPRA vectors
One Shot TOP10 Electrocomp E. coli Life Technologies C4040-50 Library cloning with pMPRA vectors
LB agar and liquid media with carbenicllin Generic n/a Growth media for cloning
E-Gel EX Gels 1% Life Technologies G4010-01 Library verification and purification
E-Gel EX Gels, 2% Life Technologies G4010-02 Library verification and purification
MinElute Gel Extraction Kit Qiagen 28604 Library and backbone purification
EndoFree Plasmid Maxi Kit Qiagen 12362 Library DNA isolation
Cell culture media n/a n/a Experiment-specific
Transfection reagents n/a n/a Experiment-specific
MicroPoly(A)Purist Kit Life Technologies AM1919 mRNA isolation
TURBO DNA-free Kit Life Technologies AM1907 Plasmid DNA removal
SuperScript III First-Strand Synthesis System Life Technologies 18080-051 cDNA synthesis
PfuUltra II Hotstart PCR Master Mix Agilent 600850 Polymerase for Tag-Seq PCR
Primers (see text) IDT custom PAGE purify Tag-Seq primers

References

  1. Kinney, J., Murugan, A., Callan, C. G., Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proceedings of the National Academy of Sciences USA. 107 (20), 9158-9163 (2010).
  2. Melnikov, A., et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nature Biotechnology. 30 (3), 271-277 (2012).
  3. Patwardhan, R. P., Lee, C., Litvin, O., Young, D. L., Pe’er, D., Shendure, J. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nature Biotechnology. 27 (12), 1173-1175 (2009).
  4. Patwardhan, R. P., et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nature Biotechnology. 30 (3), 265-270 (2012).
  5. Arnold, C. D., Gerlach, D., Stelzer, C., Boryń, &. #. 3. 2. 1. ;. M., Rath, M., Stark, A. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science (New York, N. Y.). 339 (6123), 1074-1077 (2013).
  6. Kheradpour, P., et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Research. 23 (5), 800-811 (2013).
  7. White, M. A., Myers, C. A., Corbo, J. C., Cohen, B. A. Massively parallel in vivo enhancer assay reveals that highly local features determine the cis -regulatory function of ChIP-seq peaks. Proceedings of the National Academy of Sciences USA. 110 (29), 11952-11957 (2013).
  8. Mogno, I., Kwasnieski, J. C., Cohen, B. A. Massively parallel synthetic promoter assays reveal the in vivo effects of binding site variants. Genome Research. 23 (11), 1908-1915 (2013).
  9. Schütze, T., et al. A streamlined protocol for emulsion polymerase chain reaction and subsequent purification. Analytical Biochemistry. 410 (1), 155-157 (2011).
  10. Panne, D., Maniatis, T., Harrison, S. C. An atomic model of the interferon-beta enhanceosome. Cell. 129 (6), 1111-1123 (2007).

Play Video

Cite This Article
Melnikov, A., Zhang, X., Rogov, P., Wang, L., Mikkelsen, T. S. Massively Parallel Reporter Assays in Cultured Mammalian Cells. J. Vis. Exp. (90), e51719, doi:10.3791/51719 (2014).

View Video