We present a genome engineering workflow for the generation of new in vitro models for HIV-1 infection that recapitulate proviral integration at selected genomic sites. Targeting of HIV-derived reporters is facilitated by CRISPR-Cas9-mediated, site-specific genome manipulation. Detailed protocols for single-cell clone generation, screening, and correct targeting verification are provided.
Human immunodeficiency virus (HIV) integrates its proviral DNA non-randomly into the host cell genome at recurrent sites and genomic hotspots. Here we present a detailed protocol for the generation of novel in vitro models for HIV infection with chosen genomic integration sites using CRISPR-Cas9-based genome engineering technology. With this method, a reporter sequence of choice can be integrated into a targeted, chosen genomic locus, reflecting clinically relevant integration sites.
In the protocol, the design of an HIV-derived reporter and choosing of a target site and gRNA sequence are described. A targeting vector with homology arms is constructed and transfected into Jurkat T cells. The reporter sequence is targeted to the selected genomic site by homologous recombination facilitated by a Cas9-mediated double-strand break at the target site. Single-cell clones are generated and screened for targeting events by flow cytometry and PCR. Selected clones are then expanded, and correct targeting is verified by PCR, sequencing, and Southern blotting. Potential off-target events of CRISPR-Cas9-mediated genome engineering are analyzed.
By using this protocol, novel cell culture systems that model HIV infection at clinically relevant integration sites can be generated. Although the generation of single-cell clones and verification of correct reporter sequence integration is time-consuming, the resulting clonal lines are powerful tools to functionally analyze proviral integration site choice.
Integration of proviral DNA into the host genome upon infection is a critical step in the life cycle of human immunodeficiency virus (HIV). Following integration, HIV persists by establishing latency in long-lived CD4+ T cell subsets such as memory CD4+ T cells. HIV integration appears to be non-random1,2. A number of genomic hotspots with recurrently integrated proviral DNA has been detected in several studies through the sequencing of integration sites in acutely and chronically infected individuals2,3,4,5,6,7,8. Interestingly, at some of these integration sites, the same locus was detected in a large fraction of infected cells, leading to the idea that integration at recurrent sites might positively affect clonal expansion1.
To advance our understanding of the significance of recurrent integration sites, proviral integration site choice must be explored. However, several technical aspects hamper studying HIV integration site choice and the consequences. Broadly used cell culture models for HIV latency like JLat cell lines do not reflect clinically relevant recurrent integration sites9. Studies on primary patient-derived cells, on the one hand, enable description of integration site landscape by sequencing but do not allow for functional analyses. To our knowledge, no adequate experimental model is available to functionally analyze selected clinically relevant integration sites.
Here we present a detailed workflow to generate novel models for HIV infection using CRISPR-Cas9-based genome engineering technology. The workflow described herein can be used to generate T cell-derived reporter cell lines that model HIV infection, carrying a genomically integrated proviral reporter at a chosen integration site. They are thus serving as new tools to explore how the proviral integration site can impact HIV biology and how the provirus responds to different treatment strategies (e.g., inducibility by latency reversing agents). Our method uses the advantages of CRISPR-Cas9-based genome engineering, in which integration of the reporter sequence by homologous recombination is facilitated by a Cas9 nuclease-induced double-strand break at the target site. Target sites for integration are chosen according to proximity to the described recurrent integration sites from studies on HIV-infected individuals and the presence of suitable PAM motifs for Cas9-mediated genome engineering.
In our exemplary results, we have focused on the BACH2 gene locus, which codes for the BTB And CNC Homology transcriptional regulator 2. In chronically HIV-infected individuals on antiretroviral therapy, BACH2 is one of the loci showing enrichment of integrated HIV-1 sequences3,6,7,8,10. We have chosen a minimal HIV-derived reporter consisting of HIV-1-derived long terminal repeat (LTR), tdTomato coding sequence, and bovine growth hormone (BGH) polyadenylation signal (PA), which we have targeted to two specific sites in BACH2 intron 5. The presented protocol is optimized for Jurkat cells, a human CD4+ T cell-derived suspension cell line, but other cell lines may be used and the protocol adapted accordingly. We present a detailed workflow for selection of target site, construction of target vector with homology arms, CRISPR-Cas9-mediated targeting of the reporter into the chosen genomic site, generation and selection of clonal lines, and comprehensive verification of newly generated, targeted reporter cell lines.
1. Targeting Strategy for Genome Engineering and Targeting Vector (tv) Design
NOTE: The first step of genome engineering involves selection and generation of the necessary tools for CRISPR-Cas9-mediated targeting. Selection of a genomic integration site locus, choice of cell type for targeting, and design of an HIV-derived reporter for integration should precede this step. This protocol describes targeting of an HIV-LTR_tdTomato_BGH-PA minimal reporter into Jurkat target cells. A flow chart of the workflow for CRISPR-Cas9-based targeting, generation, screening and verification of clonal lines is depicted in Figure 1. The described targeting strategy uses the S. pyogenes Cas9 (SpCas9) to generate gRNA-directed dsDNA breaks at a selected integration site. The reporter is then targeted into the chosen genomic locus through homologous recombination by providing a non-linearized targeting vector (tv) that contains the reporter sequence flanked by so-called 5’ and 3’ homology arms (HA)11.
Figure 1: Workflow for CRISPR-Cas9-mediated targeting, generation, and selection of clonal reporter lines with defined integration site. (A) Generate the target vector and transduce Jurkat T cells with the target vector and Cas9/gRNA expression plasmid. (B) Enrich the transfected cells 72 h post transfection by FACS. (C) Let the cells grow for 10 to 14 days and confirm the occurrence of targeting events by PCR and flow cytometry. (D) Generate single-cell clones by limiting dilution and let clones grow for 3 weeks. (E) Screen the clones for correct targeting by PCR and flow cytometry in 96-well format. Expand selected clones. (F) Verify correct targeting in selected clones by Southern blot, PCR and sequencing, and analysis of off-target events of Cas9 endonuclease activity. Please click here to view a larger version of this figure.
Figure 2: Targeting strategy and vector design. (a) gRNA and choice of homology arms. 20 nt gRNA is homologous to the chosen genomic target site and situated adjacent to a PAM. Homology arms are complementary to 1,000 bp up- and downstream of the gRNA and should not include the gRNA sequence. (b) Schematics of targeting vector and gRNA/Cas9 vector. The targeting vector consists of the chosen reporter sequence that is 5' and 3' flanked by the homology arms. The gRNA/Cas9 vector is based on the pX330-U6-Chimeric_BB-cBh-hSpCas9 backbone. (c) Schematic of targeting by homologous recombination. Target vector and guideRNA/Cas9 vector are transfected into Jurkat cells. Cas9 mediates a double strand break at genomic target site (indicated by *) and facilitates homologous recombination and integration of reporter sequence into the genomic target locus. Please click here to view a larger version of this figure.
2. CRISPR-Cas9-Based Targeting of Jurkat Cells
3. Generation of Clonal Lines and Screening for Correct Targeting
NOTE: After confirmation of the targeting events in the mixed targeted cell population by flow cytometry and PCR (sections 2.2–2.4), generate single-cell clones (duration: 28 to 35 days) and screen for correct integration of the reporter sequence.
In this representative experiment we have chosen to target a minimal HIV-1-derived reporter consisting of a LTR, tdTomato-coding sequence, and polyA-signal sequence to two loci in intron 5 of the BACH2 gene17. The loci for targeting were chosen according to proximity to published recurrent integration sites found in different studies on primary T cells from HIV-infected patients2,4,5,6,7,8 and the presence of a PAM motif NGG necessary for Cas9-mediated induction of double-strand breaks. Target vectors were constructed and transfected according to the described protocol. The workflow for transfection, checking for targeting events, generation of single-cell clones, and screening and selection of clones is schematically shown in Figure 1.
Two weeks after FACS-enrichment of transfected cells, we were able to detect reporter gene expression after PMA-Iono induction by flow cytometry in 4 to 12% of cells, depending on integration site (data not shown) and PCR products on genomic DNA spanning the whole 5' and 3' integration junction from upstream of 5' HA into reporter sequence and downstream of 3' HA into reporter sequence, respectively (Figure 3a and 3b). Having confirmed that targeting events occurred, we went on to generate single-cell clones by limiting dilution plating. In our hands, plating 5 to 10 96-well plates per targeting construct was sufficient to obtain enough clones for a successful screen. In the protocol (section 3.2), it is described how to perform a screening of single-cell clones in duplicated 96-wells. In this regard, only positive clones must be expanded, which saves both time and effort. In Figure 3c, example data of FACS-screening after PMA-Iono induction and PCR screening are shown. We observed a number of clones with high, low, and no fluorescent reporter gene expression (Figure 3c). For PCR-screening, it is important to include a control PCR which amplifies a genomic locus of choice to determine whether the quality of cell lysates is adequate for PCR. In our case, we have chosen a 630 bp sequence in the NUP188 gene locus for control PCR (for primer sequences, see Table 5). Primers for screening PCR were then designed to amplify a shorter sequence located in the reporter. Additionally, PCR for a sequence on the target vector backbone was performed to exclude any clones which had unspecifically integrated target vector backbone sequences (backbone PCR, data not shown).
Pre-selected clones were then expanded and further analyzed for correct targeting by Southern blot and PCR and sequencing. Southern blot was performed using a reporter-specific probe and a probe specific for the genomic locus binding outside the reporter but not within the homology arms of the targeting vector. Interestingly, all the clones that were tested by Southern blot were positive in screening PCR beforehand, but only a portion showed correct band sizes in Southern blot analysis and had heterozygously integrated the reporter (Figure 4b). It is therefore necessary to not only rely on screening PCR results but also verify correct targeting by Southern blotting. To verify correct targeting at the DNA sequence level, integration junctions were amplified by PCR and products were subjected to Sanger sequencing (Figure 4a). Notably, sequencing of the target site homologous allele, where the reporter had not integrated, revealed Cas9-mediated changes. In Figure 4a, examples for a Cas9-mediated deletion is shown. To test for Cas9-mediated off-target effects, a list of the highest-ranked off-target sites was generated as described in section 3.5. Ten highest-ranked off-target sites were PCR-amplified from genomic DNA of the clones, and the products subjected to Sanger sequencing. In the targeted single-cell clones we generated, no variations from Jurkat wild-type sequences were observed at the ten highest-ranked off-target sites.
Figure 3: Screening of single-cell clones for targeting events. (a) Schematic of primer design for detection of targeting events by PCR in mixed targeted cell population. Primer pairs for detection of 5' integration junction (P1, P2) and 3' integration junction (P3, P4) are indicated as arrows. Primer P1 and P4 also serve to amplify targeted locus of the allele without reporter integrant (b) Genomic DNA of mixed targeted cell population was prepared 10 days after FACS enrichment of transfected cells and analyzed for 5' integration junction (P1, P2), 3' integration junction (P3, P4; data not shown) and wild-type allele of the targeted locus (P1, P4). Wild-type Jurkat cells (wt) served as control. (c) First screen of single-cell clones in 96-well plates. 96-well plates with single-cell clones were screened for correct targeting by flow cytometry after PMA-Iono induction and PCR analysis. Results for example clones are shown. PCR was performed on cell lysates with reporter-specific primers (screening PCR; P5, P6) and primers amplifying a genomic locus (control PCR, here: NUP188 locus; P7, P8). Cell lysates of wild-type Jurkat cells (wt) and genomic DNA from HIVisB2 clone (B2)18 prepared with commercially available kit served as a negative control for screening PCR and positive control for control PCR, respectively. Please click here to view a larger version of this figure.
Figure 4: Analysis of selected single-cell clones for correct reporter integration. (a) Sequence analysis of integration junctions and targeted locus on the allele without reporter integrant. Primer pairs for detection of 5' integration junction (P1, P2), 3' integration junction (P3, P4) and targeted locus of allele without reporter integrant (P1, P4) were used. PCR products were amplified from genomic DNA of single-cell clones and subjected to Sanger sequencing. Sequencing results were aligned to expected sequences. Shown are example data of one clone. Sequence chromatograms from genome/HA junction and HA/reporter junction are shown for both 5' and 3' integration junction. Matches are indicated as dots. Binding site of gRNA is marked with a box. Cas9-induced mutations are highlighted in red. Schematic of primer design is shown below. Arrows indicate primer positions used for amplification. (b) Southern blot analysis of selected targeted single-cell clones and Jurkat wild-type cells. Southern blot analysis was carried out on genomic DNA with a reporter-specific probe and a probe recognizing a genomic sequence in both targeted and wild-type allele (genomic probe). Data of 7 example clones is shown. Clones with correct band sizes are marked with boxes. Diluted target vector plasmid served as positive control (+). Schematic for Southern blot analysis is shown below. Please click here to view a larger version of this figure.
Reagent | Add per reaction |
Forward Primer (20 µM) | 1 µL |
Reverse Primer (20 µM) | 1 µL |
10x Taq buffer | 5 µL |
1 µL dNTPs (2.5 mM each) | 1 µL |
MgCl (50 mM) | 1 µL |
DMSO | 1.5 µL |
gDNA (50 – 100 ng/µL) | 2 µL |
Nuclease-free water | Fill up to 49.5 µL |
Taq DNA polymerase (5 U/mL) | 0.5 µL |
reaction volume | 50 µL |
Table 1: Recipe for PCR using a polymerase with proofreading activity. The amount of each reagent to be added per PCR reaction is indicated. PCR is intended for amplification using genomic DNA as template. The recipe is used in step 1.2.2.2 (amplification of homology arms), step 3.3.3 (verification of integration sites), step 3.4.1.5 (generation of genomic probe for Southern blot), and step 3.5.4 (analysis of off-target events).
Steps | Temperature | Time | Cycles |
Initial Denaturation | 95 °C | 2 min | 1 |
Denaturation | 95 °C | 40 s | 30 -35 |
Annealing | 58 °C | 45 s | |
Extension | 72 °C | 1 min/kb | |
Final Extension | 72 °C | 10 min | 1 |
Table 2: Cycling conditions for PCR using a polymerase with proofreading activity. Cycling conditions correspond to PCR recipe listed in Table 1.
Reagent | Add per reaction |
Forward Primer (20 µM) | 1 µL |
Reverse Primer (20 µM) | 1 µL |
5x high-fidelity buffer | 5 µL |
1 µL dNTPs (2.5 mM each) | 1 µL |
gDNA (50–100 ng/µL) or plasmid DNA (50 ng/µL) | 2 µL gDNA or 1 µL plasmid DNA |
Nuclease-free water | Fill up to 49.5 µL |
High-Fidelity DNA polymerase | 0.5 µL |
reaction volume | 50 µL |
Table 3: Recipe for PCR using a high-fidelity polymerase. The amount of each reagent to be added per PCR reaction is indicated. PCR is intended for robust amplification of DNA templates. The recipe is used in step 2.4.2 (detection of targeting events) and step 3.4.1.4 (generation of reporter-specific probe for Southern blot).
Steps | Temperature | Time | Cycles |
Initial Denaturation | 98 °C | 30 s | 1 |
Denaturation | 98 °C | 10 s | 30 – 35 |
Annealing | 58 °C | 30 s | |
Extension | 72 °C | 30 s/kb | |
Final Extension | 72 °C | 10 min | 1 |
Table 4: Cycling conditions for PCR using a high-fidelity polymerase. Cycling conditions correspond to PCR recipe listed in Table 3.
Primer | Sequence (5‘ – 3‘) |
P7 control PCR F | CTTTGTTGGGTAAGCATGGAGGTC |
P8 control PCR R | CAGTTACTCACCTTTGCACATAGG |
Table 5: Oligonucleotide sequences for control PCR. Forward and reverse primers for the amplification of a 630 bp fragment of NUP188 locus are indicated.
Reagent | Add per reaction |
ready-to-use PCR Master Mix | 8 µL |
Forward Primer (20 µM) | 1 µL |
Reverse Primer (20 µM) | 1 µL |
Nuclease-free water | 8 µl |
cell lysate (1:12 dilution) | 2 µL |
total reaction volume | 20 µL |
Table 6: Recipe for screening-PCR. The amount of each reagent to be added per PCR reaction is indicated. PCR recipe is intended for screening PCR in step 3.2.11.
Steps | Temperature | Time | Cycles | |
Pre-PCR | Denaturation | 94 °C | 2 min | 1 |
Annealing | 58 °C | 1 min | ||
Extension | 72 °C | 1 min | ||
Denaturation | 94 °C | 30 s | 38 | |
Annealing | 58 °C | 30 s | ||
Extension | 72 °C | 1 min | ||
Final Extension | 72 °C | 10 min | 1 |
Table 7: Cycling conditions for screening-PCR. Cycling conditions correspond to PCR recipe listed in Table 6.
Here, we describe a protocol to generate HIV-1-derived Jurkat reporter models with chosen proviral integration sites applying CRISPR-Cas9-based genome engineering.
Several points of the protocol require careful attention during the planning stage. First, the locus to be targeted should be chosen carefully, as some loci might be easier to target than others (e.g., depending on the chromatin status of the region and the target sequence itself). Repetitive sequences are hard to clone into the targeting vector and are often not unique within the genome. Regions of repressive chromatin are harder to target with the CRISPR-Cas9 system19,20.
Second, the choice of gRNA is crucial for CRISPR-Cas9-mediated targeting. For the purpose of generating models for HIV infection with representative integration sites, one would want the gRNA to bind as close as possible to an integration hotspot. However, this site might not be ideal for Cas9 recruitment; therefore, a compromise must be made between proximity of the gRNA sequence to the integration site of choice and gRNA quality. We have found the E-CRISP webtool reliable in predicting functional gRNAs. It is also possible to carry out a gRNA pre-test by transiently expressing several gRNAs together with Cas9 in the cell type to be targeted, followed by a screening for mutations. A suitable gRNA will direct Cas9 efficiently to the target site and the gRNA complementary site will show mutations. The length of HAs (1,000 bp upstream and downstream of the integration site) was chosen according to previously published studies11. Generally, reducing the length of homology arms will result in reduced targeting frequency. A length of 1,000 bp of homology arm presents a good compromise between sufficient targeting frequency and ease of target vector construction.
Third, enough time must be spent on a good design of the reporter construct. The minimal reporter used in this protocol, which contains an HIV-1 NL4-3 derived LTR and a tdTomato coding sequence, was designed based on the following principle: single LTRs have been described as solely remaining proviral fragments in several cases of clonal expansion in chronically HIV-1-infected individuals21. It is expected that the LTR is strongly influences the chromatin status at integration sites and organizes the recruitment of cellular complexes. We have chosen a minimal HIV reporter focusing on the HIV LTR as main regulatory genetic element, then introduced tdTomato as a fluorescent LTR activity marker instead of further HIV-1 genes, as genome engineering frequency was reported to be higher with smaller targeting inserts11. Considering the time-consuming steps of tv cloning, clonal selection, screening, and verification of the clones, it is advised to carefully consider the design of the HIV reporter in the context of functional studies that will eventually be carried out on targeted cell lines. One might, for example, consider the inclusion of gag leader sequence, 3' HIV LTR, and/or other viral gene sequences in the reporter. The protocol can be readily adapted to such different reporter constructs.
Generally, it is important to consider the choice of cell type to be targeted as single-cell clone generation can be difficult with certain cell lines. We found that Jurkat cells are not very efficient in single clone generation; however, we decided to choose this cell type for its previously known use in latent HIV infection models9. We obtained the best results in Jurkat clonal dilution plating with 50% conditioned medium, when plates were left undisturbed in an incubator that was opened no more than 3 times a week. If using a different cell line is desired, it is advised to pre-test the possibility of generating single-cell clones by carrying out a dilution plating experiment. Another point to keep in mind is the variation of transfection rates of different cell lines. If transfection is not feasible for the chosen cell line, electroporating the cells for targeting may be necessary. Note that the choice of cell line used for targeting may not be determined by technical aspects only, such as efficiency for transfection or single clone generation. Functional aspects may also be considered, such as transcriptional activity or inducibility of the targeted gene locus. This might require screening of different cell lines prior to the targeting workflow.
It should be emphasized that the described protocol is time-intensive. It should be expected to take 3 to 6 months from the first step to final clonal cell lines. The Southern blot analysis, which is used to check for site-specific single integration events, may seem cumbersome, but in our experience it is highly important – as only a subset of single-cell clones that showed the expected integration junctions per PCR showed a correct patterning in the Southern blot. Ideally, experimenters should generate a number of clones with the same insert targeted to the same integration site to control for clonal effects in any subsequent experiments that make use of the clones. It is possible to do heterozygous as well as homozygous targeting. In heterozygous reporter cell lines, the allele without an integrant is likely to show modifications at the Cas9 targeting site, which can be screened by PCR (Figure 4a). For homozygous targeting, we suggest that both alleles be targeted consecutively.
Taking into account these considerations, this workflow provides a means to generate powerful cellular models that can be used to increase the understanding of chronic HIV infection. For example, proviral activity in response to latency-reversing agents can be tested in the context of recurrent integration sites. This may be of particular interest to researchers in the field, since position effects have been postulated to impact HIV latency and reversal22.
The authors have nothing to disclose.
We thank Britta Weseloh and Bettina Abel for technical assistance. We also thank Arne Düsedau and Jana Hennesen (flow cytometry technology platform, Heinrich Pette Institut) for technical support.
pX330-U6-Chimeric_BB-cBh-hSpCas9 | Addgene | 42230 | vector for expression of SpCas9 and gRNA |
pMK | GeneArt | mammalian expression vector for cloning | |
cDNA3.1 | Invitrogen | V79020 | mammalian expression vector for cloning |
BbsI | New England Biolabs | R0539S | restriction enzyme |
NEBuilder Hifi DNA Assembly Cloning Kit | New England Biolabs | E5520S | Assembly cloning kit used for target vector generation |
TaqPlus Precision PCR System | Agilent Technologies | 600210 | DNA polymerase with proofreading activity used for amplification of homology arms (step 1.2.2.2), verification of integration site and reporter sequence (step 3.3.3 and 3.3.5), generation of genomic probe for Southern blot (step 3.4.1.5) and analysis of off-target events (step 3.5.4) |
96-well tissue culture plate (round-bottom) | TPP | 92097 | tissue culture plates for dilution plating |
Phusion High-Fidelity DNA polymerase | New England Biolabs | M0530 L | DNA polymerase used for detection of targeting events (step 2.4.2) and generation ofreporter-specific probe for Southern blot (step 3.4.1.4) |
Dimethyl sulfoxide (DMSO) | Sigma-Aldrich | D9170 | dimethyl sulfoxide as PCR additive |
Magnesium Chloride (MgCl2) Solution | New England Biolabs | B9021S | MgCl2 solution as PCR additive |
Deoxynucleotide (dNTP) Solution Mix | New England Biolabs | N0447S | dNTP mixture with 10 mM of each nt for PCR reactions |
5PRIME HotMasterMix | 5PRIME | 2200400 | ready-to-use PCR mix used for screening PCR (step 3.2.11) |
QIAamp DNA blood mini kit | Qiagen | 51106 | DNA isolation and purification kit |
QIAquick PCR Purification Kit | Qiagen | 28106 | PCR Purification Kit |
RPMI 1640 without glutamine | Lonza | BE12-167F | cell culture medium |
Fetal Bovine Serum South Africa Charge | PAN Biotech | P123002 | cell culture medium supplement |
L-glutamine | Biochrom | K 0282 | cell culture medium supplement |
Penicillin/Streptomycin 10.000 U/ml / 10.000 µg/ml | Biochrom | A 2212 | cell culture medium supplement |
Gibco Opti-MEM Reduced Serum Media | Thermo Fisher Scientific | 31985062 | cell culture medium with reduced serum concentration optimized for transfection |
TransIT-Jurkat | Mirus Bio | MIR2125 | transfection reagent |
phorbol 12-myristate 13-acetate | Sigma-Aldrich | P8139-1MG | cell culture reagent |
Ionomycin | Sigma-Aldrich | I0634-1MG | cell culture reagent |
Syringe-driven filter unit, PES membrane, 0,22 µm | Millex | SLGP033RB | filter unit for sterile filtration |
Heracell 150i incubator | Thermo Fisher Scientific | 51026280 | tissue culture incubator |
Amershan Hybond-N+ | GE Healthcare | RPN1520B | positively charged nylon membrane for DNA and RNA blotting |
Stratalinker 1800 | Stratagene | 400072 | UV crosslinker |
High Prime | Roche | 11585592001 | kit for labeling of DNA with radioactive dCTP using random oligonucleotides as primers |
illustra ProbeQuant G-50 Micro Columns | GE Healthcare | 28-9034-08 | chromatography spin-columns for purification of labeled DNA |