CRISPR/Cas9 targeting of passenger single nucleotide variants in haploinsufficient or essential genes expands cancer … – Nature.com

Posted: March 29, 2024 at 2:42 am

TCGA data acquisition and pre-processing

TCGA SNV data for 16 cancer types (BLCA, BRCA, COAD, GBM, HNSC, KIRC, KIRP, LIHC, LUAD, LUSC, OV, PAAD, PRAD, READ, STAD, and UCEC) were downloaded from the GDC data portal (https://portal.gdc.cancer.gov/, DR-7.0). The mutation files were initially collected as VarScan2 processed protected mutation annotation format (MAF) files. To eliminate low-quality and potential germline variants, we further processed the files according to the guidelines provided by the GDC portal (https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/) to generate high-confidence somatic mutation files. For gene expression analysis, we obtained fragments per kilobase of exon per million mapped fragments (FPKM) data using the TCGAbiolinks19 R package (version 2.26.0). The gene expression values were then normalized to log2(FPKM+1).

The DepMap CRISPR/Cas9 screen dataset20 (https://depmap.org/portal/, DepMap Public 21Q2) was used to collect essential genes. Haploinsufficient genes were compiled from three sources: (1) Vinh T Dang et al.s study11, (2) ClinGen12 (https://clinicalgenome.org, genes with haploinsufficiency scores of 2 or 3, downloaded on January 20, 2021), and (3) DECIPHER13 (https://deciphergenomics.org, genes located in the top 5% probability of haploinsufficiency scores, version 3). Oncogenes were obtained from the COSMIC Cancer Gene Census9 (https://cancer.sanger.ac.uk/census, v94) data by applying the filter Somatic=yes and including genes with the role of oncogene in cancer. Hotspot mutations were annotated using data from the Cancer Hotspots portal3 (https://www.cancerhotspots.org, Hotspot Results V2).

To generate targetable SNVs and the corresponding sgRNA sequences from a given SNV list of a sample, we followed the following steps: First, we identified the SNVs located within essential or haploinsufficient genes. If an SNV was encoded by an essential gene, only homozygous SNVs were further analyzed. Next, we calculated the allele frequency (AF) threshold ({AF}_{cut}) using the following equation:

$${AF}_{cut}={AF}_{M}+MAD(hetAF)$$

(1)

where ({AF}_{M}) is the median of AFs of SNVs from the sample, and (MAD(hetAF)) is the median absolute deviation (MAD) of AFs of heterozygous SNVs from the patient or sample. SNVs with AF below the samples ({AF}_{cut}) were filtered out. We then considered the expression of the gene in which an SNV was located and retained SNVs where the gene expression (log2(FPKM+1)) was greater than 1.

To identify SNVs that generate a novel and specific targetable site for the CRISPR/Cas9 approach, we searched for a PAM sequence (NGG, where N represents any nucleotide) within a 12-base pair region around the SNV or checked if the SNV itself created a new PAM sequence. For the satisfying SNVs, a 20-nucleotide sgRNA sequence was obtained.

To obtain sgRNAs with precise knockout efficiency and low potential off-target effects, we calculated the on- and off-target scores and applied strict cutoffs as follows: First, on-target scores were calculated using the Azimuth 2.015 method implemented in the crisprScore21 R package (version 1.2.0). sgRNAs with on-target scores greater than 0.5 were examined for possible off-target sites using CasOFFinder16 (offline version 2.4). The UCSC human reference genome assembly (GRCh38) was used as a reference, and off-target sites with a maximum of three mismatches were searched. If an sgRNA was found to have off-target sites, the off-target score was calculated using the CFD15 method, which was also implemented in the crisprScore21 R package. If off-target sites with scores>0.175 were present, the sgRNA was filtered out to mitigate potential off-target risks. Finally, the SNVs were reported along with their corresponding sgRNAs, on-target scores, and off-target scores.

All cells were maintained at 37C in a 5% CO2 atmosphere. Human embryonic kidney 293T (HEK293T) cells were purchased from ATCC. HEK293T cells were cultured in Dulbeccos modified Eagles medium (DMEM) (Gibco, USA) supplemented with 10% fetal bovine serum (FBS) (Gibco) and 1% penicillinstreptomycin (Invitrogen, USA). Human colorectal cancer cell lines (SNUC4, SW620, and NCIH498) were also purchased from the Korean Cell Line Bank and cultured in RPMI-1640 medium (Gibco) supplemented with 10% FBS and 1% penicillinstreptomycin.

The lentiviral plasmids lentiCas9-Blast and lentiGuide-puro were purchased from Addgene USA (#52,962, #52,963). The sgRNA sequences were cloned following the lentiCRISPR v2 cloning protocol22,23. For transfection, 7.5105 HEK293T cells were seeded in 60-mm plates one day before transfection. Transfection was performed using Opti-MEM I Reduced Serum Medium (Gibco) with 1g of lentiviral plasmid, 0.25g of pMD2.G (#12,259; Addgene), 0.75g of psPAX2 (#12,260; Addgene), and 6 L of FuGENE (Promega, USA). The medium was changed after 16h of incubation at 37C under 5% CO2. Viral supernatants were collected 48 and 72h after transfection, filtered through a 0.45-m membrane (Corning, USA), and stored at -80C. Cells were transduced with lentivirus encoding lentiCas9-Blast to establish stable Cas9-expressing cells, followed by selection with blasticidin (10g/mL) (Invitrogen) for seven days.

Stable Cas9-expressing SNUC4 and SW620 cells were transduced with a lentivirus encoding either control sgRNA (non-targeting sgRNA, GCGAGGTATTCGGCTCCGCG) or sgRNA targeting the RRP9 SNV of SNUC4 (sgRRP9-SNV). After selection with puromycin (SNUC4: 10g/mL, SW620: 2g/mL, Invitrogen) for 72h, 1103 cells/well were seeded into six-well plates. The medium was replaced every 72h. After 14days, the medium was removed, and the cells were stained with 0.05% crystal violet solution in a 6% glutaraldehyde solution for 30min. The crystal violet solution was then removed, and the cells were washed with H2O and allowed to dry. Colonies comprising more than 50 cells were counted using the ImageJ software24.

Parental or stable Cas9-expressing SNUC4 and SW620 cells were transduced with a lentivirus encoding either control sgRNA (non-targeting sgRNA, GCGAGGTATTCGGCTCCGCG) or sgRRP9-SNV. After selection with puromycin (SNUC4: 10g/mL, SW620: 2g/mL) for 72h, 1105 cells/well were seeded into six-well plates. After 3days, cells were trypsinized, stained with trypan blue (Bio-Rad, USA), and counted. All harvested cells were seeded onto 60-mm plates. After 3days of incubation, cells were trypsinized and counted with trypan blue again. The subculture was repeated once more using 100-mm plates. Growth curves were generated using cell counts obtained during the subculture.

Total RNA was extracted from SW620 cell line using the RNeasy Plus Mini Kit (QIAGEN, Germany) following the manufacturers instructions. cDNA was synthesized with PrimeScript RT Master Mix (Takara Korea Biomedical Inc, Korea), and full-length RRP9 cDNA was PCR amplified with CloneAmp HiFi PCR Premix (Takara Korea Biomedical Inc). The PCR-amplified RRP9 wild-type cDNA was cloned into pcDNA3 Flag HA (#10,792, Addgene) using In-Fusion HD Cloning Kit (Takara Korea Biomedical Inc). RRP9 sequence was confirmed by Sanger-sequencing.

Stable Cas9-expressing SNUC4 cells were transduced with lentivirus encoding either control sgRNA (non-targeting sgRNA, GCGAGGTATTCGGCTCCGCG) or sgRRP9-SNV. After selection with puromycin (10g/mL) for 72h, 3103 cells/well were seeded into 96-well plates. After a 24h incubation, 2g of empty or RRP9 plasmids were transfected with FuGene HD (Promega) according to the manufacturers protocol. Cell viability was assessed after 4days using Cell Titer Glo (Promega), and relative luminescence units (RLU) were measured using an EnVision plate reader (Perkin-Elmer, USA).

Stable Cas9-expressing NCIH498 and SW620 cells were transduced with a lentivirus encoding either control sgRNA (non-targeting sgRNA, GCGAGGTATTCGGCTCCGCG) or sgRNA targeting the SMG6 SNV of NCIH498 (sgSMG6-SNV). After selection with puromycin (NCIH498: 10g/mL, SW620: 2g/mL) for 72h, 3103 cells/well were seeded into 96-well plates. After 6days, cell viability was determined with Cell Titer Glo according to the manufacturers protocol, and RLU were measured using an EnVision plate reader.

Cells and tissues were harvested, washed with phosphate-buffered saline (PBS), and lysed on ice for 15min in a radioimmunoprecipitation assay buffer (R0278; Sigma, USA) supplemented with a protease and phosphatase inhibitor cocktail (GenDEPOT, USA). Cell lysates were centrifuged at 4C for 10min at 15,000rpm. Protein concentrations were determined using Bradford assay (Bio-Rad). Equal amounts of total protein were separated via sodium dodecyl sulfate gel electrophoresis and transferred to polyvinylidene difluoride membranes (Bio-Rad). The membranes were blocked with 5% skim milk for 1h at 22C and then incubated overnight at 4C with a primary antibody against the target protein in a buffer containing 0.1% Tween 20. Subsequently, the membranes were washed with Tween-PBS buffer three times for 10min each and incubated with a secondary antibody (anti-rabbit IgG or anti-mouse IgG) diluted in a blocking buffer containing 0.1% Tween 20 for 1h at 22C. The membranes were then washed with Tween-PBS three times for 10min each. The immunoreactive bands were visualized using Pierce enhanced chemiluminescence western blotting substrate (32,106; Thermo Fisher Scientific, USA). Mouse monoclonal anti-Cas9 (#14,697; Cell Signaling Technology, USA), rabbit polyclonal anti-RRP9 (#ab168845, Abcam, UK), rabbit polyclonal anti-FLAG (DYKDDDDK) (#2368; Cell Signaling Technology) and rabbit monoclonal anti-heat shock protein 90 (HSP90) (#4877, Cell Signaling Technology) and were used at a 1:1000 dilution. Anti-rabbit IgG (#111-035-144; Jackson ImmunoResearch, USA) was used at a 1:5000 dilution except for anti-FLAG which was used at a 1:10,000 dilution. Anti-mouse IgG (#115-035-146, Jackson ImmunoResearch) was used at a 1:10,000 dilution.

Genomic DNA was extracted using the QIAamp DNA Mini Kit (QIAGEN) following the manufacturers instructions. Libraries were prepared with a two-step PCR reaction, in which the first step uses target-specific primers, and the second step utilizes primers containing unique barcodes and Illumina sequencing adaptor sequences. The primers used here are listed in Supplementary Data 4. PCR reactions were performed with KAPA HiFi HotStart Ready Mix (Roche Molecular Systems, Inc. USA). For the first PCR step, 100ng of genomic DNA was denatured at 95 for 5min, followed by 30 cycles of (98C at 20s, 61C for 15s, and 72C for 15s), and a final extension at 72C for 1min. Primers with unique barcodes and Illumina sequencing adaptor sequences were added to the PCR product from step 1 for the second PCR reaction, where denaturation at 95C for 5min was followed by 12 cycles of (98C at 20s, 61C for 15s, 72C for 15s), and a final extension at 72C for 1min. PCR products were verified with 2% agarose gel electrophoresis and extracted using the Zyomoclean Gel DNA Recovery Kit (Zymo Research, USA) according to the manufacturers instructions. The barcoded PCR products were pooled and subjected to paired-end sequencing (2150bp reads) on an Illumina NovaSeq-6000 instrument (Macrogen, Korea). InDel quantification was conducted using CRISPResso225 with default parameters.

Genomic DNA was extracted from colorectal cancer cell lines using the QIAamp DNA Mini Kit (QIAGEN) following the manufacturers instructions. Target regions were PCR-amplified with nTaq (Mg2+plus) (Enzynomics, Korea) with the following primers: sgRRP9-SNV region (Forward: 5-TCAAGGCCCTCGTTGATTCC-3, Reverse: 5-TTTTTGGGCTTTGTGGCTGC-3), sgSMG6-SNV region (Forward: 5-TCTGCATCGAAAGTGACACGA-3, Reverse: 5- CTATCAGCCTGGACGACGTTT-3). PCR products were purified with PureLink Quick PCR Purification Kit (Invitrogen). 200ng of purified PCR product were denatured at 95C for 10min, re-annealed at 2C per second temperature ramp to 85C, followed by a 1C per second ramp to 25C. 1l of T7E1 enzyme (Enzynomics) was added to the heterocomplexed PCR product and incubated at 37C for 15min. Products were electrophoresed on a 2% agarose gel using TAE buffer. Band intensities were measured with ImageJ, and the estimated non-homologous end joining (NHEJ) event was calculated with the following formula:

$$NHEJleft( % right) = 100 times left[ {1 - left( {1 - fraction; cleaved} right)^{{left( {frac{1}{2}} right)}} } right]$$

(2)

where the fraction cleaved is (frac{(Density; of; digested; products)}{(Density; of; digested; products,+,undigested; parental; band}).

All animal procedures were approved by the Institutional Animal Care and Use Committee of Yonsei University, Seoul, Korea (2021-0106). All methods were performed in accordance with the relevant guidelines and regulations for the care and use of laboratory animals. Six-week-old female BALB/c-nu Slc mice were purchased from Orient Bio (Korea) and SLC Inc. (Japan). The mice were housed in individual ventilation cages equipped with a computerized environmental control system (Techniplast, Italy). The animal room temperature was maintained at 222C with a relative humidity of 5010%. Before the experiments, the animals were acclimated for seven days under a 12-h lightdark cycle.

Stable Cas9-expressing SNUC4 cells were transduced with lentivirus encoding either the control sgRNA or sgRNA targeting the RRP9 SNV in SNUC4 cells. After selection with 10g/mL puromycin for 72h, 3106 cells were subcutaneously injected into the left (control sgRNA) or right (RRP9 SNV of SNUC4 sgRNA) flanks of 10 mice. Similarly, stable Cas9-expressing SW620 cells were transduced with lentivirus encoding either the control sgRNA or sgRNA targeting the RRP9 SNV in SNUC4 cells. After selection with 2g/mL puromycin for 72h, 2106 cells were subcutaneously injected into the left (control sgRNA) and right (RRP9 SNV of SNUC4 sgRNA) flanks of 10 mice. Among the mice, we excluded those with no observable tumor growth in the left flank (control sgRNA) from further analysis.

Tumor sizes were measured using a caliper, and the volume was calculated using the formula: 0.5lengthwidth2. Mice were sacrificed when the largest tumor reached a volume of 1000 mm3. Each tumor was considered an experimental unit. The sample size was determined to be sufficient to identify statistically significant differences between groups.

Genomic DNA was extracted from colorectal cancer cell lines using the QIAamp DNA Mini Kit (QIAGEN) following the manufacturers instructions. Whole-exome capture was performed using the SureSelect Human All Exon V4 51Mb Kit (Agilent Technologies, USA). The captured DNA was then sequenced on the HiSeq 2500 platform (Illumina, USA), generating a minimum of 98.9 million paired-end sequencing reads of 100bp per sample.

The Burrows-Wheeler Alignment26 tool was used with the default parameters to align the paired-end reads to the UCSC human reference genome assembly (GRCh37/hg19). An average of 98.3% of the reads were successfully aligned to the human genome. Duplicate reads were removed using the Picard software package. The Genome Analysis Tool Kit (GATK) version 3.446 was used for read quality score recalibration and local realignment to identify short InDels using the HaplotypeCaller27 package. The variants were filtered using the GATK Best Practices quality control filters.

SNVs were identified using Mutect28, specifically the tumor-only option, with default parameters. Variants supported by at least five high-quality reads (Phred-scaled quality score>30) and detected with at least 20% AF were selected for further analysis. The detected SNVs and InDels were annotated using various databases, including the single nucleotide polymorphism (SNP) database (dbSNP29, build 147), 1000 Genomes Project30 (Phase 3), Korean dbSNP (build 20,140,512), and somatic mutations in TCGA colon adenocarcinoma (COAD), using the Variant Effect Predictor software31 (version 87). ANNOVAR32 was used to annotate regions of known germline chromosomal segmental duplications and tandem repeats.

Several steps were performed to filter variants. Patients with germline polymorphisms, chromosomal segmental duplications, or tandem repeats were excluded. The variants were then filtered to include known somatic mutations observed in at least one sample from TCGA COAD dataset. Additionally, nonsynonymous mutations observed in genes belonging to the Cancer Gene Census, as reported in at least ten samples in the COSMIC9 database (version 87), were included in the analysis.

Total RNA was extracted from colorectal cancer cell lines using the RNeasy Plus Mini Kit (QIAGEN) following the manufacturers instructions. The TruSeq RNA Sample Prep Kit v2 (Illumina) was used to generate mRNA-focused libraries. Libraries were sequenced on the HiSeq 2500 platform, generating at least 40 million paired-end reads of 100bp per sample.

The TopHat-Cufflinks33 pipeline was employed to align the reads to the reference genome and calculate normalized gene expression values in FPKM. TopHat was used to align and map the reads to the reference genome. The resulting alignments were then processed using Cufflinks, which estimates transcript abundance and calculates FPKM values, providing a measure of gene expression levels that takes the length of exons and the total number of mapped reads into account.

R (ver. 4.2.1) (R Foundation, Austria) and the ImageJ software were used to analyze the data.

The figures were generated using the R software, and statistical analyses were performed using RStudio software (version 2022.07.2+576). Specific statistical tests are identified in the figure legends for each experiment.

The study design, animal use and all experimental methods were conducted and reported in accordance with ARRIVE guidelines (https://arriveguidelines.org).

Go here to read the rest:
CRISPR/Cas9 targeting of passenger single nucleotide variants in haploinsufficient or essential genes expands cancer ... - Nature.com

Related Posts

Comments are closed.

Archives