Genome Assemblies and CRISPR/Cas in Echinoderms: Difference between pages

From EchinoWiki
(Difference between pages)
Jump to navigation Jump to search
imported>Echinobase
 
imported>Ctelmer
No edit summary
 
Line 1: Line 1:
=Echinoderm Genome Assemblies by Species=


__TOC__
Welcome to the Echinobase CRISPR/Cas resource. A brief literature and method review is followed by tables of gRNA spacer sequences.


Updated December 2020


== '''''Strongylocentrotus purpuratus''''' ==


=== '''Assembly_3.1 (Spur_3.1)'''===
'''''S. purpuratus'' genome editing to create insertions and deletions'''
=== '''Assembly 2.6(Spur 2.6)'''===
=== '''Assembly_2.5(Spur_2.5)'''===
=== '''Assembly_2.1(Spur_2.1)'''===
=== '''Assembly_0.5(Spur_0.5)'''===


== '''''Patiria miniata''''' ==
To date CRISPR/Cas9 has been used to introduce insertion and deletion mutations (indels) into ''S. purpuratus nodall'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=44372 Lin and Su 2016]), ''polyketide synthase 1'', ''gcml'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=48855 Oulhen and Wessel 2016]), ''nanos2l'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=45207 Oulhen et al. 2017]) and ''dll1'' (''delta'') ([https://www.echinobase.org/literature/article.do?method=display&articleId=45720 Mellott et al. 2017]) genes. Attempts to mutate ''foxy'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=47178 Oulhen et al. 2019]) were unsuccessful. A number of different methods were used for gRNA synthesis (several using pT7-gRNA)  and NLS-SpCas9-NLS (pCS2-nCas9n (zebrafish codon-optimized), or pCS2-3xFLAG-NLS-SpCas9-NLS (codon optimized for human with a 3XFLAG-tag) were used in these studies (see below for details). The gRNAs and mRNAs were microinjected into fertilized eggs.


=== '''V2.0 Assembly'''===


We sought to improve the Patiria miniata genome assembly with additional PacBio sequences. We generated a new PacBio read dataset at the Duke University Sequencing Center using our reference individual DNA. The read dataset contains 2 million reads and 15.8 billion bp. The read N50 is 10.4 Kb. We used PBJelly2 to combine the PacBio reads with the previously assembled contigs. The results were an improvement in contig size and number with only a small reduction in the number of scaffolds (Table). The P. miniata Gene v2.0 set was generated using MAKER2 pipeline from v2.0 genome assembly.
'''Single nucleotide edits'''


Additional studies fused a deaminase to two mutants of SpCas9 for achieving targeted, single nucleotide edits to ''[https://www.echinobase.org/gene/showgene.do?method=display&geneId=23094247& alx1]'', ''[https://www.echinobase.org/gene/showgene.do?method=display&geneId=23083023 segment polarity protein dishevelled homolog DVL-3]'' (''Dsh'') and ''[https://www.echinobase.org/gene/showgene.do?method=display&geneId=23139056 polyketide synthase 1]'' (''Pks1'') to produce STOP codons ([https://www.echinobase.org/literature/article.do?method=display&articleId=45725 Shevidi et al. 2017]).




{| class="wikitable"
'''Reviews'''
!
! Pm v1.0
! Pm v2.0
|-
| Scaffold number
| 60,183
| 57,698
|-
| Scaffold N50
| 52,6141
| 76,341
|-
| Contig number
| 179,756
| 131,779
|-
| Contig N50
| 9,466
| 18,676
|}


=== '''V1.0 Assembly'''===
Reviews of the methods are available ([https://www.echinobase.org/literature/article.do?method=display&articleId=45567 Cui et al. 2017]; [https://www.echinobase.org/literature/article.do?method=display&articleId=47096 Lin et al. 2019]).
<u>What's New</u>


Pmin_1.0 is the latest (as of Apr 11, 2012) assembly of the genome of Patiria Miniata. The assembly tools CABOG (Celera Assembler), Newbler, ATLAS-Link, and ATLAS-GapFill were used to assemble a combination of 454 reads (fragment and 2.5kb insert paired ends;~15x coverage) and Illumina reads (300bp insert and 2.5kb insert paired ends;~70x coverage).


<u>Introduction</u>
'''Editing other echinoderm species'''


This information is for the first release (Pmin_1.0) of the draft genome sequence of the Patiria miniata . This is a draft sequence and may contain errors so users should exercise caution.Typical errors in draft genome sequences include misassemblies of repeated sequences, collapses of repeated regions, and unmerged overlaps(e.g. due to polymorphisms) creating artificial duplications.
Editing technology has also been used in ''Hemicentrotus pulcherrimus'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=47348 Liu et al. 2019]; [https://www.echinobase.org/literature/article.do?method=display&articleId=48597 Wessel et al. 2020]) and ''Temnopleurus reevesii'' ([https://www.echinobase.org/literature/article.do?method=display&articleId=48853 Yaguchi et al. 2020]).


With a goal of solving the polymorphism issues of the data while maintaining the sequence continuity, The Pmin_1.0 assembly was generated in the following steps:


1) 454 reads were assembled by CABOG using settings less strignent than the default (unitigger=bog utgErrorRate=0.03 ovlErrorRate=0.08 cnsErrorRate=0.08 cgwErrorRate=0.14 doExtendClearRanges=0)
'''Design overview'''


2) Both contig and degenerate sequences from the previous step were chopped into fake reads with ~11x coverage (500bp long; 460bp overlap; 80bp minimal length) for ctgs and 8x coverage(450bp long; 400bp overlap; 80bp minimal length) for degs. The fake reads were then assembled by Newbler with the option of -large.
CRISPR systems in nature are composed of the Cas9 nuclease and two RNAs, the CRISPR RNA (crRNA) that binds to a complementary DNA sequence and binds to the transactivating RNA (tracrRNA) that also binds to a specific Cas9 protein. For ease of use the crRNA and tracrRNA have been combined into a single guide RNA (sgRNA) molecule for use with the ''Streptococcus pyogene''s Cas9 (SpCas9). The sgRNA has the target gene '''spacer''' sequence and the '''scaffold''' sequence that interacts with the Cas9 protein. The design of the gRNA target gene specific spacer sequence can be performed using online tools such as CRISPRscan. Briefly the software will scan for the NGG protospacer adjacent motif (PAM) sequence then evaluate the 20 nucleotides that are 5' of the PAM site for their suitability as a spacer sequence. Favorable spacer sequences are more than 50% GC, 20 nucleotides in length and do not have "off-target" binding sites. Additionally, if using T7 RNA polymerase to produce the sgRNA then two '''5' GGs''' should be considered, editing occurred with an 80% frequency with GG-, 75% NG-, 60% GN- and 37.5% if NN- ([https://www.echinobase.org/literature/article.do?method=display&articleId=48856 Thomas et al. 2014]).


3) Both 454 and iIlumina pair end reads were mapped to the contigs from the previous step. We used BLAT to map the 454 data and bwa(aln+samse) to map the Illumina data, both with the default options. Based on the mapping locations of the paired ends, contigs were then ordered and oriented into scaffolds using ATLAS-Link.


4) ATLAS-GapFill was then used to assemble the reads locally in an attempt to fill the gaps among the contigs within the scaffolds.This final step produced 770.5Mb sequences with contig N50 size of 9.5kb and scaffold N50 size of 50.3kb.
'''Method overview'''


<u>Conditions for use</u>
Published methods have used microinjection of RNA into embryos.


These data are made available before scientific publication with the following understanding:
The capped mRNA for ''Streptococcus pyogenes'' Cas9 with nuclear localization signals was produced using either linearized pCS2-nCas9n or pCS2-3XFLAG-NLS-SpCas9-NLS as the template for the MEGAscript SP6 Transcription Kit or the mMESSAGE mMACHINE SP6 Transcription Kit. The RNA was then purified.


*The data may be freely downloaded, used in analyses, and repackaged in databases.
To make the gRNAs several approaches have been used. The pT7-gRNA was designed to clone the gene specific spacer/target sequence into BsmBI restriction sites. The vector contains the T7 promoter and the gRNA scaffold followed by a restriction site for linearization prior to RNA production. More recently the pT7-gRNA plasmid has been used as template for a primer containing the T7 promoter, spacer sequence and an overlap sequence to prime the PCR and add the scaffold. This overlap approach has also been used with a synthesized scaffold oligo for PCR (eg. 5’ AAAAGCACCG ACTCGGTGCC ACTTTTTCAA GTTGATAACG GACTAGCCTT ATTTTAACTT GCTATTTCTA GCTCTAAAAC 3' where the overlap sequence is underlined) . The sgRNA is then produced using the MEGAshortscript T7 Transcription Kit and RNA is purified.  


*Users are free to use the data in scientific papers analyzing particular genes and regions if the providers of this data (Baylor College of Medicine Human Genome Sequencing Center) are properly acknowledged. Please cite the BCM-HGSC web site or publications from BCM-HGSC referring to the genome sequence.
For microinjection the 500-1000 ng/ul Cas9 mRNA and 150-400 ng/ul sgRNA are mixed (literature varies). The NLS-Cas9-NLS protein is approximately 4.4X the mass of gRNAs so sgRNAs are in excess. If 50pl is injected this is on the order of 10^7 molecules of Cas9 mRNA and 10^8 molecules of sgRNA.
 
*The BCM-HGSC plans to publish the assembly and genomic annotation of the dataset, including large-scale identification of regions of evolutionary conservation and other features.
 
*This is in accordance with, and with the understandings in the Fort Lauderdale meeting discussing Community Resource Projects and the resulting NHGRI policy statement (http://www.genome.gov/page.cfm?pageID=10506537).
 
*Any redistribution of the data should carry this notice.
 
== '''''Lytechinus variegatus''''' ==
 
=== '''Assembly LvPtE5C'''===
=== '''Assembly LvMSCB'''===
=== '''Assembly 2.2 (Lvar_2.2)'''===
=== '''Assembly 0.4 (Lvar_0.4)'''===

Revision as of 11:14, 1 December 2020

Welcome to the Echinobase CRISPR/Cas resource. A brief literature and method review is followed by tables of gRNA spacer sequences.

Updated December 2020


S. purpuratus genome editing to create insertions and deletions

To date CRISPR/Cas9 has been used to introduce insertion and deletion mutations (indels) into S. purpuratus nodall (Lin and Su 2016), polyketide synthase 1, gcml (Oulhen and Wessel 2016), nanos2l (Oulhen et al. 2017) and dll1 (delta) (Mellott et al. 2017) genes. Attempts to mutate foxy (Oulhen et al. 2019) were unsuccessful. A number of different methods were used for gRNA synthesis (several using pT7-gRNA) and NLS-SpCas9-NLS (pCS2-nCas9n (zebrafish codon-optimized), or pCS2-3xFLAG-NLS-SpCas9-NLS (codon optimized for human with a 3XFLAG-tag) were used in these studies (see below for details). The gRNAs and mRNAs were microinjected into fertilized eggs.


Single nucleotide edits

Additional studies fused a deaminase to two mutants of SpCas9 for achieving targeted, single nucleotide edits to alx1, segment polarity protein dishevelled homolog DVL-3 (Dsh) and polyketide synthase 1 (Pks1) to produce STOP codons (Shevidi et al. 2017).


Reviews

Reviews of the methods are available (Cui et al. 2017; Lin et al. 2019).


Editing other echinoderm species

Editing technology has also been used in Hemicentrotus pulcherrimus (Liu et al. 2019; Wessel et al. 2020) and Temnopleurus reevesii (Yaguchi et al. 2020).


Design overview

CRISPR systems in nature are composed of the Cas9 nuclease and two RNAs, the CRISPR RNA (crRNA) that binds to a complementary DNA sequence and binds to the transactivating RNA (tracrRNA) that also binds to a specific Cas9 protein. For ease of use the crRNA and tracrRNA have been combined into a single guide RNA (sgRNA) molecule for use with the Streptococcus pyogenes Cas9 (SpCas9). The sgRNA has the target gene spacer sequence and the scaffold sequence that interacts with the Cas9 protein. The design of the gRNA target gene specific spacer sequence can be performed using online tools such as CRISPRscan. Briefly the software will scan for the NGG protospacer adjacent motif (PAM) sequence then evaluate the 20 nucleotides that are 5' of the PAM site for their suitability as a spacer sequence. Favorable spacer sequences are more than 50% GC, 20 nucleotides in length and do not have "off-target" binding sites. Additionally, if using T7 RNA polymerase to produce the sgRNA then two 5' GGs should be considered, editing occurred with an 80% frequency with GG-, 75% NG-, 60% GN- and 37.5% if NN- (Thomas et al. 2014).


Method overview

Published methods have used microinjection of RNA into embryos.

The capped mRNA for Streptococcus pyogenes Cas9 with nuclear localization signals was produced using either linearized pCS2-nCas9n or pCS2-3XFLAG-NLS-SpCas9-NLS as the template for the MEGAscript SP6 Transcription Kit or the mMESSAGE mMACHINE SP6 Transcription Kit. The RNA was then purified.

To make the gRNAs several approaches have been used. The pT7-gRNA was designed to clone the gene specific spacer/target sequence into BsmBI restriction sites. The vector contains the T7 promoter and the gRNA scaffold followed by a restriction site for linearization prior to RNA production. More recently the pT7-gRNA plasmid has been used as template for a primer containing the T7 promoter, spacer sequence and an overlap sequence to prime the PCR and add the scaffold. This overlap approach has also been used with a synthesized scaffold oligo for PCR (eg. 5’ AAAAGCACCG ACTCGGTGCC ACTTTTTCAA GTTGATAACG GACTAGCCTT ATTTTAACTT GCTATTTCTA GCTCTAAAAC 3' where the overlap sequence is underlined) . The sgRNA is then produced using the MEGAshortscript T7 Transcription Kit and RNA is purified.

For microinjection the 500-1000 ng/ul Cas9 mRNA and 150-400 ng/ul sgRNA are mixed (literature varies). The NLS-Cas9-NLS protein is approximately 4.4X the mass of gRNAs so sgRNAs are in excess. If 50pl is injected this is on the order of 10^7 molecules of Cas9 mRNA and 10^8 molecules of sgRNA.