ECB-FEAT-23138504: Difference between revisions

From EchinoWiki
Jump to navigation Jump to search
Created page with "From Matt Glasenappe: The ''ebr1'' gene annotation in Spur_5.0 and Echinobase is 56,133 base pairs. The gene contains many (40+) exons spread across this region. From previous cDNA sequencing, we know that the ''S. purpuratus'' ''ebr1'' mRNA is 12,074 base pairs (https://www.ncbi.nlm.nih.gov/nuccore/NM_214665.1) and the mature protein is 3,712 amino acids (11,136 base pairs) (https://www.ncbi.nlm.nih.gov/protein/47551295). However, looking at the Spur_5.0 gff/gtf anno..."
 
No edit summary
 
Line 1: Line 1:
From Matt Glasenappe:
From Matt Glasenapp:


The ''ebr1'' gene annotation in Spur_5.0 and Echinobase is 56,133 base pairs. The gene contains many (40+) exons spread across this region.
The ''ebr1'' gene annotation in Spur_5.0 and Echinobase is 56,133 base pairs. The gene contains many (40+) exons spread across this region.

Latest revision as of 16:04, 2 December 2024

From Matt Glasenapp:

The ebr1 gene annotation in Spur_5.0 and Echinobase is 56,133 base pairs. The gene contains many (40+) exons spread across this region.

From previous cDNA sequencing, we know that the S. purpuratus ebr1 mRNA is 12,074 base pairs (https://www.ncbi.nlm.nih.gov/nuccore/NM_214665.1) and the mature protein is 3,712 amino acids (11,136 base pairs) (https://www.ncbi.nlm.nih.gov/protein/47551295).

However, looking at the Spur_5.0 gff/gtf annotation files, the sum of all the annotated exons is only 9,421 base pairs, and the combined length of all the annotated CDS is 8,486 base pairs.

So, there appear to be many missing exons/CDS/bases in the Spur_5.0 gene prediction. The missing data appears to be at the 5' end of the gene (ebr1 is on the (-) strand). This is evident in the Echinobase CDS Gene Model, which does not begin with a start codon.