Feature comparison

The table below lists the features of Persephone and shows their availability in the two (Windows and Web) versions.


Feature Windows Web Notes
Show genetic maps with marker mapping

+

+

Show maps based on sequence (chromosomes and scaffolds)

+

+

some map sets can contain millions of scaffolds
Show multiple maps on one screen with synteny visualization

+

+

connections based on orthologous genes, common markers, sequence regions
Link genetic and physical map by common markers

+

+

 
Maps can be shown vertically and horizontally

+

+

 
Sequence is shown for entire chromosomes

+

+

Implemented a bit differently in Windows and Web versions
Track types      
– sequence

+

+

E.g., the wheat genome is 16 GB, a track allows to see nucleotides
– gene models

+

+

Typically, tens of thousands of features per one track
– markers (in general, a position on a map: SNP markers, repeats, regions of interest, etc.)

+

+

Tested with millions of markers per one track
– quantitative tracks (RNA-seq coverage, methylation or conservation levels, etc.)

+

+

Example: Human genome has the conservation track (phyloP100) where each nucleotide has a value. The Web version can merge several quantitative tracks into one
– QTLs

+

+

 
– Variation (SNPs and indels)

+

+

SNP track can have multiple sub-tracks (e.g., one per variety line).
Examples:
Human samples have 80 million SNPs per patient.
3000 rice accessions with 5 million SNPs each.
– Cytobands

+

 
– TBLASTN or BLASTN tracks

+

E.g., tblastn of SwissProt proteins vs. entire chromosomes
Marker details form

+

+

Shows properties, sequences, and lists all locations of the marker on other maps
Gene details form

+

+

 
– basic properties

+

+

 
– spliced, protein, unspliced sequence with color marking

+

+

 
– metrics tab with coordinates and sizes

+

+

 
– transcripts view

+

+

All gene models that overlap with the selected gene are combined in one zoomable view. Coordinates and sizes of exons and introns are shown. Differences in splice pattern are highlighted. Genes with common CDS can be collapsed. Sorting can be done by unspliced sequence length, transcript name or CDS length.
– CDS part only

+

+

 
– spliced alignment is shown in detail

+

see BAM track

Results of spliced alignment (hisat2, MagicBLAST) are shown. The sequence of a transcript (EST) is displayed alongside the genomic sequence allowing to see the flanking regions of introns, mismatches, indels, etc.
       
       
Realtime BLAST

+

+

The interface allows selection of multiple genomes or individual chromosomes. Results are shown graphically, allowing analysis of each HSP.
Bookmarks

+

+

Remembers screen layout, can be shared
Find best matching syntenic chromosome for a given map

+

+

Click a track (markers or genes) and see which tracks in which genomes have the most of matching features. Dot plot is shown for each pair of maps to be considered.
Synteny matrix

+

+

Zoomable dot plot for all vs. all maps of two genomes (based on orthologs or markers)
Create new marker tracks by filtering existing tracks

+

in pipeline

Extract markers based on some criteria and create a new track
Create new tracks by filtering existing tracks of ANY type

in pipeline

Need to define an interface to specify criteria that will filter the existing features based on their name, length, or qualifier values, etc.
Search powered by Apache Solr

+

+

Add tracks from external files      
– markers

+

+

 
– VCF

+

 
– BAM

+

Allows loading BAM files from a URL or the local disk. BAI file is required. In case of loading from a URL, only BAI file is transferred and processed, the rest of data is fetched on demand. BAI file is analyzed and used to display the density of the read alignment. Tested on files of 200 GB. Works with BAM files without sending the data to the server.
– CRAM

+

 
– bedgraph

+

+

 
– gff

+

 
– QTLs

+

 
– BED files

+

+

The data can be added to the database or as a private track (Windows only). Support colors
Users can add genomic sequences

+

Users can drag/drop FASTA files with multiple sequences and create a new genome entry
Expression interface

+

Specialized interface for analysis of the expression data. One value per gene per sample is considered.
Load BAM files to the database

+

In addition to the BAM files loaded by end users, add the BAM data to the database to make it available to all users
Gene models in the annotation track can carry quality marks

+

+

The marks will have different color depending on the quality of prediction.
       
Export

+

+

Allows to output genes, markers or QTLs with their properties and sequences
– Export of ortholog pairs

+

Export all orthologs for a pair of genomes
Instant genomic sequence comparison (<1 Mbp regions)

+

Show two maps, zoom into a region of 1 Mbps, create ribbons for identical sequences on the fly by clicking a button that runs BLASTN (results are typically shown in less than 1 sec)
Instant genomic sequence comparison (full chromosomes)

+

Use ‘minimap2’ to align entire maps and produce ribbons of synteny.
Genomic sequence extraction

+

+

Sequence motif search

+

 
Inventory of tracks

+

Activate/inactivate tracks selecting them from a large collection of tracks
Highlight and label genomic regions

+

Select a region of a map and save it for future giving the area a label and a specific color.