Features

The table below lists the features of the Persephone application.


Feature Notes
Show genetic maps with marker mapping Marker positions are given in centimorgans (cM)
Show maps based on sequence (chromosomes and scaffolds) Some map sets can contain millions of scaffolds
Show multiple maps on one screen with synteny visualization Connections are based on orthologous genes, common markers, sequence similarity regions
Link genetic and physical maps by common markers Identical markers located on different maps are automatically linked
Maps can be shown vertically and horizontally The vertical layout is common for genetic maps. Sequence maps are typically shown in the horizontal orientation
Sequence is shown for entire chromosomes When the entire map is shown at low zoom, the genomic sequence is represented by a histogram based on the GC content. Zoom in to see individual base pairs. A text window with a sequence view of 2 Mbp is synchronized with the graphics. The text selection is reflected on the map.
Track types
– sequence For example, the wheat genome is 16 Gbp; after zooming in, a track can show individual nucleotides
– gene models Typically, tens of thousands of features per one track
– markers (in general, a position on a map: SNP markers, repeats, regions of interest, etc.) Tested with millions of markers per one track
– quantitative tracks (RNA-seq coverage, methylation or conservation levels, etc.) Example: The human genome has the conservation track (phyloP100) where each nucleotide has a value. Several quantitative tracks can be merged into one
– QTLs Each QTL is assigned to a trait and a study
– Variation (SNPs and indels) The variants track can have multiple sub-tracks (e.g., one per sample). Examples: Human samples have 80 million SNPs per patient. 3000 rice accessions with 5 million SNPs each.
Marker details form Shows properties, sequences, and lists all locations of the marker on other maps
Gene details form
– basic properties The prediction method, coordinates on the map, etc.
– spliced, unspliced, protein sequence with color decoration The text selection is synchronized between the tabs
– metrics tab with exon coordinates and sizes Selection is synchronized with the sequence tabs
– transcripts view All gene models that overlap with the selected gene are combined in one zoomable view. Coordinates and sizes of exons and introns are shown. Differences in splice patterns are highlighted. Genes with common CDS can be collapsed. Sorting can be done by the unspliced sequence length, transcript name, or CDS length.
Realtime BLAST The interface allows selection of multiple genomes or individual chromosomes. Results are shown graphically, allowing analysis of each HSP. The raw BLAST output is also available. The BLAST parameters can be customized
Bookmarks Remembers screen layout, can be shared with other users
Find the best matching syntenic chromosome for a given map Click a track (markers or genes) and see which tracks in which genomes have the most matching features. Dot plot is shown for each pair of maps to be considered.
Synteny matrix Zoomable dot plot for all vs. all maps of two genomes (based on orthologs, markers, or syntenic regions)
Create new tracks by filtering existing tracks Extract features based on some criteria and create a new track. The features can be given distinct colors based on some criteria
Create new tracks by filtering and converting existing tracks Example: find genes that have ‘cancer’ in their description and show them as a marker track with labels
Import tracks from external files
– markers Read from a simple tab-delimited file with 3 columns.
– VCF In progress
– BAM Allows loading BAM files from a URL or the local disk. BAI file is required. In the case of loading from a URL, only the BAI file is transferred and processed; the rest of the data is fetched on demand. BAI file is analyzed and used to display the density of the read alignment. Tested on files of 200 GB. Works with BAM files without sending the data to the server.
– CRAM Requires CRAI index file
– bedGraph Quantitative data, such as RNA-seq coverage plots
– gff Gene models or multi-part matches
– BED files The BED tracks can be added to the database or as a private track. Features can have distinct colors
– PAF or ribbon files The standard PAF files or proprietary ribbon files store syntenic ribbons
– FASTA files Users can drag/drop FASTA files with multiple sequences and create a new genome entry
Text search powered by Apache Solr The external user data is also indexed
Load BAM files to the database In addition to the BAM files loaded by end users for themselves, the BAM data can be added to the database to make it available to all users
Gene models in the annotation track can carry quality marks The marks will have different colors depending on the quality of prediction.
Export Allows to output genes, markers or QTLs with their properties and sequences for map segment or entire map set
– markers and qualifiers
– gene models and qualifiers
– bedGraph tracks
– BLAST search results
– genomic sequence
– text search results
– Export of ortholog pairs export all orthologs for a pair of genomes. In pipeline
Instant genomic sequence comparison (<1 Mbp regions) Show two maps, zoom into a region of 1 Mbps, create ribbons for identical sequences on the fly by clicking a button that runs BLASTN (results are typically shown in less than 1 sec)
Instant genomic sequence comparison (full chromosomes) Using ‘minimap2’ to align entire maps and produce ribbons of synteny.
DNA motif search The motifs with wildcards can be found in the entire sequence. The sequence view and graphics are synchronized
Inventory of tracks Activate/deactivate tracks by selecting them from a large collection of tracks
Highlight and label genomic regions Select a region of a map and save it for the future, giving the area a label and a specific color.
Statistics of maps of a genome or gene models in a track The gene structure statistics can be shown for one track or all tracks of a map set
Link in and link out The object properties can be shown as URLs to external resources. The web application accepts a set of parameters in the URL to navigate directly to a region or an object or interest