Welcome to Persephone

The Windows version of Persephone has many useful features; here are some highlights to get you started.

Zooming in/out

Persephone easily switches the view from showing an entire chromosome to looking at individual bases, bringing more details depending on the zoom level. That is why it is important to master the zooming technique – the working horse of the application.

Scaling the maps can be done in several ways.

  1. Resize individual map.
    (a) – Roll the mouse wheel when the cursor is over a map – similar to Google maps. Hold SHIFT while rolling the mouse to get lower increments. Note that the map will start growing in size around the location of the mouse cursor.
    (b) – Define a region of the map by dragging mouse while holding SHIFT – once the mouse button is released, the program will start zooming in so that the selected area smoothly grows to the size of the screen.
    (c) – Double click on a map will increase zoom by certain factor. SHIFT-click will reset the zoom and show the entire map.
  2. Resize all maps at once.
    Put the mouse cursor outside of any map and roll the wheel. This is useful if you have more than one map and you want to concentrate on a particular region that shows good alignment.

Physical and genetic maps are aligned on the same screen.

You can visualize the entire chromosomes and display several maps on one screen. Persephone can show genetic and physical (sequence-based) maps. When the maps are added to the main stage, Persephone will arrange them vertically. This orientation is typical for genetic maps, it facilitates an overview of the chromosomes and their syntenic relations. Please note, that you can also display a portion of a physical map in a more traditional for sequences horizontal view (described later).

The maps are grouped into map sets, that usually correspond to a particular version of genome. The map sets are organized into a tree shown on the left.

Align genetic and physical maps of Zea mays.

Find the map set called Maize B73 RefGen_v1 in the tree on the left. Select a map called Chr.4 from the grid below. Click inside the checkbox in the selected line:

– this will fix the map, so that it will stay on screen when another map is added.

Find the map set Maize IBMn 2008 and select the map chrom 4. Persephone will automatically link the maps by connecting common markers. Note, that the distance between markers within the genetic and physical maps can differ significantly 

Fragment of human chromosome with multiple tracks

The maps are shown in a form of movable “plates” with tracks of different types.

The plates can be scaled by using the mouse wheel.  You can move them around by dragging with the left mouse button. A roll of the mouse wheel outside of any map, on an empty background, will scale all maps together. Dragging the mouse while holding Shift will define a region of the map to zoom into.

The tracks can contain different types of data:

  • gene models with CDS information;
  • markers and other mapped features;
  • quantitative tracks (RNA-seq, methylation, etc.);
  • diversity data (SNPs);
  • expression values;
  • QTLs;
  • cytobands;
  • BLAST hits, etc.

 

Turn tracks on/off.

 

Define default tracks

Open any map of the map set Homo sapiens/Human GRCh37.p13. It will be shown with several tracks visible by default. Move the mouse to a set of LED-like bubbles in the track panel on the left.

The list of available tracks will automatically expand on mouse-over. A click on a track name will turn the track on or off. Alternatively, you can control the track visibility via context menu called by clicking the right mouse button on a track. Please note, that you can also define which tracks appear by default. To do this, click the right mouse button on the map set in the tree and select “Define default tracks”.

Viewing annotated sequence in horizontal view

To switch to more traditional for sequences horizontal view, select a portion of the physical map by dragging the mouse over the map while holding Ctrl key. Click the selection in the ruler track to open the horizontal view. Alternatively, you can use Horizontal view item from the top menu and move the mouse over the map. In this mode, a highlighted region of the size of 1 Mbp will follow the mouse until you click the mouse button again.

In the sequence view, the shortcuts are the same as in vertical view: Shift-drag mouse to select the region to zoom in, Ctrl-drag mouse to select a sequence fragment, use the mouse wheel to zoom in/out. The selections in the graphical part are synchronized with the selections in the text.

When mouse is positioned at the left edge of the screen, the additional track controls are displayed:

Now you can change the display mode of tracks (blue buttons), remove the tracks (red button), re-order them by dragging the handle or resize some of the tracks by moving their bottom edge. You can also collapse the tracks, if a “summary” view is enough for you. Find a little button ‘-‘ or ‘+’ on the left to collapse multiple genes into one-level track. All features that have been cut out will produce a shadow with a density reflecting the number of hidden elements.

Use Find Motif text box to search for sequence motifs (IUPAC codes allowed). The results of the search will be shown in the top graphics and as highlights in the text

Genomic sequence fragment with annotation in horizontal orientation

Displaying detailed information

A click on a gene, marker or other object displays detailed information. It includes associated sequences, orthologs, all locations of the feature and other properties. Various types of gene sequences are generated on the fly from genomic sequence and available exon and CDS coordinates.

For maps, map sets or tracks, the properties are available on the right mouse click – see Properties menu item.

Results of TBLASTN search

BLAST

Real-time BLAST provides quick similarity search for a protein or genomic region. Interactive graphical output of BLAST results facilitates analysis of the matches. Click Alignment tab to see which parts of the query sequence produce high-scoring  pairs (HSP). If you prefer looking at long text strings, you can also view the raw BLAST output.

Run BLAST

Open Arabidopsis thaliana Chr.1 map and zoom to any gene in a CDS track. The Gene Details form will show three tabs with sequences. Let’s take a protein sequence. Each sequence tab has a button Run BLAST. It is a shortcut that will transfer the sequence to BLAST form. Click this button. Choose TBLASTN as a program and find Arabidopsis thaliana in the list of available map sets.  (We know for sure that the selected protein was generated from this map set, so we expect to see a good match). Please note, you can take BLAST command line parameters under control, if you check the “Override arguments” check box.

Click Run button to start the search which usually takes a few seconds.

The results of the search are presented in a graphical form. If you would like to come to this results later, use Save Results button on the bottom of the form.

RNA-seq values for cold treatment experiment on Arabidopsis thaliana

Visualize quantitative tracks

Numerical values plotted vs. genomic coordinates are shown as quantitative tracks. Persephone can show charts with one nucleotide resolution. RNA-seq data is a good example of values shown in quantitative tracks.

The values in the quantitative tracks can be normalized based on the maximum value per entire chromosome or on the values in the visible part of the track. Please see Tools/Settings/General display tab. It has a check box “Dynamically scale BedGraph values to viewport”.

Importing an Excel file

You can display your tracks without loading them to the database

Persephone will recognize Excel files (.xlsx, .csv) with mapped features.

Quantitative tracks can be imported as bedgraph files.

You can also visualize SNPs from VCF files.

The import procedure is initiated by dragging & dropping the files onto the main Persephone screen.

Create a new track with with markers from Excel file

Download a sample Excel file from http://persephonesoft.com/clickonce/sampleFiles/2tracks.xlsx

It has 5 columns:

map name, feature name, start position, end position (optional) and track name (optional).

Drag and drop this file onto the main Persephone screen (Map tab) – make sure that you close Excel before the procedure, otherwise the locked file will not be recognized.

A form will open suggesting to select a map set where the tracks will appear. Note, that the map name in the file should match the names in the database.

Select Sorghum bicolor Sbi1 from the map set tree. The coordinates in the graphics on the bottom will be adjusted according to the selected map. Click the chromosome graphical element in the bottom panel and press Show Selected Maps button. The map with the newly added tracks will be shown.

The external files, imported in this way, will be listed in the map set tree under External Data node, so that you can always recall your data without going outside Persephone.

 

 

Search rules

The search for Persephone objects is partitioned into several sections: search for a marker, an annotation, a QTL or a map. The search keyword can be entered as the whole word or as a partial token with the wild card (*). For example, your search criteria could be a full name of a marker, like ‘txp71‘ or a name mask ‘txp*‘ which should find markers with the name that starts with ‘txp‘. Using the wild card is sometimes important, as, for example, the gene names quite often include extra numbers specifying the splice variant (At1g01010.1). In this case, to find the transcript by gene name you will need to enter  the name with the asterisk: ‘Atg01010*‘.

In general, most of Persephone objects, besides required attributes like name for markers or genomic location for genes, have optional properties, called qualifiers. Some of them have special meaning, like gene name or functional annotation. So, you can search through all available qualifiers or, if you are searching for a keyword in gene functions only, you can limit the search space by selecting ‘Gene Function’ in the search options.

Besides using the QTLs tab in the Search form, which is suitable for a quick search for QTLs by name, we have a separate tab (QTLs) with QTL browser in the main screen.

Please note, an additional way to display information about a list of objects is available via the main menu item Tools/Get details for the feature list. This interface will allow you to compose and export the output in a format that contains only those fields that are needed.

Search for genes with specific function

Sometimes the function can be described by different but similar words. For example, to search for genes related to transposons you can use a keyword with the wild card: ‘transpos*‘, which should detect the genes that have ‘transposon’, ‘transposase,’ or ‘transposable element’ in their description.

Select Annotation tab in the Search form. Choose ‘Gene Function’ radio-button in the Options section. Type ‘transpos*’ in the text box. Click the button near the Map Set selection to specify “Tomato SL2.40” map set. This will narrow down the search results to the selected map set. Click Search. The results will be shown in the grid and in the graphics below.

SNPs tab

The results of resequencing, normally provided in a form of VCF files, can be loaded into the system and shown in the SNPs tab. The data for genotyping samples can reside in the database, or dynamically loaded from the external VCF files provided by the user. The sample information coming from the database or from the files will be listed in the same grid that opens up once you select a map set of interest from the drop-down box.

Each of the samples has the Name, number of SNPs and optionally can have extra properties useful for making selection. These properties are listed on the right and can be chosen to be included into the grid where they can be sorted or filtered.

Add several samples of interest by selecting the corresponding row in the grid and pressing the “Add” button.

Each added sample will create a sub-track in the graphics part of the page showing the SNPs in three colors: blue – same base as the reference genome, red – all alleles are different from the reference, green – heterozygous calls. Choose a chromosome from the list above. Select a coloring mode. By default, the colors used to draw SNPs follow the schema we have just described. There are two more modes of using the colors.

Reference genotype: one of the samples can be nominated as the reference. Accordingly, all SNPs that are identical to the SNPs in that genotype will be shown in blue color, and red will show where they are different. To enable this mode drag one of the sample rows to the box that says “Drop parent 1 here”.

Two parents: similarly, instead of one reference genotype, you can use two samples. In that case, the red color will show variants identical to the base calls in the second genotype. Enable it by dragging and dropping another sample row from  the grid to the box saying “Drop parent 2 here”.

The grid with the chosen samples can be used to sort the samples which will affect the order of the sub-tracks in the track.

The values in the selected properties can be used to color the background of the sub-tracks. To do that, select the radio button under the column with the property values. If there are no more than 6 distinct values, each value will be assigned a separate color (controlled via the right click on the radio-button). If there are more than 6 distinct values, the range of the values will be shown as gray gradient.

Here we selected 16 samples from the 3,000 rice genomes. We color the sub-tracks by the variety group, that has 4 values. Selecting the radio button below this column assigns distinct colors to different variety groups:

Now, when the coloring options are selected, a click on the chromosome with sub-tracks in the bottom part will open this map in the vertical view where you can zoom in to see the individual SNPs and continue with the analysis in the horizontal view.