show me some regulation...

Regul@tionSpotter

Documentation

Documentation
Input Output Synopsis Contact
Examples
Examples & Tutorial

Analysing a VCF file

Input

VCF file

The gateway to RegulationSpotter is our Query Engine. Here, you can upload any standard VCF file for analysis with RegulationSpotter. Note that the chromosomal positions have to relate to GRCh37. For a tutorial on how to use the Query engine please click here.

Analysis settings

In our QueryEngine interface, you can determine the following properties:

RECESSIVE TRAIT

Tick this box if you want to only consider homozygous variants in your analysis.

FILTER POLYMORPHISMS

Enter a threshold for discarding variants as polymorphisms using the frequency observed in 1000G and ExAC. The default filtering is 4 for homozygous in 1000G and 10 for homozygous in ExAC. It is also possible to filter for variants present in any form (heterozygous and homozygous) as defined in the second row in case of non-recessive traits. Set both values to zero if you do not wish any filtering. All values refer to the number of individuals with the specific allele setup.

MINIMUM COVERAGE

Enter the minimum value for your variant's coverage. The default is 4.

ANALYSE THE FOLLOWING REGIONS OR GENES

Select whether you wish to analyse the entire VCF file, or custom regions or genes.

Synopsis and display settings

This page will appear after RegulationSpotter is done with its analysis. You will find a synopsis of your analysis here as well as the option to filter and sort your results.

PROJECT ID

The project ID allocated by RegulationSpotter. If you want to easily access your results later on, you can note it now. Please don't alter or delete it as RegulationSpotter needs it!

Synopsis

Most often, RegulationSpotter will not analyse each and every line of your VCF file, either because you have set certain filters, or because certain variants were not suitable for analysis with RegulationSpotter. The synopsis gives you an idea of how your analysis went.

SUBMITTED VARIANTS

Number of alterations (lines) in VCF file.

DISCARDED BEFORE ANALYSIS

Number of variants which were filtered out according to user input (below coverage, not homozygous, out of specified region / chromosome) or due to input / format errors (e.g. variant equals refseq, reference allele equals alternative allele, Indel is too long or neither genotype nor frequency are supplied).

ALL ANALYSED VARIANTS

Number of variants which were suitable for analysis. These can be significantly more than the lines in the VCF, because sometimes one line in the VCF contains more than one alternative allele. Additionally, if you choose to combine neighbouring variants, the number will rise even more.

VARIANTS MAPPED TO A GENE AND ANALYSED WITH MT

Number of analyses which were done by MutationTaster. These will normally be significantly more than the analysable variants, because for most variants more than one (suitable) transcript will be found.

TYPE OF AMINO ACID EXCHANGE (AAE)

Gives the number of observed amino acid exchanges of one of the three types used in MutationTaster. One type is for alterations that do not cause any amino acid exchange (without_aee), one for simple substitutions (simple_aae) and one for those changes that cause more complex changes in the aa sequence of the resulting peptide, such as a frameshift or a shifted start ATG (complex_aae). "Type n/a" indicates annotation problems in MT.

PREDICTION

Gives an overview of the predictions generated by MutationTaster. Only applies to intragenic cases. The four options are:

More information on the classifications can be found in MutationTaster's documentation.

Analysis options

For your convenience, we display here the options you chose in the last step.

Display settings

To make your analyses as convenient as possible, RegulationSpotter offers great flexibility in the results display. Finally, you can decide whether you would like to download or display your results.

Output: Overview

here should be the table of results

Screenshot of the results overview output of RegulationSpotter.To see the results overview in detail click here.



The different elements of the output are named and described below.

Results table - data

Upon displaying your results, RegulationSpotter gives you a summary table of your results. Here, you can find each variant together with crucial information such as the gene it is associated with, the type of alteration etc.
This table serves to give you a quick graphical overview on each variant and its effect. Affected regulatory features are indicated in a colour-coded fashion. The stronger a colour is, the more affected a feature might be.

CHR, POSITION, REF, ALT

Information on the location and nature of the alteration.

RESULT

Likely effect of the alteration. Depending on whether the variant is considered to be intragenic or extragenic, the options are:

intragenic variants

disease causing (ClinVar): known disease mutation listed in ClinVar.
disease causing: predicted by MutationTaster as disease causing.
polymorphism: predicted by MutationTaster as harmless.
polymorphism automatic: known to be harmless from databases.

More information on the classifications can be found in MutationTaster's documentation.

extragenic variants

likely regulatory variant: Due to the available data, RegulationSpotter considers it likely for the variant to have a regulatory function.
possible regulatory variant: Due to the available data, RegulationSpotter considers it possible for the variant to have a regulatory function.
polymorphism: Due to the available data, RegulationSpotter considers the variant to be not located in a regulatory region.

VARIANT FREQUENCY

Information on the availability of the variant in genetic frequency databases (dbSNP, 1000G)

Results table - colour-coded matrix

The second part of the results table is displayed in a colour-coded matrix. For various properties, each column gives an indication on the severity of the alteration and on its likelihood to be located in a regulatory region. Less transparency signifies a higher indication for a regulatory function/functional impact.

MOST SEVERE RESULT

Most severe RegulationSpotter result for all available transcripts, will be used for sorting.

XSCORE

RegulationSpotter score for the variant. The score integrates all found evidence for the functionality of the variant. Higher values indicate a higher probability of functionality. To place the score of a specific variant within the range of possible scores and for a description of the calculation of the score please see our statistics section (need link for statistics).

TYPE

Type of alteration: Single nucleotide variant (SNV), Insertion/Deletion (InDel). InDels can be long (>10 bp) or short.

INTRAGENIC VARIANT

Indicates whether the variant is also located within a gene.

NMD / PTC / frameshift /truncated

Indicates whether the variant is a highly deleterious one, e.g. leading to nonsense-mediated decay (NMD), premature termination codon (PTC), frameshift or truncation

AMINO ACID SUBSTITUTION(S)

Displays whether an amino acid exchange occurs.

WITHIN PROTEIN DOMAINS

Indicates whether the variant is located within a protein domain.

ALTERED SPLICING

Indicates whether the variant leads to the alteration of a splice site.

KOZAK SEQUENCE ALTERED

Indicates whether the variant leads to the alteration of a Kozak sequence.

POLY-A SIGNAL CHANGED

Indicates whether the variant leads to the alteration of a poly-A signal.

miRNA BINDING SITE

Indicates whether the variant leads to the alteration of a miRNA binding site.

FANTOM5/VISTA

Indicates whether the variant is located within a site listed in FANTOM5/VISTA data.

MULTICELL REGULATORY FEATURE

Indicates whether the variant is located within a multicell regulatory feature from Ensembl.

WITHIN PROMOTER

Indicates whether the variant is located within a promoter (-500bp/+50bp around a TSS).

H3K4me3 POSITIVE

Indicates whether the variant is located within a H3K4me3 positive region indicative for active transcription.

DNAse1 HYPERSENSITIVE SITE

Indicates whether the variant is located within a DNAse1 hypersensitive site indicative for active transcription (ENCODE/Ensembl).

WITHIN TFBS

Indicates whether the variant is located within a transcription factor binding site (TFBS) from Ensembl.

GENOMIC INTERATION

Indicates whether the variant is located within a genomic interaction site according to Rao et al [1]

PHYLOP /PHASTCONS MAX

Indicates the highest PhyloP and PhastCons scores, respectively.

CADD (SCALED)

Indicates the scaled CADD score for the alteration.

VARIANT FREQUENCY

Indicates the variant frequency in dbSNP and 1000G. Unknown/rare alleles are marked with a bright red colour.

Output: detailed

Clicking on the blue "extragenic results" oder "intragenic results" link of a variant leads you to more detailed insight into the results for a single variant.

For intragenic alterations and known disease causing variants, you will be redirected you to our conventional MutationTaster output. More information can be found in the MutationTaster documentation.

For the detailed explanation of an extragenic result please visit the single query documentation, where you can also find an explanation of the interaction plot.

Contact

In case you discover bugs, have suggestions or questions, please write an e-mail to
Jana Marie Schwarz (jana-marie.schwarz AT charite.de) or to
Dominik Seelow
(dominik.seelow AT charite.de).
We also appreciate hearing about your general experiences using RegulationSpotter.

References

[1] Rao SS, Huntley MH, Durand NC, Stamenova EK et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 2014. PMID: 25497547

[2] Zerbino DR, Wilder SP, Johnson N, Huettemann T, Flicek PR. The Ensembl Regulatory Build. Genome Biology 2015. PMID: 25887522

[3] 1000 Genomes Project Consortium: An integrated map of genetic variation from 1,092 human genomes. Nature 2012 Nov 1. PMID: 23128226

[4] Visel A, Minovitsky S, Dubchak I, Pennacchio LA. VISTA Enhancer Browser - a database of tissue-specific human enhancers. Nucleic Acids Res. 2007. PMID: 17130149

[5] Vlachos IS, Paraskevopoulou MD, Karagkouni D et al. DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions. Nucleic Acids Res. 2014. PMID: 25416803

[6] Pollard KS, Hubisz MJ, Siepel A. Detection of non-neutral substitution rates on mammalian phylogenies. Genome Res. 2009. PMID: 19858363

[7] Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005. PMID: 16024819