|Examples & Tutorial|
VCF fileThe gateway to RegulationSpotter is our Query Engine. Here, you can upload any standard VCF file for analysis with RegulationSpotter. Note that the chromosomal positions have to relate to GRCh37.
Analysis settingsIn our QueryEngine interface, you can determine the following properties:
SEARCH FOR HOMOZYGOUS VARIANTS
Tick this box if you want to only consider homozygous variants in your analysis.
FILTER AGAINST 1000G
Enter a threshold for viltering variants out. The default is 4 for homozygous and 20 for heterozygous cases:
Enter the minimum value for your variant's coverage. The default is 4.
ANALYSE THE FOLLOWING REGIONS OR GENES
Select whether you wish to analyse the entire VCF file, or custom regions or genes.
This page will appear after RegulationSpotter is done with its analysis. You will find a synopsis of your analysis here as well as the option to filter and sort your results.
The project ID allocated by RegulationSpotter. If you want to easily access your results later on, you can note it now. Please don't alter or delete it as RegulationSpotter needs it!
SynopsisMost often, RegulationSpotter will not analyse each and every line of your VCF file, either because you have set certain filters, or because certain variants were not suitable for analysis with RegulationSpotter. The synopsis gives you an idea of how your analysis went.
Number of alterations (lines) in VCF file.
Number of variants which were filtered out according to user input (below coverage, not homozygous, out of specified region / chromosome) or due to input / format errors (e.g. variant equals refseq, reference allele equals alternative allele, Indel is too long or neither genotype nor frequency are supplied).
Number of variants which were suitable for analysis. These can be significantly more than the lines in the VCF, because sometimes one line in the VCF contains more than one alternative allele. Additionally, if you choose to combine neighbouring variants, the number will rise even more.
Number of variants ignored for analysis due to presence in 1000 Genomes Project (applies only if one or both of the two filter against TGP options are set).
Total number of variants which were analysed with RegulationSpotter.
INTRAGENIC VARIANTS ANALYSED WITH MT
Number of analysable variants which were intragenic and thus analysed with MutationTaster.
Number of variants which were analysed with MutationTaster. These will normally be significantly more than the analysable variants, because for most variants more than one (suitable) transcript will be found.
TYPE OF AMINO ACID EXCHANGE (AAE)
Gives the number of observed amino acid exchanges of one of the three types used in MutationTaster. One type is for alterations that do not cause any amino acid exchange (without_aee), one for simple substitutions (simple_aae), and one for those changes that cause more complex changes in the aa sequence of the resulting peptide, such as a frameshift or a shifted start ATG (complex_aae).
Gives an overview of the predictions generated by MutationTaster. Only applies to intragenic cases. The four options are:
More information on the classifications can be found in MutationTaster's documentation.
For your convenience, we display here the options you chose in the last step.
Display settingsTo make your analyses as convenient as possible, RegulationSpotter offers great flexibility in the results display.
Results table - dataUpon displaying your results, RegulationSpotter gives you a summary table of your results. Here, you can find each variant together with crucial information suche as the gene it is associated with, the type of alteration etc.
CHR, POSITION, REF, ALT
Information on the location and nature of the alteration.
Likely effect of the alteration. Depending on whether the variant is considered to be intragenic or extragenic, the options are:
intragenic variantsdisease causing (ClinVar): known disease mutation listed in ClinVar.
extragenic variantslikely regulatory variant: Due to the available data, RegulationSpotter considers it likely for the variant to have a regulatory function.
Information on the availability of the variant in genetic frequency databases (dbSNP, 1000G)
Results table - colour-coded matrixThe second part of the results table is displayed in a colour-coded matrix. For various properties, each column gives an indication on the severity of the alteration and on its likelihood to be located in a regulatory region. Less transparency signifies a higher indication for a regulatory function/functional impact.
MOST SEVERE RESULT
Most severe RegulationSpotter result for all available transcripts, will be used for sorting.
RegulationSpotter score for the variant
Type of alteration: Single nucleotide variant (SNV), Insertion/Deletion (InDel). InDels can be long (>10 bp) or short.
RegulationSpotter score for the variant
NMD / PTC / frameshift /truncated
Indicates whether the variant is a highly deleterious one, e.g. leading to nonsense-mediated decay (NMD), premature termination codon (PTC), frameshift or truncation
AMINO ACID SUBSTITUTION(S)
Displays whether an amino acid exchange occurs.
WITHIN PROTEIN DOMAINS
Indicates whether the variant is located within a protein domain.
Indicates whether the variant leads to the alteration of a splice site.
KOZAK SEQUENCE ALTERED
Indicates whether the variant leads to the alteration of a Kozak sequence.
POLY-A SIGNAL CHANGED
Indicates whether the variant leads to the alteration of a poly-A signal.
miRNA BINDING SITE
Indicates whether the variant leads to the alteration of a miRNA binding site.
Indicates whether the variant is located within a site listed in FANTOM5/VISTA data.
MULTICELL REGULATORY FEATURE
Indicates whether the variant is located within a multicell regulatory feature from Ensembl.
Indicates whether the variant is located within a promoter (-500bp/+50bp around a TSS).
Indicates whether the variant is located within a H3K4me3 positive region indicative for active transcription.
DNAse1 HYPERSENSITIVE SITE
Indicates whether the variant is located within a DNAse1 hypersensitive site indicative for active transcription (ENCODE/Ensembl).
Indicates whether the variant is located within a transcription factor binding site (TFBS) from Ensembl.
Indicates whether the variant is located within a genomic interaction site according to Rao et al 
PHYLOP /PHASTCONS MAX
Indicates the highest PhyloP and PhastCons scores, respectively.
Indicates the scaled CADD score for the alteration.
Indicates the variant frequency in dbSNP and 1000G. Unknown/rare alleles are marked with a bright red colour.
ResultLikely effect of an alteration. RegulationSpotter treats alterations differently depending on whether they are located within a gene or not. For intragenic alterations, it relies on MutationTaster, which classifies an alteration as one of four possible types:
Alteration (phys. location)The alteration on "physical" i.e. chromosomal level (e.g. chr7:91623937_91623938insGGCAAT).
Alteration typeIs either a base exchange, a combination of insertion and deletion, an insertion or a deletion.
Alteration regionExtragenic by definition.
Known variantAny known polymorphism(s) or known disease variant that have been found at the position in question. Our database contains all single nucleotide polymorphisms (SNPs) from the NCBI SNP database (dbSNP). Moreover, we have stored all HapMap genotype frequencies as well as variants from the 1000 Genomes Project  (TGP). If an alteration is located at the same position as a known dbSNP, MutationTaster provides the SNP ID (or rs ID) and a link together with the HapMap genotype frequencies, if available. If every of the three possible genotypes is observed in at least one HapMap population, the alteration is automatically regarded as a polymorphism and predicted as polymorphism automatic. Please note that there may be differences between your alteration and the alleles in dbSNP.
ENSEMBL multicell regulatory featuresIndicates whether the alteration is located within an ENSEMBL multicell regulatory feature.
Regulatory features from VISTA and FANTOM5Regulatory data from Ensembl Regulatory build, b37, pblished in  (FANTOM5) and  (VISTA).
TarBase miRNA binding sitesMicroRNA binding sites which are affected by the alteration as annotated in DIANA TarBase [2;5].
PhyloP/PhastConsIndicates the conservation of the alteration site. Data from phyloP  and PhastCons .
ChromosomeThe chromosome the alteration is located on.
StrandIs either 1 for forward strand or -1 for reverse strand
Chromosomal positionGives the last wild-type base before alteration and first wild-type base after alteration in chromosomal sequence context (position relative to start of chromosomal reference sequence) e.g. 154,372,337 / 154,372,339, the altered base is at position 154,372,338.
Original chrDNA sequence snippetOriginal DNA sequence with the original nucleotide marked in blue.
Altered chrDNA sequence snippetAltered DNA sequence with the original nucleotide marked in blue.