EpiDiff: a computational tool for Epigenetic Difference Quantification and Identification by Entropy
   
Tutorials and FAQ about QDCMR

   
  Workflow  
  Tutorial  
  FAQ  
 
^Top
^Top
QDCMR, as the first module in EpiDiff, provides a quantitative approach to quantify chromatin modification difference and identify DCMRs from genome-wide chromatin modification profiles by adapting Shannon entropy. Its platform-free and species-free nature makes it easy for computational biologists to analysis the epigenetic regulation related with DCMRs across various temporal and spatial chromatin modifications.

This tutorial contains the following sections:
Describes how to start Local and online QDCMR.
Describes how to preprocess and import chromatin modification data.
Describes how to quantify chromatin modification difference across various samples.
Identify DCMRs Describes how to identify DCMRs by threshold imbedded in QDCMR.
Measure Specificity Describes how to measure sample-specificity for each DCMRs.
Export Results Describes how to save results.
Visualization Describes how to display chromatin modification level, DCMR distribution and UCSC links.
Getting Started ^Tutorial ^Top
This section describes how to start online and local QDCMR.

Before start QDCMR, user should start EpiDiff using two ways:
1. Start EpiDiff directly via Java Web Start, using the Java Web Start feature (advantage: no installation required);
2. Download EpiDiff installation package, install and run it locally on your machine (advantage: no network required).

When first started, QDCMR displays the home page.

Import Data ^Tutorial ^Top
There are three ways to load data and start your analysis:

  1) Loading chromatin modification data by clicking the "Import Data" button on the left panel.
  2) Loading chromatin modification data via "File->Import chromatin modification Data".
  3) Loading chromatin modification data by shortcut keys "Ctrl+I".

QDCMR provides the visual interface to import data by which users can import two types of data.

One is the data processed by users in which each region have chromatin modification levels in each samples.


i There are two ways to get your data file:

1.Input or paste the Absolute Path of the data file in the text field of "Input File".
2.Click the "Browser" button to select the data file.

ii Important Notices:
1.It is suggested to refer the example data by click the "example Data 1 or 2" button before import your own data.
2.Information about the regions of interest should be before the chromatin modification data for the region.
3.You can tune the start column of chromatin modification data.
4.Select the Check Box named "Transfer first row as column names" if you want column names as your first row.
5.Column names will be generated automatically if you don't check the box named "Transfer first row as column names".
6.You must ensure that there are no missing values in your data.
7.The first 20 rows of data will be shown in the data file preview window as any change take places.

Before import data, please make sure that the data is save in txt or xls file as the format as shown in this example data:

Now Import chromatin modification Data from Txt or Xls File:

Another one is the raw ChIP-Seq data by the panel "Region and raw data". The details of input parameters as show following:

Before import data, please make sure that the data is save in the right format. The region file should be saved as txt or xls file as the format as shown in this example data:

And the raw files including chromatin modification reads which has been aligned to genome, can be save in bed format, and slao can be txt.gz format which is widely used in ChIP-Seq data. What's more, these files can be put in a file folder or a compressed file (Zip or like). The examples for this two format can be found in following.

The first exmaple is for the bed files in a file folder.

The second exmaple is for the txt.gz files in a compressed file (.zip).


Now Import chromatin modification Data.

Take the Example1 as example. User can selcet the files he wants to analysize. Then click the button "Process" on the bottom.

QDCMR will do the alinement between the regions submitted by user and the chromatin modification reads. This progess may be time-consuming due to the pair-pair alinement between ChIP-Seq reads and regions submitted by user.

Import Region of Interested and raw Chromatin Modification by ChIP-Seq
i Two data are needed in this module: Region of Interested and raw Chromatin Modification by ChIP-Seq.
ii There are two ways to get your data file:

1.Input or paste the Absolute Path of the data file in the text field of "Input File".
2.Click the "Browser" button to select the data file.
iii Important Notices:
1.It is suggested to refer the example data by click the "example Data 1 or 2" button before import your own data.
2.Information about the regions of interest can be defined by users.
3.Please make sure that the first row of region file is column names.
4.Please select the species you need and the corresponding columns for chromosome, region start, and region end.
5.The raw Chromatin Modification used in current version of QDCMR in only that produced by ChIP-Seq.
6.The raw Chromatin Modification in Bed or Gz format should be provided in a file or a compressed file (ZIP or the like).
7.Please select the Chromatin Modification File in the file tree on the right panel.
8.You must ensure that there are no missing values in your data.
9.The first 20 rows of each data above will be shown in the data file preview window as any change take places.

When the alinement is completed, the software will turn to the interface as following:

After confirmation, click Import button to import data. The following interface will be shown a little while.

Chromatin modification data has been successfully loaded and shown in the data table. The first column named "RegionID" is generated automatically as the ID for each region in QDCMR.
chromatin modification levels in the first row were shown in RegionMethyView acquiescently.
Next you can do each of the following operations:
  1) Click a row to view the chromatin modification level across samples and set the image properties by right click.
  2) Double-click a row to view the region information in the UCSC Genome Browser.
  3) Click "Quantify Difference" button to quantify chromatin modification difference by calculating entropy for all regions.
Quantify Difference ^Tutorial ^Top
Click "Quantify Difference" button to quantify chromatin modification difference by calculating entropy for all regions.
The following interface about quantified chromatin modification difference will be shown when the progress bar reaches 100%.

The chromatin modification difference has been quantified and shown in the entropy table. The first few columns contain the region information. The column named "Entropy" contains the entropy for each region. The last columns are the raw chromatin modification data for each region imported by user.

You can do each of the following operations:
  1) Click a row to view the chromatin modification level across samples and set the image properties by right click.
  2) Double-click a row to view the region information in the UCSC Genome Browser.
  3) Save entropy and raw data via "File->Save Analysis Result->Entropy Table".
  4) Click the "Identify DCMRs" button to identify DCMRs and N-DCMRs for your analysis.
Identify DCMRs ^Tutorial ^Top
Click the "Identify DCMRs" button to identify DCMRs and N-DCMRs for your analysis.

The interface for user to set DCMR threshold will be provided as following:

Based on the quantified chromatin modification difference among multiple sample, the thresholds for identification of DCMR are obtained as Schug et al. did in selecting tissue-specifically expressed genes from gene expression profiles [1]. Here, the log base 2 of the fold change between replicate-dependent difference from the average level across replicates and the theoretical maximum range of chromatin modification was assumed as a normal distribution with mean equal to zero and a standard deviation (s). Setting s =0.25, the mean modification intensity was sampled from the distribution of observed mean chromatin modification intensities, and 5000 regions with uniform chromatin modification across 11 samples were modeled. Then the sampled entropy values were calculated by QDCMR. These 5000 entropy values follows a normal distribution in which a threshold was determined at p = 0.05 (one-sided). This process was repeated 10 times, and the mean value of 10 thresholds defined as the reference threshold for identification of DCMRs. The similar discription can be found in our QDMR paper [2].
[1]Schug J, Schuller WP, Kappen C, Salbaum JM, Bucan M, Stoeckert CJ, Jr.: Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol 2005, 6:R33. [Abstract] [PDF]
[2] Zhang Y, Liu H, Lv J, Xiao X, Zhu J, Liu X, Su J, Li X, Wu Q, Wang F, Cui Y: QDMR: a quantitative method for identification of differentially methylated regions by entropy. Nucleic Acids Res 2011, 39:e58. [Abstract] [PDF]




QDCMR will identify DCMRs by DCMR threshold. QDCMR will identify DCMRs from imported regions by the threshold defined by user. The following interface about DCMRs will be shown when the progress bar reaches 100%.

DCMRs and N-DCMRs have been identified and shown in the DCMR and N-DCMR table, respectively. DCMR table contains regions owning entropy less than the DCMR threshold. N-DCMR table contains regions owning entropy greater than the N-DCMR threshold.. Statistics table contains the statistics of regions. And DCMR distribution in chromosomes can be shown by clicking the button "DCMR Distribution".



Next you can do each of the following operations:

  1) Click a row to view the chromatin modification level across samples and set the image properties by right click.
  2) Double-click a row to view the region information in the UCSC Genome Browser.
  3) Save DCMRs and N-DCMRs via "File->Save Analysis Result->DCMR or N-DCMR Table".
  4) See the statistics information and DCMR distribution in chromosomes in Statistics Table.
  5) Click the "Measure Specificity" button to measure the sample specificity for all DCMRs.
Measure Specificity ^Tutorial ^Top
Click the "Measure Specificity" button to measure the sample specificity for all DCMRs. QDCMR will calculate the categorical specificity for each region in each sample. The following interface about sample-specificity of each DCMR will be shown when the progress bar reaches 100%.

The measurement of sample specificity has been finished and shown in the Specificity Table. The first few columns contain the region information. The column named "Entropy" contains the entropy for each region. The columns named as "CS_" contain the specificity of each region in every sample. The last columns are the raw chromatin modification data for each region imported by user.

Next you can do each of the following operations:
  1) Click a row to view the chromatin modification level across samples and set the image properties by right click.
  2) Double-click a row to view the region information in the UCSC Genome Browser.
  3) Save Specificity Table via "File->Save Analysis Result->Specificity Table".
  4) Save All Results by clicking "Export All Results" or via "File->Save Analysis Result->All Results".
Export Results ^Tutorial ^Top
Save All Results by clicking "Export All Results" or via "File->Save Analysis Result->All Results". The results will be saved in a file containing Entropy Table, DCMR Table, N-DCMR Table, Specificity Table and Statistics. And then you can analysis the DCMRs and N-DCMRs in the biological process in which you are interested.

Visualization ^Tutorial ^Top
First, in order to increase the visual effect of chromatin modification data, QDCMR provides visualization module RegionMethyView. RegionMethyView shows the chromatin modification levels across various samples and DCMR distribution on each chromosome. User can reset and save the figure by right clicking.

Second, in order to show the information near each genome region in user's data, QDCMR provides the link to UCSC Genome Browser. User can view the genome features near each region, such as gene, miRNA, SNP, GC percent, CpG island, and histone modifications et al. This view will facilitate the study between DCMR and other regulatory elements in the genome.

^Top
1. Public Server/General
  1) Where can I find the hardware and software prerequisites for QDCMR?
  A. The hardware requirements, supported operating systems, and supported browsers is shown in Download site.
  2) How can I get help with QDCMR or provide feedback?
  A. The tutorial fully describes QDCMR and how to use it.
To provide feedback or ask a question please do not hesitate to contact Hongbo Liu (hongbo919@gmail.com) or Yan Zhang (yanyou1225@yahoo.com.cn).

2. Useage

  1) How can I find QDCMR in EpiDiff?
  A. QDCMR is embedded in EpiDiff as the second module. QDCMR service can be found as tutorial about getting started.
  2) How the threshold of DCMR in QDCMR is made?
  A.

QDCMR is a quantitative approach to quantify chromatin modification difference and identify DCMRs from genome-wide chromatin modification profiles by adapting Shannon entropy. For more details can refer to a previous paper by us listed following.
Based on the quantified chromatin modification difference among multiple sample, the thresholds for identification of DCMR are obtained as Schug et al. did in selecting tissue-specifically expressed genes from gene expression profiles [1]. Here, the log base 2 of the fold change between replicate-dependent difference from the average level across replicates and the theoretical maximum range of chromatin modification was assumed as a normal distribution with mean equal to zero and a standard deviation (s). Setting s =0.25, the mean modification intensity was sampled from the distribution of observed mean chromatin modification intensities, and 5000 regions with uniform chromatin modification across 11 samples were modeled. Then the sampled entropy values were calculated by QDCMR. These 5000 entropy values follows a normal distribution in which a threshold was determined at p = 0.05 (one-sided). This process was repeated 10 times, and the mean value of 10 thresholds defined as the reference threshold for identification of DCMRs. The similar discription can be found in our QDMR paper [2].
[1]Schug J, Schuller WP, Kappen C, Salbaum JM, Bucan M, Stoeckert CJ, Jr.: Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol 2005, 6:R33. [Abstract] [PDF]
[2] Zhang Y, Liu H, Lv J, Xiao X, Zhu J, Liu X, Su J, Li X, Wu Q, Wang F, Cui Y: QDMR: a quantitative method for identification of differentially methylated regions by entropy. Nucleic Acids Res 2011, 39:e58. [Abstract] [PDF]


3.Data Formats
  1) What file format dose QDCMR support?
  A. Currently, QDCMR can supports Txt or Xls file. In the latter version, more file formats will be supported.
  2) Where can I find information about file formats used by QDCMR?
  A. Information on file formats supported by the modules currently in QDCMR is available in Tutorial.


4.Other
If you have any trouble or recommendations, please do not hesitate to contact Hongbo Liu (hongbo919@gmail.com) or Yan Zhang (yanyou1225@yahoo.com.cn).

Our Lab:Group of computational epigenetic research
CopyRight © Group of Computational Epigenetic Research
College of Bioinformatics Science and Technology, Harbin Medical University, China
Our college website (Chinese Version)
Recommended Browser: Mozilla Firefox (1024*768)
free counters