Mouse Developmental Methylome Database

Tutorial for usage of DevMouse
Content
  Introduction for DevMouse database.
  Describes the data processing and statistics for current DevMouse database.
  Describes the workflow of DevMouse database.
  Describes how to search the methylation pattern of genes/regions in mouse devlopment.
  Describes how to use tools in DevMouse to analyze DNA methylation dynamics and functions.
  Describes how to view methylation patterns during mouse development in genome context.
  Describes how to Sava and Download data/results from DevMouse.
  Describes how to submit new methylation data about mouse development to DevMouse.
  Describes the design and implementation of DevMouse, and hardware and software support.
Introduction Top

DNA methylation undergoes dynamic changes during mouse development, and plays crucial roles in embryogenesis, cell-lineage determination, and genomic imprinting. As bisulfite sequencing enables profiling of mouse developmental methylomes on an unprecedented scale, however, experimental biologists face challenges in integrating and mining these data. Recently, high-throughput technologies combining bisulfite conversion and second-generation sequencing, such as RRBS and BS-Seq, have been used to map the genome-wide DNA methylation profiles in cells/tissues from different development phases. The integration and depth mining of these data in a time sequence should be benefit for studies on gene regulation related with dynamic DNA methylation during development from a global perspective. Hence, there is a need to design a specialized and comprehensive database which holds methylomes during mouse development temporal order. Thus, we collect the methylation data scattered in multiple labs and build an integrated database to restore these important data and share with biologists.

Therefore, we developed the mouse developmental methylome database (DevMouse) which focuses on the efficient storage of DNA methylomes in temporal order and quantitative analysis of methylation dynamics during mouse development. The latest release of DevMouse incorporates thirty-two normalized and temporally-ordered methylomes across fifteen developmental stages and related genome information. A flexible query engine is developed for acquisition of methylation profiles for genes, miRNAs, LncRNAs, genomic intervals of interest across selected developmental stages. To facilitate in-depth mining of these profiles, DevMouse offers online analysis tools for quantification of methylation variation, identification of differentially methylated genes, hierarchical clustering, gene function annotation and enrichment. Moreover, a configurable MethyBrowser is provided to view the base-resolution methylomes under genomic context. All the search and analysis results can be downloaded or saved as figures or tables for further analysis. In brief, DevMouse hosts comprehensive mouse developmental methylome data and provides online tools to explore the relationships of DNA methylation and development. We hope our continuous efforts in the database will contribute to the understanding of epigenetic regulation in mouse development and modeling human development and disease.

Data Statistics Top

DevMouse was designed to store high-throughput DNA methylation data during mouse development in temporal order. The current version of DevMouse consists of thirty-two single base DNA methylomes across fifteen mouse developmental stages which were collected from public DNA methylation resources, and genome information (genes, microRNAs, lincRNAs, CpG islands and others) obtained from public genome databases.

These methylomes are profiled by RRBS or BS-Seq which are the next-generation sequencing technologies combined with bisulfite conversion. In these methylomes, methylated cytosine can be distinguished from unmethylated cytosine by presence of a cytosine versus thymine residue during sequencing. The proportion of methylated cytosine is treated as the methylation level which ranges from 0% representing unmethylaed cytosine to 100% fully methylated cytosine. All methylation data were subsequently normalized and standardized into a consistent interval (0–100%) according the same procedure. Before being finally stored, the methylomes from assemblies other than the UCSC July 2007 mouse reference sequence (mm9, NCBI build 37) were converted into mm9 by the LiftOver tool from UCSC. All data available can be downloaded from the download page which lists detailed information about the methylomes including experiment name, experimental technology, cell/tissue type, developmental stage, sex, author information, download links and external database links.

The data content and statistics in DevMouse as of 1 August 2013 are listed as following:

Workflow of DevMouse Top

Based on these high-throughput cytosine methylomes, DevMouse provides the basic operations, search, analysis, view and download. A flexible query engine is provided for acquisition and investigation of the methylation profiles of genes/regions of interest. Powerful analysis tools written in Java facilitate in-depth mining of novel knowledge about DNA methylation and development. Moreover, the methylation information and novel findings can be viewed using visualization modules based Apache Batik SVG toolkit, and can be downloaded before exiting the browser.

Search Methylation for Genes/Regions Top

DevMouse is designed so that users can learn everything they need to know about DNA methylation dynamics during mouse development. Both single gene/region and multiple genes/regions are also supported. DevMouse supports users to search the methylation patterns of various genome items such as genes, microRNAs, lncRNAs, CpG islands, and other genome regions, which should benefit broad researchers focusing on molecular biology from genes to specific genome regions. If you want to search for a gene/region list, just copy-and-paste the identifiers or search terms into the box, separated by carriage returns. View the sample queries to see how you can execute wildcard searches.

In addition, you should define the developmental stages which you are interested in before submit and search. For example, a whole development progress of a specific tissue or organ from gametes, zygote, pre-implantation, post-implantation, after birth. Three development progress examples are provided for learning how you can execute wildcard searches. The first example is the whole development progress of Primordial Germ Cells (PGC) of male mouse. The second one is for the development of Primordial Germ Cells (PGC) of female mouse. And the last one is for the whole development progress of Astrocyte.

Genes: The search toolkits in the homepage of DevMouse can be used to acquire the methylation states for any given genes of interest across multiple developmental stages. The first step in the process is finding those genes in DevMouse. To make it as easy as possible, we've indexed most of the major gene identifiers of mouse, including gene symbol, Refseq gene, MGI ID, etc. For a gene longer than 500bp, the methylation state is calculated as the mean methylation level of the cytosines located in the promoter region from upstream 2kb to downstream 500bp of transcriptional start site. And the methylation state of a gene shorter than 500bp is calculated as the mean methylation level of the cytosines located in the whole gene region. When you search by gene symbol, please keep in mind that symbols and aliases are not unique in the databases so you may see multiple gene entities returned from which you can select the transcripts of interest as shown in following figure.

Non-coding RNAs: Recent studies have revealed the essential roles of non-coding RNAs in regulating development and differentiation. Thus, DevMouse also allow users to search and view the dynamic methylation of non-coding RNAs such as microRNAs and lncRNAs. For a non-coding RNA longer than 500bp, the methylation state is calculated as the mean methylation level of the cytosines located in the promoter region from upstream 2kb to downstream 500bp of transcriptional start site. And the methylation state of a non-coding RNA shorter than 500bp is calculated as the mean methylation level of the cytosines located in the whole gene region.

Genome regions: You can also search by genome location by specifying a chromosomal position. Sometimes, users are interested in non-gene regions such as CpG islands,differentially methylated regions, imprinting control regions, DNA methylation valleys, and their own defined regions. The methylation state of a genome region in a developmental stage is calculated as the mean methylation level of the cytosines located in the whole region.

The search result is displayed by default as an overview table that summarizes the methylation profiles of genes across multiple developmental stages as well as gene information and chromosomal location. The table contains links to a methylation pattern panel in which the methylation profile of a selected gene can be viewed, and links to the MethyBrowser in which user can view the methylation profile as well as genomic information, according to the specified view parameters. The whole query result data can be downloaded to local from the download links in the overview panel.

In brief, we think all these search options offer every user an easy way to find the methylation pattern of genes/regions that they're looking for.

Analysis Tools for DNA Methylation Top

DNA methylation is highly dynamic during mouse development, and hypermethylation of gene promoter inhibits a few genes including developmental genes at specific developmental stage . However, few genes were documented to exhibit differential promoter methylation during a whole development process due to the less availability of DNA methylation data. As bisulfite sequencing enables profiling of mouse developmental methylomes on an unprecedented scale, some experimental biologists with limited bioinformatics experience face challenges in integrating and mining these data.

To facilitate integrative analysis of methylomes, DevMouse offers online tools for quantitative analysis as shown in following figure.


(i) Entropy-based quantification of methylation variation
An entropy-based approach, QDMR, is integrated to quantify methylation variation of a gene across multiple developmental stages, with lower entropy indicating greater methylation variation as described in our previous study. Based on the sorted entropy, user can analyze the quantitative interplay among DNA methylation and chromatin modifications. Also the entropy values can be used to investigate the quantitative roles of DNA methylation in regulation gene expression.

(ii) Identification of differentially methylated genes/regions
Based on the quantified methylation difference by entropy, the genes/regions with great variation across multiple developmental stages can be identified. The genes/regions with entropy lower than a threshold are identified as differentially methylated genes/regions which are marked by red color in the result table. The differentially methylated genes/regions may be the potential biological marks for specific developmental stage of mouse. The regulation mechanisms related with these genes induced by DNA methylation can be verified by further ‘wet’ experiment.

(iii) Hierarchical clustering analysis of methylation profiles
Hierarchical Clustering facilitates users to view the methylation pattern of various genes/regions across multiple samples. Euclidean distance is used as distance measure among genes/regions or development stages. The clustering result is shown in heat map at the bottom and can be downloaded to local computer for further analysis. In the heat map, rows represent the genes/regions, and columns the development stages. The heat map can be used to identify the module in which the methylation values are similar with each other. The hierarchical clustering analysis tool can be used to study the methylation similarity among genes or stages, and identify the genes with similar methylation patterns across developmental stages. The genes with similar methylation patterns across developmental stages may be share similar regulation mechanisms and functions.

(iv) Gene function annotation and enrichment
FunctionAnnotation integrates the functional annotation tool of DAVID Bioinformatics Resources 6.7. It can be used to analyze the functions of genes with methylation data across multiple stages. The functional annotation results are shown in the table at the bottom and can be downloaded to local computer for further analysis. The functions annotated include Gene Name, Species, SP_COMMENT_TYPE, SP_PIR_KEYWORDS, UP_SEQ_FEATURE, GOTERM_BP_FAT, GOTERM_CC_FAT, GOTERM_MF_FAT, GENERIF_SUMMARY, KEGG_PATHWAY, INTERPRO. FunctionEnrichment can be used to carry out gene function enrichment analysis of gene sets. The functional enrichment results are shown in the table at the bottom and can be downloaded to local computer for further analysis.

All of analysis results by these tools can be downloaded as in figures or tables for further analysis. One merit of these tools is that they are highly automatic for analyzing given genes, facilitating specific analysis focusing on identification of functional gene set such as potentially novel developmental genes.
MethyBrowser for View of Methylation Top

A configurable methylation browser, MethyBrowser, is developed using the Apache Batik SVG toolkit for users to view the methylomes and genome information simultaneously. The DevMouse home page provides access to MethyBrowser. To get started, click the MethyBrowser link on the top navigation bar. This will take you to a MethyBrowser page where you can search gene/region to display. To get oriented in using the MethyBrowser, try viewing a gene or region of the genome with which you are already familiar, or use the default position.
The search mechanism embedded in MethyBrowser is not a site-wide search engine. Instead, it primarily searches gene symbols (for example Pou5f1) and RefSeq mRNAs (NM_013633). Searches on genome locations are also supported.

The structure of MethyBrowser page
The MethyBrowser tracks page displays a genome location with five main features: a set of navigation controls, a chromosome ideogram, the annotations tracks image, display configuration buttons, and a set of track display controls. The application default values are used to configure the data tracks display. You can customize the annotation tracks display to suit your needs by manipulating the navigation, configuration and display controls.

Zooming in and zoom out
Click the zoom in and zoom out buttons at the top of the MethyBrowser page to zoom in or out on the center of the annotation tracks window by 1.5, 3 or 10-fold. Alternatively, you can zoom in 3-fold on the display by clicking anywhere on the Base Position track. In this case, the zoom is centered on the coordinate of the mouse click. It should be noted that base sequence only is shown when the region shorter than 150bp.

DNA methylation track
The genome MethyBrowser connecting to a MySQL backend is used to show the methylation profiles of imported gene/region across multiple developmental stages at single-base precision. A color gradient from green (methylation value=0%) to red (methylation value=100%) is used to display the numeric methylation states of the cytosines. User can select the developmental stages which they want to view by a set of track display controls at the bottom.

Genomic information track
MethyBrowser also visualizes other available genomic annotations including chromosome ideogram, base sequence, gene structure, CpG island, lncRNA, miRNA, transcription factor binding site (TFBS), gene expression, single-nucleotide polymorphism (SNP), repeat elements, sequence tagged site (STS) along the mouse draft assembly, and the alignment in human genome (hg19) of this region. The explanation for these genomic annotations are listed as following.


Chromosome ideogram: chromosome ideogram shows the relative location of genome region in current window on the chromsome.
Base sequence: MethyBrowser provides the base sequence for user to examine the Cytosines and related sequence information. It should be noted that base sequence only is shown when the region shorter than 150bp.
Gene structure: This track shows the Refseq genes and their structures obtained from UCSC table Browser. For a Refseq gene, the window accurately displays the width of exons and introns, and indicates the direction of transcription (using arrowheads) for multi-exon features. At a grosser scale, certain features - such as thin exons - may disappear.
CpG island: MethyBrowser shows the CpG islands which are enriched with CpG dinucleotides in mammal species. The CpG islands shown here are predicted by Gardiner-Garden’s method (Gardiner-Garden M et al. (1987)), and Su’s CpG_MI (Jianzhong Su et al. (2010)), respectively.
lncRNAs: The lncRNAs are obtained from MGI database (June 2013).
miRNAs: The lncRNAs are obtained from miRBase (Release 20: June 2013).
Transcription factor binding site (TFBS): This track displays the binding sites of nine ttranscription factors (Dax1, Klf4, Myc, Nac1, Nanog, Oct4, Rex1, Sox2, Zfp281) in Embryonic Stem Cell which are profiled by Jonghwan Kim et al. (Jonghwan Kim et al. (2008))
Gene expression: This track shows expression data from the GNF Gene Expression Atlas 2. This contains two replicates each of 61 mouse tissues run over Affymetrix microarrays. By default, averages of related tissues are shown. As is standard with microarray data red indicates overexpression in the tissue, and green indicates underexpression.
Single-nucleotide polymorphism (SNP): This track contains information about single nucleotide polymorphisms from dbSNP build 128.
Repeat elements: This track shows the repeat elements of mouse genome which are downloaded from UCSC Table Browser.
Sequence tagged site (STS): This track shows locations of Sequence Tagged Sites (STS) along the mouse draft assembly which is download from UCSC Table Browser.
The alignment in human genome (hg19) of this region: The chain track shows alignments of human (Feb. 2009 (GRCh37/hg19)) to the mouse genome using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. The links to the alignment in human genome (hg19) may be helpful for exploring the potential roles of orthologous genes/regions in human genome.

Save MethyBrowser as PDF Figure
By the “Save as PDF” button, the browser graphic can be save as PDF images which can be printed with Acrobat Reader and edited by many drawing programs such as Adobe Illustrator. Features of MethyBrowser include the ability to view the region by specifying the genomic coordinates, to zoom and move the given region, as well as to show and hide certain feature tracks by mouse-click configuring. The links to MethyBrowser are also available in the search result page.

Download data and Save results from DevMouse Top

To facilitate further analysis of mouse developmental methylomes, DevMouse provides download links to all the available data in this database. The data provided in the download page include the normalized DNA methylation data by us. All the data are listed with detailed experimental information including experimental technology, sample source, developmental stages, sex of the sample mouse, author information. Moreover, the links for corresponding raw data in public data resources such as Gene Expression Omnibus (GEO) and DNA Data Bank of Japan (DDBJ) are also provided for users who are interested in the raw data.

In addition, all the search and analysis results can be downloaded as figures or tables for further analysis. By the “Download” button in the search result page, user can save the differentially methylated genes/regions, uniformly methylated gene/regions, and all genes with methylation profiles. By the “Download” button in the analysis panel, user can save the analysis results easily. By the “Save as PDF” button, the browser graphic can be save as PDF images which can be printed with Acrobat Reader and edited by many drawing programs such as Adobe Illustrator.

Submit New Data for DevMouse Top
The current version of DevMouse is the first release of our database. Although it contains a wealth of development-specific DNA methylomes in mouse which will be of great usefulness both for experimental and bioinformatics researchers, the available data and functionality are still limited.Aiming to build a DNA methylome database focusing on the mouse development, continued efforts will be made to update the DevMouse data, add more methylation analysis tools, and improve the functionality of database and MethyBrowser. As the rapid profiling of DNA methylomes in more and more samples based on high-throughput bisulfite sequencing, we will continuously collect the latest datasets in different developmental stages of mouse to keep DevMouse up-to-date.

We would like to invite and encourage the scientific community to submit their methylation data about mouse development in order to make DevMouse updating and comprehensive. If you would like to submit your methylation data about mouse development to DevMouse, you can send us the data information such as Pubmed ID, Data Link,Development stages,Technology, E-mail and other comment, by our submit page. These information also can be send to Hongbo Liu (hongbo919@gmail.com).

Implementation and software support Top

DevMouse was constructed based on three major software components: an Apache Tomcat web server, a MySQL relational database, and Java-based computational services. The backstage processing programs were written in Java, which are available upon request. The web services were developed using Apache Struts2, a Java web application framework and iBATIS, a persistence framework which automates the mapping between MySQL databases and objects in Java, both of which help guarantee the high-performance and stability of the web services. Browser-based interfaces were built in JSP and AJAX. The Apache Batik SVG toolkit was used to render, generate, and manipulate the Scalable Vector Graphics (SVG) dynamically. Currently, web browsers which can display SVG images on web pages include Firefox, Internet Explorer (IE 9+), Opera, Safari and Google Chrome. Among these web browsers, Firefox is recommended for the most favorable experience.

All of these design concepts allow users to access all of the key features in DevMouse through major modern web browsers and mobile device but without any other software installed.

Recommended Browser: Mozilla Firefox (1024*768)