22
Hands-On Training in Methylation Sequencing Analysis Time Topic Speaker 09:00~10:30 WGBS Dataset QC & Mapping 張張張 張張 l 張張張張張張Human IMR90 張張 H1 WGBS Dataset l 張張 short read QC l meth-pipe 張張 張張張 張張張 張張張 (、、) l 張張 meth-pipe 張張 short read mapping 張 mapped raw data 張張 l 張張 bisulfide conversion rate l 張張 HMR / DMR / PMD l 張張 BED 張張張張張 10:45~11:00 Q & A / 休休 11:00~12:30 Basic WGBS Analysis and UCSC Genome Browser 張張張 張張 l BEDtools 張張張 l BEDtools 張張(一): HMR / DMR / PMD 張 Chromosome X 張張張張 l BEDtools 張張 張張 張 (): CpG Island DNA methylation 張張 l BEDtools 張張 張張 張 張張 (): male (H1) & female (IMR90) Chromosome X 張張 methylation 張張 l R 張張 ggplot2 package 張張張張張 methylation 張張 l UCSC Genome Browser 張張張張張張張 l UCSC Genome Browser 張張 MethBase Public TrackHub 張張張張張張 12:30~13:00 Q & A / 休休 This slide is available in 1

Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

Embed Size (px)

Citation preview

Page 1: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

1

Hands-On Training in Methylation Sequencing Analysis

Time Topic Speaker

09:00~10:30 WGBS Dataset QC & Mapping 張益峰 博士

  l   測試資料簡介( Human IMR90 以及  H1 的WGBS Dataset)

  l   執行  short read QC

  l   meth-pipe 簡介(網站、下載、安裝)

  l   執行  meth-pipe 進行  short read

mapping 和 mapped raw data 處理  l   計算  bisulfide conversion rate

  l   計算  HMR / DMR / PMD

  l   產生  BED 格式的檔案10:45~11:00 Q & A / 休息

11:00~12:30Basic WGBS Analysis and UCSC Genome Browser 林依璿 博士

  l   BEDtools 的簡介

  l   BEDtools  實做(一): HMR / DMR /

PMD 在 Chromosome X 上的分佈

  l   BEDtools  實做(二): CpG

Island 的  DNA methylation 計算

 l   BEDtools  實做(三):比較  male (H1) &

female (IMR90) 在  Chromosome X 上的methylation 差異

  l   用  R 以及  ggplot2 package 呈現以上的methylation 差異

  l   UCSC Genome Browser 介紹和使用操作

  l   UCSC Genome Browser 裡面的  MethBase

Public TrackHub 介紹和使用操作

12:30~13:00 Q & A / 午餐This slide is available in http://www.slideshare.net/YiFengChang

Page 2: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

2

Required Software in Your Laptop

• Linux console• Putty:

http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe

• SCP/SFTP/FTP client• Winscp:

http://winscp.net/download/winscp556.zip

• PDF viewer• http://get.adobe.com/tw/reader/

This slide is available in http://www.slideshare.net/YiFengChang

Page 3: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

3

NCHC ALPS1 for NRPB Users• Login node: alps1.nchc.org.tw

• Computing nodes• 48 (we can use 4) x 48 cores 128GB RAM • 1 x 64 cores 1TB RAM

• Storages• Users home: 200GB (Temp Account: 1GB)• /work3: 42TB• /work5: 200TB

This slide is available in http://www.slideshare.net/YiFengChang

Queue Name Test[註一 ] 4G 16G 48G 128G 192core 384core 1T

使用記憶體上限 (GB)

2 4 16 48 100 100/48core 100/48core 1024

使用核心數上限  1 2 8 24 48 192 384 64

建議核心使用上限 1 1 6 20 40 192 384 60

工作優先權 [註二 ] 90 85 80 50 30 20 20 10

什麼是 queue?Queue 就是針對工作所設立的虛擬的運算單元,一個  queue 可以負擔一個運算工作。如果一臺機器被指定兩個  queue,意味這台機器可能同時運行兩個工作。Queue 的設計是用來管理計算資源 。

•所有的工作必須透過  queuing system 執行•目前提供 Test, 4G, 16G, 48G, 128G,192core,384core, 1T 共有 五 種  queue•工作優先順序以 Test> 4G>16G>48G>128G>192core>384core> 1T•記憶體使用大於上限者,將會被中斷運算•可先使用 1T queue試運算,決定記憶體用量後再選擇使用何種  queue•使用 queue時的限制與規則如下 :

 註一 , test queue一人只能送一個 job,執行時間為 10分鐘註二 ,數字愈大優先權愈高 網中心提供之 Queuing system國網中心提供  IBM Load Sharing Facility (LSF)  

目前可以使用的 Queue

Page 4: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

4

Submit a Jobs00yao25@alps1:~> cd

s00yao25@alps1:~> bsub -q 4G -o stdout -e stderr "ls"

Job <422673> is submitted to queue <4G>.

s00yao25@alps1:~> cat stdout

s00yao25@alps1:~> bjobs

JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME

422673 s00yao2 RUN 4G alps1 2*alps1-25 ls Dec 18 23:16

s00yao25@alps1:~> bjobs

No unfinished job found

Page 5: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

5

Check Job Status

Page 6: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

6

Kill a Job

>bkill JOBID

# chang queue priority of pending jobs

> btop

Page 7: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

7

Effect and Problems of Bisulfite Treatment of DNA

Krueger, F., Kreck, B., Franke, A. & Andrews, S.R. DNA methylome analysis using short bisulfite sequencing data. Nat Methods 9, 145-51 (2012).

Mapping bisulfite reads to 4 possible bisulfite strands (OT/CTOT/OB/CTOB) is equivalent to mapping the bisulfite read and its reverse complementary read to both Top/Bottom strands of the original reference sequence.

Page 8: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

8

How to Align BS Reads Against Reference Genome?

Krueger, F. & Andrews, S.R. Bismark: A flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics (2011).

Bock, C. Analysing and interpreting DNA methylation data. Nat Rev Genet 13, 705-19 (2012)

Y=C or T

TCGA TCGT ACGTATGA

Multiple hits

TTGT ATGT

Multiple hits

TCGA ATGA

Page 9: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

9

Why RMAPBS/RMAPBS-PE

http://smithlabresearch.org/manuals/rmap_manual.pdf

Page 10: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

10

Analysis Pipeline

Allele-specific Methylated Regionsamrfinder allelicmeth

Differential Methylation Regiondmr

Large Hypo/Hyper-Methylation Domainspmd

Hypo/Hyper-Methylation Regionshmr hyperhmr pmr

Methylation Callingmethcounts + error correction

Bisulfite Conversion Ratebsrate

Remove Duplicate Readsduplicate-remover

Mappingrmapbs rmapbs-pe

Quality Trimmingfastq_masker

Cross-species Comparison of MethylomesliftOver

Calculating Methylation Ratio for RegionsbigWigAverageOverBed roimethstat Bwtools

Generate Methylation BED fileBedtools bedGraphToBigWig

fastx toolkit: http://hannonlab.cshl.edu/fastx_toolkit/ MethPipe: http://smithlabresearch.org/software/methpipe/

Bedtools: https://github.com/arq5x/bedtools2Programs from UCSC Genome Browser: http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64bwtool: https://github.com/CRG-Barcelona/bwtool/wiki

Page 11: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

11

H1 (male): human embryonic stem cells (107GB)IMR90 (female): fetal lung fibroblasts (154GB)

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE16256

Page 12: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

12

Convert SRA to Fastq (DO NOT RUN)sra-toolkit: https://github.com/ncbi/sratoolkit

> fastq-dump --split-3 SRR018975.sra

> ls

SRR018975.fastq

Page 13: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

13

Quality Trimming (DO NOT RUN)

#e.g. SRR018975.fastq.gz

> for f in *.gz; # read all gzip files one by one

do

b=`basename $f .gz`; # SRR018975.fastq

echo $f

bsub -q 4G -o $f.stdout -e $f.stderr "\

gzip -dc $f|\ # read gzip file

fastq_masker -q 30 -Q33|\ # mask low quality reads as Ns

split -dl 6000000 - $b- "; # split fastq file into smaller ones

done

> ls

SRR018975.fastq-00

SRR018975.fastq-01

SRR018975.fastq-02

Page 14: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

14

Mappingbsub -q 4G -o rmapbs.stdout -e rmapbs.stderr "\/work3/NRPB1219/bin/rmapbs-pe \-c /work3/NRPB1219/methpipe-data/data/genome \-o /home/s00yao00/Output/test.mr \-m 3 -L 400 -C AGATCGGAAGAGC:GCTCTTCCGATCT /work3/NRPB1219/methpipe-data/data/snippet_1.fq \/work3/NRPB1219/methpipe-data/data/snippet_2.fq"#head /home/s00yao00/Output/test.mrchr22 379487 379588 FREDDYKRUEGER_0001:1:1:160:969#0/2 2 + AAGTAATATATATGTTTTGGGATTAGTGAATTTAAGTTAGTATTAAGAATTTTATTATTATTTTTTTATTATATTTTAGAGAGTTATTTTTTTATTTTTAA B_c\\^cbbPfbfbLc]][[Xcb\cc`cbd`bbc\bdZUdbfOffffdffdSSTTQbZbbb`\df[bbbbbdffYffffffbbbZbZX^\^a\fffffcffchr22 379487 379588 FREDDYKRUEGER_0001:1:1:160:969#0/1 3 + AAGTAATATTTATGTTTTGGGATTAGTGAATTTAAGTTAGTATTATGAATTTTATTATTATTTTTTTATTATATTTTAGAGAGTTATTTTTTTCTTTTTAA `dd``addddbcUcccdcfffaf_ZdddOVM_[bOZdb`dbffZ`Ob\Obffaeff^ffdfffffffaffdWf_bdebdaU\[^[ffa_f_b`OUK^`UZTchr22 568970 569071 FREDDYKRUEGER_0001:1:1:249:1215#0/2 0 - TTTTATTTGATGGATAATATTAAGAAATTTGTAGTATTGTTTTGGAATTTTTTGTGAGGGATAAATAAATAGAATATAGTAGTATTGTTTTATAATTTTTT BcXe^cecccgbabfadadcf^cgggegTggefgeggggggggggggggggggggggggegaggg^gggbggggggggggggegggggggggfggfcdgggchr22 310957 311058 FREDDYKRUEGER_0001:1:1:303:856#0/2 4 + TTGTAATTATGTTGATTTTATGTGTAGTTATTGATGTTGTTGTATGGTAGTTTATGGTTTTTTAGGAATTTAGAATTTGAGTTTTATTTTTGTTTTATAGT Z]\]`\W[J`]fdPcefbf^fgg\gggWcggaeaedSSOQT\cggdgcgggcdgffdeaccaddcadfbacfaffcaaecadggbdggggcgcgggggccgchr22 568970 569071 FREDDYKRUEGER_0001:1:1:249:1215#0/1 2 + AAAGGATTGTAGAATAGTGTTATTGTGTTTTGTTTGTTTGTTTTTTATAAAGGATTTTAGAGTAGTGTTGTAGATTTTTTAGTGTTGTTTATTAGATAGGT ggagcfggdgggbegffgfgggggggggfgggggggggggggfggggfcf^d^Pfggggebafbbgfgfgge^^ggggggccggfegefggfggeP\^ZW[chr22 581983 582084 FREDDYKRUEGER_0001:1:1:359:1280#0/2 2 + GAGGTAATTTAGAGTGTTGTTTTTGGTTTTTGAGGGTTTGTTTTTTAAATAGGATATTATTATATTGTTACGATAGTTTGAATGTTTGTTCGTGATAAATG Bffeeegg_fgeegggeeeeagcggggfgaggggggggggggggggggggfgffdeggggggfgfgggffgggffgggggfg\gfgggggbgbggggggcfchr22 310918 311019 FREDDYKRUEGER_0001:1:1:303:856#0/1 0 + TATATAGAGTTAGGTTTTATAGTTTATTTTTTTATTATTTTGTAATTTTGTTAATTTTATGTTTAGTTATTGATGTTTTTGTATGGTAGTTTATGGTTTTT faf_f_ggeggcg^ggggbgdggggcgggggggdggdggggggg^ggggggdg[ggggdggcgdccggagggNggcdgggcbYddTcKRcde[ddYgcgggchr22 581975 582076 FREDDYKRUEGER_0001:1:1:359:1280#0/1 1 + TATTTTATAAGGTAATTTAGAGTGTTGTTTTTGGTTTTTGAGGGTTTGTTTTTTAAATAGGATATTATTATATTGTTACGATAGTTTGAATGTTTGTTCGT hggggfggghggegggggcg]gggggggggggggggggggefgeeggggggfggggfgggceggggggegggggggfeaeag\gdgggWgegegecgegfgchr22 578161 578262 FREDDYKRUEGER_0001:1:1:871:393#0/2 3 + TTAGAATAGTGTTGTTTGTATTCGAGTGTTTGTTTTTTATATAGGATTATAGAATATTTTTACGAGTGTTCGAATGTTTGTTTTTTAGATAGGATTTTGGA ^ac]eL]]R\_Jee\eLegggVbggeeeeVeefceegaggggggggefgdgggggfggggdgggggggggggggggggggegegggggggfggggggggggchr22 578102 578203 FREDDYKRUEGER_0001:1:1:871:393#0/1 0 + AAGAAAGGATTTTAGAATAGTGTTGTTGTGTTTTGAGTGTTTGTTATTTTTGAATGATTTTAGATTATTGTTGTTGGTATTCGAGTGTTTGTTTTTTATAT cce^aVeaaecdeefO\adfegcgg_gdggagggghgggggdggfbgggggbg_fgWgeggWgLgggggeggcgdePa`fff\afOe`_egggdg_g_gae

Page 15: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

15

Sorting mr file

bsub -q 16G -o stdout -e stderr "\

LC_ALL=C sort -S 14G -k 1,1 -k 2,2n -k 3,3n -k 6,6 \

-o /work3/s00yao25/h1.chrX.mr.sorted_start \

/work3/NRPB1219/h1.chrX.mr"

Page 16: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

16

Remove Duplicates

export PATH=$PATH:/pkg/biology/methpipe/methpipe-3.3.1/bin/

bsub -q 16G -o stdout -e stderr "\

duplicate-remover -S /work3/s00yao25/h1.chrX_dremove_stat.txt \

-o /work3/s00yao25/h1.chrX.mr.dremove \

/work3/s00yao25/h1.chrX.mr.sorted_start "

Successfully completed.

Resource usage summary:

CPU time : 167.80 sec.

Max Processes : 3

Max Threads : 4

TOTAL READS IN: 24350707GOOD BASES IN: 1987943796TOTAL READS OUT: 22884736GOOD BASES OUT: 1867152730DUPLICATES REMOVED: 1465971READS WITH DUPLICATES: 1219174

Page 17: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

17

Estimating bisulfite conversion ratebsub -q 16G -o stdout -e stderr "\

bsrate -c /work3/NRPB1219/hg18 \

-o /home/s00yao25/Output/h1.chrX.bsrate \

/work3/s00yao25/h1.chrX.mr.dremove"

# head –n 16 /home/s00yao25/Output/h1.chrX.bsrateOVERALL CONVERSION RATE = 0.980192POS CONVERSION RATE = 0.980204 96942555NEG CONVERSION RATE = 0.980179 96821402BASE PTOT PCONV PRATE NTOT NCONV NRATE BTHTOT BTHCONV BTHRATE ERR ALL ERRRATE1 1798190 1762518 0.98016 1796291 1760655 0.98016 3594481 3523173 0.98016 36327 3630808 0.010012 1654252 1617801 0.97797 1649805 1613025 0.97771 3304057 3230826 0.97784 41299 3345356 0.012353 1646403 1615036 0.98095 1644710 1613525 0.98104 3291113 3228561 0.98099 48231 3339344 0.014444 1699787 1666286 0.98029 1695105 1662078 0.98052 3394892 3328364 0.98040 50697 3445589 0.014715 1663363 1631006 0.98055 1658397 1626045 0.98049 3321760 3257051 0.98052 52464 3374224 0.015556 1720978 1687130 0.98033 1716036 1682351 0.98037 3437014 3369481 0.98035 45366 3482380 0.013037 1677561 1644979 0.98058 1677119 1644343 0.98046 3354680 3289322 0.98052 53873 3408553 0.015818 1714426 1681206 0.98062 1714378 1681339 0.98073 3428804 3362545 0.98068 34491 3463295 0.009969 1702891 1668424 0.97976 1700092 1665742 0.97980 3402983 3334166 0.97978 34861 3437844 0.0101410 1681522 1648092 0.98012 1680471 1647068 0.98012 3361993 3295160 0.98012 45776 3407769 0.0134311 1664207 1631036 0.98007 1664386 1631083 0.97999 3328593 3262119 0.98003 46055 3374648 0.0136512 1651326 1618334 0.98002 1649370 1616514 0.98008 3300696 3234848 0.98005 44139 3344835 0.01320

Page 18: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

18

Computing single-site methylation levels

# sorting… again

bsub -q 16G -o stdout -e stderr "\

LC_ALL=C sort -S 14G -k 1,1 -k 3,3n -k 2,2n -k 6,6 \

-o /work3/s00yao25/h1.chrX.mr.sorted_end_first \

/work3/s00yao25/h1.chrX.mr.dremove"

# methylation calling

bsub -q 16G -o stdout -e stderr "\

methcounts -c /work3/NRPB1219/hg18 \

-o /work3/s00yao25/h1.chrX.meth \

/work3/s00yao25/h1.chrX.mr.sorted_end_first"

#extract CpG sites

bsub -q 16G -o stdout -e stderr "\

symmetric-cpgs \

-o /work3/s00yao25/h1.chrX_CpG.meth h1.chrX.meth"

chrX 0 + CHH 0 0chrX 4 + CHH 0 0chrX 5 + CHH 0 0chrX 6 + CHH 0 0chrX 10 + CHH 0 0chrX 11 + CHH 0 0chrX 12 + CHH 0 0chrX 16 + CHH 0 0chrX 17 + CHH 0 0chrX 18 + CHH 0 0

chrX 152 + CpG 0 0chrX 232 + CpG 0 0chrX 330 + CpG 0 0chrX 334 + CpG 0 0chrX 336 + CpG 0 0chrX 364 + CpG 0 0chrX 366 + CpG 0 0chrX 374 + CpG 0 0chrX 376 + CpG 0 0

Page 19: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

19

Computation of methylation level statistics

bsub -q 16G -o stdout -e stderr "\

levels -o /home/s00yao25/Output/h1.chrX.levels \

/work3/s00yao25/h1.chrX.meth"

Page 20: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

20

Hypomethylated (hmr), hypermethylated (hypermr), and partial methylated (pmr) regions

bsub -q 16G -o stdout -e stderr "\

hmr -o /work3/s00yao25/h1.chrX.hmr /work3/s00yao25/h1.chrX_CpG.meth"

bsub -q 16G -o stdout -e stderr "\

hmr -partial -o /work3/s00yao25/h1.chrX.pmr /work3/s00yao25/h1.chrX_CpG.meth"

bsub -q 16G -o stdout -e stderr "\

pmd -o /work3/s00yao25/h1.chrX.pmd /work3/s00yao25/h1.chrX_CpG.meth"

chrX 2727656 2728600 HYPO0 18 +chrX 2731108 2731952 HYPO1 14 +chrX 2732390 2733303 HYPO2 23 +chrX 2740632 2740962 HYPO3 9 +chrX 2756524 2758153 HYPO4 139 +chrX 2817685 2817980 HYPO5 8 +chrX 2855757 2857708 HYPO6 127 +chrX 2890571 2890884 HYPO7 9 +chrX 3004371 3004626 HYPO8 9 +chrX 3238227 3238677 HYPO9 9 +

Page 21: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

21

Differential Methylation Analysisbsub -q 16G -o stdout -e stderr "\

methdiff -o /work3/s00yao25/h1.imr90.chrX.methdiff /work3/NRPB1219/h1.chrX_CpG.meth /work3/NRPB1219/imr90.chrX_CpG.meth"

chrX 2709681 + CpG 0.749276 7 2 12 7chrX 2709727 + CpG 0.917633 4 1 9 12chrX 2709774 + CpG 0.894737 3 1 6 10chrX 2709871 + CpG 0.742424 0 16 0 48chrX 2709890 + CpG 0.857575 3 20 3 47chrX 2709982 + CpG 0.999354 10 2 7 19chrX 2710014 + CpG 0.704043 3 6 3 10chrX 2710023 + CpG 0.600782 4 3 4 4chrX 2710146 + CpG 0.523077 1 2 8 14chrX 2710155 + CpG 0.234026 3 3 17 9

Probability Un-meth meth Un-meth meth

Page 22: Hands-On Training in Methylation Sequencing Analysis TimeTopicSpeaker 09:00~10:30WGBS Dataset QC & Mapping 張益峰 博士 測試資料簡介( Human IMR90 以及 H1 的 WGBS Dataset

22

Differential methylated region (DMR)bsub -q 16G -o stdout -e stderr "\

dmr /work3/s00yao25/h1.imr90.chrX.methdiff /work3/NRPB1219/h1.chrX.hmr /work3/NRPB1219/imr90.chrX.hmr DMR_h1_lt_imr90 DMR_imr90_lt_h1"

==> DMR_h1_lt_imr90 <==chrX 2727656 2728600 X:18 10 +chrX 2731108 2731952 X:15 4 +chrX 2732390 2733303 X:37 8 +chrX 2740632 2740962 X:9 0 +chrX 2758131 2758153 X:3 0 +chrX 2817685 2817980 X:9 0 +chrX 2855757 2855890 X:1 1 +chrX 2890571 2890884 X:9 4 +chrX 3004371 3004626 X:9 0 +chrX 3238227 3238677 X:24 0 +

==> DMR_imr90_lt_h1 <==chrX 2825454 2826947 X:37 17 +chrX 2857708 2857760 X:2 0 +chrX 3272822 3273033 X:13 3 +chrX 3275527 3275594 X:1 0 +chrX 3287038 3289160 X:36 9 +chrX 3643168 3643374 X:7 0 +chrX 4016033 4022054 X:47 29 +chrX 4028369 4042000 X:79 54 +chrX 4051286 4059878 X:52 39 +chrX 4079778 4087714 X:45 26 +

Number of significant differential methylated CpG