Duchenne Muscular Dystrophy (DMD) - coding DNA reference sequence

(used for mutation description)

(last modified November 10, 2009)


NOTE: to match the current reference human genome sequence the DMD reference sequence was updated recently (Aug. 2009). As a consequence nucleotide numbering in some introns has changed considerably. We are still in the process of checking the old descriptions and where required change them to the new numbering (the previous reference sequence).

This file was created to facilitate the description of sequence variants in the DMD gene based on a coding DNA reference sequence following the HGVS recommendations. The sequence was taken from NG_012232.1. The DMD gene contains several additional promoter/exon 1 sequences located upstream (Dp427c), in intron 1 (Dp427p), intron 29 (Dp260), intron 44 (Dp140), intron 55 (Dp116) and intron 62 (Dp70/Dp40). The Dp40 transcript ends in intron 70.
NOTE: the ancient dystrophin, lacking exon 78, encodes a protein that is has a different, longer C-terminal end. Consequently, variants up to nucleotide c.*86 may affect the protein.

January 1, 2003 the coding DNA Reference Sequence was introduced, replacing the older reference sequence. This new reference sequence was based on GenBank file NM_004006.1 (with one difference 12505G>A), containing the Dp427m isoform (muscle) of dystrophin. The gene flanking and intronic sequences were derived from a range of GenBank files (see Genomic reference sequence of the DMD gene).

Please note that introns are available by clicking on the exon numbers above the sequence.


 (upstream sequence)
                                                                    g.133057
                                                         tcct       c.-241

 .         .         .         .         .         .                g.133117
 ggcatcagttactgtgttgactcactcagtgttgggatcactcactttccccctacagga       c.-181

 .         .         .         .         .         .                g.133177
 ctcagatctgggaggcaattaccttcggagaaaaacgaataggaaaaactgaagtgttac       c.-121

 .         .         .         .         .         .                g.133237
 tttttttaaagctgctgaagtttgttggtttctcattgtttttaagcctactggagcaat       c.-61

 .         .         .         .         .         .                g.133297
 aaagtttgaagaacttttaccaggttttttttatcgctgccttgatatacacttttcaaa       c.-1

          .         .         .  | 02      .         .         .    g.324438
 ATGCTTTGGTGGGAAGAAGTAGAGGACTGTT | ATGAAAGAGAAGATGTTCAAAAGAAAACA    c.60
 M  L  W  W  E  E  V  E  D  C  Y |   E  R  E  D  V  Q  K  K  T      p.20
                                   ^ alternative starts Dp427c, Dp427p

          .         .         .    | 03    .         .         .    g.494816
 TTCACAAAATGGGTAAATGCACAATTTTCTAAG | TTTGGGAAGCAGCATATTGAGAACCTC    c.120
 F  T  K  W  V  N  A  Q  F  S  K   | F  G  K  Q  H  I  E  N  L      p.40

          .         .         .         .         .         .       g.494876
 TTCAGTGACCTACAGGATGGGAGGCGCCTCCTAGACCTCCTCGAAGGCCTGACAGGGCAA       c.180
 F  S  D  L  Q  D  G  R  R  L  L  D  L  L  E  G  L  T  G  Q         p.60

        | 04 .         .         .         .         .         .    g.499803
 AAACTG | CCAAAAGAAAAAGGATCCACAAGAGTTCATGCCCTGAACAATGTCAACAAGGCA    c.240
 K  L   | P  K  E  K  G  S  T  R  V  H  A  L  N  N  V  N  K  A      p.80

          .         .     | 05   .         .         .         .    g.521258
 CTGCGGGTTTTGCAGAACAATAAT | GTTGATTTAGTGAATATTGGAAGTACTGACATCGTA    c.300
 L  R  V  L  Q  N  N  N   | V  D  L  V  N  I  G  S  T  D  I  V      p.100

          .         .         .         .         .        | 06.    g.527972
 GATGGAAATCATAAACTGACTCTTGGTTTGATTTGGAATATAATCCTCCACTGGCAG | GTC    c.360
 D  G  N  H  K  L  T  L  G  L  I  W  N  I  I  L  H  W  Q   | V      p.120

          .         .         .         .         .         .       g.528032
 AAAAATGTAATGAAAAATATCATGGCTGGATTGCAACAAACCAACAGTGAAAAGATTCTC       c.420
 K  N  V  M  K  N  I  M  A  G  L  Q  Q  T  N  S  E  K  I  L         p.140

          .         .         .         .         .         .       g.528092
 CTGAGCTGGGTCCGACAATCAACTCGTAATTATCCACAGGTTAATGTAATCAACTTCACC       c.480
 L  S  W  V  R  Q  S  T  R  N  Y  P  Q  V  N  V  I  N  F  T         p.160

          .         .         .         .         . | 07       .    g.535008
 ACCAGCTGGTCTGATGGCCTGGCTTTGAATGCTCTCATCCATAGTCATAG | GCCAGACCTA    c.540
 T  S  W  S  D  G  L  A  L  N  A  L  I  H  S  H  R  |  P  D  L      p.180

          .         .         .         .         .         .       g.535068
 TTTGACTGGAATAGTGTGGTTTGCCAGCAGTCAGCCACACAACGACTGGAACATGCATTC       c.600
 F  D  W  N  S  V  V  C  Q  Q  S  A  T  Q  R  L  E  H  A  F         p.200

          .         .         .         .          | 08        .    g.645327
 AACATCGCCAGATATCAATTAGGCATAGAGAAACTACTCGATCCTGAAG | ATGTTGATACC    c.660
 N  I  A  R  Y  Q  L  G  I  E  K  L  L  D  P  E  D |   V  D  T      p.220

          .         .         .         .         .         .       g.645387
 ACCTATCCAGATAAGAAGTCCATCTTAATGTACATCACATCACTCTTCCAAGTTTTGCCT       c.720
 T  Y  P  D  K  K  S  I  L  M  Y  I  T  S  L  F  Q  V  L  P         p.240

          .         .         .         .         .         .       g.645447
 CAACAAGTGAGCATTGAAGCCATCCAGGAAGTGGAAATGTTGCCAAGGCCACCTAAAGTG       c.780
 Q  Q  V  S  I  E  A  I  Q  E  V  E  M  L  P  R  P  P  K  V         p.260

          .         .         .         .         .  | 09      .    g.646620
 ACTAAAGAAGAACATTTTCAGTTACATCATCAAATGCACTATTCTCAACAG | ATCACGGTC    c.840
 T  K  E  E  H  F  Q  L  H  H  Q  M  H  Y  S  Q  Q   | I  T  V      p.280

          .         .         .         .         .         .       g.646680
 AGTCTAGCACAGGGATATGAGAGAACTTCTTCCCCTAAGCCTCGATTCAAGAGCTATGCC       c.900
 S  L  A  Q  G  Y  E  R  T  S  S  P  K  P  R  F  K  S  Y  A         p.300

          .         .         .         .         .         .       g.646740
 TACACACAGGCTGCTTATGTCACCACCTCTGACCCTACACGGAGCCCATTTCCTTCACAG       c.960
 Y  T  Q  A  A  Y  V  T  T  S  D  P  T  R  S  P  F  P  S  Q         p.320

  | 10       .         .         .         .         .         .    g.699517
  | CATTTGGAAGCTCCTGAAGACAAGTCATTTGGCAGTTCATTGATGGAGAGTGAAGTAAAC    c.1020
  | H  L  E  A  P  E  D  K  S  F  G  S  S  L  M  E  S  E  V  N      p.340

          .         .         .         .         .         .       g.699577
 CTGGACCGTTATCAAACAGCTTTAGAAGAAGTATTATCGTGGCTTCTTTCTGCTGAGGAC       c.1080
 L  D  R  Y  Q  T  A  L  E  E  V  L  S  W  L  L  S  A  E  D         p.360

          .         .         .         .         .         .       g.699637
 ACATTGCAAGCACAAGGAGAGATTTCTAATGATGTGGAAGTGGTGAAAGACCAGTTTCAT       c.1140
 T  L  Q  A  Q  G  E  I  S  N  D  V  E  V  V  K  D  Q  F  H         p.380

           | 11        .         .         .         .         .    g.700347
 ACTCATGAG | GGGTACATGATGGATTTGACAGCCCATCAGGGCCGGGTTGGTAATATTCTA    c.1200
 T  H  E   | G  Y  M  M  D  L  T  A  H  Q  G  R  V  G  N  I  L      p.400

          .         .         .         .         .         .       g.700407
 CAATTGGGAAGTAAGCTGATTGGAACAGGAAAATTATCAGAAGATGAAGAAACTGAAGTA       c.1260
 Q  L  G  S  K  L  I  G  T  G  K  L  S  E  D  E  E  T  E  V         p.420

          .         .         .         .         .         .       g.700467
 CAAGAGCAGATGAATCTCCTAAATTCAAGATGGGAATGCCTCAGGGTAGCTAGCATGGAA       c.1320
 Q  E  Q  M  N  L  L  N  S  R  W  E  C  L  R  V  A  S  M  E         p.440

          .  | 12      .         .         .         .         .    g.730205
 AAACAAAGCAA | TTTACATAGAGTTTTAATGGATCTCCAGAATCAGAAACTGAAAGAGTTG    c.1380
 K  Q  S  N  |  L  H  R  V  L  M  D  L  Q  N  Q  K  L  K  E  L      p.460

          .         .         .         .         .         .       g.730265
 AATGACTGGCTAACAAAAACAGAAGAAAGAACAAGGAAAATGGAGGAAGAGCCTCTTGGA       c.1440
 N  D  W  L  T  K  T  E  E  R  T  R  K  M  E  E  E  P  L  G         p.480

          .         .         .         .   | 13     .         .    g.748751
 CCTGATCTTGAAGACCTAAAACGCCAAGTACAACAACATAAG | GTGCTTCAAGAAGATCTA    c.1500
 P  D  L  E  D  L  K  R  Q  V  Q  Q  H  K   | V  L  Q  E  D  L      p.500

          .         .         .         .         .         .       g.748811
 GAACAAGAACAAGTCAGGGTCAATTCTCTCACTCACATGGTGGTGGTAGTTGATGAATCT       c.1560
 E  Q  E  Q  V  R  V  N  S  L  T  H  M  V  V  V  V  D  E  S         p.520

          .         .         .         .   | 14     .         .    g.770781
 AGTGGAGATCACGCAACTGCTGCTTTGGAAGAACAACTTAAG | GTATTGGGAGATCGATGG    c.1620
 S  G  D  H  A  T  A  A  L  E  E  Q  L  K   | V  L  G  D  R  W      p.540

          .         .         .         .         .         .       g.770841
 GCAAACATCTGTAGATGGACAGAAGACCGCTGGGTTCTTTTACAAGACATCCTTCTCAAA       c.1680
 A  N  I  C  R  W  T  E  D  R  W  V  L  L  Q  D  I  L  L  K         p.560

          .         .     | 15   .         .         .         .    g.771008
 TGGCAACGTCTTACTGAAGAACAG | TGCCTTTTTAGTGCATGGCTTTCAGAAAAAGAAGAT    c.1740
 W  Q  R  L  T  E  E  Q   | C  L  F  S  A  W  L  S  E  K  E  D      p.580

          .         .         .         .         .         .       g.771068
 GCAGTGAACAAGATTCACACAACTGGCTTTAAAGATCAAAATGAAATGTTATCAAGTCTT       c.1800
 A  V  N  K  I  H  T  T  G  F  K  D  Q  N  E  M  L  S  S  L         p.600

          .   | 16     .         .         .         .         .    g.778776
 CAAAAACTGGCC | GTTTTAAAAGCGGATCTAGAAAAGAAAAAGCAATCCATGGGCAAACTG    c.1860
 Q  K  L  A   | V  L  K  A  D  L  E  K  K  K  Q  S  M  G  K  L      p.620

          .         .         .         .         .         .       g.778836
 TATTCACTCAAACAAGATCTTCTTTCAACACTGAAGAATAAGTCAGTGACCCAGAAGACG       c.1920
 Y  S  L  K  Q  D  L  L  S  T  L  K  N  K  S  V  T  Q  K  T         p.640

          .         .         .         .         .         .       g.778896
 GAAGCATGGCTGGATAACTTTGCCCGGTGTTGGGATAATTTAGTCCAAAAACTTGAAAAG       c.1980
 E  A  W  L  D  N  F  A  R  C  W  D  N  L  V  Q  K  L  E  K         p.660

          .   | 17     .         .         .         .         .    g.799323
 AGTACAGCACAG | ATTTCACAGGCTGTCACCACCACTCAGCCATCACTAACACAGACAACT    c.2040
 S  T  A  Q   | I  S  Q  A  V  T  T  T  Q  P  S  L  T  Q  T  T      p.680

          .         .         .         .         .         .       g.799383
 GTAATGGAAACAGTAACTACGGTGACCACAAGGGAACAGATCCTGGTAAAGCATGCTCAA       c.2100
 V  M  E  T  V  T  T  V  T  T  R  E  Q  I  L  V  K  H  A  Q         p.700

          .         .         .         .         .         .       g.799443
 GAGGAACTTCCACCACCACCTCCCCAAAAGAAGAGGCAGATTACTGTGGATTCTGAAATT       c.2160
 E  E  L  P  P  P  P  P  Q  K  K  R  Q  I  T  V  D  S  E  I         p.720

          | 18         .         .         .         .         .    g.826530
 AGGAAAAG | GTTGGATGTTGATATAACTGAACTTCACAGCTGGATTACTCGCTCAGAAGCT    c.2220
 R  K  R  |  L  D  V  D  I  T  E  L  H  S  W  I  T  R  S  E  A      p.740

          .         .         .         .         .         .       g.826590
 GTGTTGCAGAGTCCTGAATTTGCAATCTTTCGGAAGGAAGGCAACTTCTCAGACTTAAAA       c.2280
 V  L  Q  S  P  E  F  A  I  F  R  K  E  G  N  F  S  D  L  K         p.760

          .   | 19     .         .         .         .         .    g.842815
 GAAAAAGTCAAT | GCCATAGAGCGAGAAAAAGCTGAGAAGTTCAGAAAACTGCAAGATGCC    c.2340
 E  K  V  N   | A  I  E  R  E  K  A  E  K  F  R  K  L  Q  D  A      p.780

          .         .         .         . | 20       .         .    g.853111
 AGCAGATCAGCTCAGGCCCTGGTGGAACAGATGGTGAATG | AGGGTGTTAATGCAGATAGC    c.2400
 S  R  S  A  Q  A  L  V  E  Q  M  V  N  E |   G  V  N  A  D  S      p.800

          .         .         .         .         .         .       g.853171
 ATCAAACAAGCCTCAGAACAACTGAACAGCCGGTGGATCGAATTCTGCCAGTTGCTAAGT       c.2460
 I  K  Q  A  S  E  Q  L  N  S  R  W  I  E  F  C  Q  L  L  S         p.820

          .         .         .         .         .         .       g.853231
 GAGAGACTTAACTGGCTGGAGTATCAGAACAACATCATCGCTTTCTATAATCAGCTACAA       c.2520
 E  R  L  N  W  L  E  Y  Q  N  N  I  I  A  F  Y  N  Q  L  Q         p.840

          .         .         .         .         .         .       g.853291
 CAATTGGAGCAGATGACAACTACTGCTGAAAACTGGTTGAAAATCCAACCCACCACCCCA       c.2580
 Q  L  E  Q  M  T  T  T  A  E  N  W  L  K  I  Q  P  T  T  P         p.860

          .         .         .         .   | 21     .         .    g.859528
 TCAGAGCCAACAGCAATTAAAAGTCAGTTAAAAATTTGTAAG | GATGAAGTCAACCGGCTA    c.2640
 S  E  P  T  A  I  K  S  Q  L  K  I  C  K   | D  E  V  N  R  L      p.880

          .         .         .         .         .         .       g.859588
 TCAGGTCTTCAACCTCAAATTGAACGATTAAAAATTCAAAGCATAGCCCTGAAAGAGAAA       c.2700
 S  G  L  Q  P  Q  I  E  R  L  K  I  Q  S  I  A  L  K  E  K         p.900

          .         .         .         .         .         .       g.859648
 GGACAAGGACCCATGTTCCTGGATGCAGACTTTGTGGCCTTTACAAATCATTTTAAGCAA       c.2760
 G  Q  G  P  M  F  L  D  A  D  F  V  A  F  T  N  H  F  K  Q         p.920

          .         .         .         .    | 22    .         .    g.872317
 GTCTTTTCTGATGTGCAGGCCAGAGAGAAAGAGCTACAGACAA | TTTTTGACACTTTGCCA    c.2820
 V  F  S  D  V  Q  A  R  E  K  E  L  Q  T  I |   F  D  T  L  P      p.940

          .         .         .         .         .         .       g.872377
 CCAATGCGCTATCAGGAGACCATGAGTGCCATCAGGACATGGGTCCAGCAGTCAGAAACC       c.2880
 P  M  R  Y  Q  E  T  M  S  A  I  R  T  W  V  Q  Q  S  E  T         p.960

          .         .         .         .         .         .       g.872437
 AAACTCTCCATACCTCAACTTAGTGTCACCGACTATGAAATCATGGAGCAGAGACTCGGG       c.2940
 K  L  S  I  P  Q  L  S  V  T  D  Y  E  I  M  E  Q  R  L  G         p.980

           | 23        .         .         .         .         .    g.875950
 GAATTGCAG | GCTTTACAAAGTTCTCTGCAAGAGCAACAAAGTGGCCTATACTATCTCAGC    c.3000
 E  L  Q   | A  L  Q  S  S  L  Q  E  Q  Q  S  G  L  Y  Y  L  S      p.1000

          .         .         .         .         .         .       g.876010
 ACCACTGTGAAAGAGATGTCGAAGAAAGCGCCCTCTGAAATTAGCCGGAAATATCAATCA       c.3060
 T  T  V  K  E  M  S  K  K  A  P  S  E  I  S  R  K  Y  Q  S         p.1020

          .         .         .         .         .         .       g.876070
 GAATTTGAAGAAATTGAGGGACGCTGGAAGAAGCTCTCCTCCCAGCTGGTTGAGCATTGT       c.3120
 E  F  E  E  I  E  G  R  W  K  K  L  S  S  Q  L  V  E  H  C         p.1040

          .         .         .         .   | 24     .         .    g.879928
 CAAAAGCTAGAGGAGCAAATGAATAAACTCCGAAAAATTCAG | AATCACATACAAACCCTG    c.3180
 Q  K  L  E  E  Q  M  N  K  L  R  K  I  Q   | N  H  I  Q  T  L      p.1060

          .         .         .         .         .         .       g.879988
 AAGAAATGGATGGCTGAAGTTGATGTTTTTCTGAAGGAGGAATGGCCTGCCCTTGGGGAT       c.3240
 K  K  W  M  A  E  V  D  V  F  L  K  E  E  W  P  A  L  G  D         p.1080

          .         .         .       | 25 .         .         .    g.881039
 TCAGAAATTCTAAAAAAGCAGCTGAAACAGTGCAGA | CTTTTAGTCAGTGATATTCAGACA    c.3300
 S  E  I  L  K  K  Q  L  K  Q  C  R   | L  L  V  S  D  I  Q  T      p.1100

          .         .         .         .         .         .       g.881099
 ATTCAGCCCAGTCTAAACAGTGTCAATGAAGGTGGGCAGAAGATAAAGAATGAAGCAGAG       c.3360
 I  Q  P  S  L  N  S  V  N  E  G  G  Q  K  I  K  N  E  A  E         p.1120

          .         .         .         .         .         .       g.881159
 CCAGAGTTTGCTTCGAGACTTGAGACAGAACTCAAAGAACTTAACACTCAGTGGGATCAC       c.3420
 P  E  F  A  S  R  L  E  T  E  L  K  E  L  N  T  Q  W  D  H         p.1140

          .   | 26     .         .         .         .         .    g.889825
 ATGTGCCAACAG | GTCTATGCCAGAAAGGAGGCCTTGAAGGGAGGTTTGGAGAAAACTGTA    c.3480
 M  C  Q  Q   | V  Y  A  R  K  E  A  L  K  G  G  L  E  K  T  V      p.1160

          .         .         .         .         .         .       g.889885
 AGCCTCCAGAAAGATCTATCAGAGATGCACGAATGGATGACACAAGCTGAAGAAGAGTAT       c.3540
 S  L  Q  K  D  L  S  E  M  H  E  W  M  T  Q  A  E  E  E  Y         p.1180

          .         .         .         .         .         .       g.889945
 CTTGAGAGAGATTTTGAATATAAAACTCCAGATGAATTACAGAAAGCAGTTGAAGAGATG       c.3600
 L  E  R  D  F  E  Y  K  T  P  D  E  L  Q  K  A  V  E  E  M         p.1200

     | 27    .         .         .         .         .         .    g.896028
 AAG | AGAGCTAAAGAAGAGGCCCAACAAAAAGAAGCGAAAGTGAAACTCCTTACTGAGTCT    c.3660
 K   | R  A  K  E  E  A  Q  Q  K  E  A  K  V  K  L  L  T  E  S      p.1220

          .         .         .         .         .         .       g.896088
 GTAAATAGTGTCATAGCTCAAGCTCCACCTGTAGCACAAGAGGCCTTAAAAAAGGAACTT       c.3720
 V  N  S  V  I  A  Q  A  P  P  V  A  Q  E  A  L  K  K  E  L         p.1240

          .         .         .         .         .         .       g.896148
 GAAACTCTAACCACCAACTACCAGTGGCTCTGCACTAGGCTGAATGGGAAATGCAAGACT       c.3780
 E  T  L  T  T  N  Y  Q  W  L  C  T  R  L  N  G  K  C  K  T         p.1260

        | 28 .         .         .         .         .         .    g.903349
 TTGGAA | GAAGTTTGGGCATGTTGGCATGAGTTATTGTCATACTTGGAGAAAGCAAACAAG    c.3840
 L  E   | E  V  W  A  C  W  H  E  L  L  S  Y  L  E  K  A  N  K      p.1280

          .         .         .         .         .         .       g.903409
 TGGCTAAATGAAGTAGAATTTAAACTTAAAACCACTGAAAACATTCCTGGCGGAGCTGAG       c.3900
 W  L  N  E  V  E  F  K  L  K  T  T  E  N  I  P  G  G  A  E         p.1300

          .         .  | 29      .         .         .         .    g.906258
 GAAATCTCTGAGGTGCTAGAT | TCACTTGAAAATTTGATGCGACATTCAGAGGATAACCCA    c.3960
 E  I  S  E  V  L  D   | S  L  E  N  L  M  R  H  S  E  D  N  P      p.1320

          .         .         .         .         .         .       g.906318
 AATCAGATTCGCATATTGGCACAGACCCTAACAGATGGCGGAGTCATGGATGAGCTAATC       c.4020
 N  Q  I  R  I  L  A  Q  T  L  T  D  G  G  V  M  D  E  L  I         p.1340

          .         .         .         .         .  | 30      .    g.932705
 AATGAGGAACTTGAGACATTTAATTCTCGTTGGAGGGAACTACATGAAGAG | GCTGTAAGG    c.4080
 N  E  E  L  E  T  F  N  S  R  W  R  E  L  H  E  E   | A  V  R      p.1360
                                                     ^ alternative start Dp260

          .         .         .         .         .         .       g.932765
 AGGCAAAAGTTGCTTGAACAGAGCATCCAGTCTGCCCAGGAGACTGAAAAATCCTTACAC       c.4140
 R  Q  K  L  L  E  Q  S  I  Q  S  A  Q  E  T  E  K  S  L  H         p.1380

          .         .         .         .         .         .       g.932825
 TTAATCCAGGAGTCCCTCACATTCATTGACAAGCAGTTGGCAGCTTATATTGCAGACAAG       c.4200
 L  I  Q  E  S  L  T  F  I  D  K  Q  L  A  A  Y  I  A  D  K         p.1400

          .         .         .    | 31    .         .         .    g.954455
 GTGGACGCAGCTCAAATGCCTCAGGAAGCCCAG | AAAATCCAATCTGATTTGACAAGTCAT    c.4260
 V  D  A  A  Q  M  P  Q  E  A  Q   | K  I  Q  S  D  L  T  S  H      p.1420

          .         .         .         .         .         .       g.954515
 GAGATCAGTTTAGAAGAAATGAAGAAACATAATCAGGGGAAGGAGGCTGCCCAAAGAGTC       c.4320
 E  I  S  L  E  E  M  K  K  H  N  Q  G  K  E  A  A  Q  R  V         p.1440

          .         .     | 32   .         .         .         .    g.954971
 CTGTCTCAGATTGATGTTGCACAG | AAAAAATTACAAGATGTCTCCATGAAGTTTCGATTA    c.4380
 L  S  Q  I  D  V  A  Q   | K  K  L  Q  D  V  S  M  K  F  R  L      p.1460

          .         .         .         .         .         .       g.955031
 TTCCAGAAACCAGCCAATTTTGAGCAGCGTCTACAAGAAAGTAAGATGATTTTAGATGAA       c.4440
 F  Q  K  P  A  N  F  E  Q  R  L  Q  E  S  K  M  I  L  D  E         p.1480

          .         .         .         .         .         .       g.955091
 GTGAAGATGCACTTGCCTGCATTGGAAACAAAGAGTGTGGAACAGGAAGTAGTACAGTCA       c.4500
 V  K  M  H  L  P  A  L  E  T  K  S  V  E  Q  E  V  V  Q  S         p.1500

          .         | 33         .         .         .         .    g.958186
 CAGCTAAATCATTGTGTG | AACTTGTATAAAAGTCTGAGTGAAGTGAAGTCTGAAGTGGAA    c.4560
 Q  L  N  H  C  V   | N  L  Y  K  S  L  S  E  V  K  S  E  V  E      p.1520

          .         .         .         .         .         .       g.958246
 ATGGTGATAAAGACTGGACGTCAGATTGTACAGAAAAAGCAGACGGAAAATCCCAAAGAA       c.4620
 M  V  I  K  T  G  R  Q  I  V  Q  K  K  Q  T  E  N  P  K  E         p.1540

          .         .         .         .         .     | 34   .    g.963935
 CTTGATGAAAGAGTAACAGCTTTGAAATTGCATTATAATGAGCTGGGAGCAAAG | GTAACA    c.4680
 L  D  E  R  V  T  A  L  K  L  H  Y  N  E  L  G  A  K   | V  T      p.1560

          .         .         .         .         .         .       g.963995
 GAAAGAAAGCAACAGTTGGAGAAATGCTTGAAATTGTCCCGTAAGATGCGAAAGGAAATG       c.4740
 E  R  K  Q  Q  L  E  K  C  L  K  L  S  R  K  M  R  K  E  M         p.1580

          .         .         .         .         .         .       g.964055
 AATGTCTTGACAGAATGGCTGGCAGCTACAGATATGGAATTGACAAAGAGATCAGCAGTT       c.4800
 N  V  L  T  E  W  L  A  A  T  D  M  E  L  T  K  R  S  A  V         p.1600

          .         .         .         .      | 35  .         .    g.979425
 GAAGGAATGCCTAGTAATTTGGATTCTGAAGTTGCCTGGGGAAAG | GCTACTCAAAAAGAG    c.4860
 E  G  M  P  S  N  L  D  S  E  V  A  W  G  K   | A  T  Q  K  E      p.1620

          .         .         .         .         .         .       g.979485
 ATTGAGAAACAGAAGGTGCACCTGAAGAGTATCACAGAGGTAGGAGAGGCCTTGAAAACA       c.4920
 I  E  K  Q  K  V  H  L  K  S  I  T  E  V  G  E  A  L  K  T         p.1640

          .         .         .         .         .         .       g.979545
 GTTTTGGGCAAGAAGGAGACGTTGGTGGAAGATAAACTCAGTCTTCTGAATAGTAACTGG       c.4980
 V  L  G  K  K  E  T  L  V  E  D  K  L  S  L  L  N  S  N  W         p.1660

          .         .         .         .      | 36  .         .    g.979914
 ATAGCTGTCACCTCCCGAGCAGAAGAGTGGTTAAATCTTTTGTTG | GAATACCAGAAACAC    c.5040
 I  A  V  T  S  R  A  E  E  W  L  N  L  L  L   | E  Y  Q  K  H      p.1680

          .         .         .         .         .         .       g.979974
 ATGGAAACTTTTGACCAGAATGTGGACCACATCACAAAGTGGATCATTCAGGCTGACACA       c.5100
 M  E  T  F  D  Q  N  V  D  H  I  T  K  W  I  I  Q  A  D  T         p.1700

          .         .         .         .         .     | 37   .    g.981657
 CTTTTGGATGAATCAGAGAAAAAGAAACCCCAGCAAAAAGAAGACGTGCTTAAG | CGTTTA    c.5160
 L  L  D  E  S  E  K  K  K  P  Q  Q  K  E  D  V  L  K   | R  L      p.1720

          .         .         .         .         .         .       g.981717
 AAGGCAGAACTGAATGACATACGCCCAAAGGTGGACTCTACACGTGACCAAGCAGCAAAC       c.5220
 K  A  E  L  N  D  I  R  P  K  V  D  S  T  R  D  Q  A  A  N         p.1740

          .         .         .         .         .         .       g.981777
 TTGATGGCAAACCGCGGTGACCACTGCAGGAAATTAGTAGAGCCCCAAATCTCAGAGCTC       c.5280
 L  M  A  N  R  G  D  H  C  R  K  L  V  E  P  Q  I  S  E  L         p.1760

          .         .         .         .      | 38  .         .    g.996096
 AACCATCGATTTGCAGCCATTTCACACAGAATTAAGACTGGAAAG | GCCTCCATTCCTTTG    c.5340
 N  H  R  F  A  A  I  S  H  R  I  K  T  G  K   | A  S  I  P  L      p.1780

          .         .         .         .         .         .       g.996156
 AAGGAATTGGAGCAGTTTAACTCAGATATACAAAAATTGCTTGAACCACTGGAGGCTGAA       c.5400
 K  E  L  E  Q  F  N  S  D  I  Q  K  L  L  E  P  L  E  A  E         p.1800

          .         .         .         .         | 39         .    g.998541
 ATTCAGCAGGGGGTGAATCTGAAAGAGGAAGACTTCAATAAAGATATG | AATGAAGACAAT    c.5460
 I  Q  Q  G  V  N  L  K  E  E  D  F  N  K  D  M   | N  E  D  N      p.1820

          .         .         .         .         .         .       g.998601
 GAGGGTACTGTAAAAGAATTGTTGCAAAGAGGAGACAACTTACAACAAAGAATCACAGAT       c.5520
 E  G  T  V  K  E  L  L  Q  R  G  D  N  L  Q  Q  R  I  T  D         p.1840

          .         .         .         .         .         .       g.998661
 GAGAGAAAGCGAGAGGAAATAAAGATAAAACAGCAGCTGTTACAGACAAAACATAATGCT       c.5580
 E  R  K  R  E  E  I  K  I  K  Q  Q  L  L  Q  T  K  H  N  A         p.1860

        | 40 .         .         .         .         .         .    g.1001377
 CTCAAG | GATTTGAGGTCTCAAAGAAGAAAAAAGGCTCTAGAAATTTCTCATCAGTGGTAT    c.5640
 L  K   | D  L  R  S  Q  R  R  K  K  A  L  E  I  S  H  Q  W  Y      p.1880

          .         .         .         .         .         .       g.1001437
 CAGTACAAGAGGCAGGCTGATGATCTCCTGAAATGCTTGGATGACATTGAAAAAAAATTA       c.5700
 Q  Y  K  R  Q  A  D  D  L  L  K  C  L  D  D  I  E  K  K  L         p.1900

          .         .         .          | 41        .         .    g.1002348
 GCCAGCCTACCTGAGCCCAGAGATGAAAGGAAAATAAAG | GAAATTGATCGGGAATTGCAG    c.5760
 A  S  L  P  E  P  R  D  E  R  K  I  K   | E  I  D  R  E  L  Q      p.1920

          .         .         .         .         .         .       g.1002408
 AAGAAGAAAGAGGAGCTGAATGCAGTGCGTAGGCAAGCTGAGGGCTTGTCTGAGGATGGG       c.5820
 K  K  K  E  E  L  N  A  V  R  R  Q  A  E  G  L  S  E  D  G         p.1940

          .         .         .         .         .         .       g.1002468
 GCCGCAATGGCAGTGGAGCCAACTCAGATCCAGCTCAGCAAGCGCTGGCGGGAAATTGAG       c.5880
 A  A  M  A  V  E  P  T  Q  I  Q  L  S  K  R  W  R  E  I  E         p.1960

          .         .         .         .   | 42     .         .    g.1034351
 AGCAAATTTGCTCAGTTTCGAAGACTCAACTTTGCACAAATT | CACACTGTCCGTGAAGAA    c.5940
 S  K  F  A  Q  F  R  R  L  N  F  A  Q  I   | H  T  V  R  E  E      p.1980

          .         .         .         .         .         .       g.1034411
 ACGATGATGGTGATGACTGAAGACATGCCTTTGGAAATTTCTTATGTGCCTTCTACTTAT       c.6000
 T  M  M  V  M  T  E  D  M  P  L  E  I  S  Y  V  P  S  T  Y         p.2000

          .         .         .         .         .         .       g.1034471
 TTGACTGAAATCACTCATGTCTCACAAGCCCTATTAGAAGTGGAACAACTTCTCAATGCT       c.6060
 L  T  E  I  T  H  V  S  Q  A  L  L  E  V  E  Q  L  L  N  A         p.2020

          .         .         .         .         .        | 43.    g.1056911
 CCTGACCTCTGTGCTAAGGACTTTGAAGATCTCTTTAAGCAAGAGGAGTCTCTGAAG | AAT    c.6120
 P  D  L  C  A  K  D  F  E  D  L  F  K  Q  E  E  S  L  K   | N      p.2040

          .         .         .         .         .         .       g.1056971
 ATAAAAGATAGTCTACAACAAAGCTCAGGTCGGATTGACATTATTCATAGCAAGAAGACA       c.6180
 I  K  D  S  L  Q  Q  S  S  G  R  I  D  I  I  H  S  K  K  T         p.2060

          .         .         .         .         .         .       g.1057031
 GCAGCATTGCAAAGTGCAACGCCTGTGGAAAGGGTGAAGCTACAGGAAGCTCTCTCCCAG       c.6240
 A  A  L  Q  S  A  T  P  V  E  R  V  K  L  Q  E  A  L  S  Q         p.2080

          .         .         .         .         . | 44       .    g.1127556
 CTTGATTTCCAATGGGAAAAAGTTAACAAAATGTACAAGGACCGACAAGG | GCGATTTGAC    c.6300
 L  D  F  Q  W  E  K  V  N  K  M  Y  K  D  R  Q  G  |  R  F  D      p.2100

          .         .         .         .         .         .       g.1127616
 AGATCTGTTGAGAAATGGCGGCGTTTTCATTATGATATAAAGATATTTAATCAGTGGCTA       c.6360
 R  S  V  E  K  W  R  R  F  H  Y  D  I  K  I  F  N  Q  W  L         p.2120

          .         .         .         .         .         .       g.1127676
 ACAGAAGCTGAACAGTTTCTCAGAAAGACACAAATTCCTGAGAATTGGGAACATGCTAAA       c.6420
 T  E  A  E  Q  F  L  R  K  T  Q  I  P  E  N  W  E  H  A  K         p.2140

          .         | 45         .         .         .         .    g.1376137
 TACAAATGGTATCTTAAG | GAACTCCAGGATGGCATTGGGCAGCGGCAAACTGTTGTCAGA    c.6480
 Y  K  W  Y  L  K   | E  L  Q  D  G  I  G  Q  R  Q  T  V  V  R      p.2160
                    ^ alternative start Dp140

          .         .         .         .         .         .       g.1376197
 ACATTGAATGCAACTGGGGAAGAAATAATTCAGCAATCCTCAAAAACAGATGCCAGTATT       c.6540
 T  L  N  A  T  G  E  E  I  I  Q  Q  S  S  K  T  D  A  S  I         p.2180

          .         .         .         .         .         .       g.1376257
 CTACAGGAAAAATTGGGAAGCCTGAATCTGCGGTGGCAGGAGGTCTGCAAACAGCTGTCA       c.6600
 L  Q  E  K  L  G  S  L  N  L  R  W  Q  E  V  C  K  Q  L  S         p.2200

          .     | 46   .         .         .         .         .    g.1412428
 GACAGAAAAAAGAG | GCTAGAAGAACAAAAGAATATCTTGTCAGAATTTCAAAGAGATTTA    c.6660
 D  R  K  K  R  |  L  E  E  Q  K  N  I  L  S  E  F  Q  R  D  L      p.2220

          .         .         .         .         .         .       g.1412488
 AATGAATTTGTTTTATGGTTGGAGGAAGCAGATAACATTGCTAGTATCCCACTTGAACCT       c.6720
 N  E  F  V  L  W  L  E  E  A  D  N  I  A  S  I  P  L  E  P         p.2240

          .         .         .         .   | 47     .         .    g.1414882
 GGAAAAGAGCAGCAACTAAAAGAAAAGCTTGAGCAAGTCAAG | TTACTGGTGGAAGAGTTG    c.6780
 G  K  E  Q  Q  L  K  E  K  L  E  Q  V  K   | L  L  V  E  E  L      p.2260

          .         .         .         .         .         .       g.1414942
 CCCCTGCGCCAGGGAATTCTCAAACAATTAAATGAAACTGGAGGACCCGTGCTTGTAAGT       c.6840
 P  L  R  Q  G  I  L  K  Q  L  N  E  T  G  G  P  V  L  V  S         p.2280

          .         .         .         .         .         .       g.1415002
 GCTCCCATAAGCCCAGAAGAGCAAGATAAACTTGAAAATAAGCTCAAGCAGACAAATCTC       c.6900
 A  P  I  S  P  E  E  Q  D  K  L  E  N  K  L  K  Q  T  N  L         p.2300

          .   | 48     .         .         .         .         .    g.1469284
 CAGTGGATAAAG | GTTTCCAGAGCTTTACCTGAGAAACAAGGAGAAATTGAAGCTCAAATA    c.6960
 Q  W  I  K   | V  S  R  A  L  P  E  K  Q  G  E  I  E  A  Q  I      p.2320

          .         .         .         .         .         .       g.1469344
 AAAGACCTTGGGCAGCTTGAAAAAAAGCTTGAAGACCTTGAAGAGCAGTTAAATCATCTG       c.7020
 K  D  L  G  Q  L  E  K  K  L  E  D  L  E  E  Q  L  N  H  L         p.2340

          .         .         .         .         .         .       g.1469404
 CTGCTGTGGTTATCTCCTATTAGGAATCAGTTGGAAATTTATAACCAACCAAACCAAGAA       c.7080
 L  L  W  L  S  P  I  R  N  Q  L  E  I  Y  N  Q  P  N  Q  E         p.2360

          .         | 49         .         .         .         .    g.1507832
 GGACCATTTGACGTTCAG | GAAACTGAAATAGCAGTTCAAGCTAAACAACCGGATGTGGAA    c.7140
 G  P  F  D  V  Q   | E  T  E  I  A  V  Q  A  K  Q  P  D  V  E      p.2380

          .         .         .         .         .         .       g.1507892
 GAGATTTTGTCTAAAGGGCAGCATTTGTACAAGGAAAAACCAGCCACTCAGCCAGTGAAG       c.7200
 E  I  L  S  K  G  Q  H  L  Y  K  E  K  P  A  T  Q  P  V  K         p.2400

  | 50       .         .         .         .         .         .    g.1524586
  | AGGAAGTTAGAAGATCTGAGCTCTGAGTGGAAGGCGGTAAACCGTTTACTTCAAGAGCTG    c.7260
  | R  K  L  E  D  L  S  S  E  W  K  A  V  N  R  L  L  Q  E  L      p.2420

          .         .         .         .          | 51        .    g.1570428
 AGGGCAAAGCAGCCTGACCTAGCTCCTGGACTGACCACTATTGGAGCCT | CTCCTACTCAG    c.7320
 R  A  K  Q  P  D  L  A  P  G  L  T  T  I  G  A  S |   P  T  Q      p.2440

          .         .         .         .         .         .       g.1570488
 ACTGTTACTCTGGTGACACAACCTGTGGTTACTAAGGAAACTGCCATCTCCAAACTAGAA       c.7380
 T  V  T  L  V  T  Q  P  V  V  T  K  E  T  A  I  S  K  L  E         p.2460

          .         .         .         .         .         .       g.1570548
 ATGCCATCTTCCTTGATGTTGGAGGTACCTGCTCTGGCAGATTTCAACCGGGCTTGGACA       c.7440
 M  P  S  S  L  M  L  E  V  P  A  L  A  D  F  N  R  A  W  T         p.2480

          .         .         .         .         .         .       g.1570608
 GAACTTACCGACTGGCTTTCTCTGCTTGATCAAGTTATAAAATCACAGAGGGTGATGGTG       c.7500
 E  L  T  D  W  L  S  L  L  D  Q  V  I  K  S  Q  R  V  M  V         p.2500

          .         .         .         .   | 52     .         .    g.1614879
 GGTGACCTTGAGGATATCAACGAGATGATCATCAAGCAGAAG | GCAACAATGCAGGATTTG    c.7560
 G  D  L  E  D  I  N  E  M  I  I  K  Q  K   | A  T  M  Q  D  L      p.2520

          .         .         .         .         .         .       g.1614939
 GAACAGAGGCGTCCCCAGTTGGAAGAACTCATTACCGCTGCCCAAAATTTGAAAAACAAG       c.7620
 E  Q  R  R  P  Q  L  E  E  L  I  T  A  A  Q  N  L  K  N  K         p.2540

          .         .         .         . | 53       .         .    g.1665043
 ACCAGCAATCAAGAGGCTAGAACAATCATTACGGATCGAA | TTGAAAGAATTCAGAATCAG    c.7680
 T  S  N  Q  E  A  R  T  I  I  T  D  R  I |   E  R  I  Q  N  Q      p.2560

          .         .         .         .         .         .       g.1665103
 TGGGATGAAGTACAAGAACACCTTCAGAACCGGAGGCAACAGTTGAATGAAATGTTAAAG       c.7740
 W  D  E  V  Q  E  H  L  Q  N  R  R  Q  Q  L  N  E  M  L  K         p.2580

          .         .         .         .         .         .       g.1665163
 GATTCAACACAATGGCTGGAAGCTAAGGAAGAAGCTGAGCAGGTCTTAGGACAGGCCAGA       c.7800
 D  S  T  Q  W  L  E  A  K  E  E  A  E  Q  V  L  G  Q  A  R         p.2600

          .         .         .         .         .         .       g.1665223
 GCCAAGCTTGAGTCATGGAAGGAGGGTCCCTATACAGTAGATGCAATCCAAAAGAAAATC       c.7860
 A  K  L  E  S  W  K  E  G  P  Y  T  V  D  A  I  Q  K  K  I         p.2620

          .   | 54     .         .         .         .         .    g.1686513
 ACAGAAACCAAG | CAGTTGGCCAAAGACCTCCGCCAGTGGCAGACAAATGTAGATGTGGCA    c.7920
 T  E  T  K   | Q  L  A  K  D  L  R  Q  W  Q  T  N  V  D  V  A      p.2640

          .         .         .         .         .         .       g.1686573
 AATGACTTGGCCCTGAAACTTCTCCGGGATTATTCTGCAGATGATACCAGAAAAGTCCAC       c.7980
 N  D  L  A  L  K  L  L  R  D  Y  S  A  D  D  T  R  K  V  H         p.2660

          .         .         .         .        | 55.         .    g.1716760
 ATGATAACAGAGAATATCAATGCCTCTTGGAGAAGCATTCATAAAAG | GGTGAGTGAGCGA    c.8040
 M  I  T  E  N  I  N  A  S  W  R  S  I  H  K  R  |  V  S  E  R      p.2680

          .         .         .         .         .         .       g.1716820
 GAGGCTGCTTTGGAAGAAACTCATAGATTACTGCAACAGTTCCCCCTGGACCTGGAAAAG       c.8100
 E  A  A  L  E  E  T  H  R  L  L  Q  Q  F  P  L  D  L  E  K         p.2700

          .         .         .         .         .         .       g.1716880
 TTTCTTGCCTGGCTTACAGAAGCTGAAACAACTGCCAATGTCCTACAGGATGCTACCCGT       c.8160
 F  L  A  W  L  T  E  A  E  T  T  A  N  V  L  Q  D  A  T  R         p.2720

          .         .         .         .         .        | 56.    g.1837159
 AAGGAAAGGCTCCTAGAAGACTCCAAGGGAGTAAAAGAGCTGATGAAACAATGGCAA | GAC    c.8220
 K  E  R  L  L  E  D  S  K  G  V  K  E  L  M  K  Q  W  Q   | D      p.2740
                                                           ^ alternative start Dp116

          .         .         .         .         .         .       g.1837219
 CTCCAAGGTGAAATTGAAGCTCACACAGATGTTTATCACAACCTGGATGAAAACAGCCAA       c.8280
 L  Q  G  E  I  E  A  H  T  D  V  Y  H  N  L  D  E  N  S  Q         p.2760

          .         .         .         .         .         .       g.1837279
 AAAATCCTGAGATCCCTGGAAGGTTCCGATGATGCAGTCCTGTTACAAAGACGTTTGGAT       c.8340
 K  I  L  R  S  L  E  G  S  D  D  A  V  L  L  Q  R  R  L  D         p.2780

          .         .         .         .         . | 57       .    g.1847675
 AACATGAACTTCAAGTGGAGTGAACTTCGGAAAAAGTCTCTCAACATTAG | GTCCCATTTG    c.8400
 N  M  N  F  K  W  S  E  L  R  K  K  S  L  N  I  R  |  S  H  L      p.2800

          .         .         .         .         .         .       g.1847735
 GAAGCCAGTTCTGACCAGTGGAAGCGTCTGCACCTTTCTCTGCAGGAACTTCTGGTGTGG       c.8460
 E  A  S  S  D  Q  W  K  R  L  H  L  S  L  Q  E  L  L  V  W         p.2820

          .         .         .         .         .         .       g.1847795
 CTACAGCTGAAAGATGATGAATTAAGCCGGCAGGCACCTATTGGAGGCGACTTTCCAGCA       c.8520
 L  Q  L  K  D  D  E  L  S  R  Q  A  P  I  G  G  D  F  P  A         p.2840

          .         .        | 58.         .         .         .    g.1865539
 GTTCAGAAGCAGAACGATGTACATAGG | GCCTTCAAGAGGGAATTGAAAACTAAAGAACCT    c.8580
 V  Q  K  Q  N  D  V  H  R   | A  F  K  R  E  L  K  T  K  E  P      p.2860

          .         .         .         .         .         .       g.1865599
 GTAATCATGAGTACTCTTGAGACTGTACGAATATTTCTGACAGAGCAGCCTTTGGAAGGA       c.8640
 V  I  M  S  T  L  E  T  V  R  I  F  L  T  E  Q  P  L  E  G         p.2880

          .         .         | 59         .         .         .    g.1866267
 CTAGAGAAACTCTACCAGGAGCCCAGAG | AGCTGCCTCCTGAGGAGAGAGCCCAGAATGTC    c.8700
 L  E  K  L  Y  Q  E  P  R  E |   L  P  P  E  E  R  A  Q  N  V      p.2900

          .         .         .         .         .         .       g.1866327
 ACTCGGCTTCTACGAAAGCAGGCTGAGGAGGTCAATACTGAGTGGGAAAAATTGAACCTG       c.8760
 T  R  L  L  R  K  Q  A  E  E  V  N  T  E  W  E  K  L  N  L         p.2920

          .         .         .         .         .         .       g.1866387
 CACTCCGCTGACTGGCAGAGAAAAATAGATGAGACCCTTGAAAGACTCCAGGAACTTCAA       c.8820
 H  S  A  D  W  Q  R  K  I  D  E  T  L  E  R  L  Q  E  L  Q         p.2940

          .         .         .         .         .         .       g.1866447
 GAGGCCACGGATGAGCTGGACCTCAAGCTGCGCCAAGCTGAGGTGATCAAGGGATCCTGG       c.8880
 E  A  T  D  E  L  D  L  K  L  R  Q  A  E  V  I  K  G  S  W         p.2960

          .         .         .         .         .        | 60.    g.1899985
 CAGCCCGTGGGCGATCTCCTCATTGACTCTCTCCAAGATCACCTCGAGAAAGTCAAG | GCA    c.8940
 Q  P  V  G  D  L  L  I  D  S  L  Q  D  H  L  E  K  V  K   | A      p.2980

          .         .         .         .         .         .       g.1900045
 CTTCGAGGAGAAATTGCGCCTCTGAAAGAGAACGTGAGCCACGTCAATGACCTTGCTCGC       c.9000
 L  R  G  E  I  A  P  L  K  E  N  V  S  H  V  N  D  L  A  R         p.3000

          .         .         .         .         .         .       g.1900105
 CAGCTTACCACTTTGGGCATTCAGCTCTCACCGTATAACCTCAGCACTCTGGAAGACCTG       c.9060
 Q  L  T  T  L  G  I  Q  L  S  P  Y  N  L  S  T  L  E  D  L         p.3020

          .         .     | 61   .         .         .         .    g.1996011
 AACACCAGATGGAAGCTTCTGCAG | GTGGCCGTCGAGGACCGAGTCAGGCAGCTGCATGAA    c.9120
 N  T  R  W  K  L  L  Q   | V  A  V  E  D  R  V  R  Q  L  H  E      p.3040

          .         .         .         .    | 62    .         .    g.2020968
 GCCCACAGGGACTTTGGTCCAGCATCTCAGCACTTTCTTTCCA | CGTCTGTCCAGGGTCCC    c.9180
 A  H  R  D  F  G  P  A  S  Q  H  F  L  S  T |   S  V  Q  G  P      p.3060

          .         .         .         .     | 63   .         .    g.2083609
 TGGGAGAGAGCCATCTCGCCAAACAAAGTGCCCTACTATATCAA | CCACGAGACTCAAACA    c.9240
 W  E  R  A  I  S  P  N  K  V  P  Y  Y  I  N  |  H  E  T  Q  T      p.3080
                                              ^ alternative start Dp71/Dp40

          .         .         .         .       | 64 .         .    g.2121502
 ACTTGCTGGGACCATCCCAAAATGACAGAGCTCTACCAGTCTTTAG | CTGACCTGAATAAT    c.9300
 T  C  W  D  H  P  K  M  T  E  L  Y  Q  S  L  A |   D  L  N  N      p.3100

          .         .         .         .         .         .       g.2121562
 GTCAGATTCTCAGCTTATAGGACTGCCATGAAACTCCGAAGACTGCAGAAGGCCCTTTGC       c.9360
 V  R  F  S  A  Y  R  T  A  M  K  L  R  R  L  Q  K  A  L  C         p.3120

   | 65      .         .         .         .         .         .    g.2134969
 T | TGGATCTCTTGAGCCTGTCAGCTGCATGTGATGCCTTGGACCAGCACAACCTCAAGCAA    c.9420
 L |   D  L  L  S  L  S  A  A  C  D  A  L  D  Q  H  N  L  K  Q      p.3140

          .         .         .         .         .         .       g.2135029
 AATGACCAGCCCATGGATATCCTGCAGATTATTAATTGTTTGACCACTATTTATGACCGC       c.9480
 N  D  Q  P  M  D  I  L  Q  I  I  N  C  L  T  T  I  Y  D  R         p.3160

          .         .         .         .         .         .       g.2135089
 CTGGAGCAAGAGCACAACAATTTGGTCAACGTCCCTCTCTGCGTGGATATGTGTCTGAAC       c.9540
 L  E  Q  E  H  N  N  L  V  N  V  P  L  C  V  D  M  C  L  N         p.3180

          .         .    | 66    .         .         .         .    g.2137979
 TGGCTGCTGAATGTTTATGATAC | GGGACGAACAGGGAGGATCCGTGTCCTGTCTTTTAAA    c.9600
 W  L  L  N  V  Y  D  T  |  G  R  T  G  R  I  R  V  L  S  F  K      p.3200

          .         .         .         .          | 67        .    g.2140502
 ACTGGCATCATTTCCCTGTGTAAAGCACATTTGGAAGACAAGTACAGAT | ACCTTTTCAAG    c.9660
 T  G  I  I  S  L  C  K  A  H  L  E  D  K  Y  R  Y |   L  F  K      p.3220

          .         .         .         .         .         .       g.2140562
 CAAGTGGCAAGTTCAACAGGATTTTGTGACCAGCGCAGGCTGGGCCTCCTTCTGCATGAT       c.9720
 Q  V  A  S  S  T  G  F  C  D  Q  R  R  L  G  L  L  L  H  D         p.3240

          .         .         .         .         .         .       g.2140622
 TCTATCCAAATTCCAAGACAGTTGGGTGAAGTTGCATCCTTTGGGGGCAGTAACATTGAG       c.9780
 S  I  Q  I  P  R  Q  L  G  E  V  A  S  F  G  G  S  N  I  E         p.3260

          .         .        | 68.         .         .         .    g.2161738
 CCAAGTGTCCGGAGCTGCTTCCAATTT | GCTAATAATAAGCCAGAGATCGAAGCGGCCCTC    c.9840
 P  S  V  R  S  C  F  Q  F   | A  N  N  K  P  E  I  E  A  A  L      p.3280

          .         .         .         .         .         .       g.2161798
 TTCCTAGACTGGATGAGACTGGAACCCCAGTCCATGGTGTGGCTGCCCGTCCTGCACAGA       c.9900
 F  L  D  W  M  R  L  E  P  Q  S  M  V  W  L  P  V  L  H  R         p.3300

          .         .         .         .         .         .       g.2161858
 GTGGCTGCTGCAGAAACTGCCAAGCATCAGGCCAAATGTAACATCTGCAAAGAGTGTCCA       c.9960
 V  A  A  A  E  T  A  K  H  Q  A  K  C  N  I  C  K  E  C  P         p.3320

          .     | 69   .         .         .         .         .    g.2164174
 ATCATTGGATTCAG | GTACAGGAGTCTAAAGCACTTTAATTATGACATCTGCCAAAGCTGC    c.10020
 I  I  G  F  R  |  Y  R  S  L  K  H  F  N  Y  D  I  C  Q  S  C      p.3340

          .         .         .         .         .         .       g.2164234
 TTTTTTTCTGGTCGAGTTGCAAAAGGCCATAAAATGCACTATCCCATGGTGGAATATTGC       c.10080
 F  F  S  G  R  V  A  K  G  H  K  M  H  Y  P  M  V  E  Y  C         p.3360

        | 70 .         .         .         .         .         .    g.2165858
 ACTCCG | ACTACATCAGGAGAAGATGTTCGAGACTTTGCCAAGGTACTAAAAAACAAATTT    c.10140
 T  P   | T  T  S  G  E  D  V  R  D  F  A  K  V  L  K  N  K  F      p.3380

          .         .         .         .         .         .       g.2165918
 CGAACCAAAAGGTATTTTGCGAAGCATCCCCGAATGGGCTACCTGCCAGTGCAGACTGTC       c.10200
 R  T  K  R  Y  F  A  K  H  P  R  M  G  Y  L  P  V  Q  T  V         p.3400

          .         .    | 71    .         .         .         .    g.2166676
 TTAGAGGGGGACAACATGGAAAC | TCCCGTTACTCTGATCAACTTCTGGCCAGTAGATTCT    c.10260
 L  E  G  D  N  M  E  T  |  P  V  T  L  I  N  F  W  P  V  D  S      p.3420
                         ^ 3'-terminal exon Dp40

    | 72     .         .         .         .         .         .    g.2171063
 GC | GCCTGCCTCGTCCCCTCAGCTTTCACACGATGATACTCATTCACGCATTGAACATTAT    c.10320
 A  |  P  A  S  S  P  Q  L  S  H  D  D  T  H  S  R  I  E  H  Y      p.3440

          | 73         .         .         .         .         .    g.2172248
 GCTAGCAG | GCTAGCAGAAATGGAAAACAGCAATGGATCTTATCTAAATGATAGCATCTCT    c.10380
 A  S  R  |  L  A  E  M  E  N  S  N  G  S  Y  L  N  D  S  I  S      p.3460

          .     | 74   .         .         .         .         .    g.2175054
 CCTAATGAGAGCAT | AGATGATGAACATTTGTTAATCCAGCATTACTGCCAAAGTTTGAAC    c.10440
 P  N  E  S  I  |  D  D  E  H  L  L  I  Q  H  Y  C  Q  S  L  N      p.3480

          .         .         .         .         .         .       g.2175114
 CAGGACTCCCCCCTGAGCCAGCCTCGTAGTCCTGCCCAGATCTTGATTTCCTTAGAGAGT       c.10500
 Q  D  S  P  L  S  Q  P  R  S  P  A  Q  I  L  I  S  L  E  S         p.3500

          .         .         .         .         .    | 75    .    g.2197098
 GAGGAAAGAGGGGAGCTAGAGAGAATCCTAGCAGATCTTGAGGAAGAAAACAG | GAATCTG    c.10560
 E  E  R  G  E  L  E  R  I  L  A  D  L  E  E  E  N  R  |  N  L      p.3520

          .         .         .         .         .         .       g.2197158
 CAAGCAGAATATGACCGTCTAAAGCAGCAGCACGAACATAAAGGCCTGTCCCCACTGCCG       c.10620
 Q  A  E  Y  D  R  L  K  Q  Q  H  E  H  K  G  L  S  P  L  P         p.3540

          .         .         .         .         .         .       g.2197218
 TCCCCTCCTGAAATGATGCCCACCTCTCCCCAGAGTCCCCGGGATGCTGAGCTCATTGCT       c.10680
 S  P  P  E  M  M  P  T  S  P  Q  S  P  R  D  A  E  L  I  A         p.3560

          .         .         .         .         .         .       g.2197278
 GAGGCCAAGCTACTGCGTCAACACAAAGGCCGCCTGGAAGCCAGGATGCAAATCCTGGAA       c.10740
 E  A  K  L  L  R  Q  H  K  G  R  L  E  A  R  M  Q  I  L  E         p.3580

          .         .         .         .         .        | 76.    g.2198198
 GACCACAATAAACAGCTGGAGTCACAGTTACACAGGCTAAGGCAGCTGCTGGAGCAA | CCC    c.10800
 D  H  N  K  Q  L  E  S  Q  L  H  R  L  R  Q  L  L  E  Q   | P      p.3600

          .         .         .         .         .         .       g.2198258
 CAGGCAGAGGCCAAAGTGAATGGCACAACGGTGTCCTCTCCTTCTACCTCTCTACAGAGG       c.10860
 Q  A  E  A  K  V  N  G  T  T  V  S  S  P  S  T  S  L  Q  R         p.3620

          .         .         .         .         .         .       g.2198318
 TCCGACAGCAGTCAGCCTATGCTGCTCCGAGTGGTTGGCAGTCAAACTTCGGACTCCATG       c.10920
 S  D  S  S  Q  P  M  L  L  R  V  V  G  S  Q  T  S  D  S  M         p.3640

   | 77      .         .         .         .         .         .    g.2210474
 G | GTGAGGAAGATCTTCTCAGTCCTCCCCAGGACACAAGCACAGGGTTAGAGGAGGTGATG    c.10980
 G |   E  E  D  L  L  S  P  P  Q  D  T  S  T  G  L  E  E  V  M      p.3660

          .         .         .     | 78   .         .         .    g.2217962
 GAGCAACTCAACAACTCCTTCCCTAGTTCAAGAG | GAAGAAATACCCCTGGAAAGCCAATG    c.11040
 E  Q  L  N  N  S  F  P  S  S  R  G |   R  N  T  P  G  K  P  M      p.3680
                                    ^ differentially spliced exon
        | 79 .                                                      g.2222733
 AGAGAG | GACACAATGTAG                                              c.11058
 R  E   | D  T  M  X                                                p.3685
            H  N  V  G                                              p.3672+4
      (C-terminal end ancient dystrophin, -ex78 transcript)

          .         .         .         .         .         .       g.2222793
 gaagtcttttccacatggcagatgatttgggcagagcgatggagtccttagtatcagtca       c.*60
   S  L  F  H  M  A  D  D  L  G  R  A  M  E  S  L  V  S  V  M       p.3672+24

          .         .         .         .         .         .       g.2222853
 tgacagatgaagaaggagcagaataaatgttttacaactcctgattcccgcatggttttt       c.*120
   T  D  E  E  G  A  E  *                                         p.3672+31

          .         .         .         .         .         .       g.2222913
 ataatattcatacaacaaagaggattagacagtaagagtttacaagaaataaatctatat       c.*180

          .         .         .         .         .         .       g.2222973
 ttttgtgaagggtagtggtattatactgtagatttcagtagtttctaagtctgttattgt       c.*240

          .         .         .         .         .         .       g.2223033
 tttgttaacaatggcaggttttacacgtctatgcaattgtacaaaaaagttataagaaaa       c.*300

          .         .         .         .         .         .       g.2223093
 ctacatgtaaaatcttgatagctaaataacttgccatttctttatatggaacgcattttg       c.*360

          .         .         .         .         .         .       g.2223153
 ggttgtttaaaaatttataacagttataaagaaagattgtaaactaaagtgtgctttata       c.*420

          .         .         .         .         .         .       g.2223213
 aaaaaaagttgtttataaaaacccctaaaaacaaaacaaacacacacacacacacataca       c.*480

          .         .         .         .         .         .       g.2223273
 cacacacacacaaaactttgaggcagcgcattgttttgcatccttttggcgtgatatcca       c.*540

          .         .         .         .         .         .       g.2223333
 tatgaaattcatggctttttctttttttgcatattaaagataagacttcctctaccacca       c.*600

          .         .         .         .         .         .       g.2223393
 caccaaatgactactacacactgctcatttgagaactgtcagctgagtggggcaggcttg       c.*660

          .         .         .         .         .         .       g.2223453
 agttttcatttcatatatctatatgtctataagtatataaatactatagttatatagata       c.*720

          .         .         .         .         .         .       g.2223513
 aagagatacgaatttctatagactgactttttccattttttaaatgttcatgtcacatcc       c.*780

          .         .         .         .         .         .       g.2223573
 taatagaaagaaattacttctagtcagtcatccaggcttacctgcttggtctagaatgga       c.*840

          .         .         .         .         .         .       g.2223633
 tttttcccggagccggaagccaggaggaaactacaccacactaaaacattgtctacagct       c.*900

          .         .         .         .         .         .       g.2223693
 ccagatgtttctcattttaaacaactttccactgacaacgaaagtaaagtaaagtattgg       c.*960

          .         .         .         .         .         .       g.2223753
 atttttttaaagggaacatgtgaatgaatacacaggacttattatatcagagtgagtaat       c.*1020

          .         .         .         .         .         .       g.2223813
 cggttggttggttgattgattgattgattgatacattcagcttcctgctgctagcaatgc       c.*1080

          .         .         .         .         .         .       g.2223873
 cacgatttagatttaatgatgcttcagtggaaatcaatcagaaggtattctgaccttgtg       c.*1140

          .         .         .         .         .         .       g.2223933
 aacatcagaaggtattttttaactcccaagcagtagcaggacgatgatagggctggaggg       c.*1200

          .         .         .         .         .         .       g.2223993
 ctatggattcccagcccatccctgtgaaggagtaggccactctttaagtgaaggattgga       c.*1260

          .         .         .         .         .         .       g.2224053
 tgattgttcataatacataaagttctctgtaattacaactaaattattatgccctcttct       c.*1320

          .         .         .         .         .         .       g.2224113
 cacagtcaaaaggaactgggtggtttggtttttgttgcttttttagatttattgtcccat       c.*1380

          .         .         .         .         .         .       g.2224173
 gtgggatgagtttttaaatgccacaagacataatttaaaataaataaactttgggaaaag       c.*1440

          .         .         .         .         .         .       g.2224233
 gtgtaaaacagtagccccatcacatttgtgatactgacaggtatcaacccagaagcccat       c.*1500

          .         .         .         .         .         .       g.2224293
 gaactgtgtttccatcctttgcatttctctgcgagtagttccacacaggtttgtaagtaa       c.*1560

          .         .         .         .         .         .       g.2224353
 gtaagaaagaaggcaaattgattcaaatgttacaaaaaaacccttcttggtggattagac       c.*1620

          .         .         .         .         .         .       g.2224413
 aggttaaatatataaacaaacaaacaaaaattgctcaaaaaagaggagaaaagctcaaga       c.*1680

          .         .         .         .         .         .       g.2224473
 ggaaaagctaaggactggtaggaaaaagctttactctttcatgccattttatttcttttt       c.*1740

          .         .         .         .         .         .       g.2224533
 gatttttaaatcattcattcaatagataccaccgtgtgacctataattttgcaaatctgt       c.*1800

          .         .         .         .         .         .       g.2224593
 tacctctgacatcaagtgtaattagcttttggagagtgggctgacatcaagtgtaattag       c.*1860

          .         .         .         .         .         .       g.2224653
 cttttggagagtgggttttgtccattattaataattaattaattaacatcaaacacggct       c.*1920

          .         .         .         .         .         .       g.2224713
 tctcatgctatttctacctcactttggttttggggtgttcctgataattgtgcacacctg       c.*1980

          .         .         .         .         .         .       g.2224773
 agttcacagcttcaccacttgtccattgcgttattttctttttcctttataattctttct       c.*2040

          .         .         .         .         .         .       g.2224833
 ttttccttcataattttcaaaagaaaacccaaagctctaaggtaacaaattaccaaatta       c.*2100

          .         .         .         .         .         .       g.2224893
 catgaagatttggtttttgtcttgcatttttttcctttatgtgacgctggaccttttctt       c.*2160

          .         .         .         .         .         .       g.2224953
 tacccaaggatttttaaaactcagatttaaaacaaggggttactttacatcctactaaga       c.*2220

          .         .         .         .         .         .       g.2225013
 agtttaagtaagtaagtttcattctaaaatcagaggtaaatagagtgcataaataatttt       c.*2280

          .         .         .         .         .         .       g.2225073
 gttttaatctttttgtttttcttttagacacattagctctggagtgagtctgtcataata       c.*2340

          .         .         .         .         .         .       g.2225133
 tttgaacaaaaattgagagctttattgctgcattttaagcataattaatttggacattat       c.*2400

          .         .         .         .         .         .       g.2225193
 ttcgtgttgtgttctttataaccaccaagtattaaactgtaaatcataatgtaactgaag       c.*2460

          .         .         .         .         .         .       g.2225253
 cataaacatcacatggcatgttttgtcattgttttcaggtactgagttcttacttgagta       c.*2520

          .         .         .         .         .         .       g.2225313
 tcataatatattgtgttttaacaccaacactgtaacatttacgaattatttttttaaact       c.*2580

          .         .         .         .         .         .       g.2225373
 tcagttttactgcattttcacaacatatcagacttcaccaaatatatgccttactattgt       c.*2640

          .         .         .         .         .                 g.2225424
 attatagtactgctttactgtgtatctcaataaagcacgcagttatgttac                c.*2691

 (downstream sequence)

Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The Duchenne Muscular Dystrophy protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift mutations, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.


Powered by LOVDv.2.0-20 Build 20
©2004-2009 Leiden University Medical Center