| M W F; 10:10 - 11:00 pm |
Molecular Biology |
Douglas W. Smith |
| York 2722 |
BIMM 100 |
5254 Muir Biology Building |
| Fall, 2000 |
x42620; dsmith@ucsd.edu |
| BIMM100 | Syllabus
| Sections / Off Hrs | Grading
Policy | DNASYSTEM
|
| Lectures | Journal
Articles | Study Qs | Lab
Techniques | Exams |
Outline:
1. Maxam-Gilbert Chemical
Sequencing ... [Brown,
Box 4.1]
no in vitro DNA polymerase reaction
1. Use DNA, ds or ss, with radioactive label at one end ONLY
2. In at least 4 separate reactions,
treat the DNA with Base specific chemicals that result
in cleavage of the DNA strand at that base
Example: DMS (dimethyl sulfate) for G's, hydrazine for pyrimidines
3. Get a nested set of labeled DNA fragments ...
4. Analyse as a ladder on a
DNA sequencing gel:
a. Polyacrylamide denaturing gel ... resolution of short ssDNA
fragments
b. Denaturing gel: 8 M urea ... keep DNA denatured during electrophoresis
c. Analyse only the labeled DNA fragments via Autoradiography
or Fluorescence analysis
5. Read the DNA sequence
from bottom of gel to top by examination or "reading"
of the ladder of DNA bands ... gives sequence 5'
to 3' (5' -> 3')
2. Sanger dideoxy DNA Sequencing ... [Brown,
Fig 4.2]
comes from DNA Polymerase properties:
1. Have a DNA template and
a DNA primer
Cloning vehicles often have universal primers for sequencing
into DNA cloned into one of the MCS (Multiple Cloning Sites) sites
or Polylinkers ...[Brown, Fig 4.5]
2. Execute 4 separate polymerization
reactions containing each of the 4 dNTPs and one each of the four
dideoxynucleoside TPs: ddGTP, ddATP, ddTTP, ddCTP ...
To assay the product DNA, one of the 4 dNTPs is radioactively
labeled ...
or the Primer is labeled, radioactively or fluorescently ...
or the ddNTP is labeled fluorescently (see below)
Dideoxy means 3'-H as well as 2'-H
When a ddNTP is incorporated, it acts as a Chain Terminator:
DNA synthesis stops since the DNA primer no longer has a 3'-OH
Primer Terminus
3. Thus, get as reaction products, a nested set of fragments, each terminated at one of the four bases: G in the ddGTP reaction, A in the ddATP reaction, ...
4. When "run" on
a DNA sequencing gel (polyacrylamide, with DNA denatured),
the "nested set" of fragments forms a ladder of
DNA bands corresponding to the positions of the bases ...
5. Read the DNA sequence by reading the 4 lanes, one for each base, from bottom up, to correspond to 5' -> 3' sequence
3. Automated DNA Sequencing
with Fluorescent Labels ...
[Brown, Fig 4.7]
use Sanger dideoxy sequencing ...
But with either fluorescent primers or fluorescent dideoxy
chain terminators
Use different fluor for each
of the four nucleotide types ...
This permits analysis of ALL FOUR nucleotide reactions for a given
DNA sample in ONE LANE of the sequencing gel ... => 4-fold
increase in analysis capability per gel run ...
Most recently: capillary gel electrophoresis is used
in the Perkin Elmer - ABI 3700 automated sequencing machines rather
than slab gel electrophoresis ... thus, have separate thin capillary
gel for each DNA sample ...
Advantages:
1. Better resolution, no running over from one lane to another
...
2. Separation of bands occurs much faster ... => 10- to 15-fold
increase in speed ...
Both advantages very important for Celera sequencing of the human
genome ...
1. The Major Problem in
DNA Sequencing:
can only Sequence 500-700
nucleotides from a given DNA sample (!?!?)
This is due to convergence
of the DNA bands ... [Brown, Fig 4.1]
That is, the percent size difference between bands of 9
and 10 nucleotides is 10% ...
BUT this percent size difference between bands of 99 and
100 nucleotides is only 1% !!
... thus: bands corresponding to 99 and 100 nucleotides are 10-fold
closer to each other than the bands corresponding to DNA fragments
of length 9 and 10 nucleotides ...
Thus, in Genome Sequencing or Sequencing of DNA molecules much longer than 500-700 nucleotides, one must obtain sequence of many, overlapping sets of ~500 bp fragments and then join these together by determining how they overlap each other ...
2. DNA Sequence Assembly - Joining of Overlapping Sequences to form Contigs
When overlapping Sequences
are properly joined to form a single Sequence, this single sequence
is called a Contig.
In Sequence Assembly for an entire Genome, ultimately one should
end up with a single Contig for each Chromosome, since each Chromosome
is composed of a single DNA molecule.
In practice, this is VERY
difficult, due to repetitive DNA sequences ...
Repetitive DNA sequences present two major problems for
Sequence Assembly:
1. Number of Repeat Copies:
if the length of a repetitive DNA sequence region is long
compared to the sequenced DNA length of ~500 bp, it is nearly
impossible to determine how many copies of the DNA repeat there
are ...
2. Correct Assembly: ...
[Brown, Fig 2.2]
if such a long repetitive DNA sequence region is present
at several sites on a Genome, then it is nearly impossible
to determine what DNA sequences should be properly joined on either
side of each repetitive DNA sequence region ...
For these reasons (and a few others), in Genome Sequencing of Genomes from higher Eukaryotes, sequence is not obtained for much repetitive DNA and a given chromosome sequence will be present in several Contigs in the "final assembly" ...
3. Methodologies for DNA Sequence Assembly
There is three main methodologies
(with variations on a theme):
a. Shotgun Sequencing ...
b. Clone Contig Assembly ...
c. Directed Shotgun Assembly ...
a. Shotgun Sequencing ... [Brown, Fig 2.1, 4.10]
One obtains a high redundancy
of sequencing of a given long DNA: 10 - 15 fold
One then uses computer programs to find the correct overlaps
and join individual sequence reads into long Contigs
...
To do this uniquely and correctly, one needs 1) little repetitive
DNA, and 2) overlaps between reads of 20 - 40 nucleotides ...
such overlaps necessitate the high degree of redundancy ...
Closure of Gaps:
One will still have
some Gaps that need closing ... this is done via:
1) Use of a Second Clone Library ... often using a different
Cloning Vehicle
This will often yield a Clone which will cover the Gap ...
[Brown, Fig 4.11]
2) Use of Directed Sequencing:
From this Clone, obtain initial ~500 bp of sequence from one
end.
Then use this sequence to construct an oligonucleotide to use
as primer to extend the sequence further into the clone: Internal
Primer ... [Brown, Fig 4.11]
Continue doing this until one has walked across the Gap,
thereby closing the Gap ...
b. Clone Contig Assembly ...
First generate a collection of Mapped Clone Fragments:
1) Examples: YACs, BACs, PACs, Cosmids which are mapped relative
to each other, forming a set of overlapping large cloned fragments
... often with a high degree of redundancy: 5 - 15 fold.
2) From among these, choose a minimum set of overlapping clones
... this is sometimes called a minimum tiling clone set
3) For each of these overlapping clones, do Shotgun Sequencing:
Re-clone or subclone each large overlapping clone as small
sequencing fragments
Sequence these, ultimately forming a single contig corresponding
to the sequence of each large clone
4) Join each Contig for each large Clone together via overlapping
Sequence and knowledge of the Map of the Clones, yielding the
final Sequence of the entire DNA molecule, eg a chromosome
How is the Collection of Mapped Clone Fragments Generated?
1. Generate a Genome Library
as YACs, BACs, PACs, Cosmids, etc
2. Locate Markers to specific Clones in the Library.
These Markers can be Genetic markers, e.g. genes, or physical
DNA markers, e.g. R.sites, STSs, RFLPs, etc ... [Brown, Fig
4.16]
3. Identify Overlapping Clones by identifying pairs of
Clones uniquely containing the same Markers, e.g. genes
or shared R.fragments ... [Brown, Fig 4.16]
When these Markers are STSs, they can serve as physical anchors in the sequence assembly process ... i.e., one knows the position of specific sequence from the position of the STS ... and the sequence assembly must have this sequence in this position to be correct assembly ...
Chromosome Walking:
One can also determine overlapping clones without specific
markers by hybridizing one clone DNA to the DNA of other
clones ... [Brown, Fig 4.13] ...
However, use of STSs to provide anchors is very desirable with
very large DNA fragments ...
c. Directed Shotgun Assembly ...
This is Shotgun DNA Sequencing and Assembly of very large genomes, e.g. Drosophila or human coupled with use of anchored DNA markers, e.g. STSs
Example: Celera approach to Sequencing the Human Genome
Three genomic DNA libraries
are used for sequencing:
1) Plasmid library with ~2 kb inserts
2) Library with ~ 10 kb inserts ... different Cloning Vehicle
used ... 10 kb is large compared with most Repetitive DNA regions
in the human genome, thereby avoiding much of this problem
3) BAC clone library with ~250 kb inserts ...
DNA Sequencing is done on both ends of the
inserts present in each of these clones ... [see Brown, Fig 4.5A]
Computerized Assembly of sequence into Contigs is
greatly helped by the fact that the distance between the sequences
of each of the two ends is nearly constant, at 2 kb or 10 kb or
250 kb: these are called Sequence Pairs
The Sequences from the BAC
clone ends are used in two ways:
1. they become STSs
2. these STSs are used to map the BACs and the STSs
3. the mapped STSs become Anchored Sequences for the
subsequence Assembly into Contigs
Note the difference here
from Clone Contig Assembly:
1) In Clone Contig
Assembly, mapping is done first, a minimum set of YACs or BACs
is determined, and then each of these YACs or BACs is subjected
to Shotgun Sequencing.
2) Here, in Directed Shotgun Assembly, the entire genome is
subjected to Shotgun Sequencing via the two smaller insert
libraries, the 2 kb and 10 kb insert libraries.
These methodologies yield then the ultimate Physical Map of the Genome: its DNA Sequence
This DNA Sequence is
then the ultimate foundation information about the organism ...
with this information, one knows completely the enzymatic and
molecular capabilities of the organism ... what it can do
... and what it can not do ...
| BIMM100 | Syllabus
| Sections / Off Hrs | Grading
Policy | DNASYSTEM
|
| Lectures | Journal
Articles | Study Qs | Lab
Techniques | Exams |
If you have problems or comments, send email to Doug Smith