In this article we will discuss about:- 1. Introduction to the Organization of DNA in Eukaryotes 2. Quantities of DNA Present in Eukaryotes 3. Active and Inactive Genes 4. Gene Cloning 5. “Sequencing” DNA.
Contents:
- Introduction to the Organization of DNA in Eukaryotes
- Quantities of DNA Present in Eukaryotes
- Active and Inactive Genes
- Gene Cloning
- “Sequencing” DNA
1. Introduction to the Organization of DNA in Eukaryotes:
In eukaryotes the DNA is never in the form of a naked molecule, but is always associated with proteins which, together with the DNA, are permanent components of the complex substance called chromatin, the substance of the chromosomes—the units of organization of the chromatin in eukaryotes. The DNA double helix together with the associated proteins is called a chromonema.
ADVERTISEMENTS:
In prokaryotes, in bacteria in particular, the “chromosome” is essentially a naked, two-stranded DNA molecule. Proteins may be associated with this molecule, in the form of repressors or enzymes serving for the replication of the DNA or for DNA-dependent synthesis of RNA.
These protein molecules may be present or absent at different times, but are not essential components of the structure of the DNA. Furthermore, most of the DNA in a bacterial cell is contained in one chromosome, in which the DNA molecule is in the form of a closed ring. The virus “chromosome” is even more of a naked DNA molecule.
The proteins associated with the DNA in the chromosomes fall into two main groups. The first group are the basic proteins—the histones. The histones are proteins with a positive charge, due to the presence of a large proportion of basic amino acids, such as arginine and lysine. The other groups are the non-histone proteins, including the acidic proteins which have an overall negative charge and contain a greater proportion of acidic nucleic acids.
Histones Proteins:
ADVERTISEMENTS:
Histones are associated with the DNA only in eukaryotes, and only in those that are multicellular. No true histones have been found in bacteria, in blue-green algae, in yeasts, or in Protozoa (trypanosomes).
In the chromosomes of multicellular eukaryotes the histones are not diffused throughout their length, but are present in the form of small, very regular clumps, each clump consisting of exactly eight molecules, two each of four kinds of histone, designated as histones H2A, H2B, H3, and H4.
Together these eight molecules form a disc-shaped particle 110 Å in diameter, and 57 Å thick. The double-stranded DNA chain is wrapped around this disc in a spiral. Together with the histones this forms a unit of chromatin structure, which has been designated a nucleosome.
The DNA involved in one nucleosome is 140 base pairs long, and forms one complete and one incomplete turn around the histone particle, called the nucleosome core. The double-stranded DNA is continuous from one nucleosome to the next, forming a linker section, which is of variable length, sometimes even within the same tissue (from 14 base pairs to over 100 base pairs according to some data).
ADVERTISEMENTS:
The four kinds of histones contained in the nucleosome cores are amazingly similar throughout the animal and plant kingdoms. The histones H3 and H4 are especially constant. The similarity of histones in different organisms is evident not only in the physical properties of these histones but also in their amino acid residue composition.
In particular, it has been shown that the histone H4 in cow thymus and in pea seedlings differs only in two amino-acid residues – lysine and valine (in cow thymus) are substituted for arginine and isoleucine (in pea seedlings), respectively.
In both substitutions the change could have been achieved by alteration of only one base in the triplets coding for the two amino acids. The other two histones are somewhat more variable. There is a fifth type of histone, also more variable, the histone H1, which is not a part of the nucleosome core but is associated somehow with the linkers between nucleosomes. There is presumably one molecule of histone H1 in each linker.
At one time it was believed that the presence of histones in the chromosomes of eukaryotes might be a mechanism for determining which genes are to be active and which should be inactive (repressed). In fact, it has been proved that the presence of histones reduces the ability of DNA sequences in the chromosomes to be transcribed into RNA.
In one study isolated calf thymus nuclei were treated with proteolytic enzyme trypsin. The treatment removed most of the histone associated with the nucleic acid in the chromosomes. The trypsin-treated chromosomes were then used as primers in preparations containing RNA precursors.
It was found that the amount of RNA synthesized was increased roughly threefold when 70 per cent of the histones were removed. In an experiment on plant material, removing all the histone from chromosomal DNA resulted in a fivefold increase in the synthesis of RNA. The conclusion seems justified that histones associated with the DNA in the chromosomes repress their ability to serve as RNA templates.
However, the similarity of histones in different parts of the genome, as well as in different animals and plants, precludes their selective role as factors specifying the activity of particular genes. Rather, the histones should be considered as structural elements of eukaryotic chromatin.
Non-Histone Proteins:
Non-histone proteins also seem to be a permanent component of the chromatin structure. It has been conjectured that non-histone proteins may be, in part, instrumental in the maintenance of higher-order coiling of the chromonema; that is, coiling additional to the coiling around the cores of the nucleosomes.
ADVERTISEMENTS:
The non-histone proteins provide a sort of “scaffolding” for the coiled chromonema. At meiosis the conjugated chromosomes of an allelic pair can be seen in the electron microscope joined to a “synaptonemic fiber” lying between the two chromosomes.
Non-histone proteins associated with the chromosomes show an immensely greater variety than the histone proteins. Whereas there are only five kinds of histones, the number of different non-histone proteins in the chromatin of the same tissue may run into hundreds. It is therefore much more likely that non-histone proteins may play a significant part in gene action regulation.
A special component of the non-histone proteins in the chromatin are contractile proteins. There are at least ten different non-histone proteins which seem to be similar if not identical to the contractile proteins, myosin and actin. These proteins comprise up to 50 per cent of the non-histone proteins in the chromatin. It is possible, but not actually proven, that these contractile proteins may play a part in the movement of the chromosomes during mitosis.
In addition to the histones and non-histone proteins, some RNA is also found to be associated with the DNA in the chromosomes of eukaryotes. Some of this may be only temporarily associated with the DNA, such as the RNA in the process of transcription, but there is a component consisting of short molecules of RNA 40 to 80 nucleotides long, which appears to be a permanent part of the chromatin. The function of this RNA and the way in which it is bound to the other components of the chromatin is not known.
2. Quantities of DNA Present in Eukaryotes:
Compared with prokaryotes, the eukaryotes have an enormously greater amount of DNA per cell. The amounts are generally in proportion to the complexity of the organisms concerned, although there is a great variation at each level of organization.
Nevertheless, it is striking that the smallest amounts of DNA are found in viruses, a much greater amount is found in bacteria; and there is a great increase in the DNA content per cell in even the lowest eukaryotes, with the highest quantities found in vertebrates.
In the virus ɸX 174 the genome consists of 5375 nucleotides, coding for nine genes. There are millions of nucleotides in a bacterial chromosome, billions in a mammalian cell. In lower Metazoa the amount of DNA per cell is ± 100 times that of Escherichia coli; in mammals it is ± 8000 times greater than in E. coli. If all this DNA consisted of genes coding for proteins, a mammalian cell could produce two million different proteins.
3. Active and Inactive Genes:
It is presumed by many biologists that not all genes in eukaryotes are active at any one time. By active genes are meant those genes which are in fact being transcribed to produce messenger RNA, or any other kind of RNA. It is quite possible to prove that at a certain time transcription is going on in specific cells or tissues.
This is evident from the appearance of new quantities of RNA, and also from the subsequent synthesis of specific proteins. It is less easy to prove that a gene is completely inactive (that it is not being transcribed), as very small quantities of its products could escape detection.
Contrary to what was suggested at one stage, active genes are not stripped of their accompanying histones; however, the state of the DNA in active genes is altered in some manner, compared with DNA in inactive genes. One indication of this is that the DNA of active genes is much more easily broken down by DNase I, which suggests that it is “loosened up,” and thus is more easily accessible to the enzyme.
Electron microscope studies have shown that at least in some actively transcribed sectors of DNA there are no nucleosomes to be seen (in ribosomal genes and in active units of the lamp brush chromosome). Exactly how the histones are bound to the DNA in active genes has yet to be discovered.
The concept of a “gene” was originally a functional one- a gene was presumed to be a unit of hereditary transmission, and a proof of its existence was sought by means of breeding and crossing experiments. In more recent times it was found that a gene exercises its function by making it possible for an organism to synthesize a particular enzyme or other protein. The empirical rule was established- one gene, one enzyme (or polypeptide).
This rule has certain limitations. With further progress in research, eventually a gene may be characterized not only as a functional unit but as a specific physical entity, as a unit of matter, and as an immensely complex molecule or part of a molecule, the chemical composition of which can be ascertained in every detail.
4. Gene Cloning:
It is, of course, impossible to make a chemical or physical analysis of a single molecule, so the first step in analyzing the structure of a gene must be to find a way of obtaining very large quantities of the gene-molecule. An adequate means of attaining this end was discovered in the method of gene cloning.
Since the method is fairly complicated, only the general principle, and not the technical details, will be given here. The essence of the method is the insertion of a fragment of DNA from the chromosome of a eukaryotic organism (or of a prokaryote, for that matter) into a bacterial cell in such a way that the inserted foreign DNA becomes a part of the self-reproducing apparatus of the bacterial cell.
Since bacteria can be grown in culture in any desired quantities, the fragment of eukaryotic DNA becomes multiplied in the same proportion, without undergoing any changes (hence the term “cloning,” analogous to the vegetative reproduction of organisms, during which the genotype always remains the same). After a sufficient period of time the bacteria may be destroyed, and the greatly increased quantities of the eukaryotic DNA fragment recovered.
The central problem of this method is the introduction of the DNA fragment into the bacterial cell in such a way that it becomes a part of the self-reproducing unit of the cell (a chromosome is a self-reproducing entity, but a fragment of DNA taken at random is not).
One way of solving this problem is the attachment of the eukaryotic DNA fragment to the DNA of a virus (a bacteriophage), allowing the virus to carry the fragment into the bacterial cell. The virus reproduces itself in the bacterial cell, and with it the eukaryotic DNA is also reproduced.
Perhaps a more elegant way is to make use of a peculiar structure known as a plasmid. A plasmid is a comparatively short chain of DNA occurring in some bacteria, in addition to the main bacterial chromosome. Like the bacterial chromosome, the plasmid has a ring structure and, like the chromosome, it replicates itself when the cell divides. It also carries some hereditary properties—for example, a hereditary resistance to certain antibiotics.
There may be one or more plasmids per bacterial cell. By disrupting bacteria the plasmids may be isolated and then separated from other bacterial cell components by ultracentrifugation. A fragment of the foreign DNA can then be inserted into the plasmid by breaking the ring of plasmid DNA, joining the free ends of the plasmid molecule end-to-end with the fragment of foreign DNA, and then reclosing the ring.
The final step is the reintroduction of the complex chimaeric plasmid into another bacterial cell. This is done by treating E. coli with calcium salts, upon, which the cell wall becomes permeable to plasmids added to the culture of bacteria.
Only approximately one bacterial cell in a million receives an inserted chimaeric plasmid, but if the plasmid carries resistance to an antibiotic (such as tetracycline) and the bacteria are sensitive to the same antibiotic, then by cultivating the bacteria in a medium with the antibiotic the offspring of the cell that has received the plasmid can be isolated, because all the cells without the inserted plasmid will perish.
The crucial stage of breaking the DNA molecule of the plasmid as well as of the foreign DNA in such a way that they can reunite is performed by special enzymes, the restriction endonucleases, that occur in bacteria and may be extracted from them. The final restoration of the continuity of the molecules is affected by another enzyme, the DNA ligase.
This method does not allow selection of a specific portion of the genome of a eukaryote (or other organism) for reproduction in the bacterial cell. The eukaryotic DNA is first prepared for this experiment by mechanical subdivision (shearing); it is subsequently fragmented into smaller sections by restriction endonucleases.
The DNA sequence in the resulting fragments is not readily known and cannot be predicted beforehand, except in some special cases. However, once large quantities of the same segment of the DNA are produced by cloning, these can be tested to determine if they contain any meaningful “message,” and what that message is.
If the segment contains a gene, its presence can be ascertained in a variety of ways – by production of a specific RNA by transcription, by production of a protein by the subsequent translation of the message, or by the use of the method of “hybridization”.
A further step in the elaboration of the method of cloning is the use of metazoan cells instead of bacterial cells for the reproduction of a section of DNA, e.g. of a particular gene. To introduce the DNA into a cell, the DNA is first included into the genome of an animal virus; the virus with the included exogenous DNA is then allowed to infect the cells in a tissue culture.
This experiment was carried out with the aid of the monkey virus SV40, the genome of which is in the shape of a single, circular double-stranded DNA molecule. DNA of the rabbit β-globin gene was introduced into the virus genome by chemical methods similar to those used in introducing a section of DNA into the plasmid of Escherichia coli.
The cells into which the virus entered were monkey kidney cells in tissue culture. As a result, the β-globin gene not only reproduced in the monkey kidney cells but was also transcribed into β-globin mRNA, and the latter led to the synthesis of rabbit β-globin in the cultured monkey cells.
5. “Sequencing” DNA:
Once sufficient quantities of the same segment of a DNA molecule have been obtained, there are at present means available for determining the exact sequence of the nucleotides, even if the segment is thousands of nucleotides long.
One of the methods used for “sequencing” a segment of DNA is the “plus and minus” method developed by F. Sanger and A. R. Coulson (1975). This method is based not on analysis of the unknown DNA sequence, but rather on synthesis of a complementary replica under strictly specified conditions.
To apply the method it is first necessary to have millions of identical copies of the molecule that is being investigated. The molecule is prepared in single-stranded form and is then incubated with a supply of deoxy-nucleoside-triphosphates, and with DNA polymerase, which starts building complementary copies of the DNA that is being studied, beginning with the 5′ end of the newly synthesized molecule.
When part of the complementary molecule is already assembled, further synthesis is performed in one of two different ways. In the “minus” variation only three of the four nucleotides are supplied—either the adenosine, the thymidine, the guanosine, or the cytosine nucleotide is missing.
For this reason the elongation of the complementary molecule stops at the point where the missing nucleotide is normally added. For example, if in the sample lacking thymidine the next nucleotide in the model is adenosine, the synthesis stops short of adding a thymidine. In the “plus” variation only one of the four deoxy-nucleoside-triphosphates is provided.
The enzyme used in this variation actually reduces (shortens) the terminal part of the complementary chain, until it reaches the nucleotide similar to the one provided in the medium. In this way the experimenter can be certain which is the last nucleotide in the chains produced in the “plus” variation.
Because of the different lengths that the complementary chain has attained before the “plus” or “minus” condition is introduced, a great variety of lengths of the complementary chains are obtained, the end nucleotides of which are known to the experimenter.
The complementary chains are now separated from the original DNA molecules, and the length of each chain is assessed by running them through gel electrophoresis- under the influence of the electric current the shorter pieces move faster, and the longer pieces more slowly.
In this way it is possible to arrange all the fragments in a consecutive order of length, and eventually to determine the sequence of the nucleotides in the complementary chain. Finally, the sequence in the original chain is obtained by reading adenosine for thymidine, guanosine for cytosine and vice versa. Only the general principles of the “plus and minus” method are described here; for technical details, the reader is referred to the papers cited above.
One of the first applications of this method was the complete sequencing of the whole genome of a small phage, ɸX 174. The chromosome of this phage is a circular, single-stranded DNA molecule consisting of 5375 nucleotides representing the nine genes of this virus.
An alternative method for sequencing DNA has been proposed by A. Maxam and W. Gilbert (1977). Their method is based on using chemicals which break the DNA molecule (single-stranded or double-stranded) at specific bases. Dimethyl sulfate breaks the DNA molecule at a purine base (guanine or adenine).
The breakage does not occur at every susceptible base (this would result in very short pieces, useless for the purpose of sequencing), but only at 1 out of 50 to 100 bases, so that fragments of different lengths are produced. Guanine breaks more easily than adenine, and this characteristic is used to distinguish between the two.
Hydrazine is used to break the DNA molecule at the pyrimidine bases (cytosine and thymine). In the presence of 2 M NaCl hydrazine, action is restricted to cytosine only. The fragments produced by each chemical are then treated by gel electrophoresis to sort out the fragments according to length, as in the Sanger- Coulson plus minus method.
As it is known at what bases the chains terminate after each chemical treatment, and as the order of these terminal bases is given by the length of each chain, the sequence of the bases in the DNA is read by combining the data from the different chemical treatments.
The Maxam and Gilbert method was used to determine the complete nucleotide sequence in the DNA of the simian virus SV40. There are in these DNA 5224 base pairs, coding for 5 virus proteins. At least 15.2 per cent of the genome in this case is not translated into polypeptides; on the other hand the codes for several proteins overlap, so that the same base sequences are used for coding more than one protein.
The method for obtaining large quantities of a gene by cloning has its counterpart in another method, in which a DNA copy is prepared from a known mRNA by the use of the enzyme reverse transcriptase. In the normal, or more usual, course of events, the RNA molecules are complementary copies of the DNA model produced by the action of the RNA polymerase in the presence of a supply of RNA nucleotides.
In some viruses the hereditary material is not DNA, but RNA. When these viruses infect the host cells, the first process that occurs is the transcription of their RNA sequence into a complementary DNA sequence, by the use of an enzyme, reverse transcriptase, produced by the virus. All further syntheses are then carried out starting from the DNA developed in this way.
Using extracted and purified reverse transcriptase and a supply of DNA nucleotides, it is possible to produce DNA copies complementary to RNA molecules, in particular to mRNA molecules. The resulting DNA sequences are denoted as cDNA (complementary DNA).
In theory these could be identical with the natural genes, inasmuch as the RNA used for their production has been previously “copied” from the original genes. In practice, however, the cDNA need not be, and in fact is not, identical with the original gene.
Producing copies of genes (if not always accurate ones) starting from the corresponding mRNAs has an advantage over the direct isolation of DNA genes, because in functioning cells certain types of mRNA occur in many more copies than the corresponding genes.
In cells of the oviducts of laying hens the mRNA coding for ovalbumin (the main component of the egg white) may be present in more than 10,000 copies per cell, while the corresponding gene exists apparently in only two copies per cell (one copy per haploid genome). It is therefore much more practicable to extract and purify the mRNA by the use of purely chemical methods.