Microbiology Reader
Equipment to run microbiology work automatically

Growth Curves of any strain.
Microbiological calculations.

Microbiology Home
Microbioloy Reader
Growth Curves
Photo Album
Microorganisms
Software
Download
Purchasing
Contact Us

 

What Is Genetics?

Genetics is the science of genes, heredity, and the variation of organisms. Humans began applying knowledge of genetics in prehistory with the domestication and breeding of plants and animals. In modern research, genetics provides important tools in the investigation of the function of a particular gene, e.g. analysis of genetic interactions. Within organisms, genetic information generally is carried in chromosomes, where it is represented in the chemical structure of particular DNA molecules.

Genes encode the information necessary for synthesizing proteins, which, in turn play a large role in influencing, although, in many instances, do not completely determine, the final phenotype of the organism. The phrase to code for is often used to mean a gene contains the instructions on how to build a particular protein, as in the gene codes for the protein. Note that the "one gene, one protein" concept is now known to be simplistic. For example, a single gene may produce multiple products, depending on how its transcription is regulated.

Areas of genetics Classical genetics Main articles: Classical genetics, Mendelian inheritance

Classical genetics consists of the techniques and methodologies of genetics that predate the advent of molecular biology. After the discovery of the genetic code and such tools of cloning as restriction enzymes, the avenues of investigation open to geneticists were greatly broadened. Some classical genetic ideas have been supplanted with the mechanistic understanding brought by molecular discoveries, but many remain intact and in use, such as the Mendel's laws.

Molecular genetics Main article: Molecular genetics

Molecular genetics builds upon the foundation of classical genetics but focuses on the structure and function of genes at a molecular level. Molecular genetics employs the methods of both classical genetics (such as hybridization) and molecular biology. It is so-called to differentiate it from other sub fields of genetics such as ecological genetics and population genetics. An important area within molecular genetics is the use of molecular information to determine the patterns of descent, and therefore the correct scientific classification of organisms: this is called molecular systematics. The study of inherited features not strictly associated with changes in the DNA sequence is called epigenetics.

Some take the view that life can be defined, in molecular terms, as the set of strategies which RNA polynucleotides have used and continue to use to perpetuate themselves. This definition grows out of work on the origin of life, specifically the RNA world hypothesis.

Population, quantitative and ecological genetics Main articles: Population genetics, Quantitative genetics, Ecological genetics

Population, quantitative and ecological genetics are all very closely related subfields and also build upon classical genetics (supplemented with modern molecular genetics). They are chiefly distinguished by a common theme of studying populations of organisms drawn from nature but differ somewhat in the choice of which aspect of the organism on which they focus. The foundational discipline is population genetics which studies the distribution of and change in allele frequencies of genes under the influence of the four evolutionary forces: natural selection, genetic drift, mutation and migration. It is the theory that attempts to explain such phenomena as adaptation and speciation.

The related subfield of quantitative genetics, which builds on population genetics, aims to predict the response to selection given data on the phenotype and relationships of individuals. A more recent development of quantitative genetics is the analysis of quantitative trait loci. Traits that are under the influence of a large number of genes are known as quantitative traits, and their mapping to a location on the chromosome requires accurate phenotypic, pedigree and marker data from a large number of related individuals.

Ecological genetics again builds upon the basic principles of population genetics but is more explicitly focused on ecological issues. While molecular genetics studies the structure and function of genes at a molecular level, ecological genetics focuses on wild populations of organisms, and attempts to collect data on the ecological aspects of individuals as well as molecular markers from those individuals.

Genomics Main article: Genomics

A more recent development is the rise of genomics, which attempts the study of large-scale genetic patterns across the genome for (and in principle, all the DNA in) a given species.

Closely-related fields The science which grew out of the union of biochemistry and genetics is widely known as molecular biology. The term "genetics" is often widely conflated with the notion of genetic engineering, where the DNA of an organism is modified for some kind of practical end, but most research in genetics is aimed at understanding and explaining the effect of genes on phenotypes and in the role of genes in populations (see population genetics and ecological genetics), rather than genetic engineering.

History It was not until 1865 that Gregor Mendel first traced inheritance patterns of certain traits in pea plants and showed that they obeyed simple statistical rules. Although not all features show these patterns of Mendelian inheritance, his work acted as a proof that application of statistics to inheritance could be highly useful. Since that time many more complex forms of inheritance have been demonstrated.

From his statistical analysis Mendel defined a concept that he described as an allele, which was the fundamental unit of heredity. The term allele as Mendel used it is nearly synonymous with the term gene, whilst the term allele now means a specific variant of a particular gene.

The significance of Mendel's work was not understood until early in the twentieth century, after his death, when his research was re-discovered by other scientists working on similar problems.

Mendel was unaware of the physical nature of the gene. We now know that genetic information is normally carried on DNA. (Certain viruses store their genetic information in RNA). Manipulation of DNA can in turn alter the inheritance and features of various organisms.

Timeline of notable discoveries:

1859 Charles Darwin publishes The Origin of Species

1865 Gregor Mendel's paper, Experiments on Plant Hybridization

1903 Chromosomes are discovered to be hereditary units

1905 British biologist William Bateson coins the term "genetics" in a letter to Adam Sedgwick

1910 Thomas Hunt Morgan shows that genes reside on chromosomes

1918 Ronald Fisher publishes On the correlation between relatives on the supposition of Mendelian inheritance - the modern synthesis starts.

1913 Gene maps show chromosomes containing linear arranged genes

1927 Physical changes in genes are called mutations

1928 Frederick Griffith discovers a hereditary molecule that is transmissible between bacteria (see Griffiths experiment)

1931 Crossing over is the cause of recombination

1941 Edward Lawrie Tatum and George Wells Beadle show that genes code for proteins; see the original central dogma of genetics

1944 Oswald Theodore Avery, Colin McLeod and Maclyn McCarty isolate DNA as the genetic material (at that time called transforming principle)

1950 Erwin Chargaff shows that the four nucleotides are not present in nucleic acids in stable proportions, but that some general rules appear to hold (e.g., that the amount of adenine, A, tends to be equal to that of thymine, T).

1952 The Hershey-Chase experiment proves the genetic information of phages (and all other organisms) to be DNA

1953 DNA structure is resolved to be a double helix by James D. Watson and Francis Crick

1958 The Meselson-Stahl experiment demonstrates that DNA is semiconservatively replicated

1961 The genetic code is arranged in triplets

1977 DNA is sequenced

1997 First genome sequenced

2001 First draft sequences of the human genome are released simultaneously by the Human Genome Project and Celera Genomics.

2003 (14 April) Successful completion of Human Genome Project with 99% of the genome sequenced to a 99.99% accuracy

J Bacteriol, 1998 Jul, 180(14), 3563 - 9
Proposed signal transduction role for conserved CheY residue Thr87, a member of the response regulator active-site quintet; Appleby JL et al.; CheY serves as a structural prototype for the response regulator proteins of two-component regulatory systems . Functional roles have previously been defined for four of the five highly conserved residues that form the response regulator active site, the exception being the hydroxy amino acid which corresponds to Thr87 in CheY . To investigate the contribution of Thr87 to signaling, we characterized, genetically and biochemically, several cheY mutants with amino acid substitutions at this position . The hydroxyl group appears to be necessary for effective chemotaxis, as a Thr-->Ser substitution was the only one of six tested which retained a Che+ swarm phenotype . Although nonchemotactic, cheY mutants with amino acid substitutions T87A and T87C could generate clockwise flagellar rotation either in the absence of CheZ, a protein that stimulates dephosphorylation of CheY, or when paired with a second site-activating mutation, Asp13-->Lys, demonstrating that a hydroxy amino acid at position 87 is not essential for activation of the flagellar switch . All purified mutant proteins examined phosphorylated efficiently from the CheA kinase in vitro but were impaired in autodephosphorylation . Thus, the mutant CheY proteins are phosphorylated to a greater degree than wild-type CheY yet support less clockwise flagellar rotation . The data imply that Thr87 is important for generating and/or stabilizing the phosphorylation-induced conformational change in CheY . Furthermore, the various position 87 substitutions differentially affected several properties of the mutant proteins . The chemotaxis and autodephosphorylation defects were tightly linked, suggesting common structural elements, whereas the effects on self-catalyzed and CheZ-mediated dephosphorylation of CheY were uncorrelated, suggesting different structural requirements for the two dephosphorylation reactions.

Biochem J, 1998 Jul 15, 333 ( Pt 2), 433 - 8
Identification of a domain in apolipoprotein B-100 that inhibits the procoagulant activity of tissue factor; Ettelaie C et al.; The ability of low-density lipoprotein (LDL) to inhibit the procoagulant activity of tissue factor is mediated by a direct protein-protein interaction involving apolipoprotein (apo) B-100 . A lysine-rich sequence within apo B-100 (residues 3121-3217), which we have termed lysine-rich apo B-100-derived (KRAD)-98 peptide, may be responsible for its activity . Within this region, residues 3147-3160 (KRAD-14) contain an exceptionally high proportion of positive amino acids . Both recombinant KRAD-98 and KRAD-14 peptides inhibited the procoagulant activity of tissue factor by preventing the activation of factor VII . KRAD-14 also inhibited the prothrombinase components, factors Xa and V . In comparison with the parent protein (apo B-100), KRAD-14 peptide displayed a 20-fold enhancement in the rate of inhibition, whereas KRAD-98 peptide exhibited a rate closer to that of apo B-100 . Mutational analysis of KRAD-14 peptide revealed three adjacent amino acids, alteration of which greatly reduced the inhibitory potential of this peptide . A peptide derived from tissue factor (residues 58-66) was found to act co-operatively with tissue factor itself, but also augmented the inhibition of tissue-factor activity by apo B-100 . In conclusion, LDL may be a physiological regulator of haemostatic mechanisms through the interactions of lysine-rich domains of apo B-100 with tissue factor.

Biochem J, 1998 Jul 15, 333 ( Pt 2), 425 - 31
Cloning and thermostability of TaqI endonuclease isoschizomers from Thermus species SM32 and Thermus filiformis Tok6A1; Cao W et al.; Two TaqI endonuclease (hereafter referred to as TaqI) isoschizomer genes, tsp32IR from Thermus species SM32 of Azores and tfiTok6A1I from T . filiformis Tok6A1 of New Zealand, were cloned in Escherichia coli . The overexpressed enzymes were partly purified and their thermostability was determined . In the medium-salt buffer, Tsp32IR, TfiTok6A1I and one previously cloned TaqI isoschizomer (TthHB8I) were more thermostable than TaqI . Tsp32IR remained partly active up to 90 degreesC in the low-salt buffer . Six amino acid residues that are identical in the three high thermostability isoschizomers (Tsp32IR, TfiTok6A1I and TthHB8I) but differ in TaqI might provide added rigidity for thermostabilization . These include four proline residues located in or near loop regions, and one alanine and one arginine located at helix regions in the predicted TaqI endonuclease secondary structure . The possible role of these residues in thermostabilization was evaluated by mutagenizing the TaqI enzyme . Mutants generated at these six positions were less thermostable than wild-type TaqI . The results suggest that the surrounding sequence or structural context might be as important as the mutation itself.

Biochem J, 1998 Jul 15, 333 ( Pt 2), 367 - 72
Critical role of arg433 in rat transketolase activity as probed by site-directed mutagenesis; Soh Y et al.; It has been shown that one arginine per monomer at an unknown position is essential for enzyme activity of the homodimeric transketolase (TK) {Kremer, Egan and Sable (1980) J . Biol . Chem . 255, 2405-2410} . To identify the critical arginine, four highly conserved arginine residues of rat TK (Arg102, Arg350, Arg433 and Arg506) were replaced with alanine by site-directed mutagenesis . Wild-type and mutant TK proteins were produced in Escherichia coli and characterized . The Arg102-->Ala mutant exhibited similar catalytic activity to the wild-type enzyme, whereas Arg350-->Ala, Arg506-->Ala and Arg433-->Ala mutants exhibited 36.7, 37.0 and 6.1% of the wild-type activity respectively . Three recombinant proteins (wild-type, Arg350-->Ala and Arg433-->Ala) were purified to apparent homogeneity using Ni2+-affinity chromatography and further characterized . All these proteins were able to form homodimers (148 kDa), as shown by immunoblot analysis subsequent to non-denaturing gel electrophoresis . The Arg433-->Ala mutant protein was less stable than the wild-type and Arg350-->Ala proteins at 55 degrees C . Kinetic analyses revealed that both Vmax and Km values were markedly affected in the Arg433-->Ala mutant . The Km values for two substrates xylulose 5-phosphate and ribose 5-phosphate were 11.5- and 24.3-fold higher respectively . The kcat/Km values of the Arg433-->Ala mutant for the two substrates were less than 1% of those of the wild-type protein . Molecular modelling of the rat TK revealed that Arg433 of one monomer has three potential hydrogen-bond interactions with the catalytically important highly conserved loop of the other monomer . Thus, our biochemical analyses and modelling data suggest the critical role of the previously uncharacterized Arg433 in TK activity.

Biochem J, 1998 Jul 15, 333 ( Pt 2), 317 - 25
Sequence, catalytic properties and expression of chicken glutathione-dependent prostaglandin D2 synthase, a novel class Sigma glutathione S-transferase; Thomson AM et al.; The Expressed Sequence Tag database has been screened for cDNA clones encoding prostaglandin D2 synthases (PGDSs) by using a BLAST search with the N-terminal amino acid sequence of rat GSH-dependent PGDS, a class Sigma glutathione S-transferase (GST) . This resulted in the identification of a cDNA from chicken spleen containing an insert of approx . 950 bp that encodes a protein of 199 amino acid residues with a predicted molecular mass of 22732 Da . The deduced primary structure of the chicken protein was not only found to possess 70% sequence identity with rat PGDS but it also demonstrated more than 35% identity with class Sigma GSTs from a range of invertebrates . The open reading frame of the chicken cDNA was expressed in Escherichia coli and the purified protein was found to display high PGDS activity . It also catalysed the conjugation of glutathione with a wide range of aryl halides, organic isothiocyanates and alpha,beta-unsaturated carbonyls, and exhibited glutathione peroxidase activity towards cumene hydroperoxide . Like other GSTs, chicken PGDS was found to be inhibited by non-substrate ligands such as Cibacron Blue, haematin and organotin compounds . Western blotting experiments showed that among the organs studied, the expression of PGDS in the female chicken is highest in liver, kidney and intestine, with only small amounts of the enzyme being found in chicken spleen; in contrast, the rat has highest levels of PGDS in the spleen . Collectively, these results show that the structure and function, but not the expression, of the GSH-requiring PGDS is conserved between chicken and rat.

Biochem J, 1998 Jul 15, 333 ( Pt 2), 233 - 42
Chaperonins; Ranson NA et al.; The molecular chaperones are a diverse set of protein families required for the correct folding, transport and degradation of other proteins in vivo . There has been great progress in understanding the structure and mechanism of action of the chaperonin family, exemplified by Escherichia coli GroEL . The chaperonins are large, double-ring oligomeric proteins that act as containers for the folding of other protein subunits . Together with its co-protein GroES, GroEL binds non-native polypeptides and facilitates their refolding in an ATP-dependent manner . The action of the ATPase cycle causes the substrate-binding surface of GroEL to alternate in character between hydrophobic (binding/unfolding) and hydrophilic (release/folding) . ATP binding initiates a series of dramatic conformational changes that bury the substrate-binding sites, lowering the affinity for non-native polypeptide . In the presence of ATP, GroES binds to GroEL, forming a large chamber that encapsulates substrate proteins for folding . For proteins whose folding is absolutely dependent on the full GroE system, ATP binding (but not hydrolysis) in the encapsulating ring is needed to initiate protein folding . Similarly, ATP binding, but not hydrolysis, in the opposite GroEL ring is needed to release GroES, thus opening the chamber . If the released substrate protein is still not correctly folded, it will go through another round of interaction with GroEL.

Virology, 1998 Jul 5, 246(2), 409 - 17
Recombinant dengue virus type 1 NS3 protein exhibits specific viral RNA binding and NTPase activity regulated by the NS5 protein; Cui T et al.; The full-length dengue virus NS3 protein has been successfully expressed as a 94-kDa GST fusion protein in Escherichia coli . Treatment of the purified fusion protein with thrombin released a 68-kDa protein which is the expected molecular mass for the DEN1 NS3 protein . The identity of this protein was confirmed by Western blotting using dengue virus antisera . Two related activities of the recombinant NS3 protein were characterized, which were the binding of the protein to the 3'-noncoding region of the dengue virus RNA genome and NTPase activity . We demonstrated using a band shift assay that the DEN1 NS3 protein could form a complex with the stem-loop structure in the 3'-noncoding region (3'-NCR), although sites outside the stem-loop may also participate in binding . Using various unlabeled homopolymeric and heteropolymeric RNAs as competitors for binding, it was further shown that the DEN1 NS3 protein exhibits preferential binding to a 94-nt RNA transcript from the 3'-NCR of the dengue virus . The NTPase activity of the recombinant DEN1 NS3 protein was characterized using a thin-layer chromatography assay . We found that the DEN1 NS3 protein possesses some aspects of NTPase activity, which are distinct from those found in other flaviviruses . Although the NS3 protein was able to utilize all four ribonucleoside triphosphates as its substrates, the NS3 protein showed a distinct preference for purine triphosphates (i.e., ATP and GTP) . The addition of poly(U) did not stimulate NTPase activity in DEN1 NS3 protein, which contrasts with the reports for other flaviviral NS3 proteins . However, NTPase activity was specifically stimulated by the viral NS5 protein, which was manifested by a more than twofold increase in the rate of ATP hydrolysis and a 25% increase in the yield of ADP at the end of a 120-min reaction . These data suggest that the NTPase activity of the NS3 protein may be regulated by the viral NS5 protein during virus replication.

Anal Biochem, 1998 Jul 1, 260(2), 173 - 82
Enzyme-complemented activatorsorbent assay (ECASA): genetic engineering for enzyme-linked immunosorbent assay-type mercuric ion detection; Klein J et al.; The sensor component of bacterial mercury resistance systems is the metalloregulatory protein MerR, which has nanomolar sensitivity and high selectivity for Hg(II) . A fusion protein of MerR and the alpha-peptide part of beta-galactosidase (LacZalpha) was constructed by fusing the relevant genes . The protein exhibited both MerR functions and alpha-complementing activity to the inactive LacZDeltaM15 (M15) protein . The bifunctional character of the appropriate MerR-LacZalpha-complemented M15 protein (MerR-LacZalpha:M15 protein complex) was used to develop a Hg(II)-specific enzyme-complemented activatorsorbent assay . Hg(II) was immobilized and presented on a matrix taking advantage of the high affinity of Hg(II) to SH residues . The immobilized Hg(II) could be specifically detected down to the parts-per-billion level by quantifying the beta-galactosidase activity of the bound fusion protein complex .

Carcinogenesis, 1998 May, 19(5), 951 - 3
Molecular characterization of ST1C1-related human sulfotransferase; Yoshinari K et al.; Carcinogenic arylamines such as N-hydroxy-2-acetylaminofluorene (N-OH-AAF) are metabolically activated by mammalian sulfotransferases to form N-hydroxyarylamine O-sulfates . We previously showed that rat ST1C1 efficiently mediate these activations . These reactions occur in liver cytosols of humans as well as rats . However, the enzyme responsible for N-OH-AAF activation has not been identified in humans . In the present study, a human cDNA (ST1C2) encoding a sulfotransferase showing a high similarity with ST1C1, has been isolated from a human fetal liver cDNA library and expressed using a bacterial expression system . A clear difference was observed in the pH optima for p-nitrophenol sulfation between ST1C2 and ST1C1 expressed in Escherichia coli . In addition, ST1C2 did not mediate 3'-phosphoadenosine-5'-phosphosulfate-dependent DNA binding of N-OH-AAF . These results suggest that human ST1C2 has a clear different substrate specificity, in spite of the structural similarity, with rat ST1C1.

Mutat Res, 1998 Mar 13, 399(1), 55 - 64
Transgenic nematodes as biomonitors of microwave-induced stress; Daniells C et al.; Transgenic nematodes (Caenorhabditis elegans strain PC72), carrying a stress-inducible reporter gene (Escherichia coli beta-galactosidase) under the control of a C . elegans hsp16 heat-shock promoter, have been used to monitor toxicant responses both in water and soil . Because these transgenic nematodes respond both to heat and toxic chemicals by synthesising an easily detectable reporter product, they afford a useful preliminary screen for stress responses (whether thermal or non-thermal) induced by microwave radiation or other electromagnetic fields . We have used a transverse electromagnetic (TEM) cell fed from one end by a source and terminated at the other end by a matched load . Most studies were conducted using a frequency of 750 MHz, at a nominal power setting of 27 dBm . The TEM cell was held in an incubator at 25 degrees C inside a shielded room; corresponding controls were shielded and placed in the same 25 degrees C incubator; additional baseline controls were held at 15 degrees C (worm growth temperature) . Stress responses were measured in terms of beta-galactosidase (reporter) induction above control levels . The time-course of response to continuous microwave radiation showed significant differences from 25 degrees C controls both at 2 and 16 h, but not at 4 or 8 h . Using a 5 x 5 multiwell plate array exposed for 2 h, the 25 microwaved samples showed highly significant responses compared with a similar control array . The wells most strongly affected were those in the rows closest to the source, whereas the most distant row did not rise above control levels, suggesting a shadow effect . These differential responses are difficult to reconcile with general heating effects, although localised power absorption affords a possible explanation . Experiments in which the frequency and/or power settings were varied suggested a greater response at 21 than at 27 dBm, both at 750 and 300 MHz, although extremely variable responses were observed at 24 dBm and 750 MHz . Thus, lower power levels tended, if anything, to induce larger responses (with the above-mentioned exception), which is opposite to the trend anticipated for any simple heating effect . These results are reproducible and data acquisition is both rapid and simple . The evidence accrued to date suggests that microwave radiation causes measurable stress to transgenic nematodes, presumably reflecting increased levels of protein damage within cells (the common signal thought to trigger hsp gene induction) . The response levels observed are comparable to those observed with moderate concentrations (ppm) of metal ions such as Zn2+ and Cu2+ . We conclude that this approach deserves further and more detailed investigation, but that it has already demonstrated clear biological effects of microwave radiation in terms of the activation of cellular stress responses (hsp gene induction).

Blood, 1998 Jul 15, 92(2), 672 - 82
Cytosine deaminase adenoviral vector and 5-fluorocytosine selectively reduce breast cancer cells 1 million-fold when they contaminate hematopoietic cells: a potential purging method for autologous transplantation; Garcia-Sanchez F et al.; Ad.CMV-CD is a replication incompetent adenoviral vector carrying a cytomegalovirus (CMV)-driven transcription unit of the cytosine deaminase (CD) gene . The CD transcription unit in this vector catalyzes the deamination of the nontoxic pro-drug, 5-fluorocytosine (5-FC), thus converting it to the cytotoxic drug 5-fluorouracil (5-FU) . This adenoviral vector prodrug activation system has been proposed for use in selectively sensitizing breast cancer cells, which may contaminate collections of autologous stem cells products from breast cancer patients, to the toxic effects of 5-FC, without damaging the reconstitutive capability of the normal hematopoietic cells . This system could conceivably kill even the nondividing breast cancer cells, because the levels of 5-FU generated by this system are 10 to 30 times that associated with systemic administration of 5-FU . The incorporation of 5-FU into mRNA at these high levels is sufficient to disrupt mRNA processing and protein synthesis so that even nondividing cells die of protein starvation . To test if the CD adenoviral vector sensitizes breast cancer cells to 5-FC, we exposed primary explants of normal human mammary epithelial cells (HMECs) and the established breast cancer cell (BCC) lines MCF-7 and MDA-MB-453 to the Ad.CMV-CD for 90 minutes . This produced a 100-fold sensitization of these epithelial cells to the effects of 48 hours of exposure to 5-FC . We next tested the selectivity of this system for BCC . When peripheral blood mononuclear cells (PBMCs), collected from cancer patients during the recovery phase from conventional dose chemotherapy-induced myelosuppression, were exposed to the Ad.CMV-CD for 90 minutes in serum-free conditions, little or no detectable conversion of 5-FC into 5-FU was seen even after 48 hours of exposure to high doses of 5-FC . In contrast, 70% of 5-FC was converted into the cytotoxic agent 5-FU when MCF-7 breast cancer cells (BCCs) were exposed to the same Ad.CMV-CD vector followed by 5-FC for 48 hours . All of the BCC lines tested were shown to be sensitive to infection by adenoviral vectors when exposed to a recombinant adenoviral vector containing the reporter gene betagalactosidase (Ad.CMV-betagal) . In contrast, less than 1% of the CD34-selected cells and their more immature subsets, such as the CD34+CD38- or CD34(+)CD33- subpopulations, were positive for infection by the Ad.CMV-betagal vector, as judged by fluorescence-activated cell sorting (FACS) analysis, when exposed to the adenoviral vector under conditions that did not commit the early hematopoietic precursor cells to maturation . When artificial mixtures of hematopoietic cells and BCCs were exposed for 90 minutes to the Ad.CMV-CD vector and to 5-FC for 10 days or more, a greater than 1 million fold reduction in the number of BCCs, as measured by colony-limiting dilution assays, was observed . To test if the conditions were damaging for the hematopoietic reconstituting cells, marrow cells collected from 5-FU-treated male donor mice were incubated with the cytosine deaminase adenoviral vector and then exposed to 5-FC either for 4 days in vitro before transplantation or for 14 days immediately after transplantation in vivo . There was no significant decrease in the reconstituting capability of the male marrow cells, as measured by their persistence in female irradiated recipients for up to 6 months after transplantation . These observations suggest that adenovirus-mediated gene transfer of the Escherichia coli cytosine deaminase gene followed by exposure to the nontoxic pro-drug 5-FC may be a potential strategy to selectively reduce the level of contaminating BCCs in collections of hematopoietic cells used for autografts in breast cancer patients.

Biochemistry, 1998 Jul 7, 37(27), 9836 - 42
The terminal adenosine of tRNA(Gln) mediates tRNA-dependent amino acid recognition by glutaminyl-tRNA synthetase; Liu J et al.; Sequence-specific interactions between Escherichia coli glutaminyl-tRNA synthetase and tRNA(Gln) have been shown to determine the apparent affinity of the enzyme for its cognate amino acid glutamine during aminoacylation . Specifically, structural and biochemical studies suggested that residues Asp66, Tyr211, and Phe233 in glutaminyl-tRNA synthetase could potentially facilitate cognate amino recognition through their specific interactions with both A76 of tRNA(Gln)++ and glutamine . These residues were randomly mutated and the resulting glutaminyl-tRNA synthetase variants were screened in vivo for changes in their ability to recognize noncognate tRNAs and retention of tRNA-glutaminylation activity . When the variants selected in this way were characterized in vitro, they all showed dramatic decreases in apparent affinity (KM) for glutamine but little or no change in cognate tRNA affinity . Conservative replacements such as Y211F, F233L, and D66E resulted in 60-, 19-, and 18-fold increases compared to wild-type in the KM for glutamine, respectively, but had little effect on the turnover number (kcat) . Nonconservative replacements affected both KM for glutamine and kcat; Y211S, F233D, and D66F displayed 1700, 3700, and 1200-fold decreases in kcat/KM for glutamine compared to wild-type . Double mutant cycle analysis indicated that Tyr211, and Phe233 interact strongly to enhance glutamine binding . These data now show that Asp66, Tyr211 and Phe233 mediate tRNA-dependent cognate amino acid recognition via the invariant 3'-terminal adenosine of tRNA(Gln).

Biochemistry, 1998 Jul 7, 37(27), 9768 - 75
Incorrect folding of steroidogenic acute regulatory protein (StAR) in congenital lipoid adrenal hyperplasia; Bose HS et al.; Steroidogenic acute regulatory protein (StAR) rapidly stimulates the movement of cholesterol into adrenal and gonadal mitochondria to mediate the acute steroidogenic response; StAR mutations cause potentially lethal congenital lipoid adrenal hyperplasia (lipoid CAH) . Bacterially expressed wild-type StAR and four amino acid replacement/deletion mutants that cause lipoid CAH were purified to apparent homogeneity . Sedimentation equilibrium ultracentrifugation showed that all five proteins were monomeric and fit a globular protein model of the correct molecular mass . Circular dichroism (CD) spectra of both the wild-type and mutants showed minima near 208 and 222 nm, confirming the presence of substantial alpha-helical structure . However, subtle differences in the CD signals of the wild-type and mutants in the far-UV and stronger differences in near-UV indicated differences in protein folding . The amide I and II bands in the 1400-1700 cm-1 region of Fourier transform infrared spectra showed that the proteins fell into two groups . The wild-type and a partially active conservative mutant were predominantly alpha-helical with some intramolecular beta-sheet . By contrast, three mutants that lost charged residues retained much of their alpha-helical structure, but also tended to form intermolecular beta-sheets . Urea at 2.0 or 4.0 M had less effect on the CD spectrum of the wild-type than of the mutants, particularly those having lost a charged residue; 50 mM guanidinium hydrochloride did not alter the CD spectrum of the wild-type, but elicited dramatic changes to the secondary structure in all four mutants . Despite this, thermal melting curves of the mutant proteins in 50 mM guanidinium hydrochloride showed surprising stability, even exceeding that of the wild-type protein . These data suggest that the StAR amino acid replacement mutants that cause lipoid CAH are inactive because of fairly gross errors in protein folding, probably due to the loss of salt bridges that stabilize the tertiary structure.

Biochemistry, 1998 Jul 7, 37(27), 9688 - 94
Temperature-controlled activity of DnaK-DnaJ-GrpE chaperones: protein-folding arrest and recovery during and after heat shock depends on the substrate protein and the GrpE concentration; Diamant S et al.; Heat-shock proteins DnaK, DnaJ, and GrpE (KJE) from Escherichia coli constitute a three-component chaperone system that prevents aggregation of denatured proteins and assists the refolding of proteins in an ATP-dependent manner . We found that the rate of KJE-mediated refolding of heat- and chemically denatured proteins is decreased at high temperatures . The efficiency and reversibility of protein-folding arrest during and after heat shock depended on the stability of the complex between KJE and the denatured proteins . Whereas a thermostable protein was released and partially refolded during heat shock, a thermolabile protein remained bound to the chaperone . The apparent affinity of GrpE and DnaJ for DnaK was decreased at high temperatures, thereby decreasing futile consumption of ATP during folding arrest . The coupling of ATP hydrolysis and protein folding was restored after the stress . This strongly indicates that KJE chaperones are heat-regulated heat-shock proteins which can specifically arrest the folding of aggregation-prone proteins during stress and preferentially resume refolding under conditions that allow individual proteins to reach and maintain a stable native conformation.

Poult Sci, 1998 Jul, 77(7), 956 - 62
Humoral immune response impairment following excess vitamin E nutrition in the chick and turkey; Friedman A et al.; The effect of high dietary intakes of vitamin E on antibody production was investigated in chicks and turkeys . Chicks were fed four diets with 0, 10, 30, and 150 mg/kg added vitamin E and turkeys were fed three diets with 0, 50, and 150 mg/kg added vitamin E . Antibodies produced in response to naturally occurring Escherichia coli and to Newcastle disease virus and turkey pox vaccines were determined . In chicks, antibody production in response to E . coli and Newcastle disease was affected by vitamin E nutrition: significantly higher responses were measured in chicks that received 0 and 10 mg/kg added vitamin E, whereas in chicks receiving 30 and 150 mg/kg, antibody production was significantly lower . In turkeys, concentrations of circulating antibodies to Newcastle disease virus and to turkey pox were also influenced by dietary vitamin E: antibody titers to Newcastle disease and turkey pox vaccines were highest in groups receiving 0 mg/kg added vitamin E, whereas titer in groups receiving 150 mg/kg were significantly lower . Responses of groups receiving 50 mg/kg added vitamin E were slightly lower than groups receiving 0 mg/kg, though not significantly so in most cases . These results indicate that humoral immune responses are directly effected by vitamin E, and that excessive vitamin E intake has a detrimental effect on antibody production in chickens and turkeys.

FEBS Lett, 1998 Jun 5, 429(1), 104 - 8
Recombinant expression, purification and characterization of Kch, a putative Escherichia coli potassium channel protein; Voges D et al.; The Escherichia coli gene kch, similar in primary structure to eukaryotic voltage-gated potassium channels, was cloned and overexpressed in E . coli . The protein was solubilized from the plasma membrane with dodecylmaltopyranoside, lauryldimethylamine oxide or N-laurylsarcosine and was purified in milligram amounts by imidazole elution from a nickel-chelate column . The molecular mass of the purified protein in a number of detergents with 12 carbon atom chains suggests that rKch forms primarily tetramers of the 50 kDa monomers . CD spectroscopy of the purified protein indicates a significant alpha-helical content that is preserved upon addition of SDS.

FEBS Lett, 1998 Jun 5, 429(1), 21 - 6
Structural and dynamic helix geometry alterations induced by mismatch base pairs in double-helical RNA; Vogtherr M et al.; A ribooligonucleotide microhelix derived from the acceptor stem of Escherichia coli tRNA(Ala) having a C3-A70 mismatch in place of the G3-U70 wobble pair in the wild-type tRNA(Ala), and a sequence variant with a regular U3-A70 base pair have been investigated by NMR . In vivo, suppressor tRNA(Ala) variants with C3-A70 (as well as several other) mismatch pairs are substrates for alanyl-tRNA synthetase (ARS), supporting the hypothesis of an 'indirect' recognition of the identity element 3-70 mismatch pair via structural modifications caused by the mispair in comparison to canonical A-RNA helices . It is demonstrated that the C-A mismatch likewise induces helix geometry alterations, in particular with respect to base stacking in the vicinity of the mismatch . However, with reference to the 'wild-type' G3-U70 microhelix, destacking in the C3-A70 acceptor stem duplex occurs in the opposite direction from the mismatch pair . Therefore it is concluded that the locally enhanced conformational flexibility or dynamics associated with the structural changes induced by the mismatch pairs could be an essential prerequisite for optimal adaptation of the tRNA(Ala) acceptor stem to the contact region of the ARS.

Mol Biochem Parasitol, 1998 May 1, 92(2), 325 - 38
Expression, selection, and organellar targeting of the green fluorescent protein in Toxoplasma gondii; Striepen B et al.; We have engineered a mutant version of the green fluorescent protein GFP (Cormack et al . Selected for bright fluorescence in E . coli . Gene 1996;173:33-38) for expression in the protozoan parasite Toxoplasma gondii . Although intact GFP was not expressed at any detectable level, GFP fusion proteins could be detected by fluorescence microscopy, flow cytometry (FACS), and immunoblotting . Both extracellular tachyzoites and T . gondii-infected host cells could readily be sorted by FACS, which should facilitate a variety of selection strategies . Several selectable markers were tested for their ability to produce stable green transgenic parasites . Fluorescence intensity was directly correlated with gene copy number and protein expression level . Weak selectable markers such as chloramphenicol acetyl transferase (CAT) driven by the SAG1 promoter, which yield multicopy insertions, are therefore most effective for selecting green fluorescent parasites-particularly when coupled to constructs which employ a strong promoter to drive GFP expression . Transformation vectors developed in the course of this work should be of general utility for the overexpression of heterologous transgenes in Toxoplasma . CAT-GFP fusion proteins were expressed in the parasite cytoplasm . GFP fusions to the P30 major surface antigen (linked on the same plasmid to a CAT selectable marker under control of various promoters) could be detected in dense granules within living cells, and were efficiently secreted into the parasitophorous vacuole . GFP fusions to the rhoptry protein ROP1 were targeted to rhoptries (specialized secretory organelles at the apical end of the parasite).

Hepatology, 1998 Jul, 28(1), 219 - 24
Human and murine antibody recognition is focused on the ATPase/helicase, but not the protease domain of the hepatitis C virus nonstructural 3 protein; Chen M et al.; The hepatitis C virus (HCV) nonstructural (NS) 3 protein has been shown to possess at least two enzymatic domains . The amino terminal third contains a serine-protease domain, whereas the carboxy terminal two thirds is comprised of an adenosine triphosphatase (ATPase)/helicase domain . These domains are essential for the maturation of the carboxy-terminal portion of the HCV polyprotein and catalyze the cap synthesis of the RNA genome . In this report, human and murine antibody responses induced by NS3 were characterized using a recombinant full-length NS3 (NS3-FL) protein, or the isolated protease or ATPase/ helicase domains, expressed and purified from Escherichia coli . Sera from 40 patients with chronic HCV infection were assayed in enzyme-linked immunoassays (EIAs) for antibody binding to the panel of NS3 proteins . Virtually all patient sera contained antibodies specific for NS3-FL and the ATPase/helicase domain, whereas only 10% of sera reacted with the protease domain of NS3 . Human antibodies reactive with NS3-FL were highly restricted to the immunoglobulin G1 (IgG1) isotype and were inhibited by soluble ATPase/helicase, but not by the protease domain . The anti-NS3 (ATPase/helicase) reactivity decreased on denaturation by sodium dodecyl sulfate (SDS) and beta-mercaptoethanol (2ME), suggesting the recognition of nonlinear or conformational B-cell determinants . Similar to infected humans, mice immunized with NS3-FL developed high-titered primary antibody responses to the NS3 ATPase/ helicase domain, whereas an anti-NS3 protease response was not observed after primary or secondary immunizations . Thus, the human and murine humoral immune responses to the HCV NS3 protein are focused on the ATPase/helicase domain, are restricted to the IgG1 isotype in humans, and are conformationally dependent . Unexpectedly, in both species, the NS3 protease domain, present in the context of the full-length NS3, appears to possess low intrinsic immunogenicity in terms of antibody production.

Virology, 1998 Jun 20, 246(1), 104 - 12
Critical point mutations for hepatitis C virus NS3 proteinase; Yamada K et al.; The hepatitis C virus NS3 proteinase plays an essential role in processing of HCV nonstructural precursor polyprotein . To detect its processing activity, we developed a simple trans-cleavage assay . Two recombinant plasmids expressing the NS3 proteinase region and a chimeric substrate polyprotein containing the NS5A/5B cleavage site between maltose binding protein and protein A were co-introduced into Escherichia coli cells . The proteinase processed the substrate at the single site during their polyprotein expression . Deletion analysis indicated that the functionally minimal domain of the NS3 proteinase was composed of 146 amino acids, 1059 to 1204 . We isolated several cDNA clones encoding the functional domain of the NS3 proteinase from the sera of patients chronically infected with HCV and determined their proteinase activity by this trans-cleavage assay . Both active and inactive clones existed in the same patients . Comparative sequence analyses of these clones suggested that certain point mutations seemed to be related to the loss of proteolytic activity . This was confirmed by back mutation experiments . Among the critical mutations, Pro-1168 to Thr and Arg-1135 to Gly were intriguing . These amino acids, which are situated near the oxyanion hole, seem to be essential for maintaining the conformation of the active center of the NS3 proteinase.

Mol Biol Evol, 1998 Jul, 15(7), 789 - 97
Repeated evolution of an acetate-crossfeeding polymorphism in long-term populations of Escherichia coli; Treves DS et al.; Six out of 12 independent replicate populations of Escherichia coli maintained in long-term glucose-limited continuous culture for up to approximately 1,750 generations evolve polymorphisms maintained by acetate crossfeeding . In all cases, the acetate-crossfeeding phenotype is associated with semiconstitutive overexpression of acetyl CoA synthetase, which allows for the enhanced uptake of low levels of exogenous acetate . Mutations in the 5' regulatory region of the acetyl CoA synthetase locus are responsible for all the acetate crossfeeding phenotypes found . These changes were either transposable-element insertions or a single T-->A nucleotide substitution at position -93 relative to the acs gene translation start site.

Vet Immunol Immunopathol, 1998 Jun 30, 64(1), 15 - 32
Characterization of two dog IgE-specific antibodies elicited by different recombinant fragments of the epsilon chain in hens; Griot-Wenk ME et al.; Two recombinant {His}6-tagged fragments of the canine immunoglobulin E (IgE) heavy chain (second domain: IgEf2 and third and fourth domains: IgEf3/4) were cloned, expressed in Escherichia coli (E . coli) as {His}6-tagged proteins, and affinity-purified over nickel-nitrilotriacetic acid columns . The recombinant proteins were used to immunize hens . The raised and affinity-purified chicken antibodies (Ab) isolated from egg yolk exhibited specific binding to the respective recombinant canine IgE fragment (IgEf) on immunoblots and displayed high titers against the IgEf in ELISA . Immunoblotting of canine serum separated by PAGE under native conditions with the IgEf2- and IgEf3/4-specific Ab resulted in staining of a protein of approximately 180 kilodaltons (kD) . The IgEf3/4-specific Ab further recognized an 80 kD protein in IgEf3/4-specific Ab affinity-enriched dog serum separated under denaturing conditions . In an ELISA for the detection of antigen-specific IgE in dog serum, reduced binding of the IgEf-specific Ab was observed after heat treatment of the dog serum . The reactivity of both of the raised chicken Ab was only present in postimmune reagents and could only be inhibited by preincubation with the IgEf used for immunization and not with dog immunoglobulin G, E . coli extract, or with a nonrelevant recombinant {His}6-tagged protein . In immunohistochemistry, the IgEf3/4-specific Ab specifically recognized cells in paraffin-embedded tissue sections of lymph nodes . Furthermore, both of the IgEf-specific Ab elicited positive immediate type 1 skin reactions in dogs . Semiquantitative assessment of total serum IgE in dogs was developed using IgEf2-specific Ab as coating reagent and the biotinylated IgEf3/4-specific Ab as developing Ab in ELISA . In conclusion, both IgEf-specific Ab recognize native dog IgE with the advantages that they are directed against different and known constant domains of the IgE molecule, and that they can be used for immunohistochemistry on paraffin-embedded tissue . The two dog IgE-specific Ab could initiate clinical research on the involvement of immediate-type hypersensitivity reactions in dogs.

Biochim Biophys Acta, 1998 Jun 29, 1385(2), 401 - 19
Biosynthesis of 2-aceto-2-hydroxy acids: acetolactate synthases and acetohydroxyacid synthases; Chipman D et al.; Two groups of enzymes are classified as acetolactate synthase (EC 4 . 1.3.18) . This review deals chiefly with the FAD-dependent, biosynthetic enzymes which readily catalyze the formation of acetohydroxybutyrate from pyruvate and 2-oxobutyrate, as well as of acetolactate from two molecules of pyruvate (the ALS/AHAS group) . These enzymes are generally susceptible to inhibition by one or more of the branched-chain amino acids which are ultimate products of the acetohydroxyacids, as well as by several classes of herbicides (sulfonylureas, imidazolinones and others) . Some ALS/AHASs also catalyze the (non-physiological) oxidative decarboxylation of pyruvate, leading to peracetic acid; the possible relationship of this process to oxygen toxicity is considered . The bacterial ALS/AHAS which have been well characterized consist of catalytic subunits (around 60 kDa) and smaller regulatory subunits in an alpha2beta2 structure . In the case of Escherichia coli isozyme III, assembly and dissociation of the holoenzyme has been studied . The quaternary structure of the eukaryotic enzymes is less clear and in plants and yeast only catalytic polypeptides (homologous to those of bacteria) have been clearly identified . The presence of regulatory polypeptides in these organisms cannot be ruled out, however, and genes which encode putative ALS/AHAS regulatory subunits have been identified in some cases . A consensus sequence can be constructed from the 21 sequences which have been shown experimentally to represent ALS/AHAS catalytic polypeptides . Many other sequences fit this consensus, but some genes identified as putative 'acetolactate synthase genes' are almost certainly not ALS/AHAS . The solution of the crystal structures of several thiamin diphosphate (ThDP)-dependent enzymes which are homologous to ALS/AHAS, together with the availability of many amino acid sequences for the latter enzymes, has made it possible for two laboratories to propose similar, reasonable models for a dimer of catalytic subunits of an ALS/AHAS . A number of characteristics of these enzymes can now be better understood on the basis of such models: the nature of the herbicide binding site, the structural role of FAD and the binding of ThDP-Mg2+ . The models are also guides for experimental testing of ideas concerning structure-function relationships in these enzymes, e.g . the nature of the substrate recognition site . Among the important remaining questions is how the enzyme suppresses alternative reactions of the intrinsically reactive hydroxyethylThDP enamine formed by the decarboxylation of the first substrate molecule and specifically promotes its condensation with 2-oxobutyrate or pyruvate.

Biochim Biophys Acta, 1998 Jul 9, 1398(3), 353 - 8
Sequence analysis and expression of a novel mouse homolog of Escherichia coli recA gene; Kawabata M et al.; Escherichia coli recA and its yeast homologs RAD51 and DMC1 play crucial roles in mitotic and/or meiotic recombination and in repair of double-strand DNA breaks . We have identified a murine novel recA-like gene (MmTRAD) . The predicted 329 amino acid protein showed significant homology to mouse Rec2, Rad51, Dmc1 (or Lim15) and E . coli RecA . Northern blot analysis revealed that MmTRAD was ubiquitously transcribed in various tissues.

Biochim Biophys Acta, 1998 Jun 29, 1385(2), 287 - 306
Regulation of thiamin diphosphate-dependent 2-oxo acid decarboxylases by substrate and thiamin diphosphate.Mg(II) - evidence for tertiary and quaternary interactions; Jordan F et al.; The regulatory mechanism of substrate activation in yeast pyruvate decarboxylase is triggered by the interaction of pyruvic acid with C221 located on the beta domain at >20 A from the thiamin diphosphate (ThDP) . To trace the putative information transfer pathway, substitutions were made at H92 on the alpha domain, across the domain divide from C221, at E91, next to H92 and hydrogen bonded to W412, the latter being intimately involved in the coenzyme binding locus . Additional substitutions were made at D28, E51, H114, H115, I415 and E477, all near the active center . The pH-dependent steady-state kinetic parameters, including the Hill coefficient, provide useful insight to this effort . In addition to C221, the residues H92, E91, E51 and H114 and H115 together appear to have a critical impact on the Hill coefficient, providing a pathway for information transfer . To study the activation by ThDP.Mg(II), variants at G231 (of the conserved GDG triplet) and at N258 and C259 (all three being part of the putative ThDP fold) of the E1 component of the Escherichia coli pyruvate dehydrogenase multienzyme complex were studied . Kinetic and spectroscopic evidence suggests that the Mg(II) ligands are very important to activation of the enzymes by cofactors.

Biochim Biophys Acta, 1998 Jul 9, 1398(3), 243 - 55
Identification and sequencing of a cytochrome P450 gene cluster from Bradyrhizobium japonicum; Tully RE et al.; Sequencing of a region from Bradyrhizobium japonicum previously shown to encode for cytochromes P450 revealed a cluster of three complete P450 genes (CYP112, CYP114, and CYP117) plus a partial P450 gene fragment (CYP115P) . Present also are five additional open reading frames . The close positioning of the genes suggests that they comprise an operon . Although the biochemical function of the gene products is uncertain, the similarities to other genes suggests an operon involved in terpenoid synthesis . ORF3 has similarity to a {3Fe-4S} ferredoxin from Streptomyces griseolus . ORF4 has strong similarity to members of the short chain alcohol dehydrogenase family, including sterol dehydrogenases from enteric bacteria and to some plant 3-oxoacyl-(acyl carrier protein) reductases . ORF6 has strong similarity to prenyl transferases, including dimethylallyltranstransferase from Escherichia coli . ORF7 bears some similarity to plant genes for ent-kaurene synthase (a precursor of gibberellins), and to bacterial squalene-hopene cyclases . ORF8 has some similarity to a Streptomyces gene for synthesis of the cyclic sesquiterpene pentalenene . The 5' end of the mRNA transcript is 38-39 nucleotides downstream from the center of a motif that bears sequence homology to bacterial fnr promoters . A gus operon fusion to the promoter was expressed anaerobically and symbiotically 6-10-fold greater than aerobically.

Genes are material entities that parents pass to offspring during reproduction. These entities encode information essential for the construction and regulation of polypeptides, proteins and other molecules that determine the growth and functioning of the organism.

The word "gene" is shared by many disciplines, including classical genetics, molecular genetics, evolutionary biology and population genetics. Because each discipline models the biology of life differently, the material entity that supports the gene in one discipline is not the same as in the other.

Following the discovery that DNA is the genetic material, and with the growth of biotechnology and the project to sequence the human genome, the common usage of the word "gene" has increasingly reflected its meaning in molecular biology. In the molecular-biological sense, genes are the segments of DNA which cells transcribe into RNA and translate, at least in part, into proteins.

In common speech, "gene" is often used to refer to the hereditary cause of a trait, disease or condition—as in "the gene for obesity." Speaking more precisely, a biologist might refer to an allele or a mutation that has been implicated in or is associated with obesity. This is because biologists know that many factors other than genes decide whether a person is obese or not: prenatal environment, upbringing, culture and the availability of food, for example.

Moreover, it is very unlikely that variations within a single gene—or single genetic locus—fully determine one's genetic predisposition for obesity. These aspects of inheritance—the interplay between genes and environment, the influence of many genes—appear to be the norm with regard to many and perhaps most ("complex" or "multifactoral") traits. The term phenotype refers to the characteristics that result from this interplay (see genotype-phenotype distinction).

Properties of genes In molecular biology, the DNA of a gene encodes the chemical structure of a protein. The genetic code determines the sequence of the amino acids that make up a protein. The coding of a three nucleotide DNA sequence to a specific amino acid is essentially the same for all known life, from bacteria to humans.

Through the proteins they encode, genes govern the cells in which they reside. In multicellular organisms they control the development of the individual from the fertilized egg and the day-to-day functions of the cells that make up tissues and organs. The instrumental roles of their protein products range from mechanical support of the cell structure to the transportation and manufacture of other molecules and to the regulation of other proteins' activities.

The genes that exist today are those that have reproduced successfully in the past. Often, many individual organisms share a gene; thus, the death of an individual need not mean the extinction of the gene. Indeed, if the sacrifice of one individual enhances the survivability of other individuals with the same gene, the death of an individual may enhance the overall survival of the gene. This is the basis of the selfish gene view, popularized by Richard Dawkins. He points out in his book, The Selfish Gene, that to be successful genes need have no other "purpose" than to propagate themselves, even at the expense of their host organism's welfare. A human that behaved in such a way would be described as "selfish," although ironically a selfish gene may promote altruistic behaviors. According to Dawkins, the possibly disappointing answer to the question "what is the meaning of life?" may be "the survival and perpetuation of ribonucleic acids and their associated proteins".

Types of genes Due to rare, spontaneous errors (e.g. in DNA replication) mutations in the sequence of a gene may arise. Once propagated to the next generation, this mutation may lead to variations within a species' population. Variants of a single gene are known as alleles, and differences in alleles may give rise to differences in traits, for example eye color. A gene's most common allele is called the wild type allele, and rare alleles are called mutants.

Normally, RNA is an intermediate product in the translation of a molecular gene into a protein. However, for some gene sequences, RNA molecules are actually the functional end products. For example, RNAs known as ribozymes are capable of enzymatic function, or small interfering RNAs have a regulatory role. The DNA sequences from which such RNAs are transcribed are known as non-coding RNA, or RNA genes.

All living organisms carry their genes and transmit them to offspring as DNA, but some viruses carry only RNA. Because they use RNA, their cellular hosts may synthesize their proteins as soon as they are infected and without the delay in waiting for transcription. On the other hand, RNA retroviruses, such as AIDS, require the reverse transcription of their genome from RNA into DNA before their proteins can be synthesized.

In the early 1900s, Mendel's work received renewed attention from scientists. In 1910, Thomas Hunt Morgan showed that genes reside on specific chromosomes. He later showed that genes occupy specific locations on the chromosome. With this knowledge, Morgan and his students began the first chromosomal map of Drosophila. In 1928, Frederick Griffith showed that genes could be transferred. In what is now known as Griffith's experiment, injections into a mouse of a deadly strain of a bacteria that had been heat-killed transferred genetic information to a safe strain of the same bacteria, killing the mouse.

In 1941, George Wells Beadle and Edward Lawrie Tatum showed that mutations in genes caused errors in certain steps in metabolic pathways. This showed that specific genes code for specific proteins, leading to the "one gene, one enzyme" hypothesis. Oswald Avery, Collin Macleod, and Maclyn McCarty showed in 1944 that DNA holds the gene's information. In 1953, James D. Watson and Francis Crick demonstrated the molecular structure of DNA. Together, these discoveries established the central dogma of molecular biology, which states that proteins are transcribed from RNA which is translated from DNA. This dogma has since been shown to have exceptions, such as reverse transcription in retroviruses.

Heredity (the adjective is hereditary) is the transfer of characters from parent to offspring, either through their genes or through the social institution called inheritance (for example, a title of nobility is passed from individual to individual according to relevant customs and/or laws).

Biology In biology, heredity refers to the transference of biological characteristics from a parent organism to offspring, and is practically a homonym for genetics, as genes are now recognized as the carriers of biological information. In humans, defining which characteristics of a final person are due to heredity and which are due to environmental influences is often a site of controversy (the nature versus nurture debate), especially regarding intelligence and race.

Genetic interactions, in genetics, are interactions that occur between two or more mutations that results in a new phenotype. Studying genetic interactions can reveal gene function, the nature of the mutations, functional redundancy, and protein interactions. Because protein complexes are responsible for most biological functions, genetic interactions are a powerful tool.

The phenotype of an individual organism is either its total physical appearance and constitution, or a specific manifestation of a trait, such as size or eye color, that varies between individuals. Phenotype is determined to some extent by genotype, or by the identity of the alleles that an individual carries at one or more positions on the chromosomes. Many phenotypes are determined by multiple genes and influenced by environmental factors. Thus, the identity of one or a few known alleles does not always enable prediction of the phenotype.

Nevertheless, because phenotypes are much easier to observe than genotypes (it doesn't take chemistry or sequencing to determine a person's eye color), classical genetics uses phenotypes to deduce the functions of genes. These inferences can then be checked by breeding experiments. In this way, early geneticists were able to trace inheritance patterns without any knowledge whatsoever of molecular biology.

The interaction between genotype and phenotype has often been described using a simple equation:

Phenotype = Genotype + Environment That is a phenotype is any detectable characteristics of an organism (i.e. structural, biochemical, physiological and behavioural) determined by an interaction between its genotype and environment (see genotype-phenotype distinction for a further elaboration of this distinction).

The idea of the phenotype as the product of the genotype has been generalised by Richard Dawkins in his book The Extended Phenotype.

The genotype is the specific genetic makeup (the specific genome) of an individual, usually in the form of DNA. It codes for the phenotype of that individual.

Typically, one refers to an individual's genotype with regard to a particular gene of interest and, in polyploid individuals, it refers to what combination of alleles the individual carries (see homozygous, heterozygous). Any given gene will usually cause an observable change in an organism, known as the phenotype. The terms genotype and phenotype are distinct for at least two reasons:

To distinguish the source of an observer's knowledge (one can know about genotype by observing DNA; one can know about phenotype by observing outward appearance of an organism). Genotype and phenotype are not always directly correlated. Some genes only express a given phenotype in certain environmental conditions. Conversely, some phenotypes could be the result of multiple genotypes. Inspired by the biological concept and usefulness of genotypes, computer science employs simulated genotypes in genetic programming and evolutionary algorithms. Such techniques can help evolve mathematical solutions to certain types of otherwise difficult problems.

The genotype-phenotype distinction refers to the fact that while genotype and phenotype of an organism are related, they do not necessarily coincide. The genotype of an organism represents its exact genetic makeup, that is, the particular set of genes it possesses. Two organisms whose genes differ at even one locus (position in their genome) are said to have different genotypes. The term "genotype" refers, then, to the full hereditary information of an organism. The phenotype of an organism, on the other hand, represents its actual physical properties, such as height, weight, hair color, and so on. The mapping of a set of genotypes to a set of phenotypes is sometimes referred to as the genotype-phenotype map.

An organism's genotype is the largest influencing factor in the development of its phenotype, but it is not the only one. Even two organisms with identical genotypes normally differ in their phenotypes. One experiences this in everyday life with monozygous (i.e. identical) twins. Identical twins share the same genotype, since their genomes are identical; but they never have the same phenotype, although their phenotypes may be very similar. This is apparent in the fact that their mothers and close friends can always tell them apart, even though others might not be able to see the subtle differences. Further, identical twins can be distinguished by their fingerprints, which are never completely identical.

The concept of phenotypic plasticity describes the degree to which an organism's phenotype is determined by its genotype. A high level of plasticity means that environmental factors have a strong influence on the particular phenotype that develops. If there is little plasticity, the phenotype of an organism can be reliably predicted from knowledge of the genotype, regardless of environmental peculiarities during development. An example of high plasticity can be observed in larval newts1 when these larvae sense the presence of predators such as dragonflies, they develop larger heads and tails relative to their body size and display darker pigmentation. Larvae with these traits have a higher chance of survival when exposed to the predators, but grow more slowly than other phenotypes.

In contrast to phenotypic plasticity, the concept of genetic canalization addresses the extent to which an organism's phenotype allows conclusions about its genotype. A phenotype is said to be canalized if mutations (changes in the genome) do not noticeably affect the physical properties of the organism. This means that a canalized phenotype may form from a large variety of different genotypes, in which case it is not possible to exactly predict the genotype from knowledge of the phenotype. If canalization is not present, small changes in the genome have an immediate effect on the phenotype that develops.

Classical genetics consists of the techniques and methodologies of genetics that predate the advent of molecular biology. A key disocvery of classical genetics in eukaryotes, was genetic linkage. The observation that some genes do not segregate independently at meiosis, broke the laws of Mendelian inheritance, and provided science with a way to map characteristics to a location on the chromosomes. Linkage maps are still used today, especially in breeding for plant improvement.

After the discovery of the genetic code and such tools of cloning as restriction enzymes, the avenues of investigation open to geneticists were greatly broadened. Some classical genetic ideas have been supplanted with the mechanistic understanding brought by molecular discoveries, but many remain intact and in use. Classical genetics is often contrasted with reverse genetics, and aspects of molecular biology are sometimes referred to as molecular genetics.

Genetic linkage occurs when particular alleles are inherited together. Typically, an organism can pass on a allele without regard to which allele was passed on for a different gene. This is because chromosomes are sorted randomly during meiosis. However, alleles which are on the same chromosome more likely to be inherited together and are said to be linked.

Because there is some crossing over of DNA when the chromosomes segregate, alleles on the same chromosome can be separated and go to different cells. There is much more chance of this happening if the alleles are far apart on the chromosome, as it is more likely that a cross-over will occur between them.

The physical distance between two genes can be calculated using the offspring of an organism showing two linked genetic traits, and finding the percentage of children where the two traits don't run together. The higher the percentage of offspring that don't show both traits, the further apart on the chromosome they are. A study of the linkages between many genes enables the creation of a linkage map.

Two phenotypes (height and texture) occur randomly with respect to one another in a manner known as independent assortment. Today scientists understand that independent assortment occurs when the genes affecting the phenotypes are found on different chromosomes.

An exception to independent assortment develops when genes appear near one another on the same chromosome. When genes occur on the same chromosome, they are inherited as a single unit. Genes inherited in this way are said to be linked. For example, in fruit flies the genes affecting eye color and wing length are inherited together because they appear on the same chromosome.

Human gene nomenclature. For each known human gene the HUGO Gene Nomenclature Committee (HGNC) approve a gene name and symbol (short-form abbreviation). All approved symbols are stored in Genew, the Human Gene Nomenclature Database. Each symbol is unique and each gene is only given one approved gene symbol. It is necessary to provide a unique symbol for each gene so that people can talk about them. This also facilitates electronic data retrieval from publications. In preference each symbol maintains parallel construction in different members of a gene family and can also be used in other species, especially the mouse. k, f. Estimates of the number of genes in an organism are somewhat controversial, because it is only possible to discover a gene, and no techniques currently exist to prove that a DNA sequence contains no gene. Nonetheless, estimates are made based on current knowledge.

But in many cases, genes on the same chromosome that are inherited together produce offspring with unexpected allele combinations. This results from a process called crossing over. Sometimes at the beginning of meiosis, a chromosome pair (made up of a chromosome from the mother and a chromosome from the father) may intertwine and exchange sections of chromosome. The pair then breaks apart to form two chromosomes with a new combination of genes that differs from the combination supplied by the parents. Through this process of recombining genes, organisms can produce offspring with new combinations of maternal and paternal traits that may contribute to or enhance survival.

An allele is any one of a number of alternative forms of the same gene occupying a given locus (position) on a chromosome. An example is the gene for blossom color in many species of flower - a single gene controls the color of the petals, but there may be several different versions of the gene. One version might result in red petals, while another might result in white petals.

Some organisms are diploid - that is, they have paired homologous chromosomes in their somatic cells, and thus contain two copies of each gene. An organism in which both copies of the gene are identical - that is, have the same allele - is said to be homozygous for that gene. An organism which has two different alleles of the gene is said to be heterozygous. Often one allele is "dominant" and the other is "recessive" - the "dominant" allele will determine what trait is expressed. For example, in the case of blossom color, if the "red" allele is dominant to the "white" allele, in a heterozygous flower (with one red and one white allele), the petals will be red. The recessive allele will only be expressed in a recessive homozygote.

One exception is incomplete dominance. Another exception is "codominance", where both alleles are active and both traits are expressed; for example, both red and white petals. Codominance is also apparent in human blood types. A gene containing the codominant pure blood type alles "AA" and "BB" would result in a blood type of "AB". A third exception is "blending inheritance", present in flower blossoms as well. Codominant "blue" and "purple" alleles would result in color blending and hence, violet flower petals.

A wild type allele is an allele which is considered to be "normal" for the organism in question, as opposed to a mutant allele which is usually a relatively new modification.

A chromosome is, minimally, a very long, continuous piece of DNA, which contains many genes, regulatory elements and other intervening nucleotide sequences. In the chromosomes of eukaryotes, the uncondensed DNA exists in a quasi-ordered structure inside the nucleus, where it wraps around histones (structural proteins, Fig. 1), and where this composite material is called chromatin. During mitosis (cell division), the chromosomes are condensed and called metaphasic chromosomes. This is the only natural context in which individual chromosomes are visible with an optical microscope. Prokaryotes do not possess histones or nuclei. a, l, g. In its relaxed state, the DNA can be accessed for transcription, regulation, and replication. Chromosomes were first observed by Karl Wilhelm von Nägeli in 1842 and their behavior later described in detail by Walther Flemming in 1882. In 1910, Thomas Hunt Morgan proved that chromosomes are the carriers of genes.

Eukaryotes possess multiple linear chromosomes contained in the cell's nucleus. Each chromosome has one centromere, with one or two arms projecting from the centromere. The ends of the chromosomes are special structures called telomeres. DNA replication begins at many different locations on the chromosome.

Chromosomes in bacteria Bacterial chromosomes are often circular but sometimes linear. Some bacteria have one chromosome, while others have a few. Bacterial DNA also exists as plasmids. The distinction between plasmids and chromosomes is poorly defined, though size and necessity are generally taken into account. Bacterial chromosomes initiate replication and one origin of replication.

Chromatin Two types of chromatin can be distinguished:

Euchromatin, which consists of DNA that is active, e.g., expressed as protein. Heterochromatin, which consists of mostly inactive DNA. It seems to serve structural purposes during the chromosomal stages. Heterochromatin can be further distinguished into two types: Constitutive heterochromatin, which is never expressed. It is located around the centromere and usually contains repetitive sequences. Facultative heterochromatin, which is sometimes expressed.

In the early stages of mitosis, the chromatin strands become more and more condensed. They cease to function as accessible genetic material and become a compact transport form. Eventually, the two matching chromatids (condensed chromatin strands) become visible as a chromosome, linked at the centromere. Long microtubules are attached at the centromere and two opposite ends of the cell. During mitosis, the microtubules pull the chromatids apart, so that each daughter cell inherits one set of chromatids. Once the cells have divided, the chromatids are uncoiled and can function again as chromatin. In spite of their appearance, chromosomes are highly structured (Fig. 2). For example, genes with similar functions are often kept close together in the nucleus, even if they are far apart on the chromosome. The short arm of a chromosome can be extended by a satellite chromosome that contains codes for ribosomal RNA.

Normal members of a particular species all have the same number of chromosomes (Table 1). Asexually reproducing species have one set of chromosomes, which is the same in all body cells. Sexually reproducing species have somatic cells (body cells), which are diploid [2n] (they have two sets of chromosomes, one from the mother, one from the father) or polyploid [Xn] (more than two sets of chromosomes), and gametes (reproductive cells) which are haploid [n] (they have only one set of chromosomes). Gametes are produced by meiosis of a diploid germ line cell. During meiosis, the matching chromosomes of father and mother can exchange small parts of themselves (crossover), and thus create new chromosomes that are not inherited solely from either parent. When a male and a female gamete merge (fertilization), a new diploid organism is formed.

To determine the (diploid) number of chromosomes of an organism, cells can be locked in metaphase in vitro (in a reaction vial) with colchicine. These cells are then stained (the name chromosome was given because of their ability to be stained), photographed and arranged into a karyotype (an ordered set of chromosomes, Fig. 3), also called karyogram. Like many sexually reproducing species, humans have special gonosomes (sex chromosomes, in contrast to autosomes for body functions). These are XX in females and XY in males. In females, one of the two X chromosomes is inactive and can be seen under a microscope as Barr bodies.

Chemical structure of a gene Four kinds of sequentially linked nucleotides compose a DNA molecule or strand (more at DNA). These four nucleotides constitute the genetic alphabet. A sequence of three consecutive nucleotides, called a codon, is the protein-coding vocabulary. The sequence of codons in a gene specifies the amino-acid sequence of the protein it encodes. In most eukaryotic species, very little of the DNA in the genome encodes proteins, and the genes may be separated by vast sequences of so-called junk DNA. a, c, f, k, l. Moreover, the genes are often fragmented internally by non-coding sequences called introns, which can be many times longer than the genes themselves. Introns are removed on the heels of transcription by splicing. In the primary molecular sense they represent parts of a gene, however.

Deoxyribonucleic acid (DNA) is a nucleic acid which carries genetic instructions for the biological development of all cellular forms of life and many viruses. DNA is sometimes referred to as the molecule of heredity as it is inherited and used to propagate traits. During reproduction, it is replicated and transmitted to offspring.

In bacteria and other simple cell organisms, DNA is distributed more or less throughout the cell. In the complex cells that make up plants, animals and in other multi-celled organisms, most of the DNA is found in the chromosomes, which are located in the cell nucleus. The energy-generating organelles known as chloroplasts and mitochondria also carry DNA, as do many viruses.

Although sometimes called "the molecule of heredity", pieces of DNA as people typically think of them are not single molecules. Rather, they are pairs of molecules, which entwine like vines to form a double helix (see the illustration at the right).

Each vine-like molecule is a strand of DNA: a chemically linked chain of nucleotides, each of which consists of a sugar, a phosphate and one of four kinds of Aromatic hydrocarbon "bases". Because DNA strands are composed of these nucleotide subunits, they are polymers.

The diversity of the bases means that there are four kinds of nucleotides, which are commonly referred to by the identity of their bases. These are adenine (A), thymine (T), cytosine (C), and guanine (G).

In a DNA double helix, two polynucleotide strands can associate through the hydrophobic effect. Specificity of which strands stay associated is determined by complementary pairing. Each base forms hydrogen bonds readily to only one other -- A to T and C to G -- so that the identity of the base on one strand dictates the strength of the association; the more complementary bases exist, the stronger and longer-lasting the association.

The cell's machinery is capable of melting or disassociating a DNA double helix, and using each DNA strand as a template for synthesizing a new strand which is nearly identical to the previous strand. Errors that occur in the synthesis are known as mutations. The process known as PCR mimics this process in vitro in a nonliving system.

Because pairing causes the nucleotide bases to face the helical axis, the sugar and phosphate groups of the nucleotides run along the outside, and the two chains they form are sometimes called the "backbones" of the helix. In fact, it is chemical bonds between the phosphates and the sugars that link one nucleotide to the next in the DNA strand.

The role of the sequence Within a gene, the sequence of nucleotides along a DNA strand defines a protein, which an organism is liable to manufacture or "express" at one or several points in its life using the information of the sequence. The relationship between the nucleotide sequence and the amino-acid sequence of the protein is determined by simple cellular rules of translation, known collectively as the genetic code. The genetic code is made up of three letter 'words' (termed a codon) formed from a sequence of three nucleotides (eg. ACT, CAG, TTT). These codons can then be translated with messenger RNA and then transfer RNA, with a codon corresponding to a particular amino acid. Since there are 64 possible codons, most amino acids have more than one possible codon. There are also three 'stop' or 'nonsense' codons signifying the end of the coding region.

In many species of organism, only a small fraction of the total sequence of the genome appears to encode protein. The function of the rest is a matter of speculation. It is known that certain nucleotide sequences specify affinity for DNA binding proteins, which play a wide variety of vital roles, in particular through control of replication and transcription. These sequences are frequently called regulatory sequences, and researchers assume that so far they have identified only a tiny fraction of the total that exist. "Junk DNA" represents sequences that do not yet appear to contain genes or to have a function.

Sequence also determines a DNA segment's susceptibility to cleavage by restriction enzymes, the quintessential tools of genetic engineering. The position of cleavage sites throughout an individual's genome determines one kind of an individual's "DNA fingerprint".

DNA replication Main article: DNA replication

DNA replication or DNA synthesis is the process of copying the double-stranded DNA prior to cell division. The two resulting double strands are generally almost perfectly identical, but occasionally errors in replication can result in a less than perfect copy (see mutation), and each of them consists of one original and one newly synthesized strand. This is called semiconservative replication. The process of replication consists of three steps: initiation, replication and termination.

Mechanical properties relevant to biology

Space-filling model of a section of DNA moleculeThe hydrogen bonds between the strands of the double helix are weak enough that they can be easily separated by enzymes. Enzymes known as helicases unwind the strands to facilitate the advance of sequence-reading enzymes such as DNA polymerase. The unwinding requires that helicases chemically cleave the phosphate backbone of one of the strands so that it can swivel around the other. The strands can also be separated by gentle heating, as used in PCR, provided they have fewer than about 10,000 base pairs (10 kilobase pairs, or 10 kbp). The intertwining of the DNA strands makes long segments difficult to separate.

When the ends of a piece of double-helical DNA are joined so that it forms a circle, as in plasmid DNA, the strands are topologically knotted. This means they cannot be separated by gentle heating or by any process that does not involve breaking a strand. The task of unknotting topologically linked strands of DNA falls to enzymes known as topoisomerases. Some of these enzymes unknot circular DNA by cleaving two strands so that another double-stranded segment can pass through. Unknotting is required for the replication of circular DNA as well as for various types of recombination in linear DNA.

The DNA helix can assume one of three slightly different geometries, of which the "B" form described by James D. Watson and Francis Crick is believed to predominate in cells. It is 2 nanometres wide and extends 3.4 nanometres per 10 bp of sequence. This is also the approximate length of sequence in which the helix makes one complete turn about its axis. This frequency of twist (known as the helical pitch) depends largely on stacking forces that each base exerts on its neighbors in the chain.

The narrow breadth of the double helix makes it impossible to detect by conventional electron microscopy, except by heavy staining. At the same time, the DNA found in many cells can be macroscopic in length -- approximately 5 centimetres long for strands in a human chromosome. Consequently, cells must compact or "package" DNA to carry it within them. This is one of the functions of the chromosomes, which contain spool-like proteins known as histones, around which DNA winds.

The B form of the DNA helix twists 360° per 10.6 bp in the absence of strain. But many molecular biological processes can induce strain. A DNA segment with excess or insufficient helical twisting is referred to, respectively, as positively or negatively "supercoiled". DNA in vivo is typically negatively supercoiled, which facilitates the unwinding of the double-helix required for RNA transcription.

The two other known double-helical forms of DNA, called A and Z, differ modestly in their geometry and dimensions. The A form appears likely to occur only in dehydrated samples of DNA, such those used in crystallography experiments, and possibly in hybrid pairings of DNA and RNA strands. Segments of DNA that cells have methylated for regulatory purposes may adopt the Z geometry, in which the strands turn about the helical axis like a mirror image of the B form.

DNA sequence reading The asymmetric shape and linkage of nucleotides means that a DNA strand always has a discernible orientation or directionality. Because of this directionality, close inspection of a double helix reveals that nucleotides are heading one way along one strand (the "ascending strand"), and the other way along the other strand (the "descending strand"). This arrangement of the strands is called antiparallel.

All the genes and intervening DNA together make up the genome of an organism, which in many species is divided among several chromosomes and typically present in two or more copies. The location (or locus) of a gene and the chromosome on which it is situated is in a sense arbitrary. Genes that appear together on the chromosomes of one species, such as humans, may appear on separate chromosomes in another species, such as mice. e, h, b, f, e. Two genes positioned near one another on a chromosome may encode proteins that figure in the same cellular process or in completely unrelated processes. As an example of the former, many of the genes involved in spermatogenesis reside together on the Y chromosome.

For reasons of chemical nomenclature, people who work with DNA refer to the asymmetric termini of each strand as the 5' and 3' ends (pronounced "five prime" and "three prime"). DNA workers and enzymes alike always read nucleotide sequences in the "5' to 3' direction". In a vertically oriented double helix, the 3' strand is said to be ascending while the 5' strand is said to be descending.

As a result of their antiparallel arrangement and the sequence-reading preferences of enzymes, even if both strands carried identical instead of complementary sequences, cells could properly translate only one of them. The other strand a cell can only read backwards. Molecular biologists call a sequence "sense" if it is translated or translatable, and they call its complement "antisense". It follows then, somewhat paradoxically, that the template for transcription is the antisense strand. The resulting transcript is an RNA replica of the sense strand and is itself sense.

Some viruses blur the distinction between sense and antisense, because certain sequences of their genomes do double duty, encoding one protein when read 5' to 3' along one strand, and a second protein when read in the opposite direction along the other strand. As a result, the genomes of these viruses are unusually compact for the number of genes they contain, which biologists view as an adaptation.

Topologists like to note that the juxtaposition of the 3' end of one DNA strand beside the 5' end of the other at both termini of a double-helical segment makes the arrangement a "crab canon".

Single-stranded DNA (ssDNA) and repair of mutations In some viruses DNA appears in a non-helical, single-stranded form. Because many of the DNA repair mechanisms of cells work only on paired bases, viruses that carry single-stranded DNA genomes mutate more frequently than they would otherwise. As a result, such species may adapt more rapidly to avoid extinction. The result would not be so favorable in more complicated and more slowly replicating organisms, however, which may explain why only viruses carry single-stranded DNA. These viruses presumably also benefit from the lower cost of replicating one strand versus two.

The discovery of DNA and the double helix Working in the 19th century, biochemists initially isolated DNA and RNA (mixed together) from cell nuclei. They were relatively quick to appreciate the polymeric nature of their "nucleic acid" isolates, but realized only later that nucleotides were of two types--one containing ribose and the other deoxyribose. It was this subsequent discovery that led to the identification and naming of DNA as a substance distinct from RNA.

Friedrich Miescher (1844-1895) discovered a substance he called "nuclein" in 1869. Somewhat later, he isolated a pure sample of the material now known as DNA from the sperm of salmon, and in 1889 his pupil, Richard Altmann, named it "nucleic acid". This substance was found to exist only in the chromosomes.

Max Delbrück, Nikolai V. Timofeeff-Ressovsky, and Karl G. Zimmer published results in 1935 suggesting that chromosomes are very large molecules the structure of which can be changed by treatment with X-rays, and that by so changing their structure it was possible to change the heritable characteristics governed by those chromosomes. (Delbrück and Salvador Luria were awarded the Nobel Prize in 1969 for their work on the genetic structure of viruses.) In 1943, Oswald Theodore Avery discovered that traits proper to the "smooth" form of the Pneumococcus could be transferred to the "rough" form of the same bacteria merely by making the killed "smooth" (S) form available to the live "rough" (R) form. Quite unexpectedly, the living R Pneumococcus bacteria were transformed into a new strain of the S form, and the transferred S characteristics turned out to be heritable.

In 1944, the renowned physicist, Erwin Schrödinger, published a brief book entitled What is Life?, where he maintained that chromosomes contained what he called the "hereditary code-script" of life. He added: "But the term code-script is, of course, too narrow. The chromosome structures are at the same time instrumental in bringing about the development they foreshadow. They are law-code and executive power -- or, to use another simile, they are architect's plan and builder's craft -- in one." He conceived of these dual functional elements as being woven into the molecular structure of chromosomes. By understanding the exact molecular structure of the chromosomes one could hope to understand both the "architect's plan" and also how that plan was carried out through the "builder's craft." Francis Crick, James D. Watson, Maurice Wilkins, Rosalind Franklin, Seymour Benzer, et al., took up the physicist's challenge to work out the structure of the chromosomes and the question of how the segments of the chromosomes that were conceived to relate to specific traits could possibly do their jobs.

Just how the presence of specific features in the molecular structure of chromosomes could produce traits and behaviors in living organisms was unimaginable at the time. Because chemical dissection of DNA samples always yielded the same four nucleotides, the chemical composition of DNA appeared simple, perhaps even uniform. Organisms, on the other hand, are fantastically complex individually and widely diverse collectively. Geneticists did not speak of genes as conveyors of "information" in such words, but if they had, they would not have hesitated to quantify the amount of information that genes need to convey as vast. The idea that information might reside in a chemical in the same way that it exists in text--as a finite alphabet of letters arranged in a sequence of unlimited length--had not yet been conceived. It would emerge upon the discovery of DNA's structure, but few researchers imagined that DNA's structure had much to say about genetics.

Many species carry more than one copy of their genome within each of their somatic cells. These organisms are called diploid if they have two copies, or polyploid if they have more than two copies. In such organisms, the copies are practically never identical. With respect to each gene, the copies that an individual possesses are liable to be distinct alleles, which may act synergistically or antagonistically to generate a trait or phenotype. The ways that gene copies interact are explained by chemical dominance relationships (more at genetics, allele). Expression of molecular genes For various reasons, the relationship between DNA strand and a phenotype trait is not direct. j, l, c, f, b. The same DNA strand in 2 different individuals may result in different traits because of the effect of other DNA strands or the environment.

In the 1950s, only a few groups made it their goal to determine the structure of DNA. These included an American group led by Linus Pauling, and two groups in Britain. At the University of Cambridge, Crick and Watson were building physical models using metal rods and balls, in which they incorporated the known chemical structures of the nucleotides, as well as the known position of the linkages joining one nucleotide to the next along the polymer. At King's College, London, Maurice Wilkins and Rosalind Franklin were examining x-ray diffraction patterns of DNA fibers.

A key inspiration in the work of all of these teams was the discovery in 1948 by Pauling that many proteins included helical (see alpha helix) shapes. Pauling had deduced this structure from x-ray patterns. Even in the initial crude diffraction data from DNA, it was evident that the structure involved helices. But this insight was only a beginning. There remained the questions of how many strands came together, whether this number was the same for every helix, whether the bases pointed toward the helical axis or away, and ultimately what were the explicit angles and coordinates of all the bonds and atoms. Such questions motivated the modeling efforts of Watson and Crick.

In their modeling, Watson and Crick restricted themselves to what they saw as chemically and biologically reasonable. Still, the breadth of possibilities was very wide. A breakthrough occurred in 1952, when Erwin Chargaff visited Cambridge and inspired Crick with a description of experiments Chargaff had published in 1947. Chargaff had observed that the proportions of the four nucleotides vary between one DNA sample and the next, but that for particular pairs of nucleotides -- adenine and thymine, guanine and cytosine -- the two nucleotides are always present in equal proportions.

Watson and Crick had begun to contemplate double helical arrangements, and they saw that by reversing the directionality of one strand with respect to the other, they could provide an explanation for Chargaff's puzzling finding. This explanation was the complementary pairing of the bases, which also had the effect of ensuring that the distance between the phosphate chains did not vary along a sequence. Watson and Crick were able to discern that this distance was constant and to measure its exact value of 2 nanometres from an X-ray pattern obtained by Franklin. The same pattern also gave them the 3.4 nanometre-per-10 bp "pitch" of the helix. The pair quickly converged upon a model, which they announced before Franklin herself published any of her work.

The great assistance Watson and Crick derived from Franklin's data has become a subject of controversy, and it has angered people who believe Franklin has not received the credit due to her. The most controversial aspect is that Franklin's critical X-ray pattern was shown to Watson and Crick without Franklin's knowledge or permission. Wilkins showed it to them at his lab while Franklin was away.

Watson and Crick's model attracted great interest immediately upon its presentation. Arriving at their conclusion on February 21, 1953, Watson and Crick made their first announcement on February 28. Their paper 'A Structure for Deoxyribose Nucleic Acid' published on April 25. In an influential presentation in 1957, Crick laid out the "Central Dogma", which foretold the relationship between DNA, RNA, and proteins, and articulated the "sequence hypothesis." A critical confirmation of the replication mechanism that was implied by the double-helical structure followed in 1958 in the form of the Meselson-Stahl experiment. Work by Crick and coworkers deciphered the genetic code not long afterward. These findings represent the birth of molecular biology.

Watson, Crick, and Wilkins were awarded the 1962 Nobel Prize for Medicine for discovering the molecular structure of DNA, by which time Franklin had died. Nobel prizes are not awarded posthumously.

A DNA sequence (sometimes genetic sequence) is a succession of letters representing the primary structure of a real or hypothetical DNA molecule or strand, The possible letters are A, C, G, and T, representing the four nucleotide subunits of a DNA strand (adenine, cytosine, guanine, thymine), and typically these are printed abutting one another without gaps, as in the sequence AAAGTCTGAC. This coded sequence is sometimes referred to as genetic information. A succession of any number of nucleotides greater than four is liable to be called a sequence. With regard to its biological function, which may depend on context, a sequence may be sense or anti-sense (see DNA), and either coding or noncoding. DNA sequences can also contain "junk DNA".

When a sequence motif appears in the exon of a gene, it may encode the "structural motif" of a protein; that is a stereotypical element of the overall structure of the protein. Nevertheless, motifs need not be associated with a distinctive secondary structure. "Noncoding" sequences are not translated into proteins and nucleic acids with such motifs need not deviate from the typical shape (e.g. the "B-form" DNA double helix).

Outside of gene exons, there exist regulatory sequence motifs and motifs within the "junk," such as satellite DNA. Some of these are believed to affect the shape of nucleic acids (see for example RNA self-splicing), but this is only sometimes the case. For example, many DNA binding proteins that have affinities for specific motifs only bind DNA in its double-helical form. They are able to recognize motifs through contact with the double helix's major or minor groove.

Short coding motifs, which appear to lack secondary structure, include those that label proteins for delivery to particular parts of a cell, or mark them for phosphorylation.

Within a sequence or database of sequences, researchers search and find motifs using computer-based techniques of sequence analysis, such as BLAST. Such techniques belong to the discipline of bioinformatics.

See also: consensus sequence.

Motif bioinformatics Consider the N-glycosylation site motif mentioned above:

Asn, followed by anything but Pro, followed by either Ser or Thr, followed by anything but Pro This pattern may be written as N{P}[ST]{P} where N=Asn, P=Pro, S=Ser, T=Thr; {X} means any amino acid except X; and [XY] means either X or Y.

The notation [XY] does not give any indication of the probability of X or Y occurring in the pattern. Sometimes patterns are defined in terms of a probabilistic model such as a hidden Markov model.

Motifs and consensus sequences The notation [XYZ] means X or Y or Z, but does not indicate the likelihood of any particular match. For this reason, two or more patterns are often associated with a single motif: the defining pattern, and various typical patterns.

For example, the defining sequence for the IQ motif may be taken to be:

[FILV]Qxxx[RK]Gxxx[RK]xx[FILVWY] where x signifies any amino acid, and the square brackets indicate an alternative (see below for further details about notation).

Usually, however, the first letter is I, and both [RK] choices resolve to R. Since the last choice is so wide, the pattern IQxxxRGxxxR is sometimes equated with the IQ motif itself, but a more accurate description would be a consensus sequence for the IQ motif.

The DNA strand is expressed into a trait only if it is transcribed to RNA. Because the transcription starts from a specific base-pair sequence (a promoter) and stops at another (a terminator), our DNA strand needs to be correctly placed between the two. If not, it is considered as junk DNA, and is not expressed. Cells regulate the activity of genes in part by increasing or decreasing their rate of transcription. Over the short term, this regulation occurs through the binding or unbinding of proteins, known as transcription factors, to specific non-coding DNA sequences called regulatory elements. So, to be expressed, our DNA strand needs to be properly regulated by other DNA strands. the DNA strand may also be silenced through DNA methylation or by chemical changes to the protein components of chromosomes (see histone). This is a permanent form of regulation of the transcription. the RNA is often edited before its translation into a protein. Eukaryotic cells splice the transcripts of a gene, by keeping the exons and removing the introns. So, the DNA strand needs to be in an exon to be expressed. Because of the complexity of the splicing process, one transcribed RNA may be spliced in alternate ways to produce not one but a variety of proteins (alternative splicing) from one pre-mRNA. g, h, j, a, g. Prokaryotes produce a similar effect by shifting reading frames during translation. the translation of RNA into a protein also starts with a specific start and stop sequence. once produced, the protein interacts with the many other proteins in the cell, according to the cell metabolism. This interaction finally produces the trait.

Software There are software programs which, given multiple input sequences, attempt to identify one or more candidate motifs. One example is MEME, which generates statistical information for each candidate.

Discovery through evolutionary conservation Motifs have been discovered by studying similar genes in different species. For example, by aligning the amino acid sequences specified by the GCM (glial cells missing) gene in man, mouse and D. melanogaster, Akiyama and others discovered a pattern which they called the GCM motif. It spans about 150 amino acid residues, and begins as follows:

WDIND*.*P..*...D.F.*W***.**.IYS**...A.*H*S*WAMRNTNNHN Here each . signifies a single amino acid or a gap, and each * indicates one member of a closely-related family of amino acids.

The authors were able to show that the motif has DNA binding activity.

In genetics, transcription is the process of copying DNA to RNA by an enzyme called RNA polymerase (RNAP).

Ribonucleic acid (RNA) is a nucleic acid consisting of a string of covalently-bound nucleotides. It is biochemically distinguished from DNA by the presence of an additional hydroxyl group, attached to each pentose ring; as well as by the use of uracil, instead of thymine. RNA transmits genetic information from DNA (via transcription) into proteins (by translation).

RNA has four different bases: adenine, guanine, cytosine, and uracil. The first three are the same as those found in DNA, but uracil replaces thymine as the base complementary to adenine. This may be because uracil is energetically less expensive to produce, although it easily degenerates into cytosine. Thus, uracil is appropriate for RNA, where quantity is important but lifespan is not, whereas thymine is appropriate for DNA.

Comparison to DNA Structurally, RNA is indistinguishable from DNA except for the critical presence of a hydroxyl group attached to the pentose ring in the 2' position (DNA has a hydrogen atom rather than a hydroxyl group). This hydroxyl gives the molecule far greater catalytic versatility and allows it to perform reactions that DNA is incapable of performing; but at the same time, it makes RNA sensitive to alkaline hydrolysis, to which DNA is not.

The other major difference between RNA and DNA is that RNA is almost exclusively found in the single-stranded form (an exception being the genetic material of some kinds of viruses). RNA molecules often fold into more complex structures by making use of complementary internal sequences; that is, one part of a single RNA molecule is the nucleic acid complement of another part of the same molecule (for example, 5'-ACUCGA-3' and 5'-UCGAGU-3'), so that the two strands bind together. This allows the formation of hairpin loops, coils, etc., which then direct the formation of higher-order structures.

Synthesis RNA is made by an enzyme, RNA polymerase, using DNA as a template. The synthesis begins when the enzyme binds special promoter regions in the DNA. The DNA double helix is unwound by the helicase activity of the enzyme. RNA is then synthesised so that it is complementary to one of strands in the DNA. A small stretch of DNA-RNA hybrid is present at the active site of the enzyme. The synthesis continues until a termination sequence is reached. The resulting RNA molecule is the primary transcript. Modulation of the rate of initiation of RNA synthesis is one of the most important ways in which gene expression is regulated.

RNA world hypothesis The RNA world hypothesis proposes that the universal ancestor to all life relied on RNA both to carry genetic information like DNA and to catalyze biochemical reactions like an enzyme. In effect, RNA was, before the emergence of the first cell, the dominant, and probably the only, form of life. This hypothesis is inspired by the fact that retroviruses use RNA as their sole genetic material, while peptide bond formation in the ribosome is carried out by an RNA-derived ribozyme. From this perspective, retroviruses and ribozymes are remnants, or molecular fossils, left over from that RNA world. Assuming that DNA is better suited for storage of genetic information and proteins are better suited for the catalytic needs of cells, one would expect reduced use of RNA in cells, and greater use of DNA and proteins.

Messenger RNA (mRNA) Main article: Messenger RNA

Messenger RNA is RNA that carries information from DNA to the ribosome sites of protein synthesis in the cell. Once mRNA has been transcribed from DNA, it is exported from the nucleus into the cytoplasm (in eukaryotes mRNA is "processed" before being exported), where it is bound to ribosomes and translated into protein. After a certain amount of time the message degrades into its component nucleotides, usually with the assistance of RNases.

Non-coding RNA or "RNA genes" Main article: Non-coding RNA

RNA genes (sometimes referred to as non-coding RNA or small RNA) are genes that encode RNA that is not translated into a protein. The most prominent examples of RNA genes are transfer RNA (tRNA) and ribosomal RNA (rRNA), both of which are involved in the process of translation. However, since the late 1990s, many new RNA genes have been found, and thus RNA genes may play a much more significant role than previously thought.

Double-stranded RNA Double-stranded RNA (or dsRNA) is RNA with two complementary strands, similar to the DNA found in all "higher" cells. dsRNA forms the genetic material of some viruses. In eukaryotes, it may play a role in the process of RNA interference and in microRNAs.

In biology, mitosis is the process of chromosome segregation and nuclear division that follows replication of the genetic material in eukaryotic cells. This process assures that each daughter nucleus receives a complete copy of the organism's genome. In most eukaryotes mitosis is accompanied with cell division or cytokinesis, but there are many exceptions, for instance among the fungi. There is another process called meiosis, in which the daughter nuclei receive half the chromosomes of the parent, which is involved in gamete formation and other similar processes.

Mitosis is divided into several stages, with the remainder of the cell's growth cycle considered interphase. Properly speaking, a typical cell cycle involves a series of stages: G1, the first growth phase; S, where the genetic material is duplicated; G2, the second growth phase; and M, where the nucleus divides through mitosis. Mitosis is divided into prophase, prometaphase, metaphase, anaphase, and telophase.

Plasmids are (typically) circular double stranded DNA molecules that are separate from the chromosomal DNA (Fig. 1). They usually occur in bacteria, sometimes in eukaryotic organisms (e.g., the 2-micrometre-ring in Saccharomyces cerevisiae). Their size varies from 1 to over 400 kilo base pairs (kbp). There are from one copy, for large plasmids, to hundreds of copies of the same plasmid present in a single cell.

This complex process helps explain the different meanings of "gene": a nucleotide sequence in a DNA strand; or the transcribed RNA, prior to splicing; or the transcribed RNA after splicing, i.e. without the introns The latter meaning of gene is the result of more "material entity" that the first one. Mutations and evolution Just as there are many factors influencing the expression of a particular DNA strand, there are many ways to have genetic mutations. For example, natural variations within regulatory sequences appear to underlie many of the heritable characteristics seen in organisms. The influence of such variations on the trajectory of evolution through natural selection may be as large as or larger than variation in sequences that encode proteins. a, l, d, j, i. Thus, though regulatory elements are often distinguished from genes in molecular biology, in effect they satisfy the shared and historical sense of the word. Indeed, a breeder or geneticist, in following the inheritance pattern of a trait, has no immediate way to know whether this pattern arises from coding sequences or regulatory sequences. Typically, he or she will simply attribute it to variations within a gene.

Plasmid often contain genes or gene-cassettes that confer a selective advantage to the bacterium harboring them, e.g., the ability to build an antibiotic resistance. Every plasmid contains at least one DNA sequence that serves as an origin of replication or ori (a starting point for DNA replication), which enables the plasmid DNA to be duplicated independently from the chromosomal DNA.

Plasmids that exist only as a single copy in each bacterium are, upon cell division, in danger of being lost in one of the segregating bacteria. Such single-copy plasmids have systems which attempt to actively distribute a copy to both daughter cells.

Some plasmids include an addiction system. They produce both a long-lived poison and its short-lived antidote. The cell that keeps a copy of the plasmid will survive, while the cell without the plasmid will die because it is running out of antidote shortly. This is an example of plasmids as selfish DNA.

Applications of plasmids Plasmids serve as important tools in genetics and biochemistry labs, where they are commonly used to multiply (make many copies of) or express particular genes. There are many plasmids that are commercially available for such uses. Initially, the gene to be replicated is inserted in a plasmid . These plasmids contain, in addition to the inserted gene, one or more genes capable of providing antibiotic resistance to the bacteria that harbors them. The plasmids are next inserted into bacteria by a process called transformation, which are then grown on specific antibiotic(s). Bacteria which took up one or more copies of the plasmid then express (make protein) the gene that confers antibiotic resistance. This is typically a protein which can break down any antibiotics that would otherwise kill the cell. As a result, only the bacteria with antibiotic resistance can survive, the very same bacteria containing the genes to be replicated. The antibiotic(s) will, however, kill those bacteria that did not receive a plasmid, because they have no antibiotic resistance genes. In this way the antibiotic(s) acts as a filter selecting out only the modified bacteria. Now these bacteria can be grown in large amounts, harvested and lysed to isolate the plasmid of interest.

Another major use of plasmids is to make large amounts of proteins. In this case you grow the bacteria containing a plasmid harboring the gene of interest. Just as the bacteria produces proteins to confer its antibiotic resistance, it can also be induced to produce large amounts of proteins from the inserted gene. This is a cheap and easy way of mass-producing a gene or the protein it then that codes for--for example, insulin or even antibiotics.

A bacterial artificial chromosome (BAC) is a DNA construct, based on a fertility plasmid, used for transforming and cloning in bacteria, usually E. coli. Its usual insert size is 150 kbp, with a range from 100 to 300 kbp.

BACs are often used in sequencing other organisms, in genome projects, for example the Human Genome Project. A short piece of the organisms DNA is amplified as an insert in BACs, then sequenced. Finally, the sequenced parts are rearranged in silico, resulting in the genome sequence of the sequenced organism.

In genetics terminology, sequencing is most often restricted to determining the nucleotides of a DNA or RNA strand. Currently, most such sequencing is performed using the chain termination method; however, this can only be used to identify fairly short sequences (around 300-1000 base pairs on ABI machine), and must therefore be used as the basis for more complex techniques, such as chromosome walking and shotgun sequencing.

The genetic code is a set of rules, which maps DNA sequences to proteins in the living cell, and is employed in the process of protein synthesis. Nearly all living things use the same genetic code, called the standard genetic code, although a few organisms use minor variations of the standard code.

The genetic information carried by an organism - its genome - is inscribed in a DNA molecule. Each functional portion of this molecule is referred to as a gene. Each gene is transcribed into a short template molecule of the related polymer RNA, which is better suited for protein synthesis. This in turn is translated, by mediation of a machinery consisting of ribosomes and a set of transfer RNAs and associated enzymes, into an amino acid chain (polypeptide).

The gene sequence inscribed in DNA, and thus in RNA, is composed of units called codons, each coding for an amino acid, hence the phrase genetic code. The polypeptide is ultimately folded into a 3-dimensional protein structure, which will go on to perform some specific function in the cell such as an enzyme subunit or cell membrane component. This chain of events involving RNA transcription, and polypeptide translation is referred to as gene expression. Some genes encode other elements such as ribosomal RNAs and transfer RNAs, both of which are involved in protein synthesis.

Both DNA and RNA are comprised of 4 nucleotide bases. In the case of DNA this is comprised of adenine (A), guanine (G), cytosine (C) and thymine (T). RNA is identical with the exception that thymine (T) is substituted with uracil (U). Codons are non-overlapping groups of the three bases. There are 43 = 64 codons. For example, the RNA sequence UUUAAACCC contains the codons UUU, AAA and CCC, each of which specifies one amino acid. So, this RNA sequence represents a protein sequence, three amino acids long. (DNA is also a sequence of nucleotide bases, but there thymine takes the place of uracil.)

In classical genetics, the stop codons were given names: UAG was amber, UGA was opal, and UAA was ocher. These names were originally the names of the specific genes in which mutation of each of these stop codons was first detected. Translation starts with a chain initiation codon (start codon). But unlike stop codons, these are not sufficient to begin the process; nearby initiation sequences are also required to induce transcription into mRNA and binding by ribosomes. The most notable start codon is AUG, whi