|








| |
Journal of Bacteriology, May 2004, p . 3254-3258, Vol . 186, No . 10
Genome-Wide Analysis of Lipoprotein Expression in Escherichia coli MG1655
Stephen J . Brokx,1,2 Michael Ellison,1 Troy Locke,1 Drell Bottorff,1 Laura Frost,3 and Joel H . Weiner1,2*
Project
CyberCell, Department of
Biochemistry,1
CIHR Membrane Protein
Research Group,2
Department of
Biological Sciences, University of Alberta,
Edmonton, Alberta T6G 2H7, Canada3
Received 24 July 2003/
Accepted 3 February 2004
To
gain insight into the cell envelope of Escherichia coli grown
under aerobic and anaerobic conditions, lipoproteins were examined by
using functional genomics . The mRNA expression levels of each of these
genes under three growth conditionsaerobic, anaerobic, and
anaerobic with nitratewere examined by using both Affymetrix
GeneChip E . coli antisense genome arrays and
real-time PCR (RT-PCR) . Many genes showed significant changes in
expression level . The RT-PCR results were in very good
agreement with the microarray data . The results of this study represent
the first insights into the possible roles of unknown lipoprotein genes
and broaden our understanding of the composition of the cell envelope
under different environmental conditions . Additionally, these data
serve as a test set for the refinement of high-throughput bioinformatic
and global gene expression
methods .
Bacterial lipoproteins comprise a unique set of proteins
modified at their amino-terminal cysteines by the addition
of N-acyl and S-diacyl glyceryl groups
(30) . In Escherichia
coli, this lipid serves to anchor these proteins to the inner or
outer membrane so that they can function at the lipid aqueous
interface . These proteins can be identified by the presence of a leader
with a common consensus sequence
(5) . The leader is
typically between 15 and 40 amino acid residues in length and has at
least one arginine or lysine in the first seven residues . The leader is
cleaved by signal peptidase II on the amino terminal side of the
cysteine residue, which is then enzymatically modified
(30) .
The
E . coli genome has previously been searched for
potential lipoproteins . Various algorithms have
been used for genome sequence analysis to identify potential
lipoproteins, and these lipoproteins have been tabulated in
databases on the World Wide Web
(http://www.mrc-lmb.cam.ac.uk/genomes/dolop/, http://www.expasy.org/prosite, and http://www.projectcybercell.com);
from these databases, we compiled a list of 96 lipoproteins . Fifty-six
of these genes (58%) have completely unknown functions, a much
higher fraction than that for the E . coli genome, in
which approximately 25 to 30% of the genes have no known
function . Thus, the examination of the expression of the lipoprotein
genes under different growth conditions would be a beginning to
understanding the function and importance of many of the unknown
genes .
Other putative lipoproteins exist in E.
coli but were not part of the gene expression study . First,
the murein transglycosylase MltE (Blattner no . b1163) is not in any
current lipoprotein database but has been experimentally shown to be a
lipoprotein (17) . Second,
yifL (Blattner no . b3808.1) was originally not annotated in
the E . coli genome sequencing project, but YifL
now appears in the Prosite database
(http://www.expasy.org/prosite)
as a putative lipoprotein . Also, very small lipoproteins such as the
entericidins (EcnA and EcnB)
(3) were omitted from this
study because they are below the Affymetrix cutoff for open reading
frame (ORF) inclusion (150 bp) .
In the present study, we used
this set of protein genes to begin analyzing the global changes in gene
expression during aerobic and anaerobic growth with a view to
understanding the changes in the composition of the cell envelope . The
expression of lipoprotein mRNAs in E . coli MG1655
incubated in glucose defined media
(21) either aerobically
with shaking in an Erlenmeyer flask or anaerobically in a sealed
screw-cap tube, with 40 mM KNO3 being added to one set of
anaerobic cultures as an alternative electron acceptor, was monitored.
RNA was then isolated from the cells with a MasterPure RNA purification
kit (Epicentre Technologies, Madison, Wis.), and cDNA synthesis and
labeling was done as described in the Affymetrix GeneChip E.
coli Antisense Genome Array Technical Manual
(1) . Affymetrix GeneChip
antisense E . coli genome arrays were used to analyze
the complete E . coli transcriptome . Each microarray
contained 295,000 probes . Each identified ORF was covered by 15 probe
pairs consisting of a perfect match and a 1-nucleotide mismatch pair.
If the perfect match probe showed an intensity that was 200 U higher
than that of the mismatch probe, the probe pair was considered to be
present . An ORF was considered to be present with 95%
confidence if neighboring probe pairs within an ORF were
present .
Using this cutoff, we were able to group the
lipoproteins into four classes, as listed in Table
1 . Twenty-one lipoprotein genes were not expressed (not present in the
array analysis) under any of the selected conditions . Ten were present
under one growth condition, 5 were present under two conditions, and 60
were present under all three conditions . Sixty-four of the
lipoprotein genes were expressed at detectable levels during
aerobic growth, the standard experimental growth condition for
E . coli . Not surprisingly, lpp,
the gene for the major structural outer membrane murein lipoprotein
(25), has the highest
expression level of all the genes . Other well-known lipoprotein genes
highly expressed under aerobic growth conditions include pal,
the gene for peptigoglycan-associated lipoprotein
(19), and cyoA,
which encodes a subunit of the cytochrome O terminal oxidase, the major
terminal oxidase of the aerobic respiratory chain
(7) .
|
TABLE 1 . E.
coli lipoprotein genes and their expression in microarray and
RT-PCR analysesa
|
|
We then used
real-time PCR to help better quantify the expression levels for the
lipoprotein genes . First, reverse transcription (RT) was carried out
with the same total RNA samples used for the microarray analysis and
random hexamer primer (Invitrogen, Burlington, Ontario, Canada) . RT was
performed with SuperScript II (Invitrogen) for reactions with RT
(+RT reactions) . Control reactions were also performed under
the same conditions except that SuperScript II was omitted (RT
reactions) . Both types of reactions were used in real-time
PCRs .
Primers for real-time PCR were designed with Primer Express
2.0 software from Applied Biosystems (ABI) (Foster City, Calif.).
Forward and reverse primer pairs were designed for the 5' and
3' regions of each gene and purchased from Sigma Genosys
(Oakville, Ontario, Canada) . Real-time PCRs were carried out for each
primer set with both the +RT and RT reactions for each
growth condition . The reaction buffer contained, in part, 1x
ROX glycine conjugate of 5-carboxy-X-rhodamine, with succinimidyl ester
as the inert-passive reference dye, and SYBR Green I . The reaction
mixtures were aliquoted into 384-well ABI reaction plates . The plates
were then placed in an ABI Prism 7900HT RT-PCR machine under the
following conditions: stage 1 consisted of 95°C for
45 s; stage 2 consisted of 40 cycles of 95°C for
15 s, followed by 60°C for 1 min; stage 3 consisted
of 95°C for 15 s; stage 4 consisted of 60°C
for 15 s; and for stage 5, the temperature was ramped to
95°C for 5 s . The RT-PCR data were analyzed with SDS
2.0 software (ABI) . Each +RT-versus-RT reaction set
was compared against a standard curve generated for each primer set by
using E . coli linear DNA as a standard . A cycle
threshold value was chosen that gave a linear regression value greater
than 0.996 for each primer set standard curve . The calculated quantity
values for each +RT or RT reaction were
standardized within each individual primer set-generated standard
curve .
The RT-PCR data correlate well with the microarray data in
that highly expressed genes found in the microarray study also give
high RT-PCR signals . However, the RT-PCR results are much more accurate
and sensitive and give a wider dynamic range of numbers . Signal
intensity ratios for anaerobic-versus-aerobic and
anaerobic-plus-nitrate-versus-aerobic data sets were calculated for
both microarray ("present" values only) and RT-PCR data
and are compared in a scatter plot in Fig.
1 . The anaerobic/aerobic ratios had very good correlation between RT-PCR
and microarray data, with an R2 value of 0.888; the
nitrate/aerobic ratios had a slightly lower correlation
(R2 = 0.757) .
|
FIG . 1 . Scatter
plot comparative analysis of microarray and RT-PCR data . The signal
intensities of 61 genes which were designated "present"
under all three growth conditions in the microarray data set were
compared to the signal intensities generated by RT-PCR . Each datum
point represents the log2 ratio of the signal intensity
determined for one growth condition to the signal intensity determined
for another growth condition, as determined by both RT-PCR and
microarray analyses . (a) Ratios for anaerobically (An) versus
aerobically (Ae) grown cells; (b) ratios for anaerobic-plus-nitrate
(NO3) versus aerobic
cells.
|
|
A further examination
of gene expression patterns based on the RT-PCR data was then
undertaken in order to gain some insight into possible functions of
unknown lipoproteins . Growth under anaerobic conditions results in
significant changes in the expression of many of the lipoprotein genes
relative to aerobic expression . The expression of key structural
protein genes such as lpp and pal remained fairly
constant under these conditions, which was expected given their
"housekeeping" role . However, transcripts of 14 genes
are induced twofold or more under anaerobic growth . The gene with the
strongest anaerobic induction is slp (Table
1) . This gene is known to
be induced under conditions of starvation or in the stationary phase
(2), so perhaps it is not
surprising that it is also induced during slow anaerobic growth . Other
genes with strong anaerobic induction include ybgE, which
appears to be cotranscribed with the cydAB cytochrome D
terminal oxidase (also strongly induced anaerobically [data not
shown]), and osmB, a lipoprotein gene which is also
induced by high osmotic strength and in the stationary phase
(15,
16) . Only five genes had
twofold or greater reductions of signal intensity under anaerobic
conditions relative to that under aerobic conditions . These genes
included cusC/ibeB, which is induced by high
concentrations of copper ions
(20) and appears to be
important for virulence and invasion across the blood-brain barrier for
other E . coli strains
(12,
13), and spr,
which encodes a putative penicillin binding protein
(11) .
The addition
of nitrate to an anaerobic culture as an alternative electron acceptor
also influences the expression of several of the lipoprotein genes . It
is known that the addition of nitrate to anaerobic cultures regulates
the expression of many genes, especially those involved in alternative
electron transport pathways
(26) . When grown
anaerobically with nitrate, the RT-PCR signals for 13 genes decreased
twofold or more and those for 4 genes increased twofold or more
relative to those under anaerobic conditions without nitrate.
Anaerobically induced genes such as slp and yhiU are
repressed with the addition of nitrate . Two of the four lipoproteins
induced by the addition of nitrate to an anaerobic culture, albeit with
low overall signals, are ymcA and ymcC, which form
part of a putative ymcCBA operon . Another gene induced by
nitrate is osmE, a putative lipoprotein gene which is induced
by high osmotic strength
(10) .
Microarrays
have fast become a commonly used tool to examine global expression
profiles for many bacterial species (see reference
6 for a recent review).
Many of these microarray studies have gone one step further by
selection of a subset of genes to determine expression by RT-PCR data,
which are then compared to the microarray data
(4,
8,
9,
18,
22-24,
27-29).
However, in these cases, the genes studied by RT-PCR are chosen as a
subset of genes of interest that were originally identified by the
microarray assay . The approach presented here is different; the gene
set to be studied by RT-PCR was not chosen on the basis of the
microarray results; instead, it was chosen based on known or predicted
functions of the gene products . Even with this unbiased approach to
selecting genes for RT-PCR analysis, the correlation between the RT-PCR
data and the microarray data is very good . The more accurate and
quantitative RT-PCR data were not used just for comparative purposes,
however . These data were then used to identify potentially significant
unknown lipoprotein genes with either high gene expression levels or
significant changes in gene expression depending on growth conditions.
This study has produced the first real data reported for these unknown
genes and may lead to more effective investigation of these genes in
the future . With the usefulness of this approach assured, it is now
time to further study the other unknown lipoprotein genes showing
either strong or varied expression levels by other
means .
ADDENDUM
As this
paper was under review, other potential lipoproteins in E.
coli came to our attention, especially those predicted in a recent
related paper (14) . These
genesrcsF (Blattner no . b0196), borD
(b0557), ybjR (b0867), ycaL (b0909), hslJ
(b1379), ynfC (b1585), smpA (b2617), yggG
(b2936), yidQ (b3688), yidX (b3696), and
yjeI (b4144)are included in the microarray results,
but RT-PCR data were not generated .
This work was supported by the Natural Sciences
and Engineering Research Council and by Project CyberCell .
J.H.W.
is a Canada Research Chair on Membrane Biochemistry . We thank Philip
Winter for data
analysis .
* Corresponding
author . Mailing address: Department of Biochemistry, University of
Alberta, Edmonton, Alberta T6G 2H7, Canada . Phone: (780)
492-2761 . Fax: (780) 492-0886 . E-mail:
joel.weiner{at}ualberta.ca .
- Affymetrix,
Inc. 2002, posting date . GeneChip® E.
coli antisense genome array technical manual . [Online.]
Affymetrix, Inc., Santa Clara, Calif . http://www.affymetrix.com/support/technical/manuals.affx.
- Alexander,
D . M., and A . C . St John. 1994.
Characterization of the carbon starvation-inducible and stationary
phase-inducible gene slp encoding an outer membrane lipoprotein in
Escherichia coli . Mol . Microbiol.
11:1059-1071.
- Bishop,
R . E., B . K . Leskiw, R . S . Hodges,
C . M . Kay, and J . H . Weiner. 1998.
The entericidin locus of Escherichia coli and its implications
for programmed bacterial cell death . J . Mol . Biol.
280:583-596.
- Boyce,
J . D., I . Wilkie, M . Harper, M . L . Paustian, V.
Kapur, and B . Adler. 2002 . Genomic scale analysis of
Pasteurella multocida gene expression during growth within the
natural chicken host . Infect . Immun.
70:6871-6879.
- Braun,
V., and H . C . Wu. 1993 . Lipoproteins,
structure, function, biosynthesis and model for protein export, p.319
-342 . In J.-M . Ghuysen and
R . Hakenbeck (ed.), Bacterial cell wall, vol . 27 . Elsevier Science,
Amsterdam, The
Netherlands.
- Conway,
T., and G . K . Schoolnik. 2003 . Microarray
expression profiling: capturing a genome-wide portrait of the
transcriptome . Mol . Microbiol.
47:879-889.
- Gennis,
R . B., and V . Stewart. 1996 . Respiration, p.217
-261 . In F . C.
Neidhardt, R . Curtiss, J . L . Ingraham, E.
C . C . Lin, K . B . Low, B . Magasanik, W . S.
Reznikoff, M . Riley, M . Schaechter, and H . E . Umbarger (ed.),
Escherichia coli and Salmonella: cellular and
molecular biology, 2nd ed., vol . 1 . ASM Press, Washington,
D.C.
- Graham,
M . R., L . M . Smoot, C . A . Migliaccio, K.
Virtaneva, D . E . Sturdevant, S . F . Porcella,
M . J . Federle, G . J . Adams, J . R . Scott,
and J . M . Musser. 2002 . Virulence control in
group A Streptococcus by a two-component gene regulatory
system: global expression profiling and in vivo infection modeling.Proc . Natl . Acad . Sci . USA
99:13855-13860.
- Guckenberger,
M., S . Kurz, C . Aepinus, S . Theiss, S . Haller, T . Leimbach, U . Panzner,
J . Weber, H . Paul, A . Unkmeir, M . Frosch, and G . Dietrich.2002 . Analysis of the heat shock response of Neisseria
meningitidis with cDNA- and oligonucleotide-based DNA microarrays.J . Bacteriol.
184:2546-2551.
- Gutierrez,
C., S . Gordia, and S . Bonnassie. 1995.
Characterization of the osmotically inducible gene osmE of
Escherichia coli K-12 . Mol . Microbiol.
16:553-563.
- Hara,
H., N . Abe, M . Nakakouji, Y . Nishimura, and K . Horiuchi.1996 . Overproduction of penicillin-binding protein 7
suppresses thermosensitive growth defect at low osmolarity due to an
spr mutation of Escherichia coli . Microb . Drug
Resist.
2:63-72.
- Huang,
S . H., Y . H . Chen, Q . Fu, M . Stins, Y . Wang, C.
Wass, and K . S . Kim. 1999 . Identification
and characterization of an Escherichia coli invasion gene
locus, ibeB, required for penetration of brain microvascular
endothelial cells . Infect . Immun.
67:2103-2109.
- Huang,
S . H., and A . Y . Jong. 2001.
Cellular mechanisms of microbial proteins contributing to invasion of
the blood-brain barrier . Cell . Microbiol.
3:277-287.
- Juncker,
A . S., H . Willenbrock, G . Von Heijne, S . Brunak, H . Nielsen,
and A . Krogh. 2003 . Prediction of lipoprotein signal
peptides in gram-negative bacteria . Protein Sci.
12:1652-1662.
- Jung,
J . U., C . Gutierrez, F . Martin, M . Ardourel, and M.
Villarejo. 1990 . Transcription of osmB, a
gene encoding an Escherichia coli lipoprotein, is regulated by
dual signals . Osmotic stress and stationary phase . J.
Biol . Chem.
265:10574-10581.
- Jung,
J . U., C . Gutierrez, and M . R . Villarejo.1989 . Sequence of an osmotically inducible lipoprotein
gene . J . Bacteriol.
171:511-520.
- Kraft,
A . R., M . F . Templin, and J . V.
Holtje. 1998 . Membrane-bound lytic
endotransglycosylase in Escherichia coli . J.
Bacteriol.
180:3441-3447.
- Lee,
J . M., S . Zhang, S . Saha, S . Santa Anna, C . Jiang, and J.
Perkins. 2001 . RNA expression analysis using an
antisense Bacillus subtilis genome array . J.
Bacteriol.
183:7371-7380.
- Mizuno,
T. 1981 . A novel peptidoglycan-associated lipoprotein
(PAL) found in the outer membrane of Proteus mirabilis and
other gram-negative bacteria . J . Biochem.
(Tokyo)
89:1039-1049.
- Munson,
G . P., D . L . Lam, F . W . Outten, and
T . V . O'Halloran. 2000 . Identification
of a copper-responsive two-component system on the chromosome of
Escherichia coli K-12 . J . Bacteriol.
182:5864-5871.
- Neidhardt,
F . C., P . L . Bloch, and D . F . Smith.1974 . Culture medium for enterobacteria . J.
Bacteriol.
119:736-747.
- Paustian,
M . L., B . J . May, and V . Kapur.2002 . Transcriptional response of Pasteurella
multocida to nutrient limitation . J . Bacteriol.
184:3734-3739.
- Schembri,
M . A., K . Kjaergaard, and P . Klemm. 2003.
Global gene expression in Escherichia coli biofilms.Mol . Microbiol.
48:253-267.
- Stintzi,
A. 2003 . Gene expression profile of Campylobacter
jejuni in response to growth temperature variation . J.
Bacteriol.
185:2009-2016.
- Suzuki,
H., Y . Nishimura, S . Yasuda, A . Nishimura, M . Yamada, and Y.
Hirota. 1978 . Murein-lipoprotein of Escherichia
coli: a protein involved in the stabilization of bacterial cell
envelope . Mol . Gen . Genet.
167:1-9.
- Unden,
G., and J . Bongaerts. 1997 . Alternative
respiratory pathways of Escherichia coli: energetics and
transcriptional regulation in response to electron acceptors.Biochim . Biophys . Acta
1320:217-234.
- Voyich,
J . M., D . E . Sturdevant, K . R . Braughton,
S . D . Kobayashi, B . Lei, K . Virtaneva, D . W.
Dorward, J . M . Musser, and F . R . DeLeo.2003 . Genome-wide protective response used by group A
Streptococcus to evade destruction by human polymorphonuclear
leukocytes . Proc . Natl . Acad . Sci . USA
100:1996-2001.
- Wei,
Y., J . M . Lee, D . R . Smulski, and R . A.
LaRossa. 2001 . Global impact of sdiA
amplification revealed by comprehensive gene expression profiling of
Escherichia coli . J . Bacteriol.
183:2265-2272.
- Wilson,
J . W., R . Ramamurthy, S . Porwollik, M . McClelland, T.
Hammond, P . Allen, C . M . Ott, D . L . Pierson, and
C . A . Nickerson. 2002 . Microarray analysis
identifies Salmonella genes belonging to the low-shear modeled
microgravity regulon . Proc . Natl . Acad . Sci . USA
99:13807-13812.
- Wu,
H . C. 1996 . Biosynthesis of lipoproteins, p.1005
-1014 . In F . C.
Neidhardt, R . Curtiss, J . L . Ingraham, E.
C . C . Lin, K . B . Low, B . Magasanik, W . S.
Reznikoff, M . Riley, M . Schaechter, and H . E . Umbarger (ed.),
Escherichia coli and Salmonella: cellular and
molecular biology, 2nd ed., vol . 1 . ASM Press, Washington,
D.C.
Free Online Full-text Article
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
|