Genomic Analysis and Comparative Proteomics of Temperate
Mycobacteriophages,
By: Carmen A AguirreThe College of St. Scholastica
Background • Bacteriophages are viruses that infect and
replicate within bacteria
• Mycobacteriophages specifically target mycobacteria (e.g. M. smegmatis, M.tuberculosis)
• There are two types of Phages:
Lytic: obligate lysis of host
Temperate: Lytic or Lysogenic cycle
Applications:
• Alternatives to antibiotics for resistant infections
• Biocontrol agents in agriculture
• Valuable research tool
• Untapped reservoir of genetic diversity
Overview• Introduction to SEA-Phages program • Methods of Isolation• General Overview of Annotated Mycobacteriophages• Overview of comparative genomics and stoperator
sequences• General Overview of Mass Spec Techniques• Mass Spectrometry Protein Identification • Future Directions
SEA-PHAGES ProgramSEA Phage Hunters Advancing Genomics and Evolutionary Science• National experiment in bacteriophage genomics• Students isolate, name, sequence, and analyze
newly-discovered bacteriophages isolated on M. smegmatis and other hosts
• Research-based curricula• Advance science education on a national scale• Authentic scientific discovery
Collect soil sample
Enrichment
Direct plating
Streak method≥ 3 times
Serial dilution/Titer
Spot test
MTL Harvestand titer Web Plate
Empirical test
10-plate infection
Isolate DNA
ARCHIVE
HTL harvestand titer
HTL
Electron Microscopy
HTL
HTL
Restriction Digest
Quality Control Gel SEQUENCING
CENTER
APPROVED Genomic
DNAIN
SILICO
Genome Sequence
Methods
CSS Phage StatsTotal Phage Isolated during 3 years= 34
5 sequenced and annotated4 in GenBankClusters are groups of phages that have nucleotide sequence similarity 14 Cluster A 1 Cluster E 2 Cluster B 1 Cluster CFamilies are groups of phages sharing similar electron microscopy morphologyMyoviridae: double-stranded DNA genomes with contractile tailsSiphoviridae: double-stranded DNA genomes and long, flexible, non-contractile tailsPodoviridae: double-stranded DNA genomes and short, stubby, non-contractile tails
QuinNkiroGeneral
InformationCluster: A3Family: SiphoviridaeType: TemperatePlaque Morphology: Large , Turbid plaques
GenomicsGenome: 50,066bpProtein Coding Regions: 88tRNA: 2Stoperator: 18
HetaeriaGeneral Information
Cluster: A4
Family: Siphoviridae
Type: Lytic?
Plaque Morphology: Small, Clear plaques
Genomics
Genome: 51,374bp
Protein Coding Regions: 87
Stoperator: 17
Stoperator Sequences
• DNA sequences that bind repressors thus prevent RNA elongation (transcription)
• Often found before genes that promote lysis
fDNAfRepressor bound to
Stoperator
Stoperator Sequences
• In many phages, multiple stoperator sequences are found in the genome, with most located in the second half of the genome
• Stoperator sequences have polarity (directionality) that correlates with direction of transcription of target gene
• Usually found in intergenic region near target gene
Stoperator polarity and location in mycobacteriophage L5 Brown et al. The EMBO Journal Vol.16 No.19 pp.5914–5921
Stoperators in our PhagesHetaeria
• Consensus sequence generated using 3 matches and 25 pseudo matches to original search sequence
• 27/28 sequences found within relevant genes at the tail end
QuinnKiro • 18 putative sequences• Mostly at the end of genome• Core TCAAG mutated in the three matches within Lysin
genes
Unique Quinnkiro stoperator results• NCBI:
Sequence GTGCGATGTCAAG found 11 times
• DNAMaster-Additional pseudo matches found around Lysin A and B
genes-Multiple stoperator sites found within Lysin A and B-Core TCAAG sequence mutated in one case-Sequence polarity reversed in one matchWithin Lysin A (4490 - 6052): Within and after Lysin B (6052 -
7017):
Comparative Genomics
• Looked for our consensus sequence in other clusters
• Recorded how many exact and pseudo matches
were found
• Investigated the gene functions of the likely target
genes
Phage Sequence Hits Gene Function
QuinnKiro (A) GTGCGATGTCAAG 18 Lysin A, B
Caelakin (A) GTGCGATGTCAAG 17 Lysin B, DNA Pol.
Hetaeria (B) GTGCGATGTCAAG 28 Helicase, Terminase
ZygoTaiga (C) GTGCGATGCCGAG 14 Nucleotide Binding Protein, Hydrolase
Hawkeye (D) GCGCGATGTCAAG 3 DNA Pol. III
Bruin (E) GCGCGATGTGGAC 2 DNAB-like Helicase
Drago (F) GTGCGATGCCAAC 2 NKF
Avrafan (G) GTGCGAGGTCGAG 1 Lysin B
Damien (H) GTGCGATGTCCCG 3 NKF
Babsiella (I) GCGCGATGTCAAC 1 HiCa antitoxin
• Stoperator sequence of GTGCGATGTCAAG found in QuinnKiro, Caelakin, and Hetaeria.
• Sequence found in nearly all clusters of phages, including non-temperate
• Stoperator sequence near LysA and B genes or areas that promote lysis/assembly
• most sequences in the second half of genome near genes with no known function
QuinnKiro• exhibited some unusual stoperator sequences associated with the lysin genes:
• Multiple sequences embedded in coding region• Apparent reversal of polarity
Speculation:• What is the purpose of stoperator sequences in lytic phages?
• An outcome of genetic mosaicism? • Common mycophage ancestor temperate?
Stoperator Genomics Conclusions
Phage protein identification through mass spectrometry
Goals: -Detection of phage proteins -verify annotation
-How does host (smeg) protein expression respond to phage infection/prophage presence?
Methods:
-First attempt: Harvest raw lysate or PEG-precipitated phage sample
-Second attempt: Infected Cell Pellet
-Analyze: digested proteins through triple TCF mass spec
-Analyze the peptides with scaffold program
Peg Precipitation Protocol summary
• Able to identify proteins that are abundant and unique
• Several phage structural proteins were identified
• Sample concentration may need to be increased to achieve better detection
• Many orphan peptides remain unidentified (likely smeg protein)
Infected Cell Pellet ProtocolGrow Smeg Cultures of Varying concentrations~12hrs
Run Samples and Find one of OD=0.4
Infect Culture with Lysate at MOI=10 ~3-3.5hrs
Detected Peptides Analyzed with Sccafold Viewer
Subject to triple TOF mass spec
Sample Peptides with >1000 counts/sec fragmented
Pellet Cells
Infected Cell Pellet analysis-Hetaeria
• Able to identify proteins that are abundant and unique• Proteins functioning in: Assembly, Structure, Genome
Replication, and Other.• Able to detect over half predicted proteins• Able to cover a significant percentage of sequence• Identify amino acid modifications
Identified Peptides with Functions
Amino Acid Modifications
Peptide Fragments
Summary of Hetaeria Mass Spec Data
• Over half of in silico predicted proteins detected
• Over a quarter of start sites verified• Proteins of varying functions identified• Ability to locate amino acid modifications• Ability to identify size of proteins and variety• Increased sequence coverage• Increasing understanding of the Lytic Phage
Currently• Waiting on Mass spec data on infected cell pellet for
Quinnkiro (A3) and Brusacoram (P)-Temperate phages?• Interesting data on proteins involved in switching from
lysogeny• Compare identified proteins between lytic and temperate
phages• Preparing Caelakin clear plaque mutant with increase phage
protein expression. • Examine how growth time and conditions affect phage and
host protein expression • Immunity Repressors?
Acknowledgements• Dr. Daniel Westholm-Professor• The College of St. Scholastica SEA-PHAGES Research Students• Sequencing
• Virginia Commonwealth University (Hetaeria, Severus, ZygoTaiga)• North Carolina State University (QuinnKiro, Caelakin)
• SEA-PHAGES• Howard Hughes Medical Institution• University of Pittsburg-Hatful Lab
• Mass Spectrometry• Mayo Clinic Proteomic Core Lab
• Electron Microscopy• Mayo Clinic Microscopy Lab
Top Related