Docking de Proteínas -...

65
BIOINFORM BIOINFORM Á Á TICA Y BIOLOG TICA Y BIOLOG Í Í A COMPUTACIONAL A COMPUTACIONAL Curso Curso de de verano verano de la Universidad de la Universidad Complutense Complutense (Julio 2008) (Julio 2008) Docking de Docking de Prote Prote í í nas nas Juan Juan Fern Fern á á ndez ndez Recio Recio Barcelona Supercomputing Barcelona Supercomputing Center Center (BSC) (BSC)

Transcript of Docking de Proteínas -...

BIOINFORMBIOINFORMÁÁTICA Y BIOLOGTICA Y BIOLOGÍÍA COMPUTACIONALA COMPUTACIONALCursoCurso de de veranoverano de la Universidad de la Universidad ComplutenseComplutense (Julio 2008)(Julio 2008)

Docking de Docking de ProteProteíínasnas

Juan Juan FernFernáándezndez RecioRecioBarcelona Supercomputing Barcelona Supercomputing CenterCenter (BSC)(BSC)

•• IntroductionIntroduction• Computational protein-protein docking

• Geometric docking algorithms

• Docking by global energy optimization

• Comparison of docking methods

• Docking applications and future challenges

recognition(interactomics)

protein(proteomics)

gene(genomics)

Function(Drug discovery)

- two-hybrid test- affinity column, gel assays...- BIAcore- mass-spectrometry- electron microscopy- cross-linking- co-immunoprecipitation- immunofluorescence- knock-out- phylogenetic profiles, gene fusion events...- ...

Protein Interaction Detection

Structural Characterization

- NMR (chemical shifts)- sequence conservation- binding essays- mutants & alanine-scanning

- X-ray- NMR- druggable pockets

P4

P3

P1A

P4A

P1

P2A

P6

P2

P8

P1B

P9 P10

P7P5

P9A P9B

P4

P3

P1A

P4A

P1

P2A

P6

P2

P8

P1B

P9 P10

P7P5

P9A P9B

Applications

- protein design- inhibitor discovery:

peptide mimickingligand dockingVLS

- association mechanism

Biophysical Analysis

Study of Protein-Protein Interactions

Structural Analysis at Atomic Resolution: Structural Analysis at Atomic Resolution: NMR and XNMR and X--rayray

Structural Analysis at Atomic Resolution: Structural Analysis at Atomic Resolution: NMR and XNMR and X--rayray

- Lo Conte, Chothia, Janin 1999 J.Mol.Biol. 285, 2177-2198- Chakrabarti, Janin 2002 Proteins 47, 334-343

Databases of ProteinDatabases of Protein--Protein ComplexesProtein Complexeshttp://pqs.ebi.ac.ukPQS

http://dockground.bioinformatics.ku.eduDOCKGROUND

• Introduction

•• Computational proteinComputational protein--protein dockingprotein docking• Geometric docking algorithms

• Docking by global energy optimization

• Comparison of docking methods

• Docking applications and future challenges

Motivation Motivation ……

- X-ray, NMR: Determination of complex structures remains difficult

ProteinProtein--Protein DockingProtein DockingGeneration of the structure of a protein-protein complex

from the individual protein structures

Motivation Motivation ……

- X-ray, NMR: Determination of complex structures remains difficult

ProteinProtein--Protein DockingProtein DockingGeneration of the structure of a protein-protein complex

from the individual protein structures

Motivation Motivation ……

- X-ray, NMR: Determination of complex structures remains difficult

- Low-resolution data on PPI available (cryo-EM, MS…)

ProteinProtein--Protein DockingProtein DockingGeneration of the structure of a protein-protein complex

from the individual protein structures

Motivation Motivation ……

- X-ray, NMR: Determination of complex structures remains difficult

- Low-resolution data on PPI available (cryo-EM, MS…)

- Understand energetics and mechanism of protein-protein association

ProteinProtein--Protein DockingProtein DockingGeneration of the structure of a protein-protein complex

from the individual protein structures

Motivation Motivation ……- X-ray, NMR: Determination of complex structures remains difficult

- Low-resolution data on PPI available (cryo-EM, MS…)

- Understand energetics and mechanism of protein-protein association

- Protein design (diagnostic, environment) and drug discovery

ProteinProtein--Protein DockingProtein DockingGeneration of the structure of a protein-protein complex

from the individual protein structures

c

E

ROTATION

0

PROTEINS FAR APART

CLOSE CONTACT

c

E

ROTATION

0

PROTEINS FAR APART

CLOSE CONTACT

c

0

PROTEINS FAR APARTE

CLOSE CONTACT

SOME CONTACTS

NEARBY

ROTATION

ΔG’ = elect + desolv

ΔG = elect.

ΔG’’ = HB + vdW

c

0

PROTEINS FAR APARTE

CLOSE CONTACT

SOME CONTACTS

NEARBY

ROTATION

ΔG’ = elect + desolv

ΔG = elect.

ΔG’’ = HB + vdW

1)

2)

• Introduction

• Computational protein-protein docking

•• Geometric docking algorithmsGeometric docking algorithms• Docking by global energy optimization

• Comparison of docking methods

• Docking applications and future challenges

ProteinProtein--Protein Docking: Protein Docking: Geometry ApproachGeometry Approach

ProteinProtein--Protein Docking: Protein Docking: Geometry ApproachGeometry Approach

MolFitMolFit

MolFitMolFit

a

b

c

Corr(a,b)

Fourier TransformFourier Transform

Fourier transform equations: Correlation function:

“Correlation Theorem”:

C(f) = G(f) H(-f)

If h(t) is real then H(-f) = [H(f)]*

C(f) = G(f) [H(f)]*

c(t) =

F(k) = FT( f(x) )

f(x) = IFT( F(k) )

c(t) = IFT( FT( g(τ) ) [FT( h(τ) )]* )

It re-express a function in terms of sinusoidal basis functions

FFT (Fast Fourier Transform)FFT (Fast Fourier Transform)

Fourier transform timing: N2 (if N= 106 1MHz CPU time ~2 weeks)Fast Fourier transform: Nlog2N (if N= 106 1MHz CPU time ~ 30 sec)

http://www.fftw.org/

Algorithms for efficient calculation of FT and IFT

- Most common: Cooley-Tukey FFT (divide and conquer)- Other: Prime-factor, Bruun's, Rader's, Bluestein's

MolFitMolFit

a

b

c

Corr(a,b)

MolFitMolFit

a

b

c

Corr(a,b)

F(a)FT

F(b)FT

escalarproduct

IFTF(c)

Protein Docking Using FFTProtein Docking Using FFT

R

L L

RR

LRotate

Fast Fourier Transform

Complex Conjugate

Discretize

Discretize

Fast FourierTransform

Surface Interior

Correlation function

Protein Docking Using FFTProtein Docking Using FFT

Surface InteriorY Translation

Cor

rela

tion

X Translation

IFFT

L

RIFFT

Comp. cost can decrease by >104 (from N6 to N3lnN3)

FTDOCKFTDOCK ZDOCKZDOCK

ZDOCK performanceZDOCK performanceA Novel Shape Complementarity Function

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 10 100 1000

Number of Predictions

Succ

ess

Rate

Grid-based shape complementarity (GSC)GSC+Desolvation+ElectrostaticsPairwise shape complementarity (PSC)PSC +Desolvation+Electrostatics

• Introduction

• Computational protein-protein docking

• Geometric docking algorithms

•• Docking by global energy optimizationDocking by global energy optimization• Comparison of docking methods

• Docking applications and future challenges

ProteinProtein--Protein Docking EnergyProtein Docking Energy

ProteinProtein--Protein Docking EnergyProtein Docking Energy

‘soft’ vdW =

(ECEPP/3)

= 0.03 kcal/mole * ASA(apolar)

E E = = EEvwvw + + EEelel + + EEhbhb + + EEhphp

Max = 20 kcal/moleMin = -20 kcal/mole

dqq

ij

jsiE 4

0.323 2el =

Eel

Eel

ProteinProtein--Protein Docking EnergyProtein Docking Energy

E

RMSD0

Calculate maps

Positioning (x 120)

RECEPTOR LIGAND

PREDICTED COMPLEX

Monte Carlo sampling(positional)

Local minimization

Metropoliscriteria

Solutionaccepted

YES

NOSolutionrejected

Conformational stack

Low energy solutionsRMSD > 4 Å

20000 energy evaluations

Monte Carlo sampling(ligand interface side-chains)

Local minimization

Solutionaccepted

NO Solutionrejected

1000 energy evaluationsper flexible torsion angle

Metropoliscriteria

YESRIGID BODY RIGID BODY

DOCKINGDOCKING

Lowest energy solution

(including solvation)

SIDESIDE--CHAIN CHAIN REFINEMENTREFINEMENT

ICM Protein-Protein Docking

FernFernáándezndez--Recio et al. 2002 Protein Recio et al. 2002 Protein SciSci. 11, 280. 11, 280--291291

EnergyEnergy = = vdwvdw + el + + el + hbhb + +

desolvdesolv

xx--ray complexray complexBest rigidBest rigid--body body docking solutiondocking solution

RigidRigid--Body Docking + SideBody Docking + Side--Chain RefinementChain Refinement

After refinementAfter refinement

• 35% rank 1

Fernandez-Recio et al. 2002 Prot. Sci. 11, 280-291

ICM dockingwww.molsoft.com

pyDock: scoring of rigidpyDock: scoring of rigid--body docking body docking orientations by electrostatics + desolvationorientations by electrostatics + desolvation

Max = +1 kcal/moleMin = -1 kcal/mole

dqq

ij

jsiE 4

0.323 2el =

Eel

Eel

E E = = EEelel + + EEsolvsolv

ASPs forinterface/water

ASPs fromoctanol/water

ASPs are optimized for protein dockingASPs are optimized for protein docking

FTDOCK’s docking sets:(80 unbound cases)

ZDOCK’s docking sets:(80 unbound cases)

pyDock

FTDOCK

(random)

pyDock

ZDOCK(random)

pyDock pyDock –– Cheng, Cheng, BlundellBlundell, , FernandezFernandez--Recio (2007) Recio (2007) ProteinsProteins 68, 50368, 503--515515

pyDock: scoring of rigidpyDock: scoring of rigid--body docking body docking orientations by electrostatics + desolvationorientations by electrostatics + desolvation

• Introduction

• Computational protein-protein docking

• Geometric docking algorithms

• Docking by global energy optimization

•• Comparison of docking methodsComparison of docking methods• Docking applications and future challenges

Docking software (I)

Docking software (II)

Docking Servers

DOCKING VALIDATIONDOCKING VALIDATIONCAPRI: A Critical Assessment of CAPRI: A Critical Assessment of PRedictedPRedicted InteractionsInteractions

1st CAPRI – Sep02 La Londe (France)Special issue, in:PROTEINS: Structure, Function, and Genetics 52 (July 2003)

http://www.ebi.ac.uk/msd-srv/capri/

T01Hpr (unbound) HPr kinase (unbound) T02

VP6 (unbound)Fab (bound)

T03Hemagglutinin (unbound) Fab (bound)

T04, T05, T06α-amylase (unbound)VHH (bound)

T07TCRβ (unbound)speA (unbound)

2nd CAPRI – Dec04 Gaeta (Italy)Special issue, in:PROTEINS: Structure, Function, and Bioinformatics 60 (July 2005)

3rd CAPRI – Apr07 Toronto (Canada) Special issue, in:PROTEINS: Structure, Function, and Bioinformatics 69 (December 2007)

DOCKING VALIDATIONDOCKING VALIDATIONCAPRI: A Critical Assessment of CAPRI: A Critical Assessment of PRedictedPRedicted InteractionsInteractions

1st CAPRI 1st CAPRI -- PredictionsPredictions

Fernandez-Recio et al. (2003) Proteins 52, 113-117

6 groups:2 acceptable models

3 groups:1 acceptable models

5 groups:no acceptable models

Fernandez-Recio et al. (2005) Proteins 60, 308-313

T08 T10 T11 T12

T13 T14

7.67.6ÅÅ8.58.5ÅÅ

6.06.0ÅÅ 0.70.7ÅÅ

11.111.1ÅÅ

0.60.6ÅÅ

3.03.0ÅÅ

T184.14.1ÅÅ

T19

2nd CAPRI 2nd CAPRI -- PredictionsPredictions

3rd CAPRI 3rd CAPRI -- ScorersScorers

Fernandez-Recio et al. (2003) Proteins 52, 113-117

1st CAPRI 1st CAPRI –– Target 3Target 3

22ndnd CAPRI CAPRI –– Target 14Target 14

Fernandez-Recio et al. (2005) Proteins 60, 308-313

Template: Template: protein protein phosphatasephosphatase 11αα

ModelledModelled Ligand: Ligand: protein protein phosphatasephosphatase 11ββ

ID ID 92.9%92.9%

Bound receptor: Bound receptor: MYPT1MYPT1

Fernandez-Recio et al. (2005) Proteins 60, 308-313

Template: Template: protein protein phosphatasephosphatase 11αα

ModelledModelled Ligand: Ligand: protein protein phosphatasephosphatase 11ββ

ID ID 92.9%92.9%

Bound receptor: Bound receptor: MYPT1MYPT1

DockingDocking

Rank 1Rank 1Ligand Ligand RMSD: RMSD:

0.60.6ÅÅ

22ndnd CAPRI CAPRI –– Target 14Target 14

Fernandez-Recio et al. (2005) Proteins 60, 308-313

Template: Template: protein protein phosphatasephosphatase 11αα

ModelledModelled Ligand: Ligand: protein protein phosphatasephosphatase 11ββ

ID ID 92.9%92.9%

Bound receptor: Bound receptor: MYPT1MYPT1

DockingDocking

Rank 1Rank 1Ligand Ligand RMSD: RMSD:

0.60.6ÅÅ

22ndnd CAPRI CAPRI –– Target 14Target 14

• Introduction

• Computational protein-protein docking

• Geometric docking algorithms

• Docking by global energy optimization

• Comparison of docking methods

•• Docking applications and future challengesDocking applications and future challenges

Federici et al. (2006) Trends Plant Sci 11, 65 Fort et al. (2007) JBC 282, 31444

Bonivento et al. (2007) Proteins 70, 294

Bavro et al. Mol Cell (in press)

Complex Structure Prediction by PyDock: ExamplesComplex Structure Prediction by PyDock: Examples

Medina et al. (2008) Proteins

FNR

/ FN

R /

ferr

edox

infe

rred

oxin

FNR

/ FN

R /

flavo

doxi

nfla

vodo

xin

Q

QH2

PSIcyt bf

PC +

2+PCPSII

H2O2 O2H+4 +

2+

Fd 3+

Fld

FdFld FNR

FNR NADP +

NADPH

Q

QH2

Q

QH2

PSIPSIcyt bfcyt bf

PC +

2+PC

PC +PC +

2+PC 2+PC 2+PCPSII

H2O2 O2H+4 +

PSIIPSII

H2O2 O2H+4 +H2O2H2O2 O2H+4 + O2O2H+4H+H+4 +

2+

Fd 3+

Fld

FdFld

2+

Fd 3+

Fld

FdFld FNR

FNR

FNRFNR

FNRFNR NADP +

NADPH

NADP +

NADPH

1mlc

2pcf

1ca0

∑=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −100

11001

kUnbi

Bndik

Unbi

ASAASAASA

Averaged buried surfaceAveraged buried surface(ABS)(ABS)

ABSABSABSABS

NIP MAXi

i −−

=

Normalized ABS or Normalized ABS or Interface Propensity Interface Propensity

for residue i:for residue i:

Interface propensity map from docking landscape

FernandezFernandez--Recio et al. (2004) JMB 335, 843Recio et al. (2004) JMB 335, 843--865865

Binding site prediction from rigid-body docking

FernandezFernandez--Recio et al. (2004) JMB 335, 843Recio et al. (2004) JMB 335, 843--865865

pyDockRST: use of restraints to filter pyDockRST: use of restraints to filter docking solutionsdocking solutions

A B

Docking solution i

Restraintresidues

Restraintresidues

Satisfied restraints

Satisfied restraints

Pseudo-Energy = -100*(satisfied restraints / total restraint residues)

< 6 Å

< 6 Å

< 6 Å

Crescendo + pyDockRSTCrescendo + pyDockRST

CRESCENDO CRESCENDO ((ChelliahChelliah, Blundell, Lovell), Blundell, Lovell)

pyDockRSTpyDockRST

Chelliah et al. (2006) JMB 357, 1669-1682

Crescendo + pyDockRSTCrescendo + pyDockRST

Introduction of evolutionary restraints dramatically improves the docking results

Chelliah et al. (2006) JMB 357, 1669-1682

ProteinProtein--Protein Docking MechanismProtein Docking Mechanism

Blundell & Fernandez-Recio (2006) Nature 444, 279

Multi-protein docking

Docking 1:1 Tethered docking Tethered docking + restraints

Multi-proteindocking ??

Mare Nostrum Supercomputerwww.bsc.es

- 94.21 Teraflops- 10,240 CPUs- 20 TB main memory- 370 TB disk storage- world 26th, Europe 8th

(www.top500.org; Jun 2008)

Protein-Protein Interaction - General- Protein-Protein Recognition, C. Kleanthous ed., Oxford University Press- Conte et al. (1999) J. Mol. Biol 285, 2177-2198- Estructura de Proteínas, C. Gómez-Moreno & J. Sancho coord., Editorial Ariel

Docking Simulations- Katchalski-Katzir et al. (1992) PNAS 89, 2195-2199- Halperin et al. (2002) Proteins 47, 409-443- Smith & Sternberg (2002) COSB 12, 28-35- Ritchie (2008) Curr. Protein Pept. Sci. 9, 1-15

CAPRI- Proteins Special Issues (July 2003, July 2005, December 2007)

ReferencesReferences