Bu Presentation Dsa

60
SRIHARITECHSOFT Pvt Ltd SRIHARITECHSOFT Pvt Ltd Contents Contents Programming Programming Data structures Data structures  Algorithms  Algorithms  Applications  Applications

Transcript of Bu Presentation Dsa

Page 1: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 1/60

Page 2: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 2/60

SRIHARITECHSOFT Pvt Ltd

INTRODUCTIONINTRODUCTION

A famous quote: Program = Algorithm + DataA famous quote: Program = Algorithm + DataStructure.Structure.

All of you have programmed; thus have already beenAll of you have programmed; thus have already been

exposed to algorithms and data structure.exposed to algorithms and data structure. Perhaps you didn't see them as separate entities;Perhaps you didn't see them as separate entities; Perhaps you saw data structures as simplePerhaps you saw data structures as simple

programming constructs (provided by STLprogramming constructs (provided by STL----standard template library).standard template library).

However, data structures are quite distinct fromHowever, data structures are quite distinct fromalgorithms, and very important in their own right.algorithms, and very important in their own right.

Page 3: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 3/60

SRIHARITECHSOFT Pvt Ltd

OBJECTIVESOBJECTIVES

The main focus of this course is to introduce you to aThe main focus of this course is to introduce you to asystematic study of algorithms and data structure.systematic study of algorithms and data structure.

The two guiding principles of the course are:The two guiding principles of the course are:

abstraction and formal analysis.abstraction and formal analysis. Abstraction: We focus on topics that are broadlyAbstraction: We focus on topics that are broadlyapplicable to a variety of problems.applicable to a variety of problems.

Analysis: We want a formal way to compare twoAnalysis: We want a formal way to compare twoobjects (data structures or algorithms).objects (data structures or algorithms).

In particular, we will worry about "always correct"In particular, we will worry about "always correct"--ness, and worstness, and worst--case bounds on time and memorycase bounds on time and memory(space).(space).

Page 4: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 4/60

SRIHARITECHSOFT Pvt Ltd

 ALGORITHMS ANALYSIS ALGORITHMS ANALYSIS

Foundations of Algorithm Analysis and DataFoundations of Algorithm Analysis and DataStructures.Structures.

Analysis:Analysis: How to predict an algorithm·s performanceHow to predict an algorithm·s performance

How well an algorithm scales upHow well an algorithm scales up

How to compare different algorithms for a problemHow to compare different algorithms for a problem

Data StructuresData Structures How to efficiently store, access, manage dataHow to efficiently store, access, manage data

Data structures effect algorithm·s performanceData structures effect algorithm·s performance

Page 5: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 5/60

Page 6: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 6/60

SRIHARITECHSOFT Pvt Ltd

Few Common AlgorithmsFew Common Algorithms

Constructions of EuclidConstructions of Euclid Newton's root findingNewton's root finding Fast Fourier TransformFast Fourier Transform

Compression (Huffman, LempelCompression (Huffman, Lempel--Ziv, GIF, MPEG)Ziv, GIF, MPEG) DES, RSA encryptionDES, RSA encryption Simplex algorithm for linear programmingSimplex algorithm for linear programming Shortest Path Algorithms (Dijkstra, BellmanShortest Path Algorithms (Dijkstra, Bellman--Ford)Ford)

Error correcting codes (CDs, DVDs)Error correcting codes (CDs, DVDs) T CP congestion control, IP routingT CP congestion control, IP routing

Pattern matching (Genomics)Pattern matching (Genomics) Search EnginesSearch Engines

Page 7: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 7/60

SRIHARITECHSOFT Pvt Ltd

Example of AlgorithmExample of Algorithm

Two algorithms for computing the FactorialTwo algorithms for computing the Factorial Which one is better?Which one is better?

int factorial (int n) {int factorial (int n) {if (n <= 1) return 1;if (n <= 1) return 1;

else return n * factorial(nelse return n * factorial(n--1);1);}}

int factorial (int n) {int factorial (int n) {if (n<=1) return 1;if (n<=1) return 1;else {else {fact = 1;fact = 1;

for (k=2; k<=n; k++)for (k=2; k<=n; k++)fact *= k;fact *= k;

return fact;return fact; } }

}}

Page 8: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 8/60

SRIHARITECHSOFT Pvt Ltd

Role of Algorithms in Modern WorldRole of Algorithms in Modern World

Enormous amount of dataEnormous amount of data EE--commerce (Amazon, EBay)commerce (Amazon, EBay)

Network traffic (telecom billing, monitoring)Network traffic (telecom billing, monitoring)

Database transactions (Sales, inventory)Database transactions (Sales, inventory) Scientific measurements (astrophysics, geology)Scientific measurements (astrophysics, geology)

Sensor networks. RFID tagsSensor networks. RFID tags

Bioinformatics (genome, protein bank)Bioinformatics (genome, protein bank)

Amazon hired firstAmazon hired first Chief Algorithms OfficerChief Algorithms Officer(Udi Manber)(Udi Manber)

Page 9: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 9/60

SRIHARITECHSOFT Pvt Ltd

A RealA Real--World ProblemWorld Problem

Communication in the InternetCommunication in the Internet Message (email, ftp) broken down into IP packets.Message (email, ftp) broken down into IP packets. Sender/receiver identified by IP address.Sender/receiver identified by IP address.

The packets are routed through the Internet by specialThe packets are routed through the Internet by specialcomputers called Routers.computers called Routers. Each packet is stamped with its destination address,Each packet is stamped with its destination address,

but not the route.but not the route. Because the Internet topology and network load isBecause the Internet topology and network load is

constantly changing, routers must discover routesconstantly changing, routers must discover routesdynamically.dynamically.

What should the Routing Table look like?What should the Routing Table look like?

Page 10: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 10/60

SRIHARITECHSOFT Pvt Ltd

IP Prefixes and RoutingIP Prefixes and Routing

Each router is really a switch: it receives packets atEach router is really a switch: it receives packets atseveral input ports, and appropriately sends them out toseveral input ports, and appropriately sends them out tooutput ports.output ports.

Thus, for each packet, the router needs to transfer theThus, for each packet, the router needs to transfer thepacket to that output port that gets it closer to itspacket to that output port that gets it closer to itsdestination.destination.

Should each router keep a table: IP address x OutputShould each router keep a table: IP address x OutputPort?Port?

How big is this table?How big is this table? When a link or router fails, how much information wouldWhen a link or router fails, how much information would

need to be modified?need to be modified? A router typically forwards several million packets/sec!A router typically forwards several million packets/sec!

Page 11: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 11/60

SRIHARITECHSOFT Pvt Ltd

Data StructuresData Structures

The IP packet forwarding is a Data Structure problem!The IP packet forwarding is a Data Structure problem! Efficiency, scalability is very important.Efficiency, scalability is very important.

Similarly, how does Google find the documents matchingSimilarly, how does Google find the documents matching your query so fast? your query so fast? Uses sophisticated algorithms to create indexUses sophisticated algorithms to create index

structures, which are just data structures.structures, which are just data structures. Algorithms and data structures are ubiquitous.Algorithms and data structures are ubiquitous. With the data glut created by the new technologies,With the data glut created by the new technologies,

the need to organize, search, and update MASSIVEthe need to organize, search, and update MASSIVEamounts of information FAST is more severe than everamounts of information FAST is more severe than everbefore.before.

Page 12: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 12/60

SRIHARITECHSOFT Pvt Ltd

Algorithms to Process these DataAlgorithms to Process these Data

Which are the top K sellers?Which are the top K sellers?Correlation between time spent at a web siteCorrelation between time spent at a web site

and purchase amount?and purchase amount?

Which flows at a router account for > 1%Which flows at a router account for > 1%traffic?traffic?

Did source S send a packet in last s seconds?Did source S send a packet in last s seconds? Send an alarm if any international arrivalSend an alarm if any international arrival

matches a profile in the databasematches a profile in the database Similarity matches against genome databasesSimilarity matches against genome databases

Etc.Etc.

Page 13: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 13/60

SRIHARITECHSOFT Pvt Ltd

Max Subsequence ProblemMax Subsequence Problem

Given a sequence of integers A1, A2, «, An, find theGiven a sequence of integers A1, A2, «, An, find themaximum possible value of amaximum possible value of a subsequencesubsequence Ai, «, Aj.Ai, «, Aj.

Numbers can be negative.Numbers can be negative. You want aYou want a contiguouscontiguous chunk with largest sum.chunk with largest sum.

Example:Example: --2, 11,2, 11, --4, 13,4, 13, --5,5, --22 The answer is 20 (subseq. A2 through A4).The answer is 20 (subseq. A2 through A4).

We will discussWe will discuss 4 different algorithms4 different algorithms, with time, with timecomplexities O(ncomplexities O(n33), O(n), O(n22), O(n log n), and O(n).), O(n log n), and O(n). With n = 10With n = 1066, algorithm 1 may take > 10 years; algorithm, algorithm 1 may take > 10 years; algorithm

4 will take a fraction of a second!4 will take a fraction of a second!

Page 14: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 14/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 1 for Max Subsequence SumAlgorithm 1 for Max Subsequence Sum

GivenGiven AA11,«,A,«,Ann ,, find the maximum value offind the maximum value ofAAii+A+Ai+i+11++ +A+A j j0 if the max value is negative0 if the max value is negative

int maxSum = 0;

for( int i = 0; i < a.size( ); i++ )for( int j = i; j < a.size( ); j++ ){

int thisSum = 0;for( int k = i; k <= j; k++ )thisSum += a[ k ];if( thisSum > maxSum )maxSum = thisSum;

}

Time complexity: O(n3)

)1(O

)1(O

)1(O

)1(O

)( i jO ))((

1

§

!

n

i j

i j))((

1

0

1

§§

!

!

n

i

n

i j

i j

Page 15: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 15/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 2Algorithm 2

Idea: Given sum from i to jIdea: Given sum from i to j--1, we can compute the sum1, we can compute the sumfrom i to j in constant time.from i to j in constant time.

This eliminates one nested loop, and reduces the runningThis eliminates one nested loop, and reduces the running

time to O(ntime to O(n22

).).into maxSum = 0;

for( int i = 0; i < a.size( ); i++ )int thisSum = 0;for( int j = i; j < a.size( ); j++ ){

thisSum += a[ j ];if( thisSum > maxSum )

maxSum = thisSum;}

return maxSum;

Page 16: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 16/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 3Algorithm 3

This algorithm uses divideThis algorithm uses divide--andand--conquer paradigm.conquer paradigm. Suppose we split the input sequence at midpoint.Suppose we split the input sequence at midpoint. The max subsequence is entirely in theThe max subsequence is entirely in the left halfleft half,,

entirely in theentirely in the right halfright half, or it, or it straddles thestraddles themidpointmidpoint.. Example:Example:

left halfleft half || right halfright half

44 --3 53 5 --2 |2 | --1 2 61 2 6 --22 Max in left is 6 (A1 through A3); max in right isMax in left is 6 (A1 through A3); max in right is

8 (A6 through A7).8 (A6 through A7). But straddling max is 11 (A1But straddling max is 11 (A1thru A7).thru A7).

Page 17: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 17/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 3 (cont.)Algorithm 3 (cont.)

Example:Example:left halfleft half || right halfright half44 --3 53 5 --2 |2 | --1 2 61 2 6 --22

Max subsequences in each half found by recursion.Max subsequences in each half found by recursion. How do we find the straddling max subsequence?How do we find the straddling max subsequence? Key ObservationKey Observation::

Left half of the straddling sequence is the maxLeft half of the straddling sequence is the maxsubsequence ending withsubsequence ending with --2.2.

Right half is the max subsequence beginning withRight half is the max subsequence beginning with --1.1.

A linear scan lets us compute these in O(n) time.A linear scan lets us compute these in O(n) time.

Page 18: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 18/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 3: AnalysisAlgorithm 3: Analysis

The divide and conquer is best analyzedThe divide and conquer is best analyzedthrough recurrence:through recurrence:

T(1) = 1T(1) = 1

T(n) = 2T(n/2) + O(n)T(n) = 2T(n/2) + O(n)

This recurrence solves to T(n) = O(n logThis recurrence solves to T(n) = O(n logn).n).

Page 19: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 19/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 4Algorithm 4

Time complexity clearlyTime complexity clearly O O((nn))

But why does it work? I.e. proof of correctnessBut why does it work? I.e. proof of correctness

2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2

int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ ){

thisSum += a[ j ];

if ( thisSum > maxSum )maxSum = thisSum;

else if ( thisSum < 0 )thisSum = 0;

}return maxSum;

}

Page 20: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 20/60

SRIHARITECHSOFT Pvt Ltd

Proof of CorrectnessProof of Correctness

Max subsequence cannotMax subsequence cannot start or endstart or end at aat anegative Ai.negative Ai.

More generally, the max subsequence cannotMore generally, the max subsequence cannothave a prefix with a negative sum.have a prefix with a negative sum.

Ex:Ex: --2 112 11 --4 134 13 --55 --22 Thus, if we ever find that Ai through Aj sumsThus, if we ever find that Ai through Aj sums

to < 0, then we can advance i to j+1to < 0, then we can advance i to j+1 Proof. Suppose j is the first index after i when theProof. Suppose j is the first index after i when the

sum becomes < 0sum becomes < 0 The max subsequence cannot start at any p between iThe max subsequence cannot start at any p between i

and j. Because Aand j. Because Aii through Athrough App--11 is positive, so startingis positive, so startingat i would have been even better.at i would have been even better.

Page 21: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 21/60

SRIHARITECHSOFT Pvt Ltd

Algorithm 4Algorithm 4

int maxSum = 0, thisSum = 0;int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ )for( int j = 0; j < a.size( ); j++ ){{

thisSum += a[ j ];thisSum += a[ j ];

if ( thisSum > maxSum )if ( thisSum > maxSum )maxSum = thisSum;maxSum = thisSum;

else if ( thisSum < 0 )else if ( thisSum < 0 )thisSum = 0;thisSum = 0;

}}return maxSumreturn maxSum

The algorithm resets whenever prefix is < 0. Otherwise,The algorithm resets whenever prefix is < 0. Otherwise,it forms new sums and updates maxSum in one passit forms new sums and updates maxSum in one pass..

Page 22: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 22/60

SRIHARITECHSOFT Pvt Ltd

Why Efficient Algorithms MatterWhy Efficient Algorithms Matter

Suppose N = 10Suppose N = 1066

A PC can read/process N records in 1 sec.A PC can read/process N records in 1 sec. But if some algorithm does N*N computation, then itBut if some algorithm does N*N computation, then it

takes 1M seconds = 11 days!!!takes 1M seconds = 11 days!!!

100 City100 City Traveling Salesman ProblemTraveling Salesman Problem.. A supercomputer checking 100 billion tours/sec still requiresA supercomputer checking 100 billion tours/sec still requires

1010100100  years! years!

FastFast factoringfactoring algorithms can break encryptionalgorithms can break encryptionschemes. Algorithms research determines what is safeschemes. Algorithms research determines what is safecode length. (> 100 digits)code length. (> 100 digits)

Page 23: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 23/60

SRIHARITECHSOFT Pvt Ltd

How to Measure Algorithm PerformanceHow to Measure Algorithm Performance

What metric should be used to judge algorithms?What metric should be used to judge algorithms?

Length of the program (lines of code)Length of the program (lines of code)

Ease of programming (bugs, maintenance)Ease of programming (bugs, maintenance)

Memory requiredMemory required Running timeRunning time

Running time is the dominant standard.Running time is the dominant standard. Quantifiable and easy to compareQuantifiable and easy to compare

Often the critical bottleneckOften the critical bottleneck

Page 24: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 24/60

SRIHARITECHSOFT Pvt Ltd

AbstractionAbstraction

An algorithm may run differently depending on:An algorithm may run differently depending on: the hardware platform (PC, Cray, Sun)the hardware platform (PC, Cray, Sun) the programming language (C, Java, C++)the programming language (C, Java, C++) the programmer (you, me, Bill Joy)the programmer (you, me, Bill Joy)

While different in detail, all hardware and prog modelsWhile different in detail, all hardware and prog modelsare equivalent in some sense:are equivalent in some sense: T uring machinesT uring machines..

It suffices to count basic operations.It suffices to count basic operations.

Crude but valuable measure of algorithm·s performanceCrude but valuable measure of algorithm·s performance asasa function of input sizea function of input size..

Page 25: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 25/60

SRIHARITECHSOFT Pvt Ltd

Average, Best, and WorstAverage, Best, and Worst--CaseCase

On which input instances should theOn which input instances should thealgorithm·s performance be judged?algorithm·s performance be judged?

Average case:Average case: Real world distributions difficult to predictReal world distributions difficult to predict

Best case:Best case: Seems unrealisticSeems unrealistic

Worst case:Worst case: Gives an absolute guaranteeGives an absolute guarantee We will use the worstWe will use the worst--case measure.case measure.

Page 26: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 26/60

SRIHARITECHSOFT Pvt Ltd

ExamplesExamples

Vector additionVector addition Z = A+BZ = A+Bfor (int i=0; i<n; i++)for (int i=0; i<n; i++)

Z[i] = A[i] + B[i];Z[i] = A[i] + B[i];

T(n)  = c n T(n)  = c n 

Vector (inner) multiplicationVector (inner) multiplication z =A*Bz =A*B

z = 0;z = 0;for (int i=0; i<n; i++)for (int i=0; i<n; i++)

z = z + A[i]*B[i];z = z + A[i]*B[i];

T(n)  = c· + cT(n)  = c· + c11 n n 

Page 27: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 27/60

SRIHARITECHSOFT Pvt Ltd

ExamplesExamples

Vector (outer) multiplicationVector (outer) multiplication Z = A*BZ = A*BT T 

for (int i=0; i<n; i++)for (int i=0; i<n; i++)

for (int j=0; j<n; j++)for (int j=0; j<n; j++)Z[i,j] = A[i] * B[j];Z[i,j] = A[i] * B[j];

T(n)  = cT(n)  = c22 n n 2 2 ; ; 

A program does all the aboveA program does all the aboveT(n)  = cT(n)  = c00 + c+ c11 n  + cn  + c2  2  n n 

2 2 

Page 28: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 28/60

SRIHARITECHSOFT Pvt Ltd

Simplifying the BoundSimplifying the Bound

T(n)  = cT(n)  = ckk n n k k  + c+ ck k --11 n n k k --11 + c+ ck k --2  2  n n k k --2 2  + « ++ « +

cc11 n  + cn  + coo

too complicatedtoo complicated too many termstoo many terms

Difficult to compare two expressions, eachDifficult to compare two expressions, eachwith 10 or 20 termswith 10 or 20 terms

Do we really need that many terms?Do we really need that many terms?

Page 29: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 29/60

Page 30: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 30/60

SRIHARITECHSOFT Pvt Ltd

SimplificationSimplification

Drop the constant coefficientDrop the constant coefficient Does not effect the relative orderDoes not effect the relative order

Page 31: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 31/60

SRIHARITECHSOFT Pvt Ltd

SimplificationSimplification

TheThe fasterfaster growing term (such as 2growing term (such as 2nn))eventuallyeventually will outgrow the slower growingwill outgrow the slower growingterms (e.g., 1000 n) no matter what theirterms (e.g., 1000 n) no matter what their

coefficients!coefficients!

Put another way, given a certain increasePut another way, given a certain increase

in allocated time, a higher orderin allocated time, a higher orderalgorithm will not reap the benefit byalgorithm will not reap the benefit bysolving much larger problemsolving much larger problem

Page 32: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 32/60

SRIHARITECHSOFT Pvt Ltd

Complexity and TractabilityComplexity and Tractability

T (n)

n n n log n n2 n3 n4 n10 2n

10 .01Qs .03Qs .1Qs 1Qs 10Qs 10s 1Qs

20 .02Qs .09Qs .4Qs 8Qs 160Qs 2.84h 1ms

30 .03Qs .15Qs .9Qs Qs 810Qs 6.83d 1s

40 .04Qs .21Qs 1.6Qs Qs 2.56ms 121d 18m50 .05Qs .28Qs Qs Qs 6.25ms 3.1y 13d

100 .1Qs .66Qs 10Qs 1ms 100ms 3171y 4v1013

y

103

1Qs 9.96Qs 1ms 1s 16.67m 3.17v1013

y 32v10283

y

104

Qs 130Qs 100ms 16.67m 115.7d 3.17v1023

y

105

Qs 1.66ms 10s 11.57d 3171y 3.17v1033

y

10

6

ms

19.92ms

16.67m 31.71y 3.17v10

7

y 3.17v10

43

y

 Assume the computer does 1 billion ops per sec

Page 33: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 33/60

SRIHARITECHSOFT Pvt Ltd

log n n n log n n2 n3 2n

0 1 0 1 1 2

1 2 2 4 8 4

2 4 8 16 64 163 8 24 64 512 256

4 16 64 256 4096 65,536

5 32 160 1,024 32,768 4,294,967,296

0

10000

20000

30000

40000

50000

60000

70000

n

1

10

100

1000

10000

100000

n

Page 34: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 34/60

SRIHARITECHSOFT Pvt Ltd

Another ViewAnother View

More resources (time and/or processing power) translate intoMore resources (time and/or processing power) translate intolarge problems solved if complexity is lowlarge problems solved if complexity is low

T(n)T(n) Problem sizeProblem sizesolved insolved in101033 secsec

ProblemProblemsize solvedsize solvedin 10in 1044 secsec

IncreaseIncreasein Problemin Problemsizesize

100n100n 1010 100100 1010

1000n1000n 11 1010 1010

5n5n22 1414 4545 3.23.2NN33 1010 2222 2.22.2

22nn 1010 1313 1.31.3

Page 35: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 35/60

SRIHARITECHSOFT Pvt Ltd

AsympoticsAsympotics

T(n) keep one drop coef 

3n2+4n+1 3 n

2  n

101 n2+102 101 n2  n2 

15 n2+6n  15 n

2  n

a n2+bn+c a n

2  n

They all have the same growth rate

Page 36: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 36/60

SRIHARITECHSOFT Pvt Ltd

CaveatsCaveats

Follow the spirit, not the letterFollow the spirit, not the letter a 100n algorithm is more expensive than na 100n algorithm is more expensive than n22

algorithm when n < 100algorithm when n < 100

Other considerations:Other considerations: a program used only a few timesa program used only a few times

a program run on small data setsa program run on small data sets

ease of coding, porting, maintenanceease of coding, porting, maintenance memory requirementsmemory requirements

Page 37: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 37/60

SRIHARITECHSOFT Pvt Ltd

Asymptotic NotationsAsymptotic Notations

BigBig--O,O, ´bounded above byµ:´bounded above byµ: T(n) = O(f(n))T(n) = O(f(n)) For some c and N, T(n)For some c and N, T(n) ee c f(n) whenever n > N.c f(n) whenever n > N.

BigBig--Omega,Omega, ´bounded below byµ:´bounded below byµ: T(n) =T(n) = ;;(f(n))(f(n)) For some c>0 and N, T(n)For some c>0 and N, T(n) uu c f(n) whenever n > N.c f(n) whenever n > N. Same as f(n) = O(T(n)).Same as f(n) = O(T(n)).

BigBig--Theta,Theta, ´bounded above and belowµ:´bounded above and belowµ: T(n) =T(n) = 55(f(n))(f(n)) T(n) = O(f(n)) and also T(n) =T(n) = O(f(n)) and also T(n) = ;;(f(n))(f(n))

LittleLittle--o,o, ´strictly bounded aboveµ:´strictly bounded aboveµ: T(n) = o(f(n))T(n) = o(f(n)) T(n)/f(n)T(n)/f(n) pp 0 as n0 as n pp gg

Page 38: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 38/60

SRIHARITECHSOFT Pvt Ltd

By PicturesBy Pictures

BigBig--OhOh (most(mostcommonly used)commonly used) bounded abovebounded above

BigBig--OmegaOmega bounded belowbounded below

BigBig--ThetaTheta

exactlyexactly SmallSmall--oo

not as expensive as ...not as expensive as ...

0 N 

0 N 

Page 39: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 39/60

SRIHARITECHSOFT Pvt Ltd

ExampleExample

33

25

10

0

(?)(?)

nn

nn

nn

O

g

;

23 2)( nnnT  !

Page 40: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 40/60

SRIHARITECHSOFT Pvt Ltd

ExamplesExamples

)(log/1

))/((!)(

)(

)()(

)(

)1(c

Asym ptomic)(

1

0

11

321

21

1

ni

ennnr r 

ni

nini

nnc

n f 

ni

n

nin

i

k k ni

ni

ni

k ii

k i

57

5

57

57

57

57

57

5

!

!

!

!

!

!

Page 41: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 41/60

SRIHARITECHSOFT Pvt Ltd

Summary (Why O(n)?)Summary (Why O(n)?)

T(n)  = cT(n)  = ck k  n n k k  + c+ ck k --11 n n k k --11 + c+ ck k --2  2  n n 

k k --2 2  + « ++ « +cc11 n  + cn  + coo

Too complicatedToo complicatedO (n O (n k k   )  ) 

a single term with constant coefficienta single term with constant coefficientdroppeddropped

Much simpler, extra terms andMuch simpler, extra terms andcoefficientscoefficients do not matterdo not matter asymptoticallyasymptoticallyOther criteria hard to quantifyOther criteria hard to quantify

Page 42: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 42/60

SRIHARITECHSOFT Pvt Ltd

Runtime AnalysisRuntime Analysis

Useful rulesUseful rules simple statements (read, write, assign)simple statements (read, write, assign)

O(1) (constant)O(1) (constant)

simple operations (+simple operations (+ -- * / == > >= < <=* / == > >= < <=O(1)O(1)

sequence of simple statements/operationssequence of simple statements/operations

rule of sumsrule of sums for, do, while loopsfor, do, while loops

rules of productsrules of products

Page 43: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 43/60

SRIHARITECHSOFT Pvt Ltd

Runtime Analysis (cont.)Runtime Analysis (cont.)

Two important rulesTwo important rules Rule of sumsRule of sums

if you do a number of operations in sequence, theif you do a number of operations in sequence, theruntime is dominated by the most expensiveruntime is dominated by the most expensiveoperationoperation

Rule of productsRule of productsif you repeat an operation a number of times, theif you repeat an operation a number of times, the

total runtime is the runtime of the operationtotal runtime is the runtime of the operationmultiplied by the iteration countmultiplied by the iteration count

Page 44: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 44/60

SRIHARITECHSOFT Pvt Ltd

Runtime Analysis (cont.)Runtime Analysis (cont.)

if (cond) thenif (cond) then O(1)O(1)bodybody11 T T 11(n)(n)

elseelsebodybody22 T T 22(n)(n)

endifendif

T(n) = O(max (T T(n) = O(max (T 11(n), T (n), T 22(n))(n))

Page 45: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 45/60

Page 46: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 46/60

SRIHARITECHSOFT Pvt Ltd

ExampleExample

for (i=1; i<n; i++)for (i=1; i<n; i++)

if A(i) > maxVal thenif A(i) > maxVal then

maxVal= A(i);maxVal= A(i);

maxPos= i;maxPos= i;

Asymptotic Complexity: O(n)Asymptotic Complexity: O(n)

Page 47: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 47/60

SRIHARITECHSOFT Pvt Ltd

ExampleExample

for (i=1; i<nfor (i=1; i<n--1; i++)1; i++)for (j=n; j>= i+1; jfor (j=n; j>= i+1; j----))

if (A(jif (A(j--1) > A(j)) then1) > A(j)) thentemp = A(jtemp = A(j--1);1);

A(jA(j--1) = A(j);1) = A(j);A(j) = tmp;A(j) = tmp;

endifendif

endforendfor

endforendfor

Asymptotic Complexity is O(nAsymptotic Complexity is O(n22))

Page 48: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 48/60

SRIHARITECHSOFT Pvt Ltd

Run Time for Recursive ProgramsRun Time for Recursive Programs

T(n) is defined recursively in terms ofT(n) is defined recursively in terms ofT(k), k<nT(k), k<n

TheThe recurrence relationsrecurrence relations allow T(n) to beallow T(n) to be´unwoundµ recursively into some base´unwoundµ recursively into some basecases (e.g., T(0) or T(1)).cases (e.g., T(0) or T(1)).

Examples:Examples: FactorialFactorial

Hanoi towersHanoi towers

Page 49: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 49/60

SRIHARITECHSOFT Pvt Ltd

An Analogy: Cooking RecipesAn Analogy: Cooking Recipes

Algorithms are detailed and preciseAlgorithms are detailed and preciseinstructions.instructions.

Example: bake a chocolate mousse cake.Example: bake a chocolate mousse cake. Convert raw ingredients into processed output.Convert raw ingredients into processed output. Hardware (PC, supercomputer vs. oven, stove)Hardware (PC, supercomputer vs. oven, stove) Pots, pans, pantry are data structures.Pots, pans, pantry are data structures.

Interplay of hardware and algorithmsInterplay of hardware and algorithms Different recipes for oven, stove, microwave etc.Different recipes for oven, stove, microwave etc.

New advances.New advances. New models: clusters, Internet, workstationsNew models: clusters, Internet, workstations Microwave cooking, 5Microwave cooking, 5--minute recipes, refrigerationminute recipes, refrigeration

Page 50: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 50/60

SRIHARITECHSOFT Pvt Ltd

Bloom FiltersBloom Filters

BFs are probabilistic hash codes to reduce theBFs are probabilistic hash codes to reduce theamount of space.amount of space.

They offer a tradeThey offer a trade--off between the size of the filteroff between the size of the filterand the probability of a false positive i.e. when BFand the probability of a false positive i.e. when BF

says, x not in the set, answer always correct. But says, x not in the set, answer always correct. But when it says, x in the set, it may be wrong withwhen it says, x in the set, it may be wrong withsome small probability .some small probability .

This is extremely useful when most x are NOT inThis is extremely useful when most x are NOT in

the set of interest.the set of interest. Applications Applications -- web proxy caching, distributedweb proxy caching, distributednaming services, query routing, networknaming services, query routing, networkmeasurements, dictionaries, bad password files.measurements, dictionaries, bad password files.

Page 51: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 51/60

SRIHARITECHSOFT Pvt Ltd

 APPLICATIONS APPLICATIONS

Unix Spell Checker: When memory wasUnix Spell Checker: When memory wassmall/expensive, Bloom suggested using BFs.small/expensive, Bloom suggested using BFs.

Password files: Safford suggests using BF of "badPassword files: Safford suggests using BF of "bad

password lists".password lists". Distributed Caching: If a local cache doesn't haveDistributed Caching: If a local cache doesn't have

a web page, instead of going to the internet, ora web page, instead of going to the internet, orsource, first check if another nearby cache has it.source, first check if another nearby cache has it.

Set reconciliation: A sends to B its BF, and B thenSet reconciliation: A sends to B its BF, and B thensends the different sends the different 

Page 52: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 52/60

Page 53: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 53/60

SRIHARITECHSOFT Pvt Ltd

Search TreesSearch Trees

Trees useful structure for hierarchical information,Trees useful structure for hierarchical information,searching, etc. With large data, repeated linear search insearching, etc. With large data, repeated linear search inlink list too slow. With appropriate binary search trees,link list too slow. With appropriate binary search trees,we can do searches, inserts, deletes, find successor, findwe can do searches, inserts, deletes, find successor, findpredecessor etc in O(log n).predecessor etc in O(log n).

Binary Trees: Each node has 0, 1, or 2 children.Binary Trees: Each node has 0, 1, or 2 children. Most common form of trees: heap; expression trees;Most common form of trees: heap; expression trees;

branchbranch--andand--bound algorithms.9.bound algorithms.9. Binary SEARCH Trees: Important application of binary treesBinary SEARCH Trees: Important application of binary trees

in searching andin searching and implementing a search ADT.implementing a search ADT. What makes a binary tree a SEARCH tree is the "ordering"What makes a binary tree a SEARCH tree is the "ordering"

property:property: for every node X, the keys in the left subtree <for every node X, the keys in the left subtree <key(X)key(X) the keys in the right subtree >the keys in the right subtree >key(X).key(X).

Page 54: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 54/60

SRIHARITECHSOFT Pvt Ltd

Splay TreesSplay Trees

Self Self--adjusting Binary Search Treesadjusting Binary Search Trees

Balanced search trees (AVL, weight balanced,Balanced search trees (AVL, weight balanced,BB--Trees etc) have O(log n) worst case search,Trees etc) have O(log n) worst case search,

but are either complicated to implement, orbut are either complicated to implement, orrequire extra space for balance, or both.require extra space for balance, or both.

Splay trees don't maintain any explicit balanceSplay trees don't maintain any explicit balancecondition; rather they apply a simplecondition; rather they apply a simplerestructuring heuristic, called splaying EVERY restructuring heuristic, called splaying EVERY time the tree is accessed.time the tree is accessed.

Page 55: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 55/60

SRIHARITECHSOFT Pvt Ltd

B TreesB Trees

Often, it takes longer to read one page of information than toOften, it takes longer to read one page of information than toexamine it (compute).examine it (compute).

Thus, when dealing with diskThus, when dealing with disk--bound data structures, we lookbound data structures, we lookat two factors separately:at two factors separately: number of disk accesses,number of disk accesses, the CPU time.the CPU time.

BB--Tree algorithms operate at the granularity of pages i.e., theTree algorithms operate at the granularity of pages i.e., theunit operations are to READ or WRITE a page.unit operations are to READ or WRITE a page.

The main memory can only accommodate only so many pages,The main memory can only accommodate only so many pages,so older pages will be flushed out as new ones are fetched.so older pages will be flushed out as new ones are fetched.

Since we want to optimize the number of page accesses, we willSince we want to optimize the number of page accesses, we willchoose the size of the Bchoose the size of the B--Tree to match the page size.Tree to match the page size. That is, each node will store keys for about 50That is, each node will store keys for about 50--2000 items, and2000 items, and

will have the similar branching factor.will have the similar branching factor.

Page 56: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 56/60

SRIHARITECHSOFT Pvt Ltd

GRAPHSGRAPHS

Graphs are useful models for reasoningGraphs are useful models for reasoningabout relations among objects andabout relations among objects andcombinatorial problems.combinatorial problems.

Many realMany real--life problems can be solved bylife problems can be solved byconverting them to graphs.converting them to graphs.

Proper application of graph theory ideasProper application of graph theory ideas

can drastically reduce the solution time forcan drastically reduce the solution time forsome important problemssome important problems

Page 57: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 57/60

SRIHARITECHSOFT Pvt Ltd

Examples of GraphsExamples of Graphs

airport system:airport system:

nodes = airports; edges = pairs of airports with nonnodes = airports; edges = pairs of airports with non--stopstopflights.flights. (weight/cost = airfare; distance; capacity)(weight/cost = airfare; distance; capacity)

Internet:Internet:nodes = routers; edges = links.nodes = routers; edges = links.

social graphs: (6 degrees of separation)social graphs: (6 degrees of separation)

nodes = people; edges = friends/acquaintance/conodes = people; edges = friends/acquaintance/co--authorsauthors

academic graphs:academic graphs:

nodes = courses; edges = prereqs;nodes = courses; edges = prereqs;

Page 58: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 58/60

SRIHARITECHSOFT Pvt Ltd

SortingSorting

O(n^2) Insertion sort is a classicO(n^2) Insertion sort is a classicExample, Shell sort Example, Shell sort 

Bucket Sort Bucket Sort -- Absolute Value Absolute Value HeapHeap

Build a heap on the n elementsBuild a heap on the n elements-- O(n) time.O(n) time. Do n DeleteMinsDo n DeleteMins-- O(n log n) total.O(n log n) total.

QuickQuick One of the fastest sorting algorithms in practice.One of the fastest sorting algorithms in practice. It's expected running time is O(n log n), though the worst It's expected running time is O(n log n), though the worst--case iscase is

O(n^2);O(n^2);

MergeMerge  A classic example of divide A classic example of divide--andand--conquer.conquer. T(n) = 2^log n + nlogn = nlogn + n.T(n) = 2^log n + nlogn = nlogn + n.

Page 59: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 59/60

SRIHARITECHSOFT Pvt Ltd

Possible ProblemsPossible Problems

Magic SquareMagic Square

Large FactorialLarge Factorial

Josephus CircleJosephus CircleRat in a MazeRat in a Maze

 Array Problems Array Problems

Sparse MatricesSparse Matrices

Page 60: Bu Presentation Dsa

8/8/2019 Bu Presentation Dsa

http://slidepdf.com/reader/full/bu-presentation-dsa 60/60

SRIHARITECHSOFT Pvt Ltd

Other termsOther terms

 Adjacency Matrix, Topological Sort  Adjacency Matrix, Topological Sort 

Shortest Path, Weighted Shortest Paths,Shortest Path, Weighted Shortest Paths,

Spanning TreesSpanning Trees -- Minimum, Spanning treeMinimum, Spanning tree   Prims , Kruskals Algorithm,Prims , Kruskals Algorithm,

Divide and Conquer, Back Tracking,Divide and Conquer, Back Tracking,

Branch and Bound,Branch and Bound,Dynamic Programming.Dynamic Programming.