Example Problems for Greedy Algorithms

 

  1. Change Making
  2. Minimum Spanning Trees (MST)
  3. Shortest Distances

 

 Change Making:

 

            Coins: 1, 5, 10, 25, and 50

 

            Note that for this set of coins, a Greedy algorithm works as each coin

            is either a multiple of the next smaller one or twice the next smaller one plus

            the one below the next smaller one.

 

 

MST:

 

Some definitions.

 

 

A Lemma for extending a Promising set of edges.

 

            Let B be a strict subset of nodes of a graph (undirected)

            Let T be a promising set of edges such that no edges in T leaves B

            Let v be the shortest edge that leaves B (or one of the shortest if ties exists).

            Then T+{v} is promising.

 

            Note the required relationship between the promising set T of edges and an

            arbitrary strict subset B of nodes.

Given T, B must contain either both vertices or neither vertex of each edge in

            T while it contains exactly one vertex of this edge v.

(Proof on page-192)

 

Prim’s algorithm.

 

            T = f (empty-set)

            B = {an arbitrary member of N} //N=set of nodes of an undirected graph

            While (B <> N)

                        Find a edge e={u,v} of minimum length

                     such that u is in B and v in N-B

                        B = B+{v}

                        T = T+{e} 

            Return T // set of edges forming an MST.

 

Kruskal’s algorithm.

 

            Sort A (set of edges) by increasing length.

            n = number of nodes in N

            T = f (to contain the solution set of edges)

            Initialize n sets, each containing a node in N.

            Repeat

                        e = {u,v} = shortest edge not considered

                        if ucomp <> vcomp

                                    merge (ucomp, vcomp)

                                    T = T+{e}

                        Until T contains n-1 edges

            return T

 

Note:

 

  1. For the above Kruskal’s algorithm to be efficient, we need a good representation of disjoint sets.
  2. For this, we can use the following schemes given in section 5.9 (page 175-179)

 

 

1

2

3

2

1

3

2

3

3

2

     

      Above array represents the following three sets recognized by the first number of

      each set:           {1,5}

                              {2,4,7,10}

                              {3,6,8,9}

 

      For this representation, the following two functions can be used:

 

                  find1(x) //page-175                   merge1(a,b)

                  {  return set[x] }                       {  min=smaller(a,b)

                   //TC=constant                             max=larger(a,b)

                                                                      for k=1,N // N=10 in this case

                                                                      {

                                                                       if set[k]=max set[k]=min

                                                                       } 

}

                                                                  // TC=linear in N

 

      Executing find1()  n (Comparable to N) and merge1() at most N-1 times will

      take  O(n2 )

 

·        Better representation: Rooted Tree representation

 

1

2

3

2

1

3

4

3

3

4

      

      This array represents the same three sets but as a rooted tree each.

      One of three sets rooted at #2 may look like:

 

2

 
                             

 

 

 

 

 

 

 

 

 


                             

 

      For this representation, the following functions can be used:

 

      find2(x)                                                merge2(a,b)

      {  r=x                                                   {  if a<b set[b]=a

          while (set[r]<>r)                                   else     set[a]=b}

                  r=set[r]                         // TC=constant

          return r  }

      // TC=linear in N

 

      Executing find2()  n (Comparable to N) and merge2() at most N-1 times will

      take  O(n2 )  just like find1() and merge1() above due to the worst case of the

      height of rooted trees (caused by extremely unbalanced trees)

 

      To reduce time required for find2(), we need to maintain relatively balanced trees.

      For this, the following function can be used:

 

                  merge3(a,b)

                  // we maintain the height of each tree separately.

                  {  if height[a]==height[b]

                              {

                              height[a]++

                              set[b]=a           // tree b becomes a subtree of tree a

                              }

                     else  if (height[a]<height[b])

                                 set[a]=b        //tree rooted at a now is gone as it becomes a subtree

                              else

                                 set[b]=a

                        }   

      Now executing find2()  n (Comparable to N) and merge3() at most N-1 times will

      take  O(N+n log N)  where logN now represents the maximum height of any tree

      and it is the time complexity of find2() now and the TC of merge3() is a constant.

 

  1. To reduce the height of trees, we can further modify find2() as follows:

find3(x)           

// This function makes each node on the path of the node x in search

an immediate child of the current root in addition to finding the tree which

the node x belongs to. So, trees will be of height at most 2.

{  r=x

    while (set[r]<>r)

            r=set[r]

    // now the root of the tree is found which is r.

    // time to compress the path from this root and node x.

    node=x

    while (node<>r)

            {  keep = set[node]

                set[node]=r

                node = keep

            }          // now node=r (the root)

                          return r

                        }

 

      Effect of above function find3(16)

                     

                    (Before)                                                         (After)

10

 

10

 
                                         

 

 

 


18

 

15

 
     

15

 

18

 

22

 
 

 


                 

22

 
                             

25

 
 

 

 

 

 

 

 

 


25

 
                             

 

 

Textbook says that Prim’s algorithm takes q (n2) and  and Kruskal’s

takes q(a*log n). So, if the graph has more nodes (compared to edges, in a way),

Kruskal’s algorithm may be preferable. And, if the graph has mode edges (relative to

nodes) then Prim’s algorithm is preferable.        

 

 

  

       1                           10            2

 

    

              12                   14                  16

                       15                  20

       3                      4                       5   

   

                       19           22          18       

 

                               6 

 

 

 

 

 

 

 

 
 


           

                                   

 

 

 


                       

       

     

   

   

 

 

 

   

            Example: Kruskal’s algorithm.

 

            For the above graph, initially, we have 6 sets:

S1={1}, S2={2}, S3={3}, S4={4}, S5={5}, and S6={6} for the six nodes of the

            graph.

 

1.      The shortest edge, {1,2,10} connects two separate sets, hence it is to be picked.

2.      T = {{1,2}} and S1={1,2} //result of merging two sets S1 and S2, S2 is gone now.

3.      Next, the shortest edge, {1,3,12} connects two separate sets S1 and S3 and so,

it is to be picked up.

T = {{1,2}, {1,3}} and S1 ={1,2,3}   

4.      Next, the shortest edge, {2,4,14} connects two sets, S1 and S4.

So, it is to be picked.

T = {{1,2}, {1,3}, {2,4}} and S1 ={1,2,3,4}     

5.      Next, {2,5,16} and finally,   {5,6,18}is to be picked.

6.      Final result:  T = {{1,2}, {1,3}, {2,4},{2,5},{5,6}}

 

 

Shortest Distance Problem: Connected Weighted Graph (directed or undirected)

 

Finding the shortest distance from a given source node to every other node of the graph.

 

Dijkstra’s Algorithm : Assumption: Each weight is non-negative

3

 

2

 

1

 
Example                                                                 25                  

                                             12                         9       

                                                           

                              14       15               22                       14 

 

4

 

6

 

5

 
                                         30                               11

 


                                16                10           9                  16

 

8

 

7

 
                                                       

                                                             28 

 

                        Matrix: L         

12

25

14

15

Inf

Inf

Inf

12

9

Inf

22

Inf

Inf

Inf

25

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

Inf

Inf

14

Inf

11

Inf

16

Inf

Inf

Inf

16

10

Inf

28

Inf

Inf

Inf

Inf

9

16

28

 

                        1. First choice: #2: Min distance = 12

                            This choice updates the matrix as follows:

                       

(12)

21

14

15

Inf

Inf

Inf

12

9

Inf

22

Inf

Inf

Inf

21

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

Inf

Inf

14

Inf

11

Inf

16

Inf

Inf

Inf

16

10

Inf

28

Inf

Inf

Inf

Inf

9

16

28

 

                        2. Next choice: #4: Min Distance = 14

                            This choice updates the matrix as follows:

                       

(12)

21

(14)

15

Inf

30

Inf

12

9

Inf

22

Inf

Inf

Inf

21

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

Inf

Inf

14

Inf

11

Inf

16

30

Inf

Inf

16

10

Inf

28

Inf

Inf

Inf

Inf

9

16

28

 

                        3. Next choice: #5: Min Distance = 15

                            This choice updates the matrix as follows:

                       

(12)

21

(14)

(15)

26

25

24

12

9

Inf

22

Inf

Inf

Inf

21

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

26

Inf

14

Inf

11

Inf

16

25

Inf

Inf

16

10

Inf

28

24

Inf

Inf

Inf

9

16

28

 

                        4. Next Choice: #3: Min Distance=21

                            This choice does not update the matrix.

 

(12)

(21)

(14)

(15)

26

25

24

12

9

Inf

22

Inf

Inf

Inf

21

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

26

Inf

14

Inf

11

Inf

16

25

Inf

Inf

16

10

Inf

28

24

Inf

Inf

Inf

9

16

28

 

                        5. Next Choice: #8: Min Distance=24

                            This choice does not update the matrix.

 

                        6. Next Choice: #7: Min Distance=25:

    No updating either.

 

                        7. Last Choice: #6:  Min Distance=26

 

(12)

(21)

(14)

(15)

(26)

(25)

(24)

12

9

Inf

22

Inf

Inf

Inf

21

9

Inf

Inf

14

Inf

Inf

14

Inf

Inf

30

Inf

16

Inf

15

22

Inf

30

11

10

9

26

Inf

14

Inf

11

Inf

16

25

Inf

Inf

16

10

Inf

28

24

Inf

Inf

Inf

9

16

28

 

 

Scheduling Problem

 

1.       Minimizing Time in the System

2.       Scheduling with Deadlines

 

Minimizing Time in the System.

 

  Problem description: Given n jobs with Sj  service time for j=1,2,…,n, we want to schedule

  these jobs for the single server such that the average time that a job spends in the system

  (and equivalently the total time spent in the system by all the jobs) is minimized.

  Example:        Suppose that we have 4 jobs with

                        S1 = 10

S2 = 15

S3 = 20 and

S4 = 12

 

                        The optimal schedule is:  J1, J4, J2, and J3 in that order.

                        This gives the average wait time (time spent in the system by each job) of:

                        (10 + 22 + 37 + 57)/4 = 31.5

 

  The simple greedy algorithm is:

    To schedule all the jobs (with known service times) in order of non-decreasing service 

    time.

 

 

Scheduling with Deadlines: Maximizing Total Profit

 

  Problem description: We have a set of n jobs to execute, each of which takes the unit

  time. Job k earns us a profit  pk  > 0 if it is executed no later than time dk .

  We want to schedule these jobs to maximize the total profit.

  (Not all the jobs may be executed because of conflicting deadlines)

 

  Example:        Suppose that we have 6 jobs with

 

                        k          1          2          3          4          5          6

                        pk         50        10        15        30        10        40

                        dk         2          1          2          1          3          2                       

 

                        In this case, the optimal schedule is :  J1, J6 and J5 (or  J6, J1, and J5) in that

order for the total profit of 100. Jobs 2, 3 and 4 are not to be executed.

 

The solution is equivalent to finding a feasible sequence of jobs (one that allows every job in the sequence to be executed no later than its deadline) with maximum total profit.

For example, {J1, J2, J3, J5} or {J1, J2, J3}is not a feasible sequence but {J2,J3,J5} or {J4,J3,J5}or {J1,J3,J5}is.

 

The function sequence() on page 210 finds an optimal sequence based up jobs sorted in order of decreasing profit. In this example, J1, J6, J4, J3, J2, and J5.

 

Tracing this function, we see changes occurring to the optimal sequence being constructed as follows:

 

Initially

1

 

  

 i=2 when J6 is considered

1

6

 

i=3 when J4 is considered: Not to be included due to its early deadline.

1

6

 

i=4 when J3 is considered. Not to be included due to its early deadline.

1

6

 

i=5 when J2 is considered. Not to be included due to its early deadline.

1

6

 

i=6 when J5 is considered.

1

6

5

 


 

int sequence (int jobs, int profit[], int deadline[], int result[])

// assume that jobs are ordered in ascending order of their profit and

// that the two parallel input arrays contain their respective profits and deadlines,

// and ‘jobs’ is the total number of jobs.

{          result[0] = 0; // first job selected into the output array result

            int where;

            int made=1; // number of jobs already selected.

            for (int current = 1; current<=jobs-1; current++)

                        {

                        where=made; // potential position in result of current job

                        while (where>0 && (deadline[current]<deadline[result[where-1]]))

                           where--;

                        // ‘where’ is the number of jobs that are included with deadline earlier than

                        //  or equal to that of the current job.

                        if (deadline[current]>where) // current job must be included.

                            {

     if (where<made)      // at least one scheduled job has a later deadline

                                 for (int right=made; right>=where+1; right--)

                                    result[right]=result[right-1];      

                             result[where] = current;

                             made++;

                             }     // end of current job inclusion

                        }

            return made;     // returning the number of jobs sequenced.

}                                  // end of function sequence()