Survey On Efficient Task Scheduling In A Cost Efficient Manner

DOI : 10.17577/IJERTV2IS120686

Download Full-Text PDF Cite this Publication

Text Only Version

Survey On Efficient Task Scheduling In A Cost Efficient Manner

$I. Prisilla Preethi #Mrs. M. Roshni Thanka

$P.G Student, Department of Computer Science and Engineering,

#Assistant Professor, Department of Computer Science and Engineering, Karunya University

Abstract

Cloud computing relies on sharing of resources to achieve coherence and economies of scale similar to a utility over a network. The cloud also focuses on maximizing the effectiveness of the sharing the resources. Cloud computing resources are not only shared by multiple users but as well as dynamically re-allocated as per the user demand. Cloud computing provides on-demand self-service it allows users to configure, obtain and deploying the cloud services themselves using cloud service catalogues, without requiring the assistance of information technology. In this cloud computing offers a promising new platform to execute large programs in the cloud. In this large programs can be divided into multiple sequence of task.

Keywords: Cloud Computing, Parallel task scheduling

  1. Introduction

    Cloud is a type of parallel and distributed system consisting of a collection of interconnected and virtualized computers that are dynamically provisioned and represented as one or more unified computing resources based on service level agreements established through the negotiation between the service providers and consumers. Cloud computing involves delivering hosted services over the internet.

    By using the virtualization concept, cloud computing can also support heterogeneous resources and flexible. Another merit of cloud computing is its scalability. Cloud computing has been under growing spotlight as a possible solution for providing a flexible on demand computing infrastructure for a number of applications. A virtual machine (VM) is a software implementation of a computing environment in which an operating system or program can be installed and run.

    Scheduling is the method which is used to access the system resources. This is usually done to balance a load in a system effectively or achieve a quality of service. The important for a scheduling algorithm arises from the requirement for most modern systems to perform multitasking and multiplexing.

    The scheduler is concerned mainly with:

    Throughput is the total number of processes that complete their execution per time unit. Turnaround time means the total time between submission of a process and its completion. Response time is am amount of time it takes from when a request was submitted until the first response is produced.

    Task scheduling mechanism can meet end users requirements, and improve the performance of the resource utilization, thereby improving the overall performance of the cloud computing environment. A task is an activity that uses a set of input task to produce a optimal output. Processes in fixed set are statically assigned to processors, at compile-time. This task scheduling mechanism can not only meet the end user requirements, but also get efficient resource utilization. But it needs more improvement as this whole algorithm is based on the accuracy of the predicted execution time of each task. The primary goal of task scheduling is to schedule tasks on processors and minimize the make span of the schedule. The task scheduling can be modeled as a weighted Directed Acyclic Graph (DAG). A vertex represents a task, and its weight the size of the task computation. An arc represents the broadcast information among two tasks, and its weight represents the cost. The directed edge shows the dependency between two tasks.

  2. Related Works

    Task scheduling algorithms are described below

    1. Cost-Based Workflow Scheduling Algorithm

      R.Buyya proposed an cost based workflow scheduling algorithm uses for minimizing the execution time and execution cost. It is optimally solve the task scheduling problem in several sequential tasks with branches. By rescheduling unexecuted tasks it can adapt the delays of service executions. Two typical Qos conditions are time and execution cost. For workflow pay per use method is used. The users get the execution done at lowest possible cost within their required time frame.[20]The merits of using this algorithm is reducing processing time and execution cost. The demerit is based on the task deadline level the sub- deadlines are assigned that incurs unnecessary cost.

    2. A Framework For Mapping Complex Scientific Workflows Onto Distributed Systems

      This method is used to represent the workflows of an abstract level without needing to worry about the final execution systems. The application performance is increased because of workflow scheduling algorithm. The scheduling systems and overhead of the job submission low computational granularity task should be exposed .

      [3] The overall structure of the application and its maximum parallelism are exposed by the workflow. Improving the application performance and mapping horizon to adjust to the volatility of the target execution system these are the advantages of complex scientific workflows. The disadvantage is overloading of the head node with too many jobs.

    3. Nephele: Efficient Parallel Data Processing

      Nephele is the data processing framework to exploit the resource allocation in a dynamic manner for scheduling the task and job execution. Particular task of a job can be assigned to different types of virtual machines in a cloud computing. During the job execution it is automatically created and terminated. In this based on the virtual machine capacity the task will be allocated. If it is having high workload and high cost of the job will be allocated. In the course of a job execution the task will be allocate/deallocate the virtual machines which is used to improve the overall performance and reducing the processing cost. The demerit is exploiting data locality change from one place to another large amount of data through network.

    4. A Budget Constrained Based Scheduling

      R.Buyya proposed this method for reducing the execution time. For the workload scheduling

      problem, a solution is meeting some conditions: When all the predecessor should be completed then only the task will be started. Once in the schedule only the task will be appeared. The fitness function is developed for encourage the formation of the correct solutions to achieve the time minimization and budget constraints. In this algorithm by supporting the running time rescheduling and different service negotiation models along with duplication of the critical task to meet end users QoS requirements.[20] In this method is used by the cloud providers and users. The main disadvantage of this method is when all the parent task should be completed then only the child task will allow to executing the task. Because of the NP hard problem there is no known algorithms are able to generate the optimal solution within polynomial time.

    5. Task Duplication Based Scheduling Algorithm For Network Heterogeneous Systems

      This algorithm uses the directed acyclic graph for provide the exact results for the applications. It provides a simple set of conditions on network communication execution time and task computation should be satisfied. The main aim of this algorithm is to minimize schedule time and scheduling length itself. [2] Which is initially create a set of clusters similar to the linear clusters. The main benefit of this method is it provides the optimal results when it is homogeneous. It is used to reduce the scheduled complexity, schedule cost and schedule processing time. The main problem is the complexity of the problem increases when task scheduling is to be done in a heterogeneous environment.

    6. Genetic Algorithm

      Robust searching techniques are provided by the genetic algorithm. That allows a high quality best solution to be derived from a large search in polynomial time. A genetic algorithm combines the exploitation of worth solutions from past searches with the exploration of new regions of the solution space. It maintains a population of individuals that evolves over the generation of the searching method.[11] A typical genetic algorithm consist of some steps Initial population must be created by the consist of randomly generated solutions, By applying genetic operators, namely selected approach, crossover and mutation, new offspring should be generated. Evaluate the fitness values of the each individual in the population. The advantages are used to solve the scheduled optimization problem and better response time. The demerits are it will takes

      more longest execution time and high computational complexity.

    7. Heterogeneous Earliest Finish Time Algorithm

      It is very simple and computationally not expensive algorithm. Mapping the tasks to the particular resources. Which schedules the workflows by creating an ordered list of jobs out of the workflows. The heterogeneous algorithm that we applied that consists of 3 phases: Weighting assigns the weights to the nodes and edges in the workflow. The ranking creates the correct order list of jobs, organized in the order how it should be executed.[22] Mapping assigns the tasks to the resources. In heterogeneous environments, the weights must be approximated considering different predictions for execution times on many different resources, and for different data transfer times on different data links. The advantages are more effective less time consuming, it is best for unbalanced workflows and it is very expensive. Some disadvantage is it does not provide the very good optimal solution.

    8. Dynamic Critical Path Scheduling

      In this algorithm examine the critical path of the task graph. The path schedule of each processor will be dynamically rearranged. In this schedule length can be reduced dynamically for the many number of processors. It determines the critical path of the task graph which is taking more time to execute the particular task and selects the next node to be scheduled in a dynamic manner. It selects a suitable processor for a node by looking ahead of the potential start times of the remaining nodes on the processor.[24] It rearranges the schedule on each processor dynamically in the sense that the positions of the nodes in the partial schedules are not fixed until all nodes have been considered. Monotonically schedule length can be reduced is a good benefit of dynamic critical path. In this many no of processors are also used. It is suitable for different types of graph structure. It schedules relatively not much important node to the processor already in use inorder not to waste processor.The disadvantage is time complexity.

    9. Dryad: Distributed Data-Parallel Programs From Sequential Building Blocks

      Dryad is a general purpose distributed execution engine for coarse-grain data parallel applications. In this dryad runs the application by executing the vertices of the graph .In this set of computers , transmission control protocol pipes and

      shared memory are also used. It is used to improve the performance. A Dryad application combines computational vertices with communication channels to form a dataflow graph. The advantage is can built a high performance distributed execution engine. Dryad is allowing graph vertices to use an arbitrary number of inputs and outputs.[15] It gives the flexible test bed. The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: recovering from communication failures, and transporting data between the vertices. The main drawbacks are it addresses a problem in long standing. It is having highly complicated architecture.

    10. A Compile Time Scheduling Heuristic

      It is called as dynamically scheduling heuristic. This technique is used to matching the task with processor in a dynamically changing priorities. For eliminating shared resource contention both temporal and spatial dimensions should be used.[6] The advantages are poor complexity, high efficient performance and it is very useful for partitioning and clustering of parallel programs and the drawbacks are Inter processor communication is overhead ,It eliminates the shared communication resource contention.

  3. Conclusion

    Task scheduling is one of the important and challenging thing in cloud computing. The task scheduling is explained and the algorithm computes scheduling plans that produce make span as good as the best known algorithm of while significantly reducing monetary costs. Based on the cost efficient task scheduling algorithm the programs can be executed in an efficient manner. Many algorithms are used to reduce the time and cost.

  4. Referrences

[1]D. Bozdag, F. Ozguner, U. Catalyurek, (2009) , Compaction of schedules and a two-stage approach for duplication-based dag scheduling, IEEE Transactions on Parallel and distributed Systems [2]D. Bozdag, U. Catalyurek, F. Ozguner, IPDPS 2006, IEEE, 2006,parallel distributed systems

[3]E. Deelman, G. Singh, M. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. Berriman, J. Good, et al, (2005) , Pegasus: a framework for mapping complex

[4]E. Hou, N. Ansari, H. Ren, (1994) ,A genetic algorithm for multiprocessor scheduling, IEEE Transactions on Parallel and Distributed Systems

[5]E. Ulungu, J. Teghem, (1994) , Multi-objective combinatorial optimization problems: a survey,

Journal of Multi-Criteria Decision Analysis

[6]G. Sih, E. Lee, A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures, IEEE Transactions on distributed systems.

[7]G. Singh, C. Kesselman, E. Deelman, ACM, 2007,A provisioning model and its comparison with best-effort for performance-cost optimization in grids, in:

[8]H. Topcuoglu, S. Hariri, M. Wu, (2002)

,Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel in Modern Project Scheduling.

[9]International Symposium on High Performance Distributed Computing, .

[10]J. Ullman, (1975) ,Np-complete scheduling problems, Journal of Computer and System Sciences [11]J. Yu, R. Buyya, 2006, WORKS06, IEEE,

2006A budget constrained scheduling of workflow applications on utility grids using genetic algorithms, in: Workshop on Workflows in Support of Large- Scale Science.

[12]J. Yu, R. Buyya, C. Tham, 2005, IEEE, 2005 ,

Cost-based scheduling of scientific workflow applications on utility grids, in: First International Conference on e-Science and Grid Computing. [13]K. Kurowski, J. Nabrzyski, A. Oleksiak, J. Weglarz, (2006) Scheduling jobs on the grid- multicriteria approach, Computational methods in science .

[14]K. Kurowski, J. Nabrzyski, A. Oleksiak, J. Weglarz, 2006 Grid multicriteria job scheduling with resource reservation and prediction mechanisms,

Perspectives

[15]M. Isard, M. Budiu, Y. Yu, A. Birrell, D. Fetterly, (2007) Dryad: distributed data-parallel programs from sequential building blocks, ACM SIGOPS Operating.

[16]M. Wieczorek, A. Hoheisel, R. Prodan, (2009)

,Towards a general model of the multi-criteria workflow scheduling on the grid, Future Generation Computer

[17]M. Wieczorek, R. Prodan, T. Fahringer, (2005) Scheduling of scientific workflows in the askalon grid environment, ACM SIGMOD Record Parallel and Distributed Systems.

[18]Q. Zhu, G. Agrawal, ACM, 2010, Resource provisioning with budget constraints for adaptive applications in cloud environments, in: Proceedings of te 19th ACM.

[19]R. Bajaj, D. Agrawal, (2004) ,Improving scheduling of tasks in a heterogeneous environment,

IEEE Transactions on Parallel and Distributed Systems .

[20]R. Buyya, C. Yeo, S. Venugopal, J. Broberg, I. Brandic, (2009) , Cloud computing and emerging it platforms: Vision hype and reality for delivering computing.

[21]R. Sakellariou, H. Zhao, E. Tsiakkouri, M. Dikaiakos, (2007) ,Scheduling workflows with budget constraints, Integrated Research in Grid Computing

[22]S. Garg, R. Buyya, H. Siegel, (2010) ,Time and cost trade-off management for scheduling parallel applications on utility grids, Future Generation Computer.

[23]T. Yang, A. Gerasoulis, (1994) Dsc: Scheduling parallel tasks on an unbounded number of processors, IEEE Transactions on Parallel and Distributed Systems technology the 5th utility, Future Generation computer systems

[24]Y. Kwok, I. Ahmad, (1996). Dynamic critical- path scheduling: an effective technique for allocating task graphs to multiprocessors, IEEE Transactions on Parallel

[25] www.wikipedia.com

Leave a Reply