Survey Paper On Resource Discovery Model In Grid Computing

DOI : 10.17577/IJERTV2IS60468

Download Full-Text PDF Cite this Publication

Text Only Version

Survey Paper On Resource Discovery Model In Grid Computing

Ramya N Nagarathna N

Dept of Computer Science, Dept of Computer Science,

BMS college of Engineering, BMS college of Engineering Bangalore, India Bangalore, India,

Abstract

Grids are geographically distributed platforms composed of heterogeneous resources that users can access via a single interface, providing common resource-access technology and operational services across widely distributed and dynamic virtual organizations, i.e., institutions or individuals that share resources. Resources are generally meant as reusable entities employable to execute applications, and comprise processing units, secondary storage, network links, software, data, special-purpose devices, etc.

Resource discovery in grid systems is a fundamental task which provides searching and locating necessary resources for given processes. There are many different approaches in literature for this problem.

In this paper we propose a survey of different grid resource discovery techniques which are widely used approaches that has resulted in many tools to become de facto standards of todays grid resource management. We also give a comparative study of different resource discovery techniques.

Keywords Grid Computing; Distributed Computing; Resource Discovery; Centralized Grid; Peer-to- Pee(P2P) Grid; Hierarchical Grid; Agent-Based Grid Large-scale Grid Systems, Reliability, Dynamicity;

1. Introduction

The rapid growth of scientific applications has led to the development of new generation of distributed

systems such as Grid Computing [1]. Grid systems coordinates resources that are not subject to centralized control that means they are distributed over networks whose resources are managed, used, owned by several organizations, and are dynamic in nature i.e., Resources and users can change frequently [2]. This system uniformly provides the access to a large number of services and heterogeneous resources such as workstations, networks, storages, and computing power that belong to several organizations and administrative domains [3].

Moreover, during the last decade, a new generation of grid systems emerged. This new generation of grid is known as World Wide Grid (WWG) [4]. Similar to the well-known www, wwg aims at establishing a scientific and a computational worldwide grid that anybody, around the globe, can access and use its services depending on his requirements.

The advantages of grid system such as dynamicity, the heterogeneity, the distribution attributes of grid resources and the complexity of resource discovery has become a challenge for extending grid service to large scale systems [5] on which the grid system relies to find appropriate resources for a given job [6]. Therefore in this paper we concentrate on discovering the grid resources for the needed jobs

Criterias such as scalability, reliability and dynamicity plays an important role in designing resource discovery [7].

Scalability: since grid system is largely scalable in nature the computational performance of the grid resource discovery technique is related to scalability, the resource discovery system should be able to scale with a increasing number of users, events, and resources. The performance of static resource discovery technique will decrease very fast when the size of the grid environment is grown. This problem causes the Resource Discovery(RD) technique to work poorly in this environment.

Reliability: Whenever failure rate is high in large- scale grid, reliability plays an important role. Failure occur due to either false-positive errors caused by the usage of Time-To-Live(TTL) limitation, server (head node, manager node) failure as a result of a high amount of queries or due to Single point of failure caused in a part of a system that, if it fails, will stop the entire system from working .

The third and final requirement for a good resource discovery mechanism is Dynamicity. This condition affects the performance of grid system by its influence on the reliability of the system. The nodes in the grid environment might be very dynamic in terms of leaving and joining the grid system. The dynamicity of some nodes like central server affects the domain of the queries, and affects the reliability of the system by turning the central server into a single point of failures.

Thus, these three criterias plays an important role in deciding good resources discovery techniques

  1. LITERATURE SURVEY

    Resource Discovery Techniques(RDTs) are classified into the following four categories and reviewed based on these classifications. Figure 1 is a graphical representation of these categories.

    • Centralized Schemes.

    • Hierarchical Methods.

    • Peer-To-Peer Approaches.

    • Agent-based Methods.

      Figure 1. A general RDT classification

      1. Centralized Techniques:

        Grid Systems such as UNICORE, and MDS2 are examples of centralized RDT [10]. In these systems, the information of grid resources are stored in a centralized database. These systems are easy to implement, cost effective and apparently, therefore such systems are likely to be influenced to single point failure (i.e. is a part of a system that, if it fails, will stop the entire system from working) which would be solved by deploying back-up servers which is again cost constrained and also creates bottleneck issue in case of a large number of users querying to discover resources and frequent updates of the resource status. These issues would remarkably decrease the performance of the grid systems.

        Thereby discourages researchers to adopt centralized resource mechanisms.

        Figure 2. The Centralized architecture.

      2. Hierarchical Techniques:

        The second category of resource discovery techniques is based on hierarchical structure.

        In this technique, instead of using a centralized technique, Ganglia mentioned in [9] propose to organize the information services in a hierarchical structure which consist of three parts:

        • Information providers (IP).

        • Grid Resource-Information-Services (GRIS).

        • Grid Information- Index- Services (GIIS).

          This structure is more scalable than centralized systems, but it still suffers from single point failure mentioned earlier in previous technique. This is because at each level there is a central database server responsible for resource update requests. However, the bottleneck problem has been reduced.

          Figure 3. The Hierarchical architecture.

      3. P2P Techniques:

        MAAN, CAN, Chord, SWORD are the examples of P2P technique mentioned in [7]. These are decentralized network approaches which overcome the limitation of above mentioned techniques. However, there raises other serious issues in several domains of grid systems such as resource allocation, resource discovery, and security. P2P initiates a model of self-organization and decentralization of highly independent peers. This new model allows

        the system to size to a very large number of participating nodes.

        The P2P model is different from the traditional client- server model, because every component in P2P systems operates as a server and as a client at the same time. P2P systems can be categorized, depending on the organization of the peers and on the connection protocol, into three types:

        • Unstructured P2P systems

        • Super-peer systems

        • Structured P2P systems.

          According to [10] Unstructured P2P resource discovery approaches, hndle the dynamicity of resources. The common routing mechanisms used in these approaches prevent the grid to scale; nevertheless, the usage of TTL causes false-positive errors in most of unstructured systems even if the searched resources exist and are accessible on the grid. On other hand, increasing TTL will increase network traffic and negatively will affect the runtime of the algorithms and inefficient use of the bandwidth.

          On the other hand, According to [10] super-peer based methods a RD mechanism in which some selected super-peers nodes operate as directory services. The flooding mechanism for communication used by most super-peers enhances the scalability of the grid systems. Therefore, the reliability of super-peer is affected by the bottlenecks problem when the number of request for the super-peer is very large, and the losing the resources under control by a super- peer as a result of super-peer failure (single point of failures).

          The most structured P2P methods such as in

          [10] , reduce the bottleneck problem and ensure the scalability of the system by involved all the resource nodes in the query processing, which ensure that all nodes in the grid will have the equal load. However, most methods distribute the queries in the network by following a defined path. The failure of the node that forward the queries, bring the single point of failure (reliability) problem to the grid system.

          Figure 4. P2P architecture.

      4. Agent Based Techniques:

      According to Foster and et al [11], an agent is defined as an embedded computer system that is situated in some environment, capable of flexible, autonomous action in that environment in order to meet its design objectives. Among the common features of an agents are:

      1. Specific problem solving entities with clear frontiers;

      2. Embedded in a particular environment and act on that environment to produce some desired results;

      3. Autonomous in the sense that they have control on both internal state and their own behavior.

        Agent-based techniques are being broadly proposed as a method of solving resource discovery problem in grid computing, mainly because of their autonomy property where agents use their migration policies to determine a new migration sites.

        The agent-based grid RD approaches is classified depending on the network topology to

        • Structured

        • Unstructured approach.

      According to [10] The unstructured approaches do not suffer from the single point of failures problem and bottleneck problem . The flooding approach used

      by the agents for requesting a resource affects the scalability of the Grid. When we use the mobile agent more smart routing methods used to increase the scalability of the system. The queries using routing methods follow a single path, but the queries may be lost in the network in which mobile agents are used due to the failure on any node on the path of query routing. While the mechanism does not use central mangers, the mechanism will be free from single point of failure problems. Furthermore, the false positive error eliminated by remove the TTL value from the queries.

      According to [10] structured networks, eliminate the problem of bottleneck while the nodes have the same load. Moreover, the structured nature of the grid causes the system to use efficient routings algorithms for queries, to enhance the scalability of the grid system. However, the crash of a node in grid will cause loss of queries in the network. Moreover, the crash of the central node that manages set of nodes will prevent the queries from reaching these nodes even if they exist.

  2. COMPARISON OF RESOURCE DISCOVERY TECHNIQUES:

    In the previous sections discussed some current studies related to diverse methodologies used in resource discovery. Those studies are classified into four types. For each type of approach, the main approach described and discussed synthetically with detailed analysis. With respect introduced criteria (scalability, reliability, and dynamicity). These provide an evaluation of methods behavior in high dynamic large-scale environment. A comparative study done between all resource discovery approaches (Peer-to- Peer techniques, hierarchical, centralized, and agent systems) ,which permitted to mention the advantages and drawback of different approaches.

    Firstly, comparison is between resource discovery using centralized and hierarchical systems in both methods the main difference is in scalability and reliability.

    Disadvantage in centralized system:

    • The centralized systems suffer from the bottleneck problems in large scale.

    • There exists the single point of failure problem.

    • Some studies propose to replicate the centralized index server, this procedure might be very expensive in terms of messaging complexity in large scale.

      On the other hand, In Hierarchical systems:

      Advantages:

    • Load is distributed into many locations instead of one central server by doing this it increases the scalability of the system by distributing load on index servers.

    • They also decrease the effect of single point of failures. In case of a failure of an index server, a part of the system becomes unreachable instead of the whole.

    • Support for dynamic attribute queries requires that the query is processed within the resource nodes.

      Disadvantages:

      Since the idea in the examined solutions is indexing the resource information in central locations, they do not support dynamic attribute queries.

      Some of the examined algorithms propose solutions for this problem, but the solutions are independent from the classification that is been proposed.

      By these considerations, we can conclude that the centralized methods are not suitable for the large scale environments. But this can be effectively used in the systems in which the scale is small and indexing server is reliable. On the other hand, hierarchical methods are more suitable for environments in which scale is bigger since the load is distributed to many locations. But even the load is hierarchically distributed; those methods may still suffer from bottleneck problem in large scale.

      Secondly, comparison is between resource discovery using agent and P2P systems. Both these system have encouraged several researchers to develop diverse types of resource discovery approaches that can improve the reliability and scaling of grid systems:

      The Agent-Based techniques being broadly proposed in the resource discovery systems, because of they have several advantages:

    • They provide autonomy property.

    • The agents use their migration policies to determine new migration sites.

    • The grid systems can benefit from the agent property to accomplish efficient route selection for queries.

      During the queries go across the network the resource discovery methods process the queries inside the resource nodes, which provide up to date resource information.

      Disadvantages in Agent-based techniques:

      Many proposed methods suffer from false-positive problems brought by non-deterministic nature of approach that uses inefficient flooding techniques.

      On the other hand, during the last year many researchers adopted P2P technology in grid systems to enhance the reliability and scalability of the grid systems. Most current P2P approaches increase queries performance by using structured Distributed Hash Table (DHT) systems [12]. The usage of DHT enables the P2P systems to be scalable and reliable because the resource discovery process involves all nodes in the systems. But, on the other hand, DHTs are not miraculous and have some disadvantages for the RD domain. The usage of DHTs limits the RD algorithm in terms of support for dynamic-attribute queries. Since dynamic-attributes of resources are changing in time, keeping these attributes in DHTs is not feasible.

      To solve this problem, many algorithms use the topological structure of the overlay to efficiently distribute the query directly between the resource nodes. Inheriting all properties of overlay systems, P2P-based grid RD methods are suitable for large-scale dynamic environments in which reliability of queries is important.

  3. COMPARISON SUMMARY OF RDTS:

    RDTs are designed based on scalability, dynamicity, reliability. In the section we shall summarize the different RDTs based on above mentioned Criterias

    1. Centralized Systems:

      • Is not scalable due to bottleneck problem.

      • Is tolerant to node dynamicity, but not tolerant to indexing mechanism's dynamicity.

      • Reliable in terms of query correctness, but not reliable in terms of single point of failure.

    2. Hierarchical Systems:

      • Better scalable than centralized systems because of the hierarchical distribution of load.

      • It is tolerant to node dynamicity and to indexing mechanism's dynamicity.

      • It is reliable in terms of query correctness and better reliable in terms of single point of failure.

    3. Agent-Based systems:

      This system is classified into 2 types based on network topology:

      • Unstructured Agent network

        • It is not scalable due to time and message complexities.

        • It is tolerant to node dynamicity since queries are distributed in parallel, but not tolerant in some approaches in which query migrates on a single path.

        • It is not reliable because of either false- positive errors or single point of failures.

      • Structured Agent network

        • It is scalable because of the hierarchical distribution of load.

        • It is not tolerant since dynamicity of nodes in the structure may result in

          disconnectivity of a large portion of the resources.

        • It is not reliable because of the single point of failures.

    4. P2P-Based systems:

    This system is classified into 3 types:

    • Unstructured P2P

      • It is not scalable due to time and message complexities.

      • Is tolerant to node dynamicity since queries are resolved within the nodes.

      • It is not reliable because of the false- positive errors.

    • Super-P2P

      • Not scalable due to bottlenecks.

      • Performs poorly when the Grid is dynamic.

      • It is not reliable because of the false- positive errors and single point of failures.

    • Unstructured P2P

      • It is scalable since complexities are low and load is distributed.

      • Performs poorly when the Grid is dynamic.

      • Reliable since no single point of failures and no false-positive errors exist.

  4. CONCLUSION:

In this paper, different types of grid resource discovery techniques were discussed and analyzed.

The advantages and the drawback of the discussed resource discovery techniques were highlighted. From these results, we can conclude that hierarchical and centralized resource discovery schemes are suitable to be used in small size grids, while P2P schemes are suitable for dynamic and large scale grid systems in which reliability of queries is not important.

REFERENCES

  1. I. Foster, et al., "Cloud Computing and Grid Computing 360-Degree Compared,"Gce:2008Grid Computing Environments Workshop, pp. 60-69,112, 2008.

  2. R. Ranjan, et al., "Peer-to-peer-based resource discovery in global grids: a tutorial," Communications Surveys & Tutorials, IEEE, vol. 10, pp. 6-33, 2008.

  3. I. Foster, et al., "The anatomy of the grid: Enabling scalable virtual organizations," International Journal of High Performance Computing Applications, vol. 15, p. 200, 2001.

  4. P. Kacsuk, et al., "Can we connect existing production grids into a world wide grid?," High Performance Computing for Computational Science- VECPAR 2008, pp. 109-122, 2008.

[6]A.Sharma and S. Bawa, "Comparative Analysis of Resource Discovery Approaches in Grid

Computing," Journal of Computers, vol. 3, p. 60, 2008.

  1. A. Hameurlain, et al., "Resource discovery in grid systems: a survey," International Journal of Metadata, Semantics and Ontologies, vol. 5, pp. 251-263, 2010.

  2. Deniz Cokuslu1,2,3 , Abdelkader Hameurlain2, Kayhan Erciyes3 Grid Resource Discovery Based on Centralized and Hierarchical Architectures 1Izmir Institute of Technology, Department of Computer Engineering, Turkey 2IRIT, Paul Sabatier University, France 3Izmir University, Turkey International Journal for Infonomics (IJI), Volume 3, Issue 1, March 2010.

  3. M. L. Massie,et al., "The ganglia distributed monitoring system: design, implementation, and experience," Parallel Computing, vol. 30, pp. 817-840, 2004.

  4. Abdelkader Hameurlain* and Deniz Cokuslu

    ,Institut de Recherche en Informatique de Toulouse IRIT, Paul Sabatier University, Izmir Institute of Technology, Department of Computer Engineering,

    Gulbahce, Urla, 35430 Izmir, Turkey and Kayhan Erciyes International Computer Institute Ege University, Bornova Resource discovery in grid systems: a survey Int. J. Metadata, Semantics and Ontologies, Vol. 5, No. 3 pp. 251-263, 2010

  5. I. Foster, et al., "Brain meets brawn: Why grid and agents need each other," 2004.

  6. H. Huang, et al., "PChord: a distributed hash table for P2P network," Frontiers of Electrical

    and Electronic Engineering in China, vol. 5, pp. 49-58, 2010.

  7. Mohammed Bakri Bashir, Muhammad Shafie Bin Abd Latiff, Aboamama Atahar Ahmed, Yahaya Coulibaly , Abdul Hanan Abdullah and Adil Yousif, Faculty of Computer Science and Information System Universiti Teknologi Malaysia A Hybrid Resource Discovery Model For Grid Computing, International Journal of Grid Computing & Applications (IJGCA) Vol.2, No.3, September 2011.

  8. Foster, I. and C. Kesselman (2004) The Grid:Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers.

  9. Y. Yin, et al., "The grid resource discovery method based on hierarchical model," Information Technology Journal, vol. 6, pp. 1090-1094, 2007.

  10. A. Iamnitchi, et al., "A peer-to-peer approach to resource location in grid environments," INTERNATIONAL SERIES IN OPERATIONS RESEARCH AND MANAGEMENT SCIENCE, pp. 413-430, 2003.

  11. I. Filali, et al., "A simple cache based mechanism for peer to peer resource discovery in grid environments," 2008, pp. 602-608.

  12. C. Mastroianni, et al., "A super-peer model for resource discovery services in large-scale grids," Future Generation Computer Systems, vol. 21, pp. 1235-1248, 2005.

  13. D. Puppin, et al., "A grid information service based on peer-to-peer," Euro-Par 2005

    Parallel Processing, pp. 454-464, 2005.

  14. M. Marzolla, et al., "Resource discovery in a dynamic grid environment," 2005, pp. 356-360.

  15. D. Talia, et al., "Peer-to-peer models for resource discovery in large-scale grids: a scalable

    architecture," High Performance Computing for Computational Science-VECPAR 2006, pp.

    66-78, 2007.

  16. S. Ding, et al., "A heuristic algorithm for agent-based grid resource discovery," 2005, pp.222- 225.

[23]X. Tang and L. Huang, "Grid resource management based on mobile agent," 2006, pp. 230-238.

  1. K. Jun, et al., "Agent-based resource discovery," 2000, p. 43.

  2. S. Manvi, et al., "An agent-based resource allocation model for computational grids," Multiagent and Grid Systems, vol. 1, pp.17-27, 2005.

  3. G. Kakarontzas and I. K. Savvas, "Agent-based resource discovery and selection for dynamic grids," 2006, pp. 195-200.

Leave a Reply