Fault Tolerant Objects for Application Performance in Grid

DOI : 10.17577/IJERTV2IS121024

Download Full-Text PDF Cite this Publication

Text Only Version

Fault Tolerant Objects for Application Performance in Grid

Sunil Gavaskar. P1, Subbarao Ch D. V2

Research Scholar, Department of Computer Science and Engineering, S.V.University, Tirupathi, India 1

Professor, Department of Computer Science and Engineering, S.V.University, Tirupathi, India2

Abstract

In Grid computing infrastructure, data are usually distributed among different nodes and thus shared among several users. Real time application with replicated data in grid has received particular attention for providing fault tolerance, efficient access to achieve satisfactory performance and to enhance the parameter of interest like CPU/memory consumption for a grid job. In this paper with the help of workflow monitoring capability to be used within a grid job, we faced the problem of data dependability, data Independence and monitoring the internal state of the objects within an agent oriented workflow monitoring grid job environment. Work flow allows the reference links to handle exceptions occurred in fault tolerance. The metadata usability of Behavioural, structural dependencies and independence at metadata catalogue helps to recovery of jobs when fault occurred. This undoubtedly has increased the availability and efficiency on data access.

  1. Introduction.

    Grid [2] is a basic infrastructure of national high performance computing and Information services, it achieves the integration and interconnection of many types of high performance Computers, Data servers, large-scale storage systems which are Distributed and heterogeneous, and the important application research queries which are lack of effective research approach. Distributed hash tables (DHTs) and other structured overlay networks were developed to give an efficient key based addressing of nodes in volatile environments [1](i.e. distributed nodes that may join, leave or crash at any time).Due to these, two challenges arise in such environments:(a).When a node crashes , all data stored on this node is lost, But it can be addressed by data replication.(b).The node is suspected to be crashed, lookup inconsistencies may occur as a result of that wrong query results or loss of update requests may occur, the second issue can only be relieved but not overcome: It was shown in asynchronous network[3].

    The rest of the paper is organized as follows. In section 2, we present some strategies related to replication in grid environments, Section 3 describes the proposed model, Section 4 are results Finally, we conclude our strategy in section 5.

  2. Related works.

    Data grid [4] provides services for supporting the discovery of resources and enables computing in heterogeneous storage resource by storage of resource agent. Data consistency cannot be achieved if responsibility consistency violated [3] (i.e. when a node is suspected to be crashed, which lead to loss of update requests or wrong query results).As shown in[6]the probability of inconsistent data accesses can be reduced by increasing the replication degree and performing reads on a majority of replicas. In order to ensure data availability when nodes join, leave or crash at any time, the Data consistency is enforced by performing all data operations on a majority of replicas [7].Replication in distributed is a practical and effective method to achieve efficient and fault tolerant data access in grids [8-10].Replica utilization in fault tolerance strategy is challenging research in data grid. By placing multiple replicas at different locations, replica management [5] can reduces network delay and bandwidth consuming of remote data access. In Grid environment monitoring and performance measurement is a basic activity used at the resource and process level [11,12].The grid monitoring can be categorized as external and internal, the external monitoring performs monitoring of grid job at specified time intervals, where as in internal monitoring the self monitoring capabilities are included to monitor internal state of the job and parameters like cpu time and memory they use. Some workflows are used in data warehouse applications to monitor the dependency and independency of execution using data structure called behaviour and structure. Issues with Replication are widely studied in Grid. Our Proposed Agents replica model mainly shows that the optimized Fault tolerance model.

  3. Fault tolerance Infrastructure.

    The fault tolerance system proposed in this section has been to provide reliable model for replication in grid environments, if Agents become failed then adjacent agent performs its fault recovery operations with help of catalogue mapping.

    Magent Magent

    Sagents Sagent

    application used to monitor the each record as objects to show that the nature of dependency and Independency.Behavioural aspects controls the runtime attributes (objects states, if more than one user is trying to accesses same replica).Second aspect structural dependency/independency, which deals with structure of the agents/format of data. The dependency refers to technique that permits the monitoring of objects with different states through the metadata catalogue, the changes in object states will influence on metadata catalogue.

    Objects Objects

    Figure.1.Replicas Initial Stage Figure.2.Replicas when Sagent fails

    Sagents: The role of agents is storing local replicas of the currently registered distributed objects.

    Magents: They also act as replicas and contain local copies, there by subset of Agents becomes under the control of Magents, in the proposed model we consider that we have numbers of Magents are equivalent of classes. Sagents are having length of class i.e. Number of objects supporting in it.

    The agents are considered as replicas, it contains objects to identify particular replica or same replica with different request will be treated as object with different state clients. In order to identify particular object failed or not it requires log B steps, where as B In Grid environment monitoring and performance represents number of objects currently possessed by an agents.

    The client initiates the update or search operations. (1).The Magent discovers that particular Sagent is not active in the system (i.e. agent failed), then the Magent which is having control over its Sagents performs object reallocation to adjacent sagent using inheritance technique.(2).If Magent fails, one of the Magent connected Sagent converted into Magent.

  4. RESULTS

    The experimental setup made with parameters value collected from the Globus Toolkit Version 4.0.5, Sigar Version 1.6.0, the Sigar supports various operating systems across different platforms and allows gathering of monitoring parameters such as CPU, Memory, and handles the file usage(replicas).The data warehouse

    TABLE I

    DEPENDENT OBJECT/REPLICAS MEASUREMENT

    Time (Seconds)

    Dependent(Seconds)

    User System Total

    20

    6.04

    13.28

    19.32

    40

    12.08

    26.57

    38.65

    60

    18.12

    39.86

    57.98

    80

    24.16

    53.15

    77.31

    100

    30.2

    66.44

    96.64

    120

    36.24

    79.72

    115.96

    140

    42.28

    93.01

    135.29

    TABLE II

    INDEPENDENT OBJECT/REPLICAS MEASUREMENT

    Time (Seconds)

    Independent(Seconds)

    User System Total

    20

    5.88

    13.40

    19.28

    40

    11.77

    26.81

    38.58

    60

    17.66

    40.22

    57.88

    80

    23.55

    53.63

    77.18

    100

    29.44

    67.04

    96.48

    120

    35.32

    80.44

    115.76

    140

    41.21

    93.85

    135.06

    TABLE III

    PERFORMANCE OF DEPENDENT AND INDEPENDENT TECHNIQUES

    Time (Sec)

    Independ.. ( m )

    Depend.. (n)

    m/n

    (n/m)-1

    20

    19.28

    19.32

    1.002074

    0.002074

    40

    38.58

    38.65

    1.001814

    0.001814

    60

    57.88

    57.98

    1.001727

    0.001727

    80

    77.18

    77.31

    1.001684

    0.001684

    100

    96.48

    96.64

    1.001658

    0.001658

    120

    115.76

    115.96

    1.001727

    0.001727

    140

    135.06

    135.29

    1.001702

    0.001702

    The comparative analysis from the Table.3. Results marginal performance from 0.001 to 0.002.Due to the inclusion of metadata catalogue and Object features in replication.

    Figure.3. Parallel job workflow monitoring

    On the other hand, the data warehouse workflow considered as Replicas of records. From the figure.3 sequential file25, sequential file28 are used as source and reference links. Sequential file27, sequential file26 are rejected and target links used to maintain rejected and accepted replicas records as objects. Lookup stage used to monitor data moved between target and reject links on specific conditions. The following log report shows that number of records rejected and accepted when we use certain dependency and independency conditions at lookup stage.

    DataStage Report – Summary Log for job: sun_lookup

    Produced on: 12/17/2013 12:07:50 AM

    Project: DS_project Host system: home Items: 87 – 132

    Sorted on: Type

    Occurred: 11:43:04 AM On date: 11/11/2013 Type: Info Event: Parallel job reports successful completion

    DataStage Report – Summary Log for job: sun_lookup

    Produced on: 12/17/2013 12:07:50 AM

    Project: DS_project Host system: home Items: 87 – 132

    Sorted on: Type

    Occurred: 11:43:04 AM On date: 11/11/2013 Type: Info Event: Parallel job reports successful completion

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Event: main_program: Startup Time 0:01 Production Run Time

    0:00

    Event: main_program: Startup Time 0:01 Production Run Time

    0:00

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Event: main_program: Step execution finished with status = OK.

    Event: main_program: Step execution finished with status = OK.

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Event: Sequential_File_28,0: Export complete. 1 records

    exported successfully, 0 rejected.

    Event: Sequential_File_28,0: Export complete. 1 records

    exported successfully, 0 rejected.

    Event: Sequential_File_26,0: Export complete. 3 records

    exported successfully, 0 rejected.

    Event: Sequential_File_26,0: Export complete. 3 records

    exported successfully, 0 rejected.

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Event: Sequential_File_25,0: Import complete. 4 records

    imported successfully, 0 rejected.

    Event: Sequential_File_25,0: Import complete. 4 records

    imported successfully, 0 rejected.

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:04 AM On date: 11/11/2013

    Type: Info

    Event: Sequential_File_27,0: Import complete. 4 records

    imported successfully, 0 rejected.

    Occurred: 11:43:03 AM On date: 11/11/2013 Type: Info Event: main_program: APT configuration file:

    C:/Ascential/DataStage/Configurations/default.apt (…)

    Event: Sequential_File_27,0: Import complete. 4 records

    imported successfully, 0 rejected.

    Occurred: 11:43:03 AM On date: 11/11/2013 Type: Info Event: main_program: APT configuration file:

    C:/Ascential/DataStage/Configurations/default.apt (…)

    Occurred: 11:43:03 AM On date: 11/11/2013

    Event: main_program: orchgeneral: loaded (…)

    Type: Info

    Occurred: 11:43:03 AM On date: 11/11/2013

    Event: main_program: orchgeneral: loaded (…)

    Type: Info

    Occurred: 11:43:03 AM On date: 11/11/2013

    Type: Info

    Occurred: 11:43:03 AM On date: 11/11/2013

    Type: Info

    Event: main_program: Ascential DataStage(tm) Enterprise

    Edition 7.5 (…)

    Event: main_program: Ascential DataStage(tm) Enterprise

    Edition 7.5 (…)

    Occurred: 11:43:02 AM On date: 11/11/2013

    Event: Parallel job initiated

    Type: Info

    Occurred: 11:43:02 AM On date: 11/11/2013

    Event: Parallel job initiated

    Type: Info

    Occurred: 11:43:01 AM On date: 11/11/2013

    Event: Environment variable settings: (…)

    Type: Info

    Occurred: 11:43:01 AM On date: 11/11/2013

    Event: Environment variable settings: (…)

    Type: Info

    Event: Starting Job sun_lookup.

    Event: Starting Job sun_lookup.

  5. CONCLUSION

This paper addresses the results of dependent and Independent behaviour through the proposed agents fault tolerance model, this model supports for large scale distributed monitoring system. Our goal is to invest the system with behaviour of Meta data level (Repository) that can be used to start the job from failure state with dependent and independent behaviours presented in the above agents oriented architecture. Through statistics, the work presented in this section demonstrates an early reliability analysis model within a grid job. We presently enhanced our model through this work flow analysis. We shown that an optimistic fault tolerance replication model, the statistics helps for data redundancy technique used in the inter-SE (storage element) and intra-SE.

ACKNOWLEDGMENT

We deeply thank Dr. Ch D.V.Subbarao for his valuable help.

Occurred: 11:43:04 AM On date: 11/11/2013

Type: Info

Occurred: 11:43:04 AM On date: 11/11/2013

Type: Info

REFERENCES

  1. L.Alima, S.El-Ansary, P.Brand and S.Haridi.DKS (N, k, f): A family of low-communication, scalable and fault-tolerant infrastructures for P2P applications. Workshop on Global and P2P Computing, CCGRID 2003, May 2003.

  2. Foster I, Kesselman C.The grid: blue print for a new computing infrastructure [M]. San Francisco, USA: Morgan Kaufman Publishers, 1999.

  3. A.Ghodsi.Distributed k-ary system: Algorithms for distributed hash tables. PhD Thesis, Royal Institute of Technology, 2006.

  4. Y. Wang, N. Xiao, R. Hao, et al. Research on key technology in data grid [J].Journal of Computer Research and evelopment, 2002, 39(8):943-947.

  5. H. Lamehamedi , B. Szymanski, Z. shentu, and E. Deelman , Data replication strategies in grid environments, in Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing , 2002, pp.378-383.

  6. T.M.Shafsst, M.Moser, T.Schutt, A.Reinefeld, A.Ghodsi, S.Haridi,Key-based consistency and availability in structured overlay networks. Infoscale, June 2008

  7. A.Ghodsi, L.Alima, S.Haridi. Symmetric replication for structured Peer-to-Peer Systems.DBISP2P, Aug.2005

  8. Jose M. Perez, Felix Gracia-Carballeira, Jesus Carretero, Alejandro Calderona and Javier Fernandeza, Branch replication scheme: A new model for data replication in large scale data grids, Future Generation Computer Systems, Vol.26, No.1,,pg12-20,Jan2010.

  9. Gao, M.Dahlin, A.Nayate, J.Zheng and A.Iyengar,Improving Availability and Performance with Application-Specific Data Replication, IEEE Trans.Knowledge and Data Engineering, Vol.17, No.1, pp.106-200, 2005.

  10. M.Tang, B.S.Lee, X.Tang and C.K.YeoThe impact on data replication on Job Scheduling Performance in the Data Grid International Journal of Future Generation of Computer Systems, Elsevier (22), pp254-268, 2006.

  11. I.Foster, C.Kesselman, and S.Tuecke,The Anatomy of the Grid: Enabling Scalable Virtual Organizations, In the International Journal of High Performance Computing Applications, vol.15, pp.200-222, 2001.

  12. S.Zanikolas and R.Sakellariou,A taxonomy of grid monitoring systems, In Future Generation Computer Systems, vol.21, issue.1, pp:163-188, 2005,Elsevier Science Publishers.

BIOGRAPHY

  1. P. SunilGavaskar is a PhD candidate at the Department of Computer Science and Engineering, Sri Venkateswara University; Tirupathi.His research interests include distributed systems, Grid Computing.

  2. Dr. Ch D.V.Subbarao is a professor and Head of the Department of Computer Science and Engineering, Sri Venkateswara University; Tirupathi. His research interests include distributed systems, Grid Computing.

Leave a Reply