Implementation of Cooperative Caching Algorithms Using Remote Client Memory

Darshan Kapadia; Nakul Nagpal

doi:10.17577/IJERTCONV2IS09010

ICIAC - 2014 (Volume 2 - Issue 09)

Implementation of Cooperative Caching Algorithms Using Remote Client Memory

DOI : 10.17577/IJERTCONV2IS09010

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 258
Total Downloads : 10
Authors : Darshan Kapadia, Nakul Nagpal
Paper ID : IJERTCONV2IS09010
Volume & Issue : ICIAC – 2014 (Volume 2 – Issue 09)
Published (First Online): 30-07-2018
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Implementation of Cooperative Caching Algorithms Using Remote Client Memory

Darshan Kapadia1, Nakul Nagpal2 1Department of Computer Science,

B. Thomas Golisano College of Computing and Information Sciences

,Rochester Institute of Technology

Address

1darshanrkapadia@gmail.com 2nkn5603@rit.edu

Abstract As technology is advancing, processors speed is increasing much faster than disk access speed. So it becomes necessary to decrease the number of disk accesses by the distributed file system to gain overall performance improvement of the system. As network speed has increased tremendously in recent time, accessing data from remote client machine memory is faster than accessing data from the disk. Thus file caches from number of client machines connected over a high speed network can be combined together to form a global cache, this is called cooperative caching. In cooperative caching when data is requested and if the data is not present on clients local cache, the request is satisfied by another client machine cache if possible. In a typical client server based distributed file system there are 3 levels in memory hierarchy: server disk and server memory, client memory. Cooperative cache can be seen as a fourth level of cache in the distributed file system. The project will implement the following five different read only cooperative caching algorithms using remote client memory: Direct Client Cooperative caching, Greedy Forwarding Algorithm, Centralized Coordinated Cache Algorithm, N-Chance Forwarding Algorithm and N-Chance Forwarding with Centrally Coordinated Cache Algorithm. The project will then compare the performance of the algorithms based on different evaluation metrics such as average read performance, number of disk access, global and local hit ratio. These algorithms though simple in implementation and without major architecture changes can provide really good read performance by decreasing the number of disk accesses and hence reduce total access time.

KeywordsCooperative Caching,Remote,client,Memory, processors speed,disk access.

INTRODUCTION

Cooperative Caching

In Distributed File Systems the amount of data stored in the system is very large. Distributed File Systems should be able to fulfill the data request made by the application as quickly as possible. The main two problems that affect the performance are disk access and network latency to transfer the data from server. Accessing data from remote storage disk is an

expensive operation (in terms of time and network usage) and should be avoided as much as possible. There are many algorithms such as First in First Out, Random Replacement, Least Recently Used etc. for cache replacement on individual clients. As these algorithms manage the local client cache greedily they do not take advantage of the data present on other clients caches [3]. Cooperative caching provides a solution to this problem. In cooperative caching all the clients cache memory can be managed and used as a global resource. If the data requested is not present in local client cache it can be looked up and transferred from other clients cache. This technique requires data transfer from remote client cache memory but avoids disk access. This approach is faster than disk access in terms of time taken to look up and transfer data. Local client caches can still be maintained using Least Recently Used algorithm. There are many cooperative caching algorithms by which the global cache resource can be maintained. Cooperative caching adds an additional level to the memory hierarchy to look up data. Cooperative Caching Algorithms can provide high local and global hit rates as they dont affect the local client cache but still can use other clients caches as a global resource.
PROBLEM STATEMENT
1. Existing System
  
  In existing Distributed File Systems, there are three levels of memory hierarchy: client cache, server memory and disk. Attempts to reduce the amount of disk access to get overall performance improvement are made by implementing various replacement algorithms on the client cache memory such as First In First Out, Random replacement, Frequency Based Replacement, Least Recently used etc. These algorithms do provide high local hit rates and reduce disk access by some amount. Still clients are unaware of other clients caches, so even if data is present in other clients caches they cannot request it, resulting in disk accesses. Also due to lack of coordination between clients there may be many duplicate blocks in the system. The existing work consists of implementation of four Cooperative Caching Algorithms using Remote Client memory on xFS file System [1]. xFS is a server less File System having number of Clients and a Manager. The
  
  work shows the comparison of the four algorithms in the server less Distributed File System environment. In contrast, this project implements the four Cooperative Caching Algorithms using Remote Client memory to improve performance in a Master-Slave Architecture Distributed File System and then proposes a new Cooperative Caching algorithm.
2. Proposed Solution
  
  With cooperative caching, the overall global cache hit ratio increases since there is an addition of remote client memory to system memory hierarchy. Also the overall performance of the system will increase since there are fewer disk access operations. Cooperative Caching is also cost effective as it is an alternative to having a large server memory [1].
  
  The project will implement the following five cooperative caching algorithms on Master-Slave Architecture Distributed File System and compare them based on different evaluation criteria:
  1. Direct Client Cooperative Caching Algorithm [1].
  2. Greedy Forwarding Algorithm [1].
  3. Centralized Coordinated Caching Algorithm [1].
  4. N-Chance Forwarding Algorithm [1].
  5. N-Chance Forwarding with Centrally Coordinated Cache Algorithm.

System Architecture

Fig 1: Master-Slave Architecture of Distributed File System.

The project will implement a modification of the xFS Architecture [6][7]. It will contain a number of clients and a server connected over a high speed Local Area Network (LAN). Server will have access to the storage disk which contains all the files present in the system.

The general steps taken by the system when data is

requested by the application are as follows:

Application requests a block of data from the client.
The client looks for the desired block in its local cache memory. If found it returns the data back to the application.
If not found, the client forwards the request to the server.
The server looks for the desired block in the server memory. If found it returns the data back to the client.
If not found, the server looks for the data in the global cache. If it finds the data in the global cache it forwards the request to the client containing the data. The data transfer takes place between the clients.
If not found the server performs a storage disk access and returns the data back to the client. This is an expensive operation.
Once the client gets the block it puts it in its LRU based local cache.

How the data is managed and looked up in the global cache, what portion of memory is dedicated to local and global cache and how forwarding of data requsts takes place will be different for all the algorithms. All of five algorithms to be implemented will try to avoid disk access since it degrades the performance of the entire file system.

IV .Cooperative Caching Algorithms

Direct Client Cooperation Algorithm

Direct Client Cooperation is the simplest algorithm for cooperative caching as it can be implemented with minor modifications to the existing distributed file system architecture. Direct client cooperation algorithm allows a client to use other clients caches as a backup storage. In this case when an active clients cache is full, the client looks for an idle client and starts forwarding the block entries in idle clients cache. The active client uses idle clients cache in the same way as it uses its own cache. The problem of this algorithm is that clients are unaware of other clients cache. Due to this, in case of a cache miss, the client will need to send the request to the server to get the desired block even if the copy of the same block may be present in another clients cache. Also this may result into disk access on server side if the server does not have the desired block in its memory which can be a really expensive operation. As we will see performance improvement of this algorithm is much less compared to the other cooperative caching algorithms. Also a design issue has to be considered in case the idle client becomes active. In that case the idle client may decide to flush the entries of the other client or may forward it to another idle clients cache. For simplicity of implementation and results interpretation the project will not consider other clients interference, instead the project will just double the amount of cache space.
Greedy Forwarding Algorithm

In Greedy Forwarding Algorithm all the clients caches are used as an aggregated global shared resource. So a clients read request may be fulfilled by any client cache present in the system. Still this algorithm does not attempt to

coordinate the contents of the client cache. So each client manages its cache greedily without having any knowledge of another clients cache. In this algorithm when a read request is made the client first looks into its own local cache and if the block is not present it forwards the request to the server. The server looks for the desired block in its cache; if present it supplies the data back to the client. If the desired block is not present in the servers cache, the server looks up the desired block in other clients caches. If another client has the copy of the desired block, the server forwards the read request to the client containing the data. The data transfer then takes place directly between the client having the data and the requesting client. Since the data is not transferred back to the server and as server only needs to forward the request, the load on server is reduced. Once the client gets the block it puts it in the local cache. If the clients cache is full it evicts the Least Recently Used block from the cache. For this algorithm, a data structure has to be maintained such as a hash table which maps the clients and the contents in their caches. The server can look up this table to find the client and forward the request. Greedy Forwarding Algorithm should increase the global hit ratio since the request is now not only fulfilled by single client cache or server but can be fulfilled by all clients caches present in the system.
Centralized Coordinated Cache Algorithm

Centralized Coordinated Cache algorithm adds coordination to the greedy forwarding algorithm. Every clients cache is divided into two parts. One part (local part) is managed greedily by the client whereas the other part of the cache is coordinated globally by the server. The server uses this cache as an extension to its central cache. Once a block is requested, the client first searches for the block in the local part of cache. If not found the client forwards the requests to the server. The server looks up the globally coordinated portion of the clients cache for desired block. If found the server forwards the request to the client having the data. If not found in the globally coordinated cache, the server performs a disk access operation. Once the client gets the block it puts it in the local cache. The server manages major portion of the clients cache and remaining portion of it is managed greedily by the client. The project will vary this ratio to find the optimum partition ratio. Server uses a global replacement algorithm to manage the global part of the clients cache. When a server removes a block from its local cache it replaces the least recently used block in globally shared cache with this block. The server renews the entry of the block in the LRU list. The main advantage of this algorithm is that it will achieve high global hit rate since there is a very large globally shared memory. The disadvantage of this algorithm is that because the local cache size is reduced, the local hit ratio is affected. Also since the server is required for forwarding majority of the requests the load on the server also increases.
N-Chance Forwarding Algorithm

N-Chance Forwarding algorithm is an extension to the greedy forwarding algorithm. N-Chance forwarding adjusts the portion of the clients cache that is managed cooperatively

based on the activity of the client. N-Chance Forwarding Algorithm modifies the greedy forwarding algorithm so that it caches singlets. Singlets are blocks that are present exactly in one clients cache in the entire distributed file system. In the previous three algorithms if there is block which is present in exactly one clients cache and if that block is removed from the system, the next request to access this block will result into disk access which is very expensive. N-Chance forwarding algorithm addresses this problem. In N-Chance forwarding algorithm when the client is about to discard a block from its cache, it checks to see whether that copy is the last copy of the block in the entire system. If it is, it sets the recirculation count of that block to be n-1 and forwards it to any random peer in the network. The receiving client puts this block in its Least Recently Used cache as if the block just had a hit. When this client is about to evict this block from its cache it decrements the recirculation count and follows the same above procedure. When the recirculation count on the block reaches 0 it is simply discarded from the system. If the block is read anytime during the process the recirculation count is set back to n. In this algorithm the active client will tend to discard blocks from it cache faster than idle clients. Idle clients on the other end will accumulate the singlets which is good as it provides a greater global hit ratio. Since this algorithm does not modify the amount of local cache it provides a good local hit ratio. Also since by protecting singlets the algorithm provides a good overall global hit ratio.
N-Chance Forwarding with Centrally Coordinated Cache Algorithm

The project proposes a new algorithm which is a modification to the Centrally Coordinated Cache and N- Chance Forwarding Algorithm. In Centrally Coordinated Cache Algorithm the client cache is divided into 2 parts: One part is managed greedily by the client and the other part is managed globally by the server. The global part of clients cache occupies major portion and local part is the rest. This decreases the local hit ratio significantly (as size of the local cache has decreased) but increases the global hit ratio. In N- Chance Forwarding Algorithm the singlets are re-circulated in the system before they are discarded thus increasing the chances of a global hit. Instead of 80/20 partitioning of the clients cache as in Centralized Coordinated Cache Algorithm, we partition the clients cache in 50/50 ratio and implement N- Chance Forwarding in locally managed portion of clients cache. Since we have only used50% of clients cache for globally shared cache and used 50% for local cache, we do not expect that the global hit ratio will be affected significantly. Also because of N-Chance Forwarding in the remaining 50% the local managed cache, the local hit ratio should also be good. Thus this algorithm will try to use best of both the algorithms.

SIMULATION METHODOLOGY
To compare the performance of the five cooperative caching algorithms mentioned above the following evaluation metrics are used:
- Average Block Read Time: Time it takes on average to perform a block read operation by the algorithm.
- Number Of Disk Access: Total number of disk access required to perform the read request by the Application
- Global Cache Hit: Total number of blocks read from the global cache. It represents the number of blocks read from global cache.
- Local Cache Hit: Total number of blocks read by client from its local cache.
- Number of Clients: Total number of clients in the system.

Test Cases

All the five algorithms are tested under the following test cases. While testing each case, the simulation parameters such as clients cache size, server cache size, number of blocks read and time required for performing read of blocks are kept constant across algorithms. Also the same random seed is used across algorithms so that same files are read in all simulations.

Number of blocks in disk is less than total available cache after n clients.

The simulation parameters for this test case are as follows:

Number Of Files	1500
Blocks Per File	5
Total Blocks On Disk	7500
Total File Reads	4000
Client Cache Size	750 blocks
Server Cache Size	2000 blocks
Local Memory Access Time	0.25 msec
Server Memory Access Time	1.05 msec
Remote Client Memory Access Time	1.25 msec
Disk Access time	15.85 msec

Table 1: Simulation parameters for test case 1.

With the simulation parameters in table 1, after about 8 clients in the system, the sum of the clients cache and server cache is almost equal to the total blocks in the system. Because of this it is possible that all the blocks on the disk may be cached in client or server cache.

The performance of the 5 algorithms under the above test case is as follows:

Direct client cooperation algorithm

Number of Clients	Disk Access (Number of blocks)	Local Cache Hit (Number of blocks)	Average Access Time (msec)
4	13820	4140	11.1112
8	13280	4285	10.7058
12	13120	4090	10.5952
16	13002	4178	10.50436
20	13245	3865	10.6967

Table 2: Readings for Direct Client Cooperation Algorithm

Greedy forwarding algorithm

Number of Clients	Disk Access (Number of Blocks)	Local Cache Access (Number of Blocks)	Remote Client Cache Access (Number of Blocks)	Average Access Time (msec)
4	9995	2020	4030	8.4058
8	6694	1918	7288	5.99972
12	4470	2016	9014	4.3673
16	3350	1877	10097	3.55489
20	2328	1997	10831	2.80115

Table 3: Readings for Greedy Forwarding Algorithm

Centrally coordinated cache algorithm with 80 global/20 local cache partition.

Number of Clients	Disk Access (Number of Blocks)	Global Cache Access (Number of Blocks)	Local Cache Access (Numbe r of Blocks)	Average Access Time (msec)
4	8134	6532	554	7.11232
8	1940	12380	568	2.58668
12	0	14288	465	1.17428
16	0	14210	386	1.17666
20	0	14197	425	1.17497

Table 4: Readings for Centrally Coordinated Cache Algorithm

N-Chance forwarding algorithm with recirculation count = 2

Number of Clients	Disk Access (Number of Blocks)	Local Cache Access (Number of Blocks)	Remote Client Cache Access (Number of Blocks)	Average Access Time (msec)
4	7995	2053	5921	6.94339
8	913	2059	12345	1.76671
12	217	2108	13148	1.25774
16		92	1911	13331	1.17495
20	32	1986	13205	1.12629

Table 5: Readings for N-Chance Forwarding Algorithm

N-Chance forwarding with centrally coordinated cache algorithm with 50 global/50 local cache partition.

Number of Clients	Disk Access (Number of Blocks)	Global Cache Access (Number of blocks)	Local Cache Access (Number of blocks)	Remote Client Cache Access (Numbe r of Blocks)	Average Access Time (msec)
4	7912	3148	1062	3878	6.93266
8	2286	4959	1100	6871	2.81594
12	405	6392	1131	7518	1.44356
16	166	6866	1024	7227	1.27281
20	107	7000	1011	7283	1.23157

Table 6: Readings for N-Chance Forwarding With Centrally Coordinated Cache Algorithm

The table below shows the readings when the percentage of Global Cache and Local Cache is varied in Centrally Coordinated Cache when number of

Global Cache (%)	Local Cache (%)	Disk Access (Number of Blocks)	Global Access (Number of Blocks)	Local Cache Access (Number of Blocks)	Average access time (msec)
50	50	8586	5650	1180	7.41294
60	40	7394	6800	950	6.55156
70	30	6175	8210	730	5.6724
80	20	5044	9405	485	4.85721
90	10	3967	10530	300	4.07888

Table 7: Readings for centrally coordinated cache by varying local cache/global cache proportion.

The graph in figure 2 shows the number of blocks accessed from the disk when the number of clients is increased from 4 to 20. Total number of blocks read= 20000.

Figure 2: Graph of Number of Clients vs. Number of blocks accessed from disk for 5 algorithms

We can see from the above graph that as the number of clients is increased in the system the number of blocks read from the disk reduces for all the algorithms except the Direct Client Cooperation Algorithm. Also the number of disk accesses by Greedy Forwarding algorithm decreases with increase in clients but not as much the Centrally Coordinated Cache, N-Chance Forwarding and N-Chance Forwarding with Centrally Coordinated Cache algorithm. When number of clients is greater than 8 the disk access by the Centrally Coordinated Cache, N-Chance Forwarding and N-Chance Forwarding with Centrally Coordinated Cache algorithm is much less. This is because almost all the blocks in the system are cached in either clients cache or server cache.

The graph in figure 3 shows the average access time to read a block for 5 algorithms when the number of clients is increased.

Fig3: Graph of Number of Clients vs. Average Block Access time for 5 algorithms

It can be seen that when the number of clients are increased it does not have any effect on the average access time by Direct Client Cooperation. Also the average access time to read a block in Direct Client Cooperation is much greater than the other 4 algorithms. The average access time of greedy forwarding algorithm decreases but not as much the other 3 algorithms. When number of clients are greater than 8 the average access time of Centrally Coordinated Cache, N-Chance

Forwarding and N-Chance Forwarding with Centrally Coordinated Cache algorithm is almost same and remains constant. This is because all the blocks in the system are cached either in clients cache or server cache.

The graph in figure 4 shows the distribution of how the 20000 blocks are read in the five algorithms. The distribution shows the number of blocks read from disk, server cache, global cache, client local cache and remote clients cache.

Fig 4: Graph of distribution of location of blocks for the

5 algorithms

The graph in figure 5 shows the average access time to read a block for all the five algorithms when number of clients=8.

Fig 5: Graph of Average block access time for 5 algorithms

Greedy Forwarding Algorithm

Number Of Clients	Disk Access (Number Of Blocks)	Local Cache Access (Number Of Blocks)	Remote Client(Num ber Of blocks)	Average access time (msec)
4	14880	1592	1739	12.01491
8	12037	1195	4788	9.95746
12	9953	1073	6786	8.44016
16	8678	1064	7858	7.50774
20	7337	1118	9299	6.52765

Number of Blocks in the disk is greater than available cache size

The simulation parameters for this test case are as follows:

Number Of Files	3000
Blocks Per File	5
Total Blocks	15000
Total File Reads	4000
Client Cache Size	750 blocks
Server Cache Size	2000 blocks
Local Memory Access Time	0.25 msec
Server Memory Access Time	1.05 msec
Remote Client Memory Access Time	1.25 msec
Disk Access time	15.85 msec

Table 8: Simulation parameters for test case 2.

In this test case we increase number of blocks in the disk. So now the number of blocks in disk is greater than all the available cache size even when the number of clients is equal to 20. So in this case there will be disk access in all the algorithms even for large number of clients.

Table 10: Reading for greedy forwarding algorithm

Centrally Coordinated Cache Algorithm

Global

Cache Local Cache Average

Numbe

Disk Access

access

The results for the above simulation parameters for five algorithms are as follows:

r Of Access Access (Number Of (Number

Clients Of (Number Of

time
1. Direct Client Cooperation

Number

of Disk Access Local Cache Access

Average Access

Time 4

Blocks) (msec)

Blocks) Blocks)

13994 3309 180 11.43145

Clients

(Number Of blocks)

(Number of 8

4	16823	1991	13.41938
8	<>16753	1988	13.3677
12	16345	2354	13.05114
16	16563	2088	13.2231
20	16501	2110	13.17634

Blocks) (msec) 12

16

20

10822 6368 270 9.11116

7408 9857 183 6.62317

4403 12840 205 4.42842

1184 16594 141 2.08646

Table 11: Reading for Centrally cordinated Cache Algorithm

Table 9: Reading for Direct Client CooperationCache

N-Chance Forwarding Algorithm

Number Of Clients	Disk Access (Number Of blocks)	Local Cache Access (Number Of Blocks)	Remote Client Access (Number Of Blocks)	Average access time(ms ec)
4	12235	988	4678	10.1111 6
8	6273	964	10490	5.75836
12	1982	1071	14645	2.62029
16	1140	1061	15441	2.00557
20	661	1040	15893	1.65647

Table 12: Reading for N-Chance Forwarding Algorithm

N-Chance Forwarding with Centrally Coordinated Cache Algorithm

Number Of Clients	Disk Access (Number Of blocks)	Global Cache Access (Numbe r Of Blocks)	Local Cache Access (Numbe r Of Blocks)	Remote Client Access (Number Of Blocks)	Average access time(mse c)
4	12197	2031	478	3030	10.10727
8	6675	3331	584	7039	6.06984
12	2994	4873	584	9136	3.38229
16	1351	6142	544	9511	2.18451
20	689	7248	500	9194	1.70428

Table 13: Readings for N-Chance Forwarding with Centrally Coordinated Cache Algorithm

The graph in figure 6 shows the number of blocks accessed from the disk when the number of clients is increased from 4 to 20. Total number of blocks read= 20000.

Fig 6: Graph of Number of Clients vs. Number of blocks accessed from disk for 5 algorithms

As we can see in figure 6 that when the number of blocks on the disk is very large there is disk access in all algorithms even when the number of clients is equal to 20. Also the disk access in direct client cooperation is highest and remains constant independent of number of clients. The number of disk access in greedy forwarding algorithm decreases with increase in number of clients but not as much as other three algorithms. In contrast to the first test case, the number of disk access in Centrally Coordinated Cache algorithm is greater than N- Chance Forwarding algorithm and N-Chance Forwarding with Centrally Coordinated Cache algorithm. This is because even when 80% of clients cache is maintained globally, the total cache size is less than number of blocks on the disk. So not all blocks are cached and thus disk access occurs even when the number of clients is equal to 20. The N-Chance Forwarding with Centrally Coordinated Cache and N-Chance Forwarding perform almost equally.

The graph in figure 7 shows the average access time to read a block for all algorithms when the numbers of clients is increased.

Fig7: Graph of Number of Clients vs. Average Block Access time for 5 algorithms

From graph in figure 7 it can be seen that the average access time for Direct Client Cooperation is very high and remains constant independent of the number of clients. The average block access time for Greedy Forwarding algorithm decreases with increase in clients. The average block access time of Centrally Coordinated Cache decreases with increase in clients but it is higher than the block access time in test case 1. This is because of disk accesses which occur because all the blocks of the disk are not cached. The average block access time for N-chance Forwarding algorithm and N-Chance Forwarding with

Centrally Coordinated Cache is the same and much less compared to the other three algorithms.

The graph in figure 8 shows the distribution of how the 20000 blocks are read in the five algorithms. The distribution shows the number of blocks read from disk, server cache, global cache, client local cache and remote clients cache.

ANALYSIS OF RESULTS

The following analysis can be made based on the results of the test cases above:
1. In both the test cases the Direct Client Cooperation algorithm has the worst performance compared to the other algorithms. This is because the clients are unaware of other clients cache content. So disk access occurs even if the block is present in another clients cache.
2. The performance of greedy forwarding algorithm is much better than Direct Client Cooperation algorithm in both test cases because of coordination of clients cache content. The client requests a block for remote client if it has that block rather than server performing a disk access. But still the performance of greedy forwarding is not as good as the following three algorithms because of presence of multiple copies of block among clients.
3. The Centrally Coordinated Cache Algorithm outperforms all the other algorithms in test case 1. This is because when the number of clients in the system is greater than 8, the global cache formed by the 80% of all clients cache caches all the blocks present in the system. Thus there is no need of performing disk access as all the blocks can be found in local cache, server cache or global cache.
  
  Fig 8: Graph of distribution of location of blocks for the 5 algorithms
  
  But in test case 2, Centrally Coordinated Cache Algorithm does not perform as well as N-Chance Forwarding Algorithm or N-Chance Forwarding with Centrally Coordinated Cache Algorithm. This happens because of the large number of blocks on the disk. So even with more than 20 clients in the system all the blocks in the disk are not cached by Centrally Coordinated Cache because of which disk access occurs.
4. N-Chance Forwarding algorithm performs the best under both the test cases. This is because N-Chance forwarding
  
  The graph in figure 5 shows the average access time to read a block for all the 5 algorithms when number of clients=8.
  
  algorithm prefers singlets. By doing this N-Chance Algorithm tries hardest to have at least one copy of every block in the system. Also in N-Chance Algorithm the number of local cache hits is high as clients use the entire local cache. Because of the high global and
  
  local hit ratio the average access time of N-Chance Algorithm is the least among all the algorithms under both the test cases.
5. N-Chance Forwarding with Centrally Coordinated Cache algorithm performs as well as the N-chance forwarding algorithm and better than the Centrally Coordinated Cache algorithm. It performs better than Centrally Coordinated Cache algorithm because in Centrally Coordinated Cache algorithm the number of local cache hits is much less as it uses only 20% of clients cache as local cache where as in N-Chance Forwarding with Centrally Coordinated Cache the local cache uses 50% o
Fig 9: Graph of Average block access time for 5 algorithms clients cache. Also in N-Chance Forwarding with

Centrally Coordinated Cache because of preference of singlets and 50% of clients cache as global cache the global hit ratio is high.
Conclusions

In Distributed File Systems, the high number of disk access can largely affect the performance of the system. Based on the results of the performance of the various Cooperative Caching algorithms under the test cases, it can be concluded that Cooperative Caching algorithms can improve the read performance by reducing the amount of disk access. Cooperative Caching algorithms take advantage of remote clients cache memory and thus avoiding disk access. Though simple in implementation and without many changes in System Architecture the Cooperative Caching algorithms can provide very good read performance improvement. Greedy forwarding algorithm is very simple in terms of implementation and can provide some read performance improvement. Centrally Coordinated Cache algorithm can provide very good read performance improvement when the number of files in system is small but it imposes load on the server. N-Chance Forwarding algorithm and N-Chance Forwarding with Centrally Coordinated Cache can provide significant improvement in the overall performance of the system. This project implementation shows that Greedy Forwarding Algorithm, Centrally Coordinated Cache Algorithm, N-Chance Forwarding Algorithm and N-Chance Forwarding Algorithm reduce the number of disk access, reducing the average block access time which increases the overall performance of the system.

Siddhartha Annapureddy, Michael J. Freedman, David Mazi`ere. Shark: Scaling File Servers via Cooperative Caching. In Proceedings of the 2nd conference on Symposium on Networked Systems Design and Implementation – Volume 2 (NSDI'05)
Randolph Y Wang, Thomas E Anderson. xFS: A Wide Area Mass Storage File System. Whitepaper.
http://www.tesisenxarxa.net/TESIS_UPC/AVAILABLE

/TDX-0721109- 153311//TTCR2de2.pdf

ACKNOWLEDGMENT

I would like to thank Prof. Hans-Peter Bischof for his guidance and help throughout implementation of my Masters Capstone Project. I would also like to thank Prof. James

Heliotis for his valuable inputs for improving my Masters Project Report. I am grateful to Prof. Alan Kaminsky for his Computer Science Course Library which helped me for

implementation of my Masters Project.

REFERENCES

M. Dahlin, R. Wang, T. E. Anderson, and D. A. Patterson. Cooperative caching: Using remote client memory to improve file system performance. OSDI94, 1994
Michael D. Dahlin, Clifford J. Mather, Randolph Y. Wang, Thomas E. Anderson, David A.

Patterson. Quantitative Analysis of Cache Policies fir Scalable Network File Systems.

Proc. 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. May 1994.
Benjamin Reed and Darrell D. E. Long. 1996. Analysis of caching algorithms for distributed file systems. SIGOPS Oper. Syst. Rev. 30, 3 July 1996.
Sanjay Ghemawat, Howard Gobioff, Shun Tak Leung. The Google File System. 19th ACM Symposium on Operating System Principles. October 2003.

Implementation of Cooperative Caching Algorithms Using Remote Client Memory

Leave a Reply