Effect of Performance in Heterogenous Traffic in Opportunistic Networks

DOI : 10.17577/IJERTCONV4IS22072

Download Full-Text PDF Cite this Publication

Text Only Version

Effect of Performance in Heterogenous Traffic in Opportunistic Networks

Gopi Krishna T. L

Assistant Professor, Dept of CSE, VTU, ADARSHA Institute of Technology

Bangalore, India

Harish G. N

Assistant Professor, Dept of CSE, VTU, ADARSHA Institute of Technology

Bangalore, India

Abstract:- In opportunistic networks, direct communication between mobile devices is used to extend the set of services accessible through cellular or Wi-Fi networks. Mobility patterns and their impact in such networks have been extensively studied. In contrast, this has not been the case with communication traffic patterns, where homogeneous traffic between all nodes is usually assumed. This assumption is generally not true, as node mobility and social characteristics can significantly affect the end-to-end traffic demand between them. The proposed system is to explore the joint effect of traffic patterns and node mobility on the performance of popular forwarding mechanisms, through simulations. Here in, heterogeneity renders the added value of using extra relays more/less useful. Furthermore, we confirm the intuition that an increasing amount of heterogeneity closes the performance gap between different forwarding policies, making end to- end routing more challenging in some cases, or less necessary in others.

  1. INTRODUCTION

    Opportunistic Networks are used when the future is expected to support communication during environments which are really challenging, when the intermediate nodes are absent (Example: critical situations after disaster).Opportunistic Networks require nodes which are mobile in nature. These nodes communicate only when they come into direct communication. In order to communicate with each other the nodes should be in the same transmission range.

    Most of the users make use of Bluetooth for transferring the data files. But the range of Bluetooth is limited compared to Wi-Fi range of transmission. Hence the communication within the nodes is not continuous and difficult to maintain end to end paths as well. As our main intension is to provide security for our data, the nodes dont have interest to depend on intermediate. Because the third party may modify the data or information. Hence the sender node expects the destination node to come near the source node then receive the information. So the message can only be transferred when both the nodes come into direct contact.

    Opportunistic networks also support modifying of existing network structure, by switching onto Wi-Fi networks from the user based cellular networks and also giving a chance for novel said applications. As the range of transmission is major factor for communicating, passing data to the destination is tough and it will not be continuous, managing becomes difficult and problematic. In order to provide secure data an original copy of the data

    is stored in source and carried over network and reaches the intermediate node which has no option to modify, further the packet is forwarded to the sink node when the intermediate node encounters the sink node.

    These Opportunistic networks also support modifying of existing network structure, by switching onto Wi-Fi networks from the user based cellular networks and also giving a chance for novel said applications. When user is using Wi-Fi networks they can communicate with many different nodes located in the same area. Here its easier for exchanging files, documents from source to destination. Because destination can travel nearer to source node. Suppose if the nodes are in different networks they can use only one intermediate node by keeping one copy with source node and then pass it on for the destination node through intermediate node. Keeping the safety of data as our main intension.

    Opportunistic networks consists of mobile nodes which are moving in character like mobile phones especially android, notebooks, laptops etc. That these devices exchanges nearer using Bluetooth. Where the transmission range is limited. As the range of transmission is the major factor for communicating, passing data to the destination node is tough and it will not be continuous and managing becomes difficult and problematic

    So, at this point of time the node has to depend on the neighbour node, but while a source has to send secure data it cannot trust the intermediate node. So it has an option where it can keep an instance of an information or data and then forward it to through destination, when the destination reaches that intermediate node.

    Since message exchange or information swapping or packets exchanging take place only when distance between nodes are less. Hence mobility parameter is the major factor in both performance and the protocol designing and their applications

    Its difficult to understand human mobility for the simulations of mobile devices in a wireless network, but the present mobility models do not have reflection of real user movement position. The purpose we use computation of mobiles and communication [1] is to allow people interact with other people while moving and exchange informations, files, datas, documents to nearby people. Anyone who are involved in designing a system or network to provide services for the people they should have idea about them the people how they move.

    Whenever there is no intermediate nodes to communicate with the destination node these DTNs come into picture. These networks are mostly used in the networking technology. It follows three procedure first it stores itself a copy of packet and then carry along a network and then reach intermediate node[2] and then forwards to the destination node. Hence in order to avoid this mechanism it depends on mobility factor. Where each of nodes will be mobile. So that it makes sure to increase performance of delay tolerant network routing and protocol applications are mostly dependent on the mobility parameter of each nodes and its behaviour.

    In this model herds of animals are used as message passers. We use a routing protocol that applies knowledge to predict the behaviour of each message. In order to select correct message for carrying data from [3] source to sink nodes by assuming that the cattle of sheep in the farmyard and flock of sheep in the garden area.

    In order to send heterogeneous data we use distributed learning algorithm. This algorithm first separates the data based on the type of data and allocates a channel. Then the nodes use a function that tests the [4] channel like which channel is good and which channel is poor. If a channel outcome is success then that node uses the channel to transfer the data else it uses the other channel. If a channel outcome is poor then the node uses other channel to transfer the data.

    The query is how do we create an opportunistic network [6]? By using epidemic routing protocol and PROPHET protocols we can create these networks. Here each of the nodes reserves two channels one is to store the messages from their neighbour nodes and the other its own messages. Whenever the intermediate node goes nearer to the destination node at that point of time this transfers the messages stored in them.

  2. NETWORK MODEL

In network model we divide into two parts:

  • Mobility

  • Communication Traffic

  1. Mobility

    In this part we consider a network w with w nodes which can move in a network covered in by an area which exceeds their transmission range. Between the two mobile nodes message exchanges when they come into direct face to face contact. In this point the message delivers to the receiver based on its twocriteria mobility and the direct face to face communication. This gives a chance to use Markova frameworks for making an analysis of the message delivery progress. Finally in our project there are no bandwidth concerns in this framework.

  2. Communication Traffic

    To analyse the traffic we to come across three questions: who wants to communicate with whom? And

    how long the communication goes on? And how much traffic is being exchanged between them? By looking back to our previous history of projects and when we analyse it technically and socially it proves that there exist a heterogeneous traffic whenever a data of heterogeneity is sent. Many dont know the importance of opportunistic networks and their applications. These networks have a depth applications as location based and social applications. These networks majorly depend on special and social networks. It checks the location its transmission range and the people who use mobile nodes like laptops, smart phones to communicate with each other and for transferring the audio, video and file documents.

    When it comes to the traffic two questions rises such that: whether traffic depends on the mobility of the nodes and whether there are any correlations between the traffic and mobility. But to be frank traffic is independent of mobility nature of nodes. It may have some effect on the delay but not totally.

    The same studies further suggest that this hetero- geneity depends on the spatial and social characteristics of these networks. Since location-based services and social networking are considered among the major applications supported by opportunistic networks, such traffic dependencies on social and/or spatial factors are very probable to appear. What is more, mobility characteristics have also been found to depend on spatial and social characteristics [10]. This clearly seems to argue for a non- homogeneous traffic model. Moreover, traffic and mobility in such networks are expected to exhibit some correlations [13], [14]. Before we proceed to choose a traffic model, one should consider the following questions: Would the mere heterogeneity of trafficsuffice to affect performance? Is it necessary to consider traffic and mobility correlations? As stated earlier, information dissemination is determined by the sequence of contact events. Hence, if traffic character-Fig. 1. Mean delivery delay of 4 routing protocols, namely Direct Transmission, Spray and Wait (SnW), 2-hop, and SimBet, on the (a) Gowalla and (b) Strathclyde datasets.

    Statistics are independent of node mobility; one might expect a limited impact on performance. Towards examining the validity of the above argument, we decided to compare the performance of some well-known opportunistic protocols (direct transmission [4], spray and wait [5], 2-hop routing [18], and SimBet [7]) through simulations on two real traces (we discuss the traces in more detail, later, in Section 4), for three traffic scenarios:

    1. Homogeneoustraffic: every pair of nodes has the same chance of being chosen as the source-destination pair for the next message;

    2. Heterogeneous trafc that is mobility independent: we assign randomly to each pair a different end-to-end trafc demand (with the normalized message generation rate for a pair drawn uniformly in [1,1000]);

    3. Heterogeneous trafc that is mobility dependent: end-to-end trafc between two nodes is proportional to

their contact rate. We generated an equal (sufficiently large) number of messages for all scenarios. Results for the mean message delivery delay are shown in Fig. 1. As is evident from these results, when trafc heterogeneity is independent of mobility (middle bar), the average delay is practically the same to the homogeneous case (left bar), for all protocols, and across all scenarios (including additional ones we have tried). In contrast, when trafc is heterogeneous and correlated with the contact rates (rightmost bar), Fig. 1 shows a clear difference in average delay for all scenarios and protocols. These results provide an initial answer to the above questions: It is not traffic heterogeneity itself that affects performance, but rather the joint effect of mobility and traffic (hetero- geneity). In other words, unless differences in traffic demand correspond also to differences in contact frequency (e.g. frequently meet- ing pairs tend to also consistently generate more/less traffic for each other), end-to-end performance will not be affected. This statement is also formally proven in Lemma 1 (Section 8.4). The above observation, together with the initial insight com- ing from real datasets, motivates us to propose the following simple, yet quite generic, model for end-to-end traffic.

Denition 2 (Heterogeneous Communication Traffic). The end-to-end traffic demand (per time unit) between a pair of nodes {i,j}, is a random variable ij, such that E[ij]

= (ij), where (·) is a continuous function from

Hence, trafc demand between node pairs can differ and is on average correlated with the nodes contact rate. However, ij itself is still random, allowing some node pairs to have little trafc demand even if they meet often (e.g. familiar strangers). Furthermore, through the function (·) one can introduce a number of different types and amounts of (positive or negative) correlations between trafc and mobility. While real mobility and trafc patterns are clearly expected to have a number of additional nuances and details, not captured by the models of Def. 1 and Def. 2, it turns out that these abstractions are still rich enough to allow us to draw useful conclusions.

III ANALYSIS

Consider now an opportunistic network with mobility and trafc according to the denitions of Section 2. To calculate a performance metric for this network, e.g. the expected delay, one would consider a large number of messages generated be- tween various source-destination pairs. Therefore, one would further need to know the contact rates between the sources and destinations of these messages. If a message was equally likely to be generated between any pairs of nodes, then the contact rate between the source and destination of this message should be distributed as f (Def. 1). However, if messages are more like to come from a frequently meeting pair rather than an average pair, then the source-destination contact rate (we refer to it as the effective contact rate) would be biased towards higher values. To this end, we derive the following basic proposition (whose proof is given in Section 8.1) for

the probability distribution of the effective contact rates between source destination node pairs.

Proposition 1. The probability density function f of the contact rate between the source and the destination

{s,d} of a random message, in a network following Denitions 1 and 2, converges as follows:

f(x)p 1 C ·(x)·f(x)

Where f(x)dx = P{sd [x,x + dx)},

p denotes conver- gence in probability, and C = E[()]

=R 0 (x)f(x)dx is a normalizing constant

As Proposition 1 shows, the source-destination contact rate distribution depends both on the contact rate distribution f() and the trafc

Fig. 1. Mean delivery delay of 4 routing protocols, namely Direct Transmission, Spray and Wait (SnW), 2-hop, and SimBet, on the (a) Gowalla and (b) Strathclyde datasets.

Patterns () (i.e. joint effect of mobility and trafc). Specically, the probability that the contact rate of a selected node pair takes a certain value, e.g. sd [x,x+dx), is proportional to the number of pairs that contact with rate ij [x,x+dx) (i.e.f(x)) and the average trafc demand between them (i.e. (x)).

  1. End-to-end Delivery Performance

    An opportunistic routing protocol tries to deliver the end-to- end trafc demand ij, and we would like to consider the effects of different contact patterns f and trafc patterns () on its performance. There exists a very large abundance of proposed schemes [8 and it would not be possible, nor would it provide any intuition, to analyze the effect of heterogeneity on each and every one. Instead, we focus here on some basic mechanisms to gain intuition. The approach with the minimum overhead and complexity is Direct Transmission (DT): nodes wishing to exchange data or information with each other, may do so, only when they are in direct contact, without involving any relays. For instance, DT is often assumed in content-centric applications, where a node interested in some content will query directly encountered nodes for content of interest, and retrieve it only if it is available there. Furthermore, it is the only feasible approach if nodes do not have incentives to relay trafc they are not personally interested in, e.g. due to privacy or resource- related concerns. Nevertheless, DT

    is known to suffer from long delays and low throughput. To improve the performance of direct transmission, replication or relay-assisted schemes can be used. Extra copies can be handed over to encountered nodes, and the destination can receive the message from either the source or any of the relays, reducing thus the expected delivery delay. Taken to the extreme, schemes like epidemic routing

    [15] forward the message at every possible encounter (deterministically, probabilistically, or based on some utility-function). Yet these do not usually scale well beyond networks with few tens of nodes, due to large resource usage. Instead, few relays are normally used, in an attempt to strike a good trade-off. In networks with homogeneous mobility and trafc, it is known that using just a few extra copies leads to signicant performance gains. For example, in a network of 1000 nodes, simply distributing 10 extra copies to the rst 10 nodes encountered provides an almost 10-fold improvement in delay compared to direct transmission [5]. Although this also comes with a 10-fold increase in the amount of (storage and bandwidth) resources needed, it presents a very useful trade-off to DTN protocol designers. However, when it comes to heterogeneous mobility and trafc, Proposition 1 suggests that, unlike the above example, the source is no longer equivalent with other random relays, in terms of their probability of contacting an intended destination soon. It is thus of particular interest to examine whether the above trade-off still holds, if one considers the joint effect of realistic mobility and communication trafc patterns. We thus consider, in the following, Relay-assisted routing, which is a simple abstraction of schemes that use extra randomly chosen relays2. To compare the performance of Relay-assisted routing and Direct Transmission, in terms of delivery delay and delivery probability (the two main metrics considered in related work), we rst dene the following metrics: (a) Delay Ratio, R: the ratio of the expected delivery delay of Relay- Assisted routing, E[TR], over the expected delivery delay of Direct Transmission routing, E[TDT], i.e.

    R =E[TR] /E[TDT]

    (b) Source Delivery Probability, P(src.): the probability that a message is delivered to the destination by the source node, rather than by any of the relays. Both metrics contain information about the performance gain of Relay-assisted routing compared to Direct Transmission. Specically, R shows how faster (on average) a message can be delivered under Relay-assisted routing, whereas P(src.) gives the probability that any of the relays will actually contribute in the delivery process. It is easy to see that (i) R and P(src.) always take values in the interval [0,1], and (ii) the higher their values are, the less the gain due to relay nodes is. For instance, when R = 0.1 Relay-assisted routing delivers (on average) a message 10 times faster than Direct Trans- mission, while a value R = 0.5 denotes that Relay-assisted routing is only 2 times faster. Respectively, when P(src.) =

    0.1 the probability that the source node s meets the destination d, before any other relay node meets d, is 10%, and P(src.) = 0.5 means that this probability is 50%. In the

    limiting cases, when R,P(src.) 1 the message is delivered to the destination by the source node itself, while when R,P(src.) 0 delivery takes place (entirely) due to the relays. In Result 1, we derive analytical expressions for these two metrics, R and P(src.). The proof is given in Section 8.2. Result 1. When Relay-assisted routing with L extra copies is considered, then

    where the expectations are taken over f and fR = f(L) is the L-fold convolution of f.

    In addition to the main metrics considered in this paper (Result 1), and for the readers ease of reference, in Table 1 we provide expressions for the absolute performance (message delivery delay and delivery probability) of Direct Transmission and Relay- Assisted routing. The expressions follow straight from the proof of Result 1 or through similar analysis, and, thus, we omit the detailed derivations.

  2. Insights for Real Opportunistic Networks

    The expressions we derived in Result 1 are generic and can be used under any mobility and trafc pattern (i.e. for any f and (·)). However, they do not give a good feel as to how exactly these metrics are affected by mobility and trafc heterogeneity. To obtain some further insights, in this section, we consider specic classes of mobility and trafc patterns that capture commonly observed characteristics of real networks. For these classes, we derive simple closed form expressions that bound the performance metrics R and P(src.).

    Mobility We will assume the contact rates to be gamma distributed, i.e. f(x) (x;,) = ()x1ex. Our choice is initially motivated by the ndings of Pascrell et al. [10], who have shown, through statistical analysis of pervasive social networks datasets, that the Gamma distribution matches well the observed contact rates. In addition, the analytical ndings of [10], further suggest that the choice of a Gamma distribution can be supported in real opportunistic networks and can explain many of the observed properties (e.g. distribution of aggregate inter-contact times). Finally, by selecting appropriately the parameters and of a Gamma distribution, we can assign any desired value to the mean value µ and the variance 2 of the contact rates3. This allows us to describe (or t up to the rst two moments) a large range of scenarios with different mobility heterogeneities captured by CV = µ .

    Trafc We further describe the trafc using a polynomial function of the form (x) = c·xk, c > 0. As in the case of mobility, the reasons for our choice are as following. Observations of real networks have shown that the nodes with high contact frequencies tend to exchange more trafc [13], [14], which is consistent with the above choice when k > 0. Second, the exact trafc patterns (i.e. (x)) in a real scenario are difficult (if not impossible) to determine, and, hence, it is more probable that simple methods will be used. For example, one might get some trafc samples and perform linear regression on the measured data. This would result in a linear (x) (i.e. k = 1). Our model extends this logic by going beyond linear tting and allowing as well sub- and super-linear tted

    traffic patterns. In general, the values of k capture the amount of trafc heterogeneity. Furthermore, by choosing 0 < k < 1 (or k > 1) one can emulate concave (or convex) functions and, thus, approximate different trafc patterns. Finally, one can also consider negative correlations, by choosing k < 0. Although less common, these could arise, for example, in applications where users want to communicate more when they do not meet frequently (e.g. messaging). Under the above assumptions, the following result for the relative performance of the information dissemination mechanisms we consider in this paper, holds. The proof of Result 2 is given in Section 8.3. The corresponding expressions for the absolute performance metrics are given in Table 1. Result 2. In a Heterogeneous ContactNetwork where f (,) with mean value µ and variance 2 (coefficient of variation CV = µ ) and (x) = c·xk, it holds: 1 R Rmin = 1 + (k1)·CV 2 1 + k·CV 2 + L (2) for k > kmin = 1 1 CV 2 , and

    1 P(src.) Pmin =1 + k·CV 2 1 + (k + 1)·CV 2 + L (3)for k > kmin = 1 CV 2 .

    The expressions of Result 2 depend only on 3 parameters (CV, k, L) and, thus, could be.

  3. IMPLICATIONS

It is evident from the above example that trafc heterogeneity can have a major impact on performance and thus protocol design. Table 2 formalizes this impact, by considering how Rmin and Pmin (Eq. (2) and Eq. (3)) behave: The middle column shows their monotonicity as mobility heterogeneity (CV), trafc heterogeneity (k), and amount of extra copies (L) increase. For instance, when k increases (%), Rmin and Pmin increase (%) too. The right column gives their values in the limit for large/small k or CV; e.g. collaborative or local. Such schemes have been proposed [32], [33] to collect contact related information for forwarding algorithms, but would now need to maintain also trafc-related information and correlate it with the information about the node contact rates, in an efficient manner.

Routing for Unicast Applications For high heterogeneity (trafc and mobility), our results imply that a unicast message is likely to arrive to its destination at the time the source and destination come in contact (i.e. Rmin, Pmin 1 as k,CV ). This raises questions about the usefulness of opportunistic networking for unicast applications in which end-to-end trafc is expected to be highly correlated with contact frequency (e.g. Facebook messaging) [13], [14]. On the other hand, our results suggest that potential unicast applications with an end-to- end trafc demand between nodes with non-frequent meetings, i.e. scenarios with small or negative k, (e.g. social peers residing in different communities) could benet a lot (more than normally assumed). Although these observations might appear somewhat self- evident at rst glance (note however the case described in the previous subsection), the question of how to tune protocols and choose the right number of replicas stills remains. To our

best knowledge, our results are the rst to provide closed form, quantitative insights into the tradeoffs involved in real scenarios with both mobility and trafc heterogeneity. Moreover, one could raise a point about their applicability for sophisticated protocols that choose relays intelligently (e.g. based on contact rates, social graphs). In this case, a source node could try to wait and select better relays than giving the copies to the rst randomly encountered peers, thus improving the impact per replica. Nevertheless, in a highly heterogeneous scenario, a source might need to wait a long time until it encounters such good relays (spray phase) and this could counter-balance the effect of better relays. In Section 5, we prove that the qualitative implications of our results hold also for such mobility- aware protocols, which exploit mobility heterogeneity in order to select better relays. A complementary explanation for this qualitative result is given in the end of this section (see Fig. 2 and the corresponding commentary).

In this section, we elaborate on some important implications that follow from Table 2.

Gain of Extra Copies A strong positive correlation (large k) between trafc and mobility reduces the added value of extra copies (i.e. Rmin, Pmin % as k %). This indicates that, as correlation (k) in- creases, one needs to distribute message copies to more relays nodes in order to achieve a certain performance improvement compared to the baseline, Direct Transmission. In contrast, a negative (or weak positive) correlation renders each extra copy more useful (i.e. Rmin, Pmin 0 as k kmin4). The fact that a weak positive correlation, e.g. k (0, 1 L), actually makes extra copies more useful might be a bit surprising. However, it is explained as following: Mobility heterogeneity (when trafc is homogeneous or uncorrelated with mobility) affects negatively the message delivery delay (of random protocols and Direct Transmission) [9], [11], whereas positively-correlated trafc has an opposite effect (i.e. decreases delay). The counterbalancing effects of these two factors determine a threshold (e.g. 1 + 1 L for Rmin or 1 L for Pmin) under which the negative effects of heterogeneity affect more the message delivery process. Our framework, not only reveals this inherent trade-off, but also provides the tools for quantifying such thresholds. From the above discussion it becomes evident that it is crucial to identify whether a trafc-mobility correlation exists in a given scenario, and what its nature is, as this could decide whether the overhead of using few or more extra copies is justied or would just waste a lot of valuable resources. In practice, this means that a relay- assisted protocol should be complemented with an online estimation algorithm, Content-Centric Communication

While our results are somewhat pessimistic when it comes to the usefulness of opportunistic networking for unicast applications, the opposite holds when it comes to modern, content- centric applications (e.g. le sharing, D2D-based offloading, service composition). In such applications nodes are looking, for example, for some content of interest [3] or service [2], which they can access directly from any encountered node that offers it. If the

interests of nodes are heterogeneous (which is known to be the case) and nodes with similar mobility patterns tend to have some similarity in their interests too (evidence for this does exist ), then our results suggest: (i) that there is a better chance to nd a content or service soon from a directly encountered node than one would expect in homogeneous scenarios, and (ii) coming up with complex, resource-costly mechanisms, e.g. multi-hop query- response, directories, etc., might not be necessary. We plan to look into such content-centric scenarios in more detail in future work.

Fig. 2. Message delay under Direct Transmission, Spray and Wait (L = 5), and Epidemic routing in scenarios with varying trafc heterogeneity; mobility parameters are µ = 1 and (a) CV = 1 and (b) CV = 2.

To put some extra evidence on our arguments and further demonstrate how and why trafc heterogeneity affects the relative performance, in Fig. 2 we compare the message delay of (i) Direct Transmission (i.e. the protocol with the highest delay), (ii) Relay-assisted routing (Spray and Wait, SnW, [5] with L = 5 copies) and (iii) Epidemic routing [15] (i.e. the protocol with the lowest delay), in two scenarios, for varying trafc heterogeneity (k). Two main observations, with respect to the previous implications, can be made in Fig. 2. At rst, an increasing amount of trafc heterogeneity/correlation closes the performance gap between the best (Epidemic) and the worst (Direct Transmission) forwarding policies. Hence, it becomes evident that the possible gain one could achieve by using any routing protocol and any number of extra copies, diminishes. As a result, routing schemes, whose design is crucial in homogeneous scenarios (since the improvement gap is large; see Fig. 2 for regions with low k), become less important in heterogeneous scenarios with highly correlated trafc (since the improvement cannot be large; see Fig. 2 for regions with high k) and/or less necessary (since comparable performance can be achieved with Direct Transmission; e.g. Fig. 2(b) for k = 4). Second, the delay of Direct Transmission decreases radically as trafc heterogeneity increases5. Although the delay of Relay- assisted routing decreases with trafc heterogeneity k too, the effect is less signicant. Specically, an observation of

the delay curves for Direct Transmission and Relay- assisted routing in Fig. 2(a), shows that the delay ratio R = E[TR] E[TDT ] increases as trafc becomes more heterogeneous. However, this increase is mainly due to the improvd performance of Direct Transmission rather than this of Relay-assisted routing.

  1. MODEL VALIDATION

    To validate our model and analysis, in this section we compare the theoretical results against Monte Carlo simulations on various synthetic scenarios, and on datasets of real networks.

    A. Synthetic Simulations

    We generate synthetic networks, conforming to the mobility and trafc models of Section 2, as following: (i) We assign to each pair {i,j} a contact rate ij, which we draw randomly from f, and create a sequence of contact events (Poisson process with rate ij). (ii) Since E[ij] = (ij) (from Def. 2), we draw the trafc rate for each pair

    {i,j} as ij Uniform[0,2·(ij)]. (iii) Then, we simulate a large number of message exchanges, choosing randomly for each message the source-destination pair according to the weights ij. We created different scenarios (N,L,f,(·)) to verify our analysis under various network parameters. Here, we present the simulation results for scenarios with N

    = 500 nodes6. As Relay-assisted routing, we used the Spray and Wait protocol [5] with L = 5 copies. To be consistent with the analysis of Section 3.2, we used the Gamma distribution as the contact rates distribution f and trafc functions of polynomial form, (x) = c·xk. In Fig. 3 and Fig. 4 we present simulation results for the ratios R and probabilities P(src.), along with the corresponding theoretical results (exact predictions of Result 1 and lower bounds of Result 2), in scenarios with varying mobility and trafc heterogeneity. Fig. 3 shows the delay ratio R: (a) in three scenarios with different trafc functions (x) (namely7: c·x, c·x2, and c · x4), under varying mobility heterogeneity; and (b) in three mobility scenarios with CV

    = {0.5,1,2}, under varying trafc heterogeneity. A rst observation is that the exact expressions of Result 1 (continuous lines) can accurately predict the metric R (simulation results are denoted with circles). Additionally, the lower bounds are always below the simulation curves (as expected), and in many scenarios are quite tight. Under the same mobility (CV) and trafc (k) simulation scenarios, similar observations can be made for the source delivery probability P(src.) in Fig. 4, where the exact expres- sions of Result 1 accurately match the simulation results and the bounds of Result 2 are tight in most scenarios. In general, for both the metrics R and P(src.), the theoretical lower bounds are less tight for scenarios where mobility is quite heterogeneous. Specically, in Fig. 3(a) and 4(a), the bounds are less close to the simulation curves in the regimes where CV becomes larger than 2. Also, in the scenarios with varying trafc heterogeneity (Fig. 3(b) and 4(b)), the bounds are tight for scenarios with small and moderate mobility heterogeneity, and become less tight only in the scenarios with CV = 2 (bottom plots of Fig. 3(b) and 4(b)). In every scenario, the simulation curves R

    and P(src.) have the monotonicity we predicted in Table 2 (middle column) for the theoretical bounds Rmin and Pmin. For instance, when trafc heterogeneity (k) increases, R and P(src.) always increase as well (Fig. 3(b) and 4(b)). Also, in the regimes that k kmin8 the simulation values of the considered metrics become almost zero, and for large k (especially in the bottom plots of Fig. 3(b) and 4(b), where mobility is also very heterogeneous) they get close to 1, thus validating the qualitative predictions of Table 2 (right column). The simulation results in Fig. 3(a) and 4(a), where we present scenarios with varying mobility heterogeneity (CV), validate our predictions for the monotonicity and limiting behavior as well. For example, in Fig. 3(a) for k = 0.5, where the trafc-mobility correlation is small (the same holds also for negative correlations), R and Rmin decrease as the mobility heterogeneity increases (as suggested in Table 2). In the rest of the plots, the bounds and the corresponding simulated values increase, demonstrating that the gain of the extra copies diminishes under such conditions, and, thus, conrming our qualitative results (Section 3.3). For example, in the bottom plot (k = 4) of Fig. 3(a), we can see that the improvement offered by the extra relays is at most 6×(since R = 1 1+L = 1 6) for homogeneous network (CV

    = 0), while for CV > 2 the extra gain is at most 1.25×(since R > 0.8); that is, even using 5 relays will only marginally improve the delay. Similarly, from Fig. 4(b) and for CV = 2, we can see that, while for almost homogeneous trafc (k < 0.5) the probability of the message being delivered through direct transmission, P(src.), gets less than 40%, when trafc becomes very heterogeneous (k 4), this probability is around 80%.

    Fig. 3. R in scenarios with varying (a) mobility and (b) trafc heterogeneity. Simulation results are denoted with circles; the theoretical predictions of Result 1 (exact predictions) with continuous lines; and the lower bounds Rmin (Result 2) with dashed lines.

    B.Real-World Networks

    To further investigate the applicability of our results in real-world networks, we conduct simulations on datasets collected from online social networks (Gowalla / Twitter dataset [13]) and a mobile phone usage experiment

    (Strathclyde dataset). In the following discussion we present the datasets, whose main features can be found also in Table 3.9

    Gowalla / Twitter dataset Gowalla was a location- based social network, where users were able to check-in at spots (bars, shops etc.) through their mobile phones. In addition, a user could connect her Gowalla account to her Twitter account. Hence, from this dataset, we could retrieve information related both to nodes mobility (Gowalla check-ins) and communication trafc (tweets). Mobility: In this dataset, we consider as a contact event the time when two users reside in the same spot simultaneously10. The contact rates ij can be computed from the number of the contact events and the inter-contact time intervals. Then, to incorporate this information in our model, we t the contact rates distribution f with a known probability distribution f. Specically, in the two cities, Austin (AU) and San Francisco (SF), for which we have the most user records (1004 and 479 nodes, respectively), the experimental CCDF (complementary cumulative distribution function) of the contact rates ij can be approximated by a straight line on a log-log plot. This implies that f could be tted with a Pareto distribution, instead of the Gamma distribution assumed in Section 3.2 and often observed in traces. Therefore, we use here the expressions of Result 1, instead of Result 2. Communication Trafc: As an indication for the communication trafc that two nodes would exchange in an opportunistic network, we use the number of tweets in which they are both involved. Hence, for each pair {i,j} we set its trafc rate ij equal to the number of tweets posted by i to j or by j to i, i.e. ij =

    #tweetsij. Then, we approximate the observed relation between trafc and contact rates (ij ij) with a function (x), in order to use it in our theoretical expressions. We also investigate more possible correlations between the opportunistic trafc (ij) and the Twitter trafc (#tweets), by creating two additional scenarios where we set ij

    =p#tweetsij and ij = (#tweetsij)2. The approximate functions (x) for each scenario are presented in Table 4, where we can see (x) being of type c·xk with k < 1. Strathclyde dataset The Strathclyde dataset was collected in an experiment, in which 24 high school students were selected and given modied smartphones, which recorded proximity events (through Bluetooth), calls and sms exchanged between the phone user and the other participants. Mobility: In this dataset the contact events were already recorded and, thus, we did not have to preprocess the data as in the Gowalla dataset. We followed the same methodology to calculate the contact rates ij and t their distribution with a Gamma distibution, denoted as

    f. Communication Trafc: We consider three scenarios, in each of which we use a different communication trafc metric: (i) total number of calls and sms, ij = #callsij

    +#smsij, (ii) total duration of calls, ij = callTimeij, and

    (iii) total length of sms (in characters), ij = smsLengthij. For each scenario, we t function (x) as before, through the relation ij ij. Simulations In both datasets and for each trafc scenario, we generate 10000 messages at random time points, choosing each time the source – destination pair according to the weights ij. We consider

    Direct Transmission and Spray and Wait routing [5] with L

    = 2,5,10,20 copies per message. In the analytical expressions we use the tted functions f(x) and (x). In Fig. 5 we present the simulation values for the ratio R and the probability P(src.) (green/left bars), and the corresponding theoretical predictions (yellow/right bars). We consider homo- geneous and heterogenous (denoted with ) trafc scenarios in the Gowalla/Twitter (AU and SF) and Strathclyde (St) datasets. The rst observation is that in all scenarios, for heterogeneous trafc (i.e. scenarios denoted with ), the values of the metrics R and P(src.) increase, compared to the corresponding homogeneous scenarios. This shows that the relative gains of relay- assisted schemes decrease with trafc heterogeneity, as our theoretical results predict. Moreover, larger performance differences predicted by our theory, are matched by larger per- formance differences in the respective simulation scenarios as well. For example, in the SF scenarios (middle bars in Fig. 5(a) and Fig. 5(b)), the theoretical predictions for heterogeneous trafc are slightly higher than for the homogeneous case; the same holds also for the simulation results, where it can be seen that R and P(src.) do not signicantly increase with trafc heterogeneity. On the other hand, in the St scenarios (right bars in Fig. 5(a) and Fig. 5(b)), our results predict a higher difference (between heterogeneous and homogeneous cases) than before, which is also conrmed by the simulation results where the performance effects are not negligible. To further demonstrate to what extent our results can capture the effect of trafc heterogeneity in real scenarios, in Table 5 we focus on the qualitative predictions of our theory, by comparing a number of scenarios with different amounts of heterogeneity to each other, for the Gowalla/Twitter dataset11. Specically, if the simulated performance improves from one scenario to another, and so is the theoretical prediction, the prediction is assumed to be correct and denoted with X. Incorrect predictions are denoted with ×. For example, in the scenarios AU-S1 and SF-S3 the simulation values for the ratios R are R(AUS1)

    = 0.89 and R(SFS3) = 0.94, i.e. R(AUS1) < R(SFS3).

    For the theoretical predictions it holds also that R(AUS1)

    = 0.64 < R(SFS3) = 0.68 and, thus, the prediction is assumed to be correct. The elements above the diagonal refer to the ratio R, whereas the lower triangular part refers to the probability P(src.) predictions. It is evident that in the majority of the cases we consider, the theoretical results can capture the relative changes in network performance, even between different environments (i.e. between AU and SF)12. The same conclusions can be reached by the analysis in the Starthclyde dataset, in which all the respective comparisons were found to be correct X.

  2. RELATED WORK

    Useful implications for opportunistic networking have arisen from the investigation of mobility/social ties and social ties/communication trafc correlations, which have been stud- ied extensively and under different disciplines, like anthro- pology, sociology, social media or pervasive social networks [10]. For example, shown that the amount

    of exchanged communication trafc between users of OSNs depends on their social relationships. On the other hand, the communication trafc / mobility correlation has not been given similar attention. There exist only a few works [13], [14] studying it in a framework relevant to opportunistic networking. In [13], Hossmann et al. collected and analysed two datasets from online social networks (Facebook and Gowalla / Twitter), and investigated the relations among three dimensions: mobility, social ties, communication trafc. With respect to our study, they found that there is strong dependence between mobility and trafc, and, specically, node pairs that contact during the experiments duration, communicate with higher probability than the other pairs. Correspondingly, authors in [14] analysed a massive dataset of Call Detail Records (CDRs) of 6 million users and shown a positive correlation between the mobility and communication trafc patterns. Not only they shown that the higher the contact rate (ij) of a node pair is, the higher the probability that the nodes communicate intensively, but also found that information inferred by the mobility patterns can work as a good predictor for future communication events. However, despite the fact that [13],

    1. show clearly that communication trafc is heterogeneous (and correlated to mobility), to our best knowledge, its effects on communication performance have not been studied previously. Finally, with respect to our results and the insights ob- tained from them, it has already been observed that realistic mobility patterns (e.g. locality, community) can hurt the performance of Relay-Assisted routing (especially simple, random protocols [5]). However, this is a performance degradation that is due to the relays being too similar to the source (e.g. all in the same community or with com- mon characteristics). Instead, the relative performance degradation here comes due to the source and relays being too different in terms of their encounter rates with the destination.

  3. CONCLUSIONS

    Motivated by (i) recent ndings indicating heterogeneous trafc patterns in mobile social networks and (ii) the lack of related studies, in this paper, we modelled trafc heterogeneity and studied how it affects the performance in opportunistic net- working. We found that the effects can be signicant, changing our understanding of common design principles, such as the added value of relays. Despite the fact that some of our qualitative conclusions seem to be rather intuitive, they have not attracted any focus in previous studies, where performance analysis of communication schemes is conducted assuming homogeneous trafc. This indicates a necessity for revisiting the evaluation of protocols in scenarios that entail diversity in the trafc exchanged between nodes. Moreover, our results have some interesting implications about the usefulness of opportunistic networking for various applications. We believe that our study provides an initial understand- ing on the effects of trafc heterogeneity. However, trafc patterns in real networks might have much more complex

    characteristics than what can be captured by our framework, e.g. time-dependent trafc/mobility correlations. Therefore, for a more complete characterisation of trafc demands in oppor- tunistic networking (either for end-to-end or content-centric applications [2], [3]), we believe that further experimental (e.g. measurements, recognition of trafc patterns in available datasets, etc.) and analytical research is needed.

  4. REFERENCES

    1. B. Han, P. Hui, V. Kumar, M. Marathe, J. Shao, and A. Srinivasan, Mobile data ofoading through opportunistic communications and social participation, IEEE Trans. on Mob. Comp.,, vol. 11, no. 5, 2012.

    2. T. Spyropoulos, K. Psounis, and C. S. Raghavendra, Efcient routing in intermittently connected mobile networks: the single- copy case, IEEE/ACM Trans. Netw., vol. 16, no. 1, pp. 6376, Feb. 2008.

    3. E. M. Daly and M. Haahr, Social network analysis for routing in disconnected delay-tolerant manets, in Proc. ACM MobiHoc, 2007.

    4. Mobility Models for Delay Tolerant Networks(DTN):A Survey-M Shahzamal, M F Pervez , M A U Zaman and M D Hossain-August 2014

    5. Designing a human mobility model based routing protocol for Delay Tolerant Network(DTN).-Ponsundari.S, Siva Ganesh. April 2013.

Leave a Reply