Critical Evaluation of P2P File Sharing Application
There have been a lot of research activities due to the popularity of various peer-to-peer file sharing applications like Gnutella and Napster. However, there is a disagreement on the exact definition of a peer-to-peer system due to lack of a common, centralized, and dedicated infrastructure.
A typical peer-to-peer file system works in collaboration with their active peers. Several peers interact with each other to share the resources and thus act as a virtual infrastructure. The participation of a peer in a peer-to-peer file system is not fixed; as a result, it is very difficult to figure out the exact architecture of such system. Their ad-hoc and dynamic nature makes it very difficult to find a fixed mechanism to provide services to the users. One of the most common challenges that are faced is in arranging the peers in such a way that all the data can be easily and efficiently loaded by any of the peers.
Since, each peer carries a different characteristic from each other, it is very important to understand the characteristics in order to evaluate a given peer-to-peer system. Such characteristics should be taken into account while evaluating the performance of such systems. Some of the important characteristics include bandwidth, latency, bottleneck network connections, etc. A peer having low-bandwidth and high-latency should not be assigned any large or famous portions of the distributed index; as a result the distributed index would not be available for other peers.
In the same way, the duration for which the peer remains connected to the Internet is equally important. It ensures the data or index metadata availability. However, one should check the peer status to perform a given task before going ahead and assigning the task to the peer. There are many peer-to-peer file system architectures that are evaluated with such considerations, and this is because of the inadequate knowledge about the participating host in peer-to-peer file system. There are many studies which have measured the average number of files a peer share. However, Napster and Gnutella are the two most popular peer-to-peer file sharing systems that are considered for detailed measurement.
Such system consists of various hosts mostly the end-users of the home or office machines. Since such machines are usually connected to the Internet, they form an essential part of the system.
The measurement study is categorized depending upon the number of end-users that takes part in these two systems. Some of the parameters on which they are characterized include the following.
- Bottleneck bandwidths between these hosts and the Internet.
- IP-level latencies for sending packets to the hosts
- Frequency of a connection and disconnection between the host and the system.
- The number of files shared by the host
- The number of files downloaded by the host
- Correlations between these characteristics.
The measurement results that came out showed a great amount of heterogeneity in both Gnutella and Napster peer-to-peer file system. It showed a variation in between from three to five magnitudes in terms of bandwidth, latency, degree of sharing, and availability. This was sufficient in drawing a significant conclusion that the peer-to-peer system must be very careful about delegating responsibilities across peers.
More often peers tend to give wrong information deliberately at times. However, accurate information is always required and responsible for assigning the responsibility to a peer. This has led to an implication that the upcoming peer-to-peer systems implies that future peer-to-peer systems should have an inbuilt provision for the peer to tell the correct information and the same inform could be verified by the peer-to-peer file system.