SOUVIK RAY
Ph.D. Candidate in Computer Engineering
Iowa State University
Ames, IA 50010
Office: Durham 372
Phone: (515) 451-9075
E-mail: rsouvik@iasatate.edu
Updates
04/07/08: I will be interning with NEC Research this summer

I am a PhD student in Department of Electrical and Computer Engineering, Iowa State University.
My adviser is Prof. Zhao Zhang.

Research Interests

Security and Privacy in Networked and Distributed Systems

Education

Ph.D., Computer Engineering Aug 2002 -- Department of Electrical and Computer Engineering
Iowa State University
Ames, IA
M.S., Computer Science May, 2002 Center for Advanced Computer Studies
University of Louisiana
Lafayette, Louisiana
B.S., Computer Science June, 1998 Department of Computer Science
Jadavpur University
Kolkata, India
Publications
ICPP'07 "Incentive-Driven P2P Anonymity System: A Game-Theoretic Approach", [Paper] [Talk]
Souvik Ray, Giora Slutzki and Zhao Zhang,
The 2007 International Conference on Parallel Processing,
Xian, China, September 2007.
P2P'07 "An Information-Theoretic Framework for Analyzing Leak of Privacy in Distributed Hash Tables", [Paper] [Talk]
Souvik Ray and Zhao Zhang,
Seventh IEEE International Conference on Peer-to-Peer Computing,
Galway, Ireland, September 2007.
ICCCN'07 "SCUBE: A DoS Resistant Distributed Search Protocol", [Paper] [Talk]
Souvik Ray and Zhao Zhang,
16th IEEE International Conference on Computer Communications and Networks,
Honululu, Hawaii, August 2007.
Multiagent and Grid Systems'06 "A Light-Weight Market-based Protocol for Providing Complete Anonymity in Peer-to-Peer based Grids", [Journal Article]
Souvik Ray and Zhao Zhang,
Journal of Multiagent and Grid Systems,
Accepted January 2006.
ICDCN'04 "Heuristic-based Scheduling to Maximize Throughput of Data-intensive Grid Applications", [Paper] [Talk]
Souvik Ray and Zhao Zhang,
International Workshop on Distributed Computing(ICDCN),
Kolkata, India, December 2004.
GRID'04 "An Efficient Anonymity Protocol for Grid Computing", [Paper][Talk]
Souvik Ray and Zhao Zhang,
IEEE/ACM International Conference on Grid Computing,
Pittsburgh, PA, November 2004.
Other Articles
Research
Ph.D. Dissertation: Design and Analysis of Privacy Preserving Mechanisms for Emerging Distributed Applications
Traditionally, anonymity research has focused on hiding user identities involved in a communication channel (concepts of Initiator anonymity, responder anonymity and unlinkability). Another line of research has concentrated on issues related to database privacy; whereby randomization, data perturbation and other statistical methods are used to obfuscate user-data in the database and yet gleaning important statistical information from such databases. Recently, the trend has shifted in the direction of wireless and sensor networks. However, there is still a broad range of networked/distributed applications which require a thorough understanding of security and privacy issues which are specific to such applications. Emerging distributed platforms like P2P, Grid and Overlay networks are all characterized by data sharing and this raises questions like: (a) how secure is the data?, (b) can an unauthorized entity access data or glean information about data usage?, (c) can existing privacy-preserving techniques and/or cryptographic techniques be used and how much is the overhead?, (d) how the design of these distributed platforms (e.g. routing protocol) influences information leak? My research has been motivated by such requirements of security and privacy for distributed platforms and has tried to address these questions through analysis and design. At a high level, it has tried to address questions such as (a) which node is storing what data in a P2P system?, or (b) which grid site is involved in what kind of job transactions? My research has resulted in the design of (1) scalable and robust privacy-preserving mechanisms based on cryptography and game-theoretic mechanism design, and (2) analytical tools for measuring privacy in such large scale systems.

Data Privacy in P2P Overlays
Structured Peer to Peer overlays use distributed hash table (DHT) based routing to locate data in search, content distribution systems. Distributed hashing enables fast lookup (O(logN)steps) by storing sufficient routing information at each peer in the overlay. However, this routing information also makes it easy to map a key to its data on the overlay. In other words, a fast lookup comes at the cost of information loss about which node is storing what data. Existing studies have laid stress on specific DHT designs; moreover, there is no comprehensive study on the effect of DHT designs on content privacy. I have developed an analytical model using information theory to quantify the leak of such information and also compare and contrast different DHT designs (mainly routing protocol used and size of routing table). The results show that ring-based routing should be used when privacy is of utmost importance. Data privacy for such DHT based overlays also influences the reliability of data. I have tried to address data reliability for such distributed data structures through the use of a cryptographic protocol called SCUBE. It uses secret-sharing based threshold cryptography and the concept of virtual addresses to secure the system. A key feature of this protocol is its use of secret sharing as opposed to traditional fault tolerance approaches like replication; the motivation being the better security properties of secret sharing. A prototype for this system was implemented and tested on the PlanetLab testbed to determine the effectiveness of the system under network dynamism and actual network traffic. Different adversarial attacks were artificially induced in the system. SCUBE incurs acceptable cryptographic, storage and bandwidth overhead.

Using Trust to Enhance Privacy
Grid computing platforms are closed in nature whereby there can exist multiple bi-directional trust relationships between sites (based on job transactions etc.). Moreover, the data handled can be of the order of multi gigabytes or terabytes (for example, a computing task in GriPhyN may request data files of multiple gigabytes from storage nodes). This makes it difficult to use multi-hop based forwarding techniques to achieve anonymity as has been used traditionally. I have tried to utilize the inherent trust in designing a 2-hop forwarding-based anonymity protocol for grid computing. While still retaining the forwarding-based approach from traditional anonymity, this design manages to achieve controlled anonymity while incurring minimal overhead. The information-theoretic metric of entropy was used to quantify the degree of anonymity that can be achieved under different situations, adversarial conditions and sizes of grids. The efficacy of the design was tested through extensive simulations using a Grid Simulator and also by implementing it as a module in Globus.

Game-theoretic perspective to Anonymity
I have also studied traditional anonymity issues for systems built on top of P2P infrastructures. Anonymous communication systems built on top of P2P infrastructures are affected by the churn problem or frequent joins and leaves of nodes (free-riding in other words). This influences anonymity by reducing the size of the anonymity set; moreover, since the frequency of path reformation increases, this increases the probability of intersection attacks. I have proposed an incentive-based forwarding mechanism to induce peers to participate in the anonymity substrate. The design of the incentive mechanism is based on game-theoretic principles. The crux of the mechanism is that the forwarding decisions made by the participating peers are aligned with the anonymity requirements of the system. The efficacy of the mechanism was studied using extensive simulations. A similar strategy, based on game-theoretic auction mechanisms was also proposed in a joint work( Grid and Multiagents).


Other Projects
Seamless Data Management for storage devices
Data management for home storage systems is especially challenging because different storage devices like a PDA, Laptop, PC, DVR, and PVR can potentially come from different vendors and therefore have different communication stacks, OSes etc. To give an example of an use case, consider an effcient way of managing digital pictures. Currently, we manually transfer files from a DVR to say a PC using a portable storage device, flash memory etc. However, the real challenge is to make this entire process seamless or in other words very user friendly. I have spent some time researching into issues for seamless data management for such home storage networks while interning at Seagate Research . While traditional approaches have focused on distributed file systems (client-server based), I proposed a P2P approach for managing data [2]. I developed a Universal Plug-n-Play (UPnP) based solution for device to device communication whereby each device acts as a control point and also as a server. Moreover, UPnP is built on top of the TCP/IP stack which more or less solves the device incompatibility problem for a network of heterogeneous devices.

Data management in Grids
Data grids are characterized by gigabytes and terabytes of data transfer between nodes where a job requires both computational power and access to large amounts of data in the form of data files. Thus, an important scheduling question that arises is whether to send a job to a remote site which has the data but low computing power or to get the data from the remote site and carry out the job locally. I have tried to address such scheduling questions by proposing a scheduling heuristic (Maximum Residual Resource of MRR) which performs better than Min-min and Max-min heuristics [5]. The effectiveness of the heuristic has been tested for realistic grid workloads.