![]() |
![]() |
Fred Douglis
IBM T.J. Watson Research Center
Friday, February 13, 2004, 11:00am - 12:00pm
LC 102, Brooklyn Campus, Polytechnic University
Ongoing advancements in technology lead to ever-increasing storage capacities. In spite of this, optimizing storage usage can still provide rich dividends. Several techniques based on delta-encoding and duplicate block suppression have been shown to reduce storage overheads, with varying requirements for resources such as computation and memory. We propose a new scheme for storage reduction that reduces data sizes with an effectiveness comparable to the more expensive techniques, but at a cost comparable to the faster but less effective ones. The scheme, called Redundancy Elimination at the Block Level (REBL), leverages the benefits of compression, duplicate block suppression, and delta-encoding to eliminate a broad spectrum of redundant data in a scalable and efficient manner. REBL also uses super-fingerprints, a technique that reduces the data needed to identify similar blocks and therefore the computational requirements of this process. As a result, REBL encodes more compactly than compression and duplicate suppression while executing faster than generic delta-encoding. For the data sets analyzed, REBL improved on the space reduction of other techniques by factors of 4-23 in the best case.
This is joint work with Purushottam Kulkarni, Jason LaVoie, and John M. Tracey.
Biography:
Dr. Fred Douglis is a Research Staff Member at IBM Research in
Hawthorne, NY. His research interests include data reduction techniques,
internet service performance, internet tools, load sharing, and file
systems. Before joining IBM, he was with AT&T Labs--Research for many
years, where he most recently headed the Distributed Systems Research
department. Fred Douglis was the founding chair of the IEEE Computer
Society's Technical Committee on the Internet (TCI), and is also a past
chair of the Computer Society's Technical Committee on Operating
Systems (TCOS). In addition to serving on numerous program committees,
he serves on the editorial board of IEEE Internet Computing and was the
program chair or vice-chair of many conferences including WWW in 2003,
2002 and 1999, USENIX Symposium on Internet Technologies and Systems
(1999), the 1998 USENIX Technical Conference, Symposium on Applications
and the Internet (SAINT) in 2001, and the Web Caching Workshop in 2003.
He holds a PhD from Berkeley.
For further information or to meet with the speaker please contact Torsten Suel at suel@poly.edu.