Analise de sistemas de arquivos

Disponível somente no TrabalhosFeitos
  • Páginas : 17 (4131 palavras )
  • Download(s) : 0
  • Publicado : 21 de novembro de 2012
Ler documento completo
Amostra do texto
Benchmarking File System Benchmarking: It *IS* Rocket Science
Appears in the proceedings of the 13th USENIX Workshop in Hot Topics in Operating Systems (HotOS XIII)

Vasily Tarasov, Saumitra Bhanage, and Erez Zadok Stony Brook University Abstract
The quality of file system benchmarking has not improved in over a decade of intense research spanning hundreds of publications. Researchersrepeatedly use a wide range of poorly designed benchmarks, and in most cases, develop their own ad-hoc benchmarks. Our community lacks a definition of what we want to benchmark in a file system. We propose several dimensions of file system benchmarking and review the wide range of tools and techniques in widespread use. We experimentally show that even the simplest of benchmarks can be fragile, producingperformance results spanning orders of magnitude. It is our hope that this paper will spur serious debate in our community, leading to action that can improve how we evaluate our file and storage systems.

Margo Seltzer Harvard University

1 Introduction
Each year, the research community publishes dozens of papers proposing new or improved file and storage system solutions. Practically every suchpaper includes an evaluation demonstrating how good the proposed approach is on some set of benchmarks. In many cases, the benchmarks are fairly well-known and widely accepted; researchers present means, standard deviations, and other metrics to suggest some element of statistical rigor. It would seem then that the world of file system benchmarking is in good order, and we should all pat ourselveson the back and continue along with our current methodology. We think not. We claim that file system benchmarking is actually a disaster area—full of incomplete and misleading results that make it virtually impossible to understand what system or approach to use in any particular scenario. In Section 3, we demonstrate the fragility that results when using a common file system benchmark (Filebench[10]) to answer a simple question, “How good is the random read performance of Linux file systems?”. This seemingly trivial example highlights how hard it is to answer even simple questions and also how, as a community, we have come to rely on a set of common benchmarks, without really asking ourselves what we need to evaluate. The fundamental problems are twofold. First, accuracy of publishedresults is questionable in other scientific areas [8], but may be even worse in ours [11, 12]. Second, we are asking an ill-defined question when we ask, “Which file system is better.” We limit our discussion here to the second point. 1

What does it mean for one file system to be better than another? Many might immediately focus on performance, “I want the file system that is faster!” But faster underwhat conditions? One system might be faster for accessing many small files, while another is faster for accessing a single large file. One system might perform better than another when the data starts on disk (e.g., its on-disk layout is superior). One system might perform better on meta-data operations, while another handles data better. Given the multi-dimensional aspect of the question, we arguethat the answer can never be a single number or the result of a single benchmark. Of course, we all know that—and that’s why every paper worth the time to read presents multiple benchmark results—but how many of those give the reader any help in interpreting the results to apply them to any question other than the narrow question being asked in that paper? The benchmarks we choose should measure theaspect of the system on which the research in a paper focuses. That means that we need to understand precisely what information any given benchmark reveals. For example, many file system papers use a Linux kernel build as an evaluation metric [12]. However, on practically all modern systems, a kernel build is a CPU bound process, so what does it mean to use it as a file system benchmark? The...