Charles University, Prague, Czech Republic Faculty of Mathematics and Physics. Master Thesis. Petr Novák - PDF

Description
Charles University, Prague, Czech Republic Faculty of Mathematics and Physics Master Thesis Petr Novák Network Repository for Performance Evaluation Results Department of Software Engineering Supervisor:

Please download to get full document.

View again

of 48
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information
Category:

Healthcare

Publish on:

Views: 20 | Pages: 48

Extension: PDF | Download: 0

Share
Transcript
Charles University, Prague, Czech Republic Faculty of Mathematics and Physics Master Thesis Petr Novák Network Repository for Performance Evaluation Results Department of Software Engineering Supervisor: Doc. Ing. Petr Tůma, Dr. Computer Science Program, Software Systems I would like to thank my supervisor, Petr Tůma, for his encouraging guidance, support and patience. I would also like to thank my friends Kristina and Luboš for proofreading the thesis. I declare that I have written this master thesis myself, using only the referenced sources. I give my consent to lending the thesis. Prague, April 17, 2008 Petr Novák 2 Table of Contents 1 Introduction Content Mission (How Should the Repository Look) Use Cases Home Brew Fiddling Serious Experiment Automated Framework Requirements Functional Requirements Non-functional Requirements Related Work Benchmarks The Solution Data Storage Data Model Parsing Individual Results Formats Metadata Data Aggregation Executing Queries Plots Report Generation Repository Users External Interface Implementation Details Repository Core Data Plugins Query Processing Plots Custom Views Task Processor XML Import/Export Web Interface Web Service API Security Evaluation Conclusion Bibliography...34 Appendix A: User manual...35 Appendix B: CD-ROM Contents Abstrakt Název práce: Síťové úložiště pro výsledky výkonových testů Autor: Petr Novák Katedra: Katedra softwarového inženýrství Vedoucí diplomové práce: Doc. Ing. Petr Tůma, Dr. vedoucího: Abstrakt: Výkonové testy softwarových systémů produkují velké množství dat, které je nutno ukládat a vyhodnocovat. Cílem diplomové práce je navrhnout a implementovat úložiště těchto výsledků, poskytující zároveň funkce pro jejich zpracování a prezentaci. Součástí práce je implementace úložistě, zpracovaná jako webová aplikace a schopná ukládat a vyhodnocovat různé druhy výsledků včetně textových a XML formátů. Prezentační funkce zahrnují tvorbu grafů a generování webových stránek popisujících naměřené hodnoty. Klíčová slova: webová aplikace, benchmark, úložiště výsledků, grafy, prezentace Abstract Title: Network Repository For Performance Evaluation Results Author: Petr Novák Department: Department of Software Engineering Supervisor: Doc. Ing. Petr Tůma, Dr. Supervisor's address: Abstract: Benchmarks of software systems produce large amount of data that need to be stored and processed. The goal of the thesis is to design and implement a repository, providing also functions for presentation of the results. Repository implementation which is part of this thesis works as a web application and supports parsing of various result formats, including plaintext and XML. Presentation functions include generation of plots as well as HTML pages describing extracted values. Keywords: web application, benchmark, result repository, plots, presentation 4 1 Introduction Many contemporary software applications rely on middleware components such as CORBA or SOAP libraries. There are many projects targeted on evaluating performance of such middleware (an overview of the projects is available in [1]). However, these efforts are still mostly fragmented, especially when considering result presentation, as can be illustrated on the list of individual middleware benchmarking projects: RUBiS is a prototype of an auction site, used for performance benchmarking [2]. Benchmark results are available on the website. However, there is no way to submit new measurements. The pages appear to be created manually and no new results are being added. Xampler is a CORBA benchmarking suite [3], providing occasionally updated website with results. Still, the update requires manual intervention and other people can't submit their results. Sampler (a simplified CORBA benchmarking suite [4]) has a full result repository, including option for submitting new results, coded specifically for this benchmark. TAO Performance Scoreboard [5] monitors the performance of TAO CORBA implementation. Results from daily builds show up regularly using a framework coded specifically for this benchmark, no third-party submissions are possible. SPEC result database [6] does not parse the result except for summary data, the rest of the information is in the original report generated by the benchmark application. Result submission is paid for. TPC (Transaction Processing Performance Council [7]) database collects performance results of entire server configurations. As with the SPEC database, result submission is paid. It appears that with the exception of a few specifically designed repositories, there is no open platform for publishing (and eventually comparing or otherwise processing) the experiment results. This leads to extra work for benchmark authors, who have to create their own tools for their specific results. The lack of a results repository is also pointed out by Brebner et al. [1], who also note that an open database would also allow extracting different information from one set of results for use in different research projects. Additionally, having a result database means a possibility to reference measured results, and a possibility of uploading new results can further confirm published conclusions. Because of these reasons, we want a repository that is capable of storing a multitude of benchmarking data from experiments executed at various locations. The repository should also be capable of parsing and presenting common result formats. Still, on the parsing side we can assume that certain level of cooperation from benchmark authors can be obtained that is, we do not need to parse absolutely everything, but we need to parse the output that the benchmark authors are easily able to produce. 1.1 Content The following chapter describes detailed requirements on the results repository. In Chapter 3, existing related work is discussed. Chapter 4 contains explanation of problems presented by the requirements, and the approaches I chose to solve them, while Chapter 5 describes the 5 repository implementation in further detail. Chapter 6 analyses the implementation with respect to the requirements; the main text of the thesis is summarized in Chapter 7. Appendix A contains User's Manual of the application, which is supplied on an enclosed CD-ROM, together with off-line browsable snapshot of the web interface. The detailed content of the CD is listed in Appendix B. 2 Mission (How Should the Repository Look) The purpose of this work is to design and implement a repository that would allow to store and present various benchmark results. The repository has to be accessible through web interface, parts of functionality that are important to third-party applications will be published through web service API. In scope of this work, test means one type of benchmark with one format of results and test instance is set of data produced by a third-party application that is uploaded into the repository (i.e. an actual benchmark result). 2.1 Use Cases This section contains use cases explaining typical uses of the repository, which were provided by the thesis supervisor in the initial stages of design. I chose to include them in this work, because I think they provide an overview of the application purpose in a slightly more accessible way than the requirements I've drawn from them Home Brew Fiddling I want to test a few things with a benchmark I have just devised. The benchmark produces a trivial textual output that I have just thought nice, sometimes intermingled with debugging messages, sometimes not. The output is in two files for each experiment. Each experiment also has a plain English description of whatever configuration it was run under. After I've described the format of the results to the repository, I take the results and upload them to the repository using a simple web browser form. After I do this, I can ask the repository to plot the results in various graphs, interactively via a web form again. I can plot results from multiple experiments against each other, and filter experiments by regular expressions matched against the configuration description. I might also want to extract the original files from the repository in a bundle, just in case I happen to lose the originals. Functional Requirements Parse results from text files Store results that span across multiple files Upload results via a web-browser form Define plots via web forms Compare results from multiple experiments Filter experiments (by matching regular expressions on supplied metadata) 6 Extract original result files Serious Experiment I get a larger benchmark experiment running. The results are again text files, but this time, I have no control over the content for I use somebody else's benchmark. The files are rather large, couple of megabytes, and stored across a few directories. I want to upload the results in an archive rather than file by file, because that's easier. I want to use a description of the result format that came with the benchmark to make it interoperable with my repository, and just upload that rather than to fill in forms. The description would also contain some useful graphs to create based on the data, but I can still devise additional ways to plot the data using the web form. I can provide pointers to the data to other people. The public would not be able to upload data under the same experiment header, but they would be able to browse my results pretty much in the same way as I do. I can also define a summary page that contains brief information of my choice, and an arbitrary number of detail pages that I can use to point out something in the data and send a link to the page to other people. Functional Requirements upload multi-file results in one archive import test type descriptions (together with plot definitions) create public browsable presentation, define arbitrary number of pages describing the results Automated Framework Pretty much the same as the Serious Experiment use case, except I want to whip up a few tools that would process the data. The tools would be in multiple programming languages and would use interfaces typical for those languages, such as RMI or CORBA, to access the repository results. The tools would be able to extend the results, for example by calculating some additional statistics, which would be stored back in the repository, so that the tool can reuse them if needed. Functional Requirements Interface for third-party applications to access the data Space for additional data uploaded to the results by third-party applications 2.2 Requirements What follows is a more detailed summary of the requirements provided by the expected use cases. 7 2.2.1 Functional Requirements FR1 Result Formats The repository has to be able to parse multiple result formats, including plain-text files (with syntax similar to CSV or formats with measurement type = result ). In addition, XML-based result formats have to be supported. The resulting repository has to be able to parse results from Sampler, Xampler, RUBiS, ECperf, TAO and OVM benchmarks (per thesis assignment). The application has to accept test results that consist of multiple files. This requirement is also forced by RUBiS, ECperf, and OVM benchmarks result format. The set of supported result formats should not be closed, but instead remain flexible enough to accommodate potentially appearing new formats. FR2 Configuration Information The application has to provide means to collect and store metadata together with uploaded results. These data can either be extracted from the results (configuration information) or entered by the uploader (contact information or the configuration, in case it isn't provided in the results). FR3 Plots The repository has to provide customizable support for multiple display formats including box and whisker plots, density plots and history plots. At least Sampler, Xampler, TAO and OVM display formats need to be supported. FR4 Web Interface Repository functions, including results upload and definition of plots need to be accessible from the web interface. When uploading multi-file results, an option for uploading all the files in one archive (for example tar) has to be supported. FR5 Presentation Functions The web interface needs to be able to provide a browsable public presentation of results, including customizable summary pages and an arbitrary number of detail pages, further describing the results. The presentation has to be able to provide comparisons of results from multiple experiments. User has to be able to search in the results, at least by filtering the experiments by regular expressions (matched on metadata belonging to the experiments). FR6 Availability of Original Data The repository has to provide options for downloading the original result data. 8 FR7 Import/Export of Test Descriptions In addition to defining the test format and presentation (including public pages and plots) by web forms, the repository has to provide means to upload this description from an external file. Also, exporting an already defined test description into the same format should be supported. FR8 Interface for Other Applications The repository needs to provide interface for other applications to access the data. In addition to uploading new results, the third-party applications have to be able to store additional (computed) data to already existing results and retrieve it afterwards Non-functional Requirements NFR1 Platform The product will run on Linux operating system, and will not use relational database for critical data (per assignment). NFR2 Performance Application's web interface has to be reasonably responsive (within seconds at most). Result parsing may be done in the background, but should not take more than a few minutes and the rest of the interface still needs to be usable. NFR3 Documentation The product will be accompanied with both administrator and user documentation, describing installation of the repository application and instructions for how to set up a repository for benchmark results. 3 Related Work Apart from the commercial databases mentioned in Chapter 1, there already exists an open result repository implementation in Java [8], providing a web service API. Unfortunately, when familiarizing myself with said implementation, I have encountered issues with Enterprise Java portability which have made the repository nontrivial to install. Also, I believe that relying on a dynamic scripting language such as Python opens new options to implementing some of the repository features which are difficult to achieve in Java, which is why I have decided to abandon the original implementation entirely. This has the obvious drawback of potentially adding more work, but, as I outline in the feature comparison in the concluding section, the result is at least as functional as the original implementation. 3.1 Benchmarks Due to the nature of the work, which involves parsing benchmark results, middleware benchmark applications should also be considered related to the repository. Although the following list is based on the thesis assignment and is thus not exhaustive, I believe that it covers most of the result formats that the product will be required to handle. 9 Sampler Sampler [4] produces one XML file with description of the configuration, followed by sets of measurement results. ?xml version= 1.0 ? Results VersionMajor= 1 VersionMinor= 1 Configuration Uniquifier= /Configuration Client VersionMajor= 1 VersionMinor= 13 BenchmarkClient 1.13 [--/--/--] Identity Processor Vendor= Unknown Family= Sparc Model= Sparcv9 Clock= 360 Number= 2 2 CPU Sparc (2 online) 0: CPU sparcv9 FPU sparcv9 360 MHz online 2: CPU sparcv9 FPU sparcv9 360 MHz online /Processor Memory PhysicalFree= PhysicalTotal= VirtualFree= VirtualTotal= /Memory Timer Scale= Granularity= 444 /Timer System Vendor= Sun Family= Unix Model= Solaris VersionMajor= 5 VersionMinor= 7 SunOS 5.7 Generic_ /System Compiler Vendor= Opensource Family= C++ Model= GNU C++ VersionMajor= 2 VersionMinor= 95 Optimized= Yes GNU C (release) /compiler (snipped) /Client Session Benchmark Type= Instances Parallel Sequence In Measurement Size= 0 Count= 1 Simul= 1 Sample LoadBefore= 0 LoadAfter= 2 /Sample Sample LoadBefore= 0 LoadAfter= 1 /Sample Sample LoadBefore= 0 LoadAfter= 1 /Sample Sample LoadBefore= 0 LoadAfter= 3 /Sample Sample LoadBefore= 0 LoadAfter= 2 /Sample /Measurement /Benchmark (snipped) Illustration 1: Example of Sampler output 10 Xampler Xampler [3] outputs set of text files, where one files contains result of one measurement. The files contain a series of annotated values, which can be either a string of a set of integers. [BENCHMARK] [STR] Version Xampler post 1.10 post 04/02/02 [STR] Suite Invocation Static Client [CONFIGURATION] [1] Scale [1] Granularity 1 [MEASUREMENT] [1] Samples 1 [1] Threads 1 [SAMPLE] [3] MemoryApplicationResident [3] MemoryApplicationSwapped [3] MemoryKernelUsed [3] MemoryTotalPhysicalUsed [3] MemoryTotalPhysicalFree [3] MemoryTotalSwapUsed [3] NetworkBytesSent [3] NetworkBytesReceived [3] NetworkPacketsSent [3] NetworkPacketsReceived [3] ProcessorApplicationThreads [3] ProcessorTotalKernel [3] ProcessorTotalWait [3] ProcessorTotalUser [3] ProcessorTotalIdle [1] ThruputSolo 4153 [1] ThruputAverage 4153 [3] HalfwayAverage [5000] HalfwaySolo (...) Illustration 2: Example of Xampler output RUBiS RUBiS [2] benchmarks produce human-readable HTML, including plots. The role of the repository in this case is limited to parsing summary information and generating overview pages, linking to already-generated HTML. br a NAME= down_stat /A br h3 down ramp statistics /h3 p TABLE BORDER=1 THEAD TR TH State name th % of total th count th errors th minimum Time TH Maximum Time TH Average Time TBODY TR TD div align=left home /div td div align=right 0.57 % /div td div align=right 19 /div td div align=right 0 /div td div align=right 4 ms /div td div align=right 1358 ms /div td div align=right 10 ms /div TR TD div align=left browse /div td div align=right 1.11 % /div td div align=right 37 /div td div align=right 0 /div td div align=right 3 ms /div td div align=right 1762 ms /div td div align=right 17 ms /div (snipped) TR TD div align=left end of Session /div TD div align=right 0.18 % /div td div align=right 6 /div td div align=right 0 /div td div align=right 0 ms /div td div align=right 0 ms /div td div align=right 0 ms /div TR TD div align=left b total /div /b td div align=right b 100 % /B /div TD div align=right b 3347 /b /div td div align=right b 0 /b /div td div align=center - /div td div align=center /div td div align=right b 1 ms /b /div tr td div align=left b average throughput /div /b td colspan=6 div align=center b 54 req/s /b /div TR TD div align=left completed sessions /div td colspan=6 div align=left 5 /div TR TD div align=left total time /div td colspan=6 div align=left 4457 seconds / div TR TD div align=left b average session time /div /b td colspan=6 div align=left b 891.4 seconds /b /div /TABLE p Illustration 3: Part of RUBiS Performance Report 11 TAO TAO (The ACE ORB) [9] is an open-source CORBA implementation. Its included benchmarks do not save any files by default, but produce plain text output with measured values, which can be piped into files and then possibly parsed. ~/ACE/TAO/performance-tests/Latency/Single_Threaded $./run_test.pl ================ Single-threaded Latency Test server ( ): user is not superuser, test runs in time-shared class client ( ): user is not superuser, test runs in time-shared class test finished High resolution timer calibration...done Total latency : 91[45765]/108/34042[200634] (min/avg/max)
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks