Performance Evaluation and Benchmarking for the Era of Cloud(s) by Raghunath Nambiar & Meikel Poess

Performance Evaluation and Benchmarking for the Era of Cloud(s) by Raghunath Nambiar & Meikel Poess

Author:Raghunath Nambiar & Meikel Poess
Language: eng
Format: epub
ISBN: 9783030550240
Publisher: Springer International Publishing


6 Conclusions and Future Work

This paper presents our experiences and initial experiments using BigBench on a single node configuration powered by Intel technologies and a relational database system, Microsoft* SQL Server*. Our initial results on 1 and 3 TB data sizes demonstrate advanced capabilities of Microsoft SQL Server 2019 (pre-release candidate) to handle heterogeneous and volume aspects of big data and how even a single-node, relational database configuration can scale up to big data workloads.

Given that this paper is an early study, there exists several avenues for future research. Firstly, collecting and analyzing performance over higher scale factors which are even more representative of the data volume aspect in big data is an ongoing study. Secondly, profiling the benchmark to assess sensitivities of BigBench queries to the number of cores, core frequency, memory, and storage in a single node environment is another promising direction. There are similar studies done over cluster-based environments. Combined with the existing studies on cluster-based configurations, these results can be used by practitioners to compare the query resource requirements and processing methodology in a single vs. multi-node configuration, and thus understand the impact of these different architectures on the performance of big data workloads. Also, it would be important to identify optimal platform configuration settings since the current configuration may have been overconfigured for the scale factors considered in this study. Another interesting direction would be to expand analysis to address multiple concurrent streams. Richins et al. [23] have done a comprehensive analysis using BigBench on a cluster-based configuration. The authors have identified thread level parallelism as a major bottleneck. It would be worthwhile to investigate if similar behaviour shows up on single-node setup as well and drive further analysis based on the results.

Acknowledgements

We thank Harish Sukhwani, Mahmut Aktasoglu, Hamesh Patel from Intel, and Jasraj Dange and Tri Tran from Microsoft Corporation for their constructive feedback that helped to improve the paper. We are immensely grateful to Nellie Gustafsson from Microsoft for her help in revising machine learning queries to match the benchmark specification. We thank Arun Gurunathan, Sumit Kumar, Nellie Gustafsson, and Gary Ericson for their inputs on revising the section on extensibility framework. The authors would also like to acknowledge Vaishali Paliwal and Charles Winstead from Intel Corporation for their overall support and project guidance, Sridharan Sakthivelu for the technical discussions, and Ketki Haridas for her early contributions to the work.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.