Benchmark catalogue
-
BigBench V2
The BigBench V2 benchmark addresses some of the limitation of the BigBench (TPCx-BB) benchmark. BigBench V2 separates from TPC-DS with a simple data model. The new data model still has the variety of structured, semi-structured, and unstructured data as the original BigBench data model. The differe...
-
HiBench
A comprehensive benchmark suite consisting of multiple workloads including both synthetic micro-benchmarks and real-world applications. HiBench features several ready-to-use benchmarks from 4 categories: micro benchmarks, Web search, Machine Learning, and HDFS benchmarks. It is used for both stream...
-
owperf (CLASS)
This test tool benchmarks an OpenWhisk deployment for (warm) latency and throughput, with several new capabilities: Measure performance of rules (trigger-to-action) in addition to actions Deeper profiling without instrumentation (e.g., Kamino) by leveraging the activation records in addition to...
-
Yahoo Streaming Benchmark (YSB)
It is an end-to-end pipeline that simulates a real-world advertisement analytics pipeline. Currently implemented in Kafka, Storm, Spark, Flink and Redis. Yahoo reported the following as background of why they developed YSB: "At Yahoo we have adopted >Apache Storm as our stream processing p...
-
Yahoo! Cloud Serving Benchmark (YCSB)
A benchmark designed to compare emerging cloud serving systems like Cassandra, HBase, MongoDB, Riak and many more, which do not support ACID. It provides a core package of 6 pre-defined workloads A-F, which simulate a cloud OLTP application. Web references https://github.com/brianfrankcoope...
-
ABench
ABench is as a big data architecture stack benchmark. It aims to evaluate big data system across multiple layers of big data architecture, including cloud services, data storage, batch processing, interactive processing, streaming and machine learning. The benchmark supports re-using of existing be...
-
AdBench
It combines Ad-Serving, Streaming Analytics on Ad-serving logs, streaming ingestion and updates of various data entities, batch-oriented analytics (e.g. for Billing), Ad-Hoc analytical queries, and Machine learning for Ad targeting. While this benchmark is specific to modern Web or Mobile advertisi...
-
AIBench
AIBench is an industry standard Internet service AI benchmark suite, designed specifically for modern Internet services with microservice-based architecture. The benchmark spans sixteen AI problem domains from three most widely used Internet service domains: search engine, social network, and e-com...
-
AIM Benchmark
The AIM benchmark simulates a challenging use case of how to store and analyze billing data of subscribers and make marketing campaigns immediately available. The task is to process single events like phone calls or messages and to do real-time analytics which are represented by seven analytical qu...
-
AIMatrix
AI Matrix is a benchmark suite for testing AI software frameworks and hardware platforms. Web references https://aimatrix.ai/en-us/index.html https://github.com/alibaba/ai-matrix https://aimatrix.ai/en-us/docs/contents.html Date of last description update 2019 Originating g...
-
AIoTBench
AIoTBench, is a comprehensive benchmark suite to evaluate the AI ability of mobile and embedded devices. AIoTBench covers1) different application domains, e.g. image recognition, speech recognition and natural language processing; 2) different platforms, including Android devices and Rasp...
-
ALOJA
The ALOJA research project is an initiative from the Barcelona Supercomputing Center (BSC) to produce a systematic study of Hadoop configuration and deployment options. The project provides an open source platform for executing Big Data frameworks in an integrated manner facilitating benchmark exec...
-
AMP Lab Big Data Benchmark
APM Lab Big Data Benchmark implements 5 SQL-on-Hadoop engines (RedShift, Hive, Stinger/Tez, Shark and Impala). In order to provide an environment for comparing these systems, AMP Lag Big Data Benchmark draw workloads and queries from "A Comparison of Approaches to Large-Scale Data Analysis" b...
-
BDAICoE
The Big Data and Artificial Intelligence Centre of Excellence (BDAICoE) framework focus on identifying best practices for benchmarking Centres of Excellence (CoEs). This involved establishing a common conceptual foundation of the benchmarking process, identifying a methodology for performing the be...
-
BenchIoT
BenchIoT is a benchmark suite and evaluation framework for evaluating micro-controllers (IoT-µCs) security mechanisms. The applications run on either bare-metal or a real-time embedded operating system, and are evaluated through security, performance, memory usage, and energy metrics. The Be...
-
Benchip
BENCHIP focuses on benchmarking intelligent processors and system optimization. It is ultilized for comparison of optimization and bottleneck analysis of hardware platforms. Web references BENCHIP: Benchmarking Intelligence Processors Date of last description update 2017 Originating group...
-
Berlin SPARQL Benchmark (BSBM)
This benchmark aims to provide assistance in choosing the suitable architecture and storage system according by comparing the SPARQL query performance of native RDF stores with the performance of SPARQL-to-SQL re-writers as well as the performance of relational database systems. The benchmark mimic...
-
BigBench
It is an end-to-end Big Data benchmark that represents a data model simulating the volume, velocity and variety characteristics of a Big Data system, together with a synthetic data generator for structured, semi-structured and unstructured data, consisting of 30 queries. Web references http://www...
-
BigDataBench
It is an open source Big Data benchmark suite consisting of 15 data sets (of different types) and more than 33 workloads. BigDataBench 5.0 provides 13 representative real-world data sets and 27 big data benchmarks. The benchmark suite includes micro benchmarks, each of which is a single data m...
-
BigFrame
BigFrame is a benchmark generator offering a benchmarking-as-a-service solution for Big Data analytics. Web references https://github.com/bigframeteam/BigFrame/wiki Date of last description update 2013 Originating group - Time – first version, last version 2013 Type/Domain Benchmark...
-
BigFUN
BigFUN is based on a social network use case with synthetic semi-structured data in JSON format. The benchmark focuses exclusively on micro-operation level and consists of queries with various operations such as simple retrieves, range scans, aggregations, joins, as well as inserts and updates. We...
-
BlockBench
BlockBench is the first benchmarking framework for private blockchain systems. It serves as a fair means of comparison for different platforms and enables deeper understanding of different system design choices. It comes with both macro benchmark workloads for evaluating the overall performance and...
-
CALDA
The benchmark consists of five tasks defined as SQL queries among which is the original MR Grep task, which is a representative for most real user MapReduce programs. The benchmark was developed to specifically compare the capabilities of Hadoop with those of commercial parallel Relational Database...
-
CBench-Dynamo
Dynamo-based databases are designed to run in a cluster while offering high availability and eventual consistency to clients when subject to network partition events. CBench-Dynamo is a consistency benchmark for NoSQL Database system. The benchmark correlates properties, such as performance, consis...
-
CityBench
CityBench is a configurable benchmark for evaluation of RDF Stream Processing (RSP) engines, by comparing them in terms of their capability to fulfil application-specific requirements (for smart city applications with smart city datasets). Web references https://github.com/CityBench/Benchmark Da...
-
CloudRank-D
CloudRank-D is a benchmark suite for evaluating the performance of cloud computing systems running Big Data applications. The suite consists of 13 representative data analysis tools, which are designed to address a diverse set of workload data and computation characteristics (i.e. data semantics, d...
-
CloudSuite
CloudSuite is a benchmark suite consisting of both emerging scale-out workloads and traditional benchmarks. The goal of the benchmark suite is to analyze and identify key inefficiencies in the processor’s core micro-architecture and memory system organization when running today’s cloud ...
-
DAWNBench
DAWNBench is a benchmark suite for end-to-end deep learning training and inference. It provides a reference set of common deep learning workloads for quantifying training time, training cost, inference latency, and inference cost across different optimization strategies, model architectures, softwa...
-
Deep Learning Benchmarking Suite (DLBS)
Deep Learning Benchmarking Suite (DLBS) is a collection of command line tools for running deep learning benchmark experiments on various hardware/software platforms. Web references https://hewlettpackard.github.io/dlcookbook-dlbs/#/index?id=deep-learning-benchmarking-suite https://github.com/Hew...
-
DeepBench
DeepBench is an open source benchmarking tool that measures the performance of basic operations (dense matrix multiplies, convolutions and communication) involved in training deep neural networks. The benchmark includes operations and workloads for both training and inference, These operations are ...
-
DeepMark (Convnet)
DeepMark or convnet-benchmarks is an open-source framework for benchmarking a collection of Convolutional Neural Networks. Convolutional Neural Networks are a special kind of neuronal networks which are specifically designed for processing data that has a known grid-like topology. For instance time...
-
DSPBench
DSPBench: A Suite of Benchmark Applications for Distributed Data Stream Processing Systems.This is a benchmark suite composed of 15 applications in different domain of applications such as Finance, Telecommunications, Sensor Networks, Social Networks, etc. The benchmark suite has been tested with...
-
Edge AI Bench
Edge AIBench is a benchmark suite for end-to-end edge computing spanning all three layers: client-side devices, edge computing layer, and cloud servers. Benchmark includes four typical application scenarios: ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle, which co...
-
Fathom
Fathom: a collection of eight archetypal deep learning workloads for study. Each of these models comes from a seminal work in the deep learning community ranging from the familiar deep convolutional neural network of Krizhevsky et al., to the more exotic memory networks from Facebook's AI research ...
-
Framework of Load & Integration for Cloud Pub/Sub (FLIC)
This load test framework, known as Flic (Framework of load & integration for Cloud Pub/Sub), for Cloud Pub/Sub is a tool targeted for developers and companies who wish to benchmark Cloud Pub/Sub and Kafka. The goal of this framework is to provide users with a tool that allows them to see how C...
-
GARDENIA
Gardenia is a domain-specific benchmark suite consisting of irregular graph workloads. These workloads mimic actual machine learning and Big Data applications running on modern datacenter accelerators using state-of-the-art optimization techniques. Web references https://github.com/chenxuhao/gard...
-
GDPRBench
The General Data Protection Regulation (GDPR), introduced in Europe, was a set of rules and regulations that offered new rights and protections to people concerning their personal data. GDPRbench is an open-source benchmark designed specifically to assess the GDPR compliance of database...
-
gMark
gMark is a domain- and query language-independent framework targeting highly tunable generation of both graph instances and graph query workloads based on user-defined schemas. It provides a query translator for SPARQL, openCypher, PostgreSQL and Datalog. Web references https://github.com/graphMa...
-
Graphalytics
It is an industrial-grade benchmark for graph analysis platforms such as Giraph. It consists of six core algorithms, standard datasets, synthetic dataset generators, and reference outputs, enabling the objective comparison of graph analysis platforms. Web references http://ldbcouncil.org/ld...
-
Gridmix
The benchmark suite emulates different users sharing the same cluster resources and submitting different types and number of jobs. This includes also the emulation of distributed cache loads, compression, decompression and job configuration in terms of resource usage. Web references https://hadoo...
-
Hadoop Workload Examples
Set of commonly used Hadoop applications like WordCount, Grep, Pi and Terasort. Set of micro-benchmarks applied to all technologies using MR and HDFS to process data. Web references https://cwiki.apache.org/confluence/display/HADOOP2/Grep http://hadoop.apache.org/docs/r3.2.0/api/org/apache/hadoo...
-
HERMIT
HERMIT (HEalthcaRe Monitoring for the Internet of Things) is a benchmark suite for IoT applications in healthcare industry. The goal of this benchmark is to facilitate research into new microarchitectures and optimizations that will enable efficient execution of emerging Internet of Medical Things ...
-
Hobbit Benchmark
The HOBBIT evaluation platform is a distributed FAIR benchmarking platform for the Linked Data lifecycle. This means that the platform was designed to provide means to: (1) benchmark any step of the Linked Data lifecycle, including generation and acquisition, analytics and processing, storage and c...
-
HPC AI500
HPC A1500 is a benchmark suite for evaluating HPC systems that run specific Deep Learning workloads. HPC1500 workloads are based on real scientific DL applications and cover representative scientific fields, namely climate analysis, cosmology, high energy physics, gravitational wave physics, and co...
-
IDEBench
IDEBench measures the performance of interactive data exploration systems over the course of entire user-centered workflows, where queries are built and refined incrementally and executed with delays (thinktime) between queries, rather than being processed back-to-back. Each workflow comprises a se...
-
IGUANA
Iguana is an an Integerated suite for benchmarking read/write performance of HTTP endpoints and CLI Applications.Iguana is an easily extendable benchmark that esolves these issues by providing an enviroment which is highly configurable, a realistic scneario benchmark, working with any dataset, usin...
-
IoTAbench
IoTABench is a benchmark toolkit for IoT Big Data scenarios, facilitating apples-to-apples comparisons between different sensor data and analytics platform. The benchmark can be extended to multiple IoT use-cases, including a user´s specific needs, interests or datasets. Web references http...
-
IoTBench
IoTBench is a benchmark suite targeting at IoT edge-device applications, focusing on architecture performance of IoT devices. Web references https://ieeexplore.ieee.org/document/8802949 Date of last description update 2019 Originating group -- Time – first version, last ver...
-
Linear Road
The Linear Road Benchmark compares relational database systems with stream data management systems, computes performance characteristics of different stream data management systems relative to each other. is an application benchmark simulating a toll system for the motor vehicle expressways o...
-
LinkBench
LinkBench is a benchmark, developed by Facebook, using synthetic social graph to emulate social graph workload on top of databases such as MySQL. Web references https://github.com/facebookarchive/linkbench Date of last description update 2015 Originating group Facebook Time – first ver...
-
LIQUID
Liquid benchmarking platform is an online cloud-based platform for democratizing the performance evaluation and benchmarking process. The primary objective of the Liquid Benchmarking platform is to provide a cloud-based and social platform which can simplify and democratize the job of computer scie...
-
MapReduce Benchmark Suite (MRBS)
MapReduce Benchmark Suite (MRBS) is a comprehensive benchmark suite for evaluating the performance of MapReduce systems in 5 areas: recommendations, BI (TPC-H), Bioinformatics, Text Processing & Data Mining. Web references MRBS: A Comprehensive MapReduce Benchmark Suite Date of last ...
-
MiDBench
MiDBench is a multi-modal industrial big data benchmark. It focuses s on big data systems in crane assembly, wind turbines monitoring and simulation results management scenarios, which correspond to bills of materials (a.b.a BoM), time series and unstructured data format respectively. Web referenc...
-
MLBench
MLBench, also known as MLBench_Distributed or ML benchmarkis, is a benchmark suite for distributed machine learning algorithms, frameworks and systems. The focus is on standard-supervised ML, including standard deep learning tasks as well as classic linear ML models. Web references https://mlben...
-
MLBench Services
The MLBench services benchmark, inspired by Kaggle, consists of datasets with a best-effort baseline of both feature engineering and machine learning models. It uses a novel metric based on the notion of "quality tolerance" that measures the performance gap between a given machine learning system a...
-
MLPerf
The MLPerf effort aims to build a common set of benchmarks that enables the machine learning (ML) field to measure system performance for both training and inference from mobile devices to cloud services. Web references https://mlperf.org/ https://github.com/mlperf/training Date of last descrip...
-
MRBench
MRBench is a batch processing, decision-support benchmark implementing the TPC-H benchmark queries directly in map-reduce operations. Rhw TPC-H queries directly as MapReduce jobs, so it supports technologies around MapReduce Web references https://markobigdata.com/2016/07/13/hadoop-benchmark- Da...
-
NNBench-X
NNBench-X is a benchmark for understanding and evaluating Neural Network workloads for accelerator designs. The benchmark aims to facilitate hardware-software co-designs to achieve significant performance improvements and energy saving, by dividing benchmarking process into three stages: (1) applic...
-
Open Anomaly Detection Benchmark (OADB)
Open Anomaly Detection Benchmark (OADB) benchmark evaluates machine learning algorithms for anomaly detection on numerical data. Algorithms are evaluated based on accuracy and computational complexity metric using a wide variety of datasets. OADB benchmark provides in depth visualization of results...
-
OpenML Benchmark Suites
The suite offers (a) ease of use through standardized data formats, APIs, and existing client libraries; (b) machine-readable meta-information regarding the contents of the suite; and (c) online sharing of results, enabling large scale comparisons. The OpenML100 is a machine learning benchmark suit...
-
Penn Machine Learning Benchmark (PMLB)
It includes most of the real-world benchmark datasets commonly used in ML benchmarking studies such as UCI ML repository, Kaggle, KEEL and the meta-learning benchmark. Web references https://github.com/EpistasisLab/pmlb Date of last description update 2018 Originating group Institute for Biom...
-
PigMix
A set of 17 queries, written in Pig Latin, specifically created to test the latency and scalability performance of Pig systems with different operations like data loading, different types of joins, group by clauses, sort clauses, as well as aggregation operations. Web references https://cwi...
-
PolyBench
Polybench is the first benchmark for heterogeneous analytics systems, especially for polystores, providing a complete evaluation environment. Polybench is an application-level benchmark that simulates a banking business model. It focuses on banking, since it features heterogeneous analytics and dat...
-
PRIMEBALL
PRIMEBALL is a novel and unified benchmark specification for comparing the parallel processing frameworks in the context of Big Data applications hosted in the cloud. It is implementation- and technology-agnostic, using a fictional news hub called New Pork Times, based on a popular real-life news s...
-
PUMA Benchmark Suite
During the work at Purdue on MapReduce, we developed a benchmark suite which represents a broad range of MapReduce applications exhibiting application characteristics with high/low computation and high/low shuffle volumes. There are a total of 13 benchmarks, out of which Tera-Sort, Word-Count, and ...
-
RIoTBench
A Real-time IoT Benchmark suite, consisting of 27 IoT micro-benchmarks and 4 real-application benchmarks reusing the micro-benchmark components, along with per-formance metrics. The goal of the benchmark suite is to evaluate the efficacy and performance of Distributed Stream Processing Systems (DSP...
-
Sanzu
It is a data science benchmark for evaluating systems for data processing and analytics tasks such as Anaconda, PySpark, MADlib and R. The benchmark consists of micro (basic file I/O, data wrangling, descriptive statistics, distribution and inferential statistics, time series and machine lear...
-
Semantic Publishing Benchmark (SPB)
Semantic Publishing Benchmark (SPB) v2.0 is a LDBC benchmark for RDF database engines inspired by the Media/Publishing industry, particularly by the BBC’s Dynamic Semantic Publishing approach. The application scenario considers a media or a publishing organization that deals with large volum...
-
Senska
It is an enterprise streaming benchmark. It consists of three major components: data feeder, system under test and result validator. The chosen domain for Senska is industrial manufacturing since it represents a natural fit for an enterprise application requiring data stream processing capab...
-
SmartBench
SmartBench: A Benchmark For Data Management In Smart Spaces.SmartBench focuses on queries resulting from (near) real-time applications and longer-term analysis of IoT data. This benchmark has been derived from a deployed smart building monitoring system. It provides an extensible schema of an IoT s...
-
Social Network Benchmark (SNB)
The Social Network Benchmark consists in fact of three distinct benchmarks on a common dataset, since there are three different workloads: Interactive, Business Intelligence and Graph Analytics. Each workload produces a single metric for performance at the given scale and a price/performance metri...
-
SparkAIBench
SparkAIBench is a benchmark to generate AI workloads on Apache Spark, supporting a variety of algorithms, configurable data input size, as well as parametric method for submission. Web references https://github.com/harryandlina/SparkAIBench https://www.benchcouncil.org/bench19/file/slides/...
-
SparkBench
SparkBench, developed by IBM, is a comprehensive Spark specific benchmark suite developed for in-memory data analysis to provide insights into Spark system design and performance optimization and cluster provisioning. The benchmark provides automatic generation of data sets with various scale...
-
Stream WatDiv
Stream-WatDiv is an open-source benchmark for streaming RDF data management systems, to evaluate streaming RDF processing engines. It is extended from Waterloo SPARQL Diversity Test Suite (WatDiv), and includes a streaming data generator, a query generator that can produce a diverse set of SPARQL q...
-
StreamBench
StreamBench covers 7 micro-benchmark programs that intend to address typical stream computing scenarios, implemented in Spark Streaming and Storm. Web references https://github.com/lsds/StreamBench Date of last description update 2014 Originating group -- Time – first version, last ver...
-
SWIM
It consists of a framework which is able to synthesize representative workload from real MapReduce traces taking into account the job submit time, input data size, shuffle/input and output/shuffle data ratio. The result is a synthetic workload which has the exact characteristics of the original wor...
-
TensorFlow Benchmarks
A selection of image classification models is tested across multiple platforms to create a point of reference for the TensorFlow community. The GitHub repository contains various TensorFlow benchmarks. So far, it consists of two projects: PerfZero: A benchmark framework for TensorFlow. scripts...
-
TERMinator Suite
TERMinator is a benchmark suite for evaluating and comparing encrypted computer architectures based on homomorphic operations, avoiding termination problems while maintaining data privacy. Web references https://eprint.iacr.org/2017/1218.pdf Date of last description update 2019 Originating gro...
-
Theodolite
Theodolite is a framework for benchmarking the horizontal and vertical scalability of stream processing engines. It consists of three modules:Theodolite Benchmarks: Theodolite contains 4 application benchmarks, which are based on typical use cases for stream processing within microservices. For eac...
-
TPC-DS
TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The main focus areas: Multiple snowflake schemas with shared dimensions, 24 tables with an average of 18 columns, 99 distinct SQL 99 que...
-
TPC-DS v2
TPCx-DSv2 is the version 2 of TPCx-DS benchmark which is an industry standard for benchmarking SQL based big data systems. TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark prov...
-
TPC-H
TPC-H is the de facto benchmark standard for testing data warehouse capability of a system. Instead of representing the activity of any particular business segment, TPC-H models any industry that manages, sells, or distributes products worldwide...
-
TPCx-BB (BigBench)
TPCx-BB is a measure the performance of Hadoop based Big Data systems systems. Based on BigBench, it measures the performance of both hardware and software components by executing 30 frequently performed analytical queries in the context of retailers with physical and online store presence. ...
-
TPCx-HS v2
It stresses both the hardware and software components including the Hadoop run-time stack, Hadoop File System and MapReduce layers. The benchmark is based on the TeraSort workload, which is part of the Apache Hadoop distribution. Introduced in 2016 TPCx-HS V2 is based on TPCx-HS V1 with support fo...
-
TPCx-IoT (YCSB++)
The TPC Benchmark IoT (TPCx-IoT) benchmark, also known as YCB++, is designed based on Yahoo Cloud Serving Benchmark (YCSB) workloads. It is not comparable to YCSB due to significant changes. The TPCx-IoT workloads consists of data ingestion and concurrent queries simulating workloads on typical IoT...
-
TPCx-V
The TPC Express Benchmark V (TPCx-V) measures the performance of a server running virtualized databases. It simulate a mix of On Line Transaction Processing (OLTP) and Decision Support Systems (DSS) workloads in cloud computing environment. It stresses CPU and memory hardware, storage, networking,...
-
Training Benchmark for DNNs (TBD)
TBD is an open source benchmark suite. It covers 6 application domains with 8 deep learning models. Web references http://tbd-suite.ai/ https://github.com/tbd-ai/tbd-suite Date of last description update 2019 Originating group University of Toronto, Microsoft Research Time – first ver...
-
VisualRoad
VisualRoad is a benchmark for evaluating video data management systems (VDBMSs). The benchmark comes with a data generator and a suite of queries over cameras positioned within a simulated metropolitan environment. Visual Road’s video data is automatically generated and annotated using a simu...
-
WatDiv
WatDiv measures how an RDF data management system performs across a wide spectrum of SPARQL queries with varying structural characteristics and selectivity classes. It consists of two components: the data generator and the query (and template) generator. Web references http://dsg.uwaterloo.ca/wat...