Ravello Performance on Oracle Cloud Infrastructure

Ravello Performance on Oracle Cloud Infrastructure

 

Introduction

I’ve been a long time network tester either as part of my day job or as a hobby. Upon getting a new computer upgrade I’d always want to measure the performance benefits against the previous version. I guess the same thing goes for cars too but that can be a little more dangerous, right? So I was excited when earlier this year a number of us vExperts were invited to beta test the new Ravello on Oracle Cloud Infrastructure (previously known as Ravello on Oracle Bare Metal Cloud). Among all the new features and capabilities there were two that caught my eye…

  1. ESXi on top of Ravello (set the preferPhysicalHost flag to true)
  2. VMs with 32 CPUs and 200 GB of RAM

It’s still the same Ravello concept under the hood but the underlying hardware is better now that they control the datacenter where the machines are run. The Ravello hypervisor runs directly on Oracle Hardware Servers without having to go through a separate nested hypervisor abstraction layer.

Repo

One of the features I love most about Ravello is the “Repo“. The Ravello Repository allows you to instantly reproduce an entire environment created by someone else. All the VMs I used in this blog post are available for you to try out in a publicly available blueprint… just click this link to make it available in your own Ravello account: https://www.ravellosystems.com/repo/blueprints/88541081

Publish these to the various zones and you can reproduce my tests below for yourself and even modify them with your own custom workloads.

There are some other great write-ups on what you can do with Ravello. Please check these out too:

Test Results

In this report I will show the results of various performance tests being run on the legacy Ravello Infrastructure versus the new “Bare Metal” option.

TL;DR – the new Ravello on Oracle Cloud Infrastructure (aka Bare Metal) capabilities offers anywhere from two to ten times the performance of the legacy environments depending on your workloads.

Blueprint Name: ubuntu virtio x 2 test-bp

VM Image: https://www.ravellosystems.com/public-inf/ubuntuserver14-04-120150301-image.html

Changes made: SSH key configured, 4 vcpu, 4gb ram.

US East 2 – 9.2 Gbits/sec network – 286.33 MB/sec memory – 10.6 seconds for 1000 threads

  • West: ubuntuserver140412-ubuntuvirtiox2test-zsssbmzc.srv.ravcloud.com
  • East: ubuntuserver140412-ubuntuvirtiox2test-pamjktwc.srv.ravcloud.com

US Central 1 – 6.0 Gbits/sec network – 272.59 MB/sec memory – 11.2 seconds for 1000 threads

  • West: ubuntuserver140412-ubuntubaremetal-48fveqxh.srv.ravcloud.com
  • East: ubuntuserver140412-ubuntubaremetal-w7ig42rz.srv.ravcloud.com

US East 5 – 11.2 Gbits/sec network – 1308.36 MB/sec memory – 2.1 seconds for 1000 threads

  • West: ubuntuserver140412-ubuntuvirtiox2east-w3bsd56k.srv.ravcloud.com
  • East: ubuntuserver140412-ubuntuvirtiox2east-insladvj.srv.ravcloud.com

Steps followed

  1. From your admin PC…
  2. Use a web browser create a 2 node ubuntu application
  3. login to each node: ssh -i ssh-key.pem ubuntu@host-name
  4. sudo add-apt-repository “ppa:patrickdk/general-lucid”; sudo apt-get -y update; sudo apt-get -y install iperf3 bonnie++ sysbench
  5. save application as a blueprint
  6. publish blueprint to various sites and then test as follows…
  7. ssh -i ssh-key.pem ubuntu@east-host-name
    1. ip a ← shows ip address of 10.0.0.3
    2. iperf3 -s
  8. ssh -i ssh-key.pem ubuntu@west-host-name
    1. ip a
    2. iperf3 -c 10.0.0.3 ← starts test from client on west to server on east
  9. record results for each site above
  10. run memory test – sysbench –test=memory run

Here’s a screenshot where you can see the iPerf3 output showing throughput differences between the same 2 node ubuntu blueprint published to US East 2 and US East 5 sites. From my mac laptop I was able to quickly spin up the same blueprint of 2 VMs in 2 different sites (4 VMs) within a few minutes and run these test workloads. By default the VMs run for 2 hours then they suspend automatically but you can extend this as long as you need. There’s also a full RESTful API available for integration with your Continuous Integration Software Development Lifecycle.

Tools used

iperf3

The ultimate speed test packet generator tool for TCP, UDP and SCTP
iPerf3 is a tool for active measurements of the maximum achievable bandwidth on IP networks. It supports tuning of various parameters related to timing, buffers and protocols (TCP, UDP, SCTP with IPv4 and IPv6). For each test it reports the bandwidth, loss, and other parameters. This is a new implementation that shares no code with the original iPerf and also is not backwards compatible. iPerf was originally developed by NLANR/DAST. iPerf3 is principally developed by ESnet / Lawrence Berkeley National Laboratory. It is released under a three-clause BSD license. Here are just some of the features:

  • TCP and SCTP
  • Measure bandwidth
  • Report MSS/MTU size and observed read sizes.
  • Support for TCP window size via socket buffers.
  • UDP
  • Client can create UDP streams of specified bandwidth.
  • Measure packet loss
  • Measure delay jitter
  • Multicast capable
  • Cross-platform: Windows, Linux, Android, MacOS X, BSD

iperf3 command options

$ iperf3 -v

iperf 3.0.7

Linux 3.16.0-31-generic #41~14.04.1-Ubuntu SMP Wed Feb 11 19:30:13 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Usage: iperf [-s|-c host] [options]

iperf [-h|–help] [-v|–version]

Server or Client:

-p, –port # server port to listen on/connect to

-f, –format [kmgKMG] format to report: Kbits, Mbits, KBytes, MBytes

-i, –interval # seconds between periodic bandwidth reports

-F, –file name xmit/recv the specified file

-A, –affinity n/n,m set CPU affinity

-B, –bind bind to a specific interface

-V, –verbose more detailed output

-J, –json output in JSON format

-d, –debug emit debugging output

-v, –version show version information and quit

-h, –help show this message and quit

Server specific:

-s, –server run in server mode

-D, –daemon run the server as a daemon

Client specific:

-c, –client run in client mode, connecting to

-u, –udp use UDP rather than TCP

-b, –bandwidth #[KMG][/#] target bandwidth in bits/sec (0 for unlimited)

(default 1 Mbit/sec for UDP, unlimited for TCP)

(optional slash and packet count for burst mode)

-t, –time # time in seconds to transmit for (default 10 secs)

-n, –bytes #[KMG] number of bytes to transmit (instead of -t)

-k, –blockcount #[KMG] number of blocks (packets) to transmit (instead of -t or -n)

-l, –len #[KMG] length of buffer to read or write

(default 128 KB for TCP, 8 KB for UDP)

-P, –parallel # number of parallel client streams to run

-R, –reverse run in reverse mode (server sends, client receives)

-w, –window #[KMG] TCP window size (socket buffer size)

-C, –linux-congestion set TCP congestion control algorithm (Linux only)

-M, –set-mss # set TCP maximum segment size (MTU – 40 bytes)

-N, –nodelay set TCP no delay, disabling Nagle’s Algorithm

-4, –version4 only use IPv4

-6, –version6 only use IPv6

-S, –tos N set the IP ‘type of service’

-L, –flowlabel N set the IPv6 flow label (only supported on Linux)

-Z, –zerocopy use a ‘zero copy’ method of sending data

-O, –omit N omit the first n seconds

-T, –title str prefix every output line with this string

–get-server-output get results from server

[KMG] indicates options that support a K/M/G suffix for kilo-, mega-, or giga-

iperf3 homepage: http://software.es.net/iperf/

SysBench

SysBench is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.

The idea of this benchmark suite is to quickly get an impression about system performance without setting up complex database benchmarks or even without installing a database at all.

Current features allow to test the following system parameters:

  • file I/O performance
  • scheduler performance
  • memory allocation and transfer speed
  • POSIX threads implementation performance
  • database server performance

The design is very simple. SysBench runs a specified number of threads and they all execute requests in parallel. The actual workload produced by requests depends on the specified test mode. You can limit either the total number of requests or the total time for the benchmark, or both.

sysbench homepage: https://github.com/akopytov/sysbench

Debug logging output

iPerf 3 Test: protocol: TCP, 1 stream, 131072 byte blocks, 60 second test, TCP MSS: 1448

Internal Test command: $ iperf3 -c 10.0.0.3 –verbose –reverse –time 60

Internet Test command: $ iperf3 -c iperf.he.net –verbose

Command used for memory testing: $ sysbench –test=memory run

Nested e2 testing details

legacy hosted public cloud “virtual” infrastructure – slower due to overhead of binary translation

From E2 to E2 Test Results:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 51.5 GBytes 7.38 Gbits/sec 0 sender
[ 4] 0.00-60.00 sec 51.5 GBytes 7.38 Gbits/sec receiver
CPU Utilization: local/receiver 99.1% (2.0%u/97.0%s), remote/sender 15.7% (0.3%u/15.4%s)

From E2 to Internet Test Results:

[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 6.27 MBytes 5.26 Mbits/sec 0 sender
[ 4] 0.00-10.00 sec 5.85 MBytes 4.91 Mbits/sec receiver
CPU Utilization: local/sender 1.6% (0.6%u/1.0%s), remote/receiver 0.4% (0.0%u/0.4%s)

ubuntu@e2east:~$ sysbench –test=memory run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing memory operations speed test
Memory block size: 1K

Memory transfer size: 102400M

Memory operations type: write
Memory scope type: global
Threads started!
Done.

Operations performed: 104857600 (292531.35 ops/sec)

102400.00 MB transferred (285.68 MB/sec)

Test execution summary:
total time: 358.4491s
total number of events: 104857600
total time taken by event execution: 271.5140
per-request statistics:
min: 0.00ms
avg: 0.00ms
max: 1.49ms
approx. 95 percentile: 0.00ms

Threads fairness:
events (avg/stddev): 104857600.0000/0.00
execution time (avg/stddev): 271.5140/0.00

Bare Metal e5 testing details

new “bare metal” environment – faster with hardware assisted nested virtualization support

From E5 to E5 Test Results:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-60.00 sec 78.0 GBytes 11.2 Gbits/sec 0 sender
[ 4] 0.00-60.00 sec 78.0 GBytes 11.2 Gbits/sec receiver
CPU Utilization: local/receiver 99.3% (0.6%u/98.8%s), remote/sender 8.0% (0.2%u/7.9%s)

From E5 to Internet Test Results:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 542 MBytes 455 Mbits/sec 0 sender
[ 4] 0.00-10.00 sec 542 MBytes 455 Mbits/sec receiver
CPU Utilization: local/sender 4.6% (0.3%u/4.3%s), remote/receiver 1.2% (0.0%u/1.2%s)

ubuntu@e5east:~$ sysbench –test=memory run
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Doing memory operations speed test
Memory block size: 1K

Memory transfer size: 102400M

Memory operations type: write
Memory scope type: global
Threads started!
WARNING: Operation time (0.000000) is less than minimal counted value, counting as 1.000000
WARNING: Percentile statistics will be inaccurate
Done.

Operations performed: 104857600 (1342818.26 ops/sec)

102400.00 MB transferred (1311.35 MB/sec)

Test execution summary:
total time: 78.0877s
total number of events: 104857600
total time taken by event execution: 60.6087
per-request statistics:
min: 0.00ms
avg: 0.00ms
max: 0.49ms
approx. 95 percentile: 0.00ms

Threads fairness:
events (avg/stddev): 104857600.0000/0.00
execution time (avg/stddev): 60.6087/0.00

ubuntu@e5east:~$ sysbench –test=memory run –max-time=600 –debug=on –validate=on

Operations performed: 104857600 (1126890.18 ops/sec)

102400.00 MB transferred (1100.48 MB/sec)

Test execution summary:

total time: 93.0504s

total number of events: 104857600

total time taken by event execution: 60.6825

per-request statistics:

min: 0.00ms

avg: 0.00ms

max: 0.11ms

approx. 95 percentile: 0.00ms

Threads fairness:

events (avg/stddev): 104857600.0000/0.00

execution time (avg/stddev): 60.6825/0.00

DEBUG: Verbose per-thread statistics:

DEBUG: thread # 0: min: 0.0000s avg: 0.0000s max: 0.0001s events: 104857600

DEBUG: total time taken by event execution: 60.6825s

Next Steps – Pricing

I really wanted to include a detailed assessment and price comparison of using Ravello versus other hosted cloud providers. Now that Amazon has announced per second billing it might be worthwhile to do a cost comparison for running similar workloads with various providers. But any cost comparison should include the value of ease of use and network feature capabilities.

https://cloud.oracle.com/ravello/pricing

The blueprint for this blog has 2 VMs each with 4GB RAM, 4vCPU, and 4 GB Disk space with a total cost of less than $1 per hour of use.

Other Options

In an effort to build similar network test labs with other providers I have not been able to find one that offers the same capabilities. Ravello provides the unique ability to pre-assign MAC and IP addresses, take snapshots, create blueprints, and easily share those with others using a built-in repo. Now with the improved performance and support from Oracle, it’s not just a better cost value, it’s a better solution overall.



About Iben Rodriguez

iben-headshot

 

As a consulting Cloud and Virtualization Architect Iben Rodriguez is working to shift network testing functions out of the test lab and closer to the developers and operators of Enterprise Data Centers. 

 

Trained in Agile, ITIL, SOX, PCI-DSS, ISO27000 he develops Testing Solutions for Businesses that leverage Cloud Technology.

Iben started his security career maintaining the data communication systems used by military reconnaissance, followed by stints in the global pharmaceutical and fabless semiconductor industries. Over the past 10 years, Iben has assisted executive teams from both enterprise and government organizations in his roles as a cloud solution strategist and security architect at leading companies such as AT&T, Ericsson, Huawei, Juniper, Cisco, Brocade, CA, Spirent, VMware, and eBay.

 

Since 2014 Iben has been working with the OPNFV community to determine the best practices for networking and security when designing an automated virtual testing laboratory based on OpenStack. The Spirent Virtual Cloud Test Lab is a prototype version of this reference design. This lab was used to produce test reports like the Brocade Vyatta 5600 80Gbps done for Intel’s Network Builders.

As a volunteer for the Center for Internet Security, Iben initiated and maintains the Benchmark Standards for hardening hypervisors used for virtual workloads running on VMware ESXi and Amazon AWS. As part of the Virtualization Security Roundtable podcast series, he also works closely with vendors to stay ahead of the latest products and trends in cloud computing.

Contact:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s