ARM-based server performance comparison: OCI vs. AWS

OCI Ampere Altra A1

Oracle will make the Arm-based Ampere A1 Compute Shape available in virtual machines starting in May 2021.

Release Notes - Arm-based Ampere A1 Compute shape now available
Ampere A1 Compute Introduction Page

OCI Ampere A1 Performance Measurements

Performance Measurement References

In the recent API Gateway Apache APISIX blog, we compared the performance of cloud ARM-based servers. Based on the contents, we will configure the environment as similar as possible in OCI and measure the performance once.

[#One. June 7, 2022, Installation and performance testing of API Gateway Apache APISIX on AWS Graviton3](https://apisix.apache.org/blog/2022/06/07/installation-performance-test-of-apigateway-apisix-on -aws-graviton3/)
#2. August 12, 2022, GCP, AWS, and Azure ARM-based server performance comparison

Create ARM Ubuntu VM

Environments tested in the referenced documentation
AWS Graviton3: Based on ARM architecture
AWS EC2 c7g.large (2vCPU, 4GiB Memory)
Ubuntu 20.04

In OCI, configure the following environment:
- OCI Ampere A1: based on ARM architecture
- VM.Standard.A1.Flex: Free configuration, same choice of 2OCPU, 4GB memory
- Ubuntu 22.04 Minimal aarch64: Select 22.04 to minimize installation errors of benchmark test module
On the Create Compute instance screen, choose Shape type as Ampere.
Change OS image to Ubuntu, change to default 22.04 Minimal aarch64.
Select one VM.Standard.A1.Flex Shape. A1 means Arm-based Ampere A1, and Flex means free choice of CPU and Memory. Choose 2 OCPU, 4GB memory.
OS Image and Arm Server Shape selected.
- Network bandwith increases according to the number of OCPUs in Flex Shape.
- Network bandwith of VM.Standard.A1.Flex is 1 Gbps per OCPU, up to 40 Gbps.
  - Reference: [Oracle Cloud Infrastructure Documentation > Compute > Compute Shapes] (https://docs.oracle.com/en-us/iaas/Content/Compute/References/computeshapes.htm#vm-standard)
- 2 OCPUs were selected, resulting in 2 Gbps as shown.
Create an instance by specifying the rest of the name, VCN, SSH Key, etc. as desired values.

Connect to the created instance with SSH.

ssh ubuntu@<PUBLIC-IP-OF-COMPUTE-INSTANCE>

Install APISIX on ARM Ubuntu VM

See also: [How to build APISIX in ARM Ubuntu] (https://apisix.apache.org/blog/2022/01/11/building-apisix-in-ubuntu-for-arm/)

Install Requirements

Source code cloning

sudo apt-get update
sudo apt-get install git

git clone https://github.com/apache/apisix.git
cd apisix
git checkout release/2.15

Install OpenResty. Install for Ubuntu 22.04, ARM.

Note: (https://openresty.org/en/linux-packages.html#ubuntu)
Step 1: we should install some prerequisites needed by adding GPG public keys (could be removed later):

sudo apt-get -y install --no-install-recommends wget gnupg ca-certificates

Step 2: import our GPG key:

# For ubuntu 22
wget -O - https://openresty.org/package/pubkey.gpg | sudo gpg --dearmor -o /usr/share/keyrings/openresty.gpg

Step 3: then add the our official APT repository.

# for arm64 or aarch64 system
# For ubuntu 22 or above
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/openresty.gpg] http://openresty.org/package/arm64/ubuntu $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/openresty.list > /dev/null

Step 4: update the APT index & install openresty

sudo apt-get update
sudo apt-get -y install openresty

Install Dependencies

bash utils/install-dependencies.sh
sudo apt install wget sudo unzip
sudo apt-get install make gcc
curl https://raw.githubusercontent.com/apache/apisix/master/utils/linux-install-luarocks.sh -sL | bash -
LUAROCKS_SERVER=https://luarocks.cn make deps

Install APISIX
```
sudo make install
```

Install ETCD

Install Docker
```
sudo apt-get install docker.io
```

Start etcd

sudo docker run -d --name etcd -p 2379:2379 -e ETCD_UNSUPPORTED_ARCH=arm64 -e ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 -e ETCD_ADVERTISE_CLIENT_URLS=http://0.0.0.0:2379 gcr.io/etcd-development/etcd:v3.5.1-arm64

OK Make sure STATUS is Up.
```
sudo docker ps -a
```

Start APISIX

Go to the source code location
```
cd ~/apisix
```
Install Dependencies
```
make deps
sudo make install
```

Add settings

echo "ulimit -n 4096" >> ~/.bashrc
source ~/.bashrc

Start APISIX

apisix init

# start APISIX
apisix start

curl "http://127.0.0.1:9080/apisix/admin/routes/1" \
-H "X-API-KEY: edd1c9f034335f136f87ad84b625c8f1" -X PUT -d '
{
  "uri": "/anything/*",
  "upstream": {
    "type": "roundrobin",
    "nodes": {
      "httpbin.org:80": 1
    }
  }
}'

Test

curl -i http://127.0.0.1:9080/anything/das

test result

HTTP/1.1 200 OK
.....

Performance measurement

For performance testing, the Apache APISIX official benchmark [script] mentioned in Reference #1 was used (https://github.com/apache/apisix/blob/master/benchmark/run.sh).

Prepare for Performance Measurements

Install the HTTP benchmark tool wrk on the created ubuntu VM.
```
sudo apt-get install wrk
```

Measurement Scenario

We used the two scenarios below, which are included in the benchmark test script to be used. Each scenario requires an Nginx node serviced at 127.0.0.1:1980 to be routed by APISIX. That part is also included in the test script.

Scenario 1: Single upstream

First scenario in Reference #1, without plugins, using a single upstream. It is intended for APISIX performance testing in pure proxy back-to-origin mode.

# apisix: 1 worker + 1 upstream + no plugin

# register route
curl http://127.0.0.1:9080/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
{
    "uri": "/hello",
    "plugins": {
    },
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "127.0.0.1:1980":1
        }
    }
}'

Scenario 2: Single upstream + Two plugins

Second scenario in Reference #1, using 2 plugins, single upstream. It is intended for APISIX performance testing using limit-count and prometheus as two-core performance-consuming plugins.

# apisix: 1 worker + 1 upstream + 2 plugins (limit-count + prometheus)

# register route
curl http://127.0.0.1:9080/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
{
    "uri": "/hello",
    "plugins": {
        "limit-count": {
            "count": 2000000000000,
            "time_window": 60,
            "rejected_code": 503,
            "key": "remote_addr"
        },
        "prometheus": {}
    },
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "127.0.0.1:1980":1
        }
    }
}'

Run the benchmark test script

Go to the source code location
```
cd ~/apisix
```
Install editing tools
```
sudo apt-get install vim
```
Add a curl command to line 103 of the ./benchmark/run.sh script to check the backend Nginx startup.
```
curl -i http://127.0.0.1:9080/hello
wrk -d 5 -c 16 http://127.0.0.1:9080/hello
```
In the ./benchmark/run.sh script, increase the load time for testing from the default of 5 seconds to 60 seconds. Replace -d 5 with -d 60.
```
wrk -d 60 -c 16 http://127.0.0.1:9080/hello
```
Run the benchmark script.
```
./benchmark/run.sh
```

Execution result

ubuntu@oci-arm-ubuntu-c2m4:~/apisix$ ./benchmark/run.sh
+ '[' -n '' ']'
+ worker_cnt=1
+ '[' -n '' ']'
+ upstream_cnt=1
+ mkdir -p benchmark/server/logs
+ mkdir -p benchmark/fake-apisix/logs
+ make init
[ info ] init -> [ Start ]
...
apisix: 1 worker + 1 upstream + no plugin
+ curl http://127.0.0.1:9080/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
{
    "uri": "/hello",
    "plugins": {
    },
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "127.0.0.1:1980":1
        }
    }
}'
{"node":{"key":"\/apisix\/routes\/1","value":{"uri":"\/hello","update_time":1664362611,"plugins":{},"status":1,"create_time":1664361560,"priority":0,"upstream":{"scheme":"http","pass_host":"pass","nodes":{"127.0.0.1:1980":1},"hash_on":"vars","type":"roundrobin"},"id":"1"}},"action":"set"}
+ sleep 1
+ curl -i http://127.0.0.1:9080/hello
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Date: Wed, 28 Sep 2022 10:56:52 GMT
Server: APISIX/2.15.0

1234567890+ wrk -d 60 -c 16 http://127.0.0.1:9080/hello
Running 1m test @ http://127.0.0.1:9080/hello
  2 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.02ms  188.43us  13.83ms   97.08%
    Req/Sec     7.86k   505.69     8.73k    89.58%
  938523 requests in 1.00m, 171.84MB read
Requests/sec:  15640.38
Transfer/sec:      2.86MB
+ sleep 1
+ wrk -d 60 -c 16 http://127.0.0.1:9080/hello
Running 1m test @ http://127.0.0.1:9080/hello
  2 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.03ms  112.62us  10.02ms   93.94%
    Req/Sec     7.82k   314.04    15.89k    88.01%
  934559 requests in 1.00m, 171.12MB read
Requests/sec:  15550.03
Transfer/sec:      2.85MB

...

apisix: 1 worker + 1 upstream + 2 plugins (limit-count + prometheus)
+ curl http://127.0.0.1:9080/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
{
    "uri": "/hello",
    "plugins": {
        "limit-count": {
            "count": 2000000000000,
            "time_window": 60,
            "rejected_code": 503,
            "key": "remote_addr"
        },
        "prometheus": {}
    },
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "127.0.0.1:1980":1
        }
    }
}'
{"node":{"key":"\/apisix\/routes\/1","value":{"uri":"\/hello","update_time":1664362734,"plugins":{"limit-count":{"time_window":60,"key":"remote_addr","rejected_code":503,"count":2000000000000,"key_type":"var","policy":"local","allow_degradation":false,"show_limit_quota_header":true},"prometheus":{"prefer_name":false}},"status":1,"create_time":1664361560,"priority":0,"upstream":{"scheme":"http","pass_host":"pass","nodes":{"127.0.0.1:1980":1},"hash_on":"vars","type":"roundrobin"},"id":"1"}},"action":"set"}
+ sleep 3
+ wrk -d 60 -c 16 http://127.0.0.1:9080/hello
Running 1m test @ http://127.0.0.1:9080/hello
  2 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.31ms  192.72us  17.15ms   95.50%
    Req/Sec     6.15k   294.98     6.71k    71.83%
  734891 requests in 1.00m, 185.02MB read
Requests/sec:  12248.00
Transfer/sec:      3.08MB
+ sleep 1
+ wrk -d 60 -c 16 http://127.0.0.1:9080/hello
Running 1m test @ http://127.0.0.1:9080/hello
  2 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.30ms  141.40us  12.13ms   93.56%
    Req/Sec     6.17k   254.42     6.87k    64.83%
  736422 requests in 1.00m, 185.41MB read
Requests/sec:  12271.99
Transfer/sec:      3.09MB
...

Performance measurement results

Scenario 1: Single upstream
- apisix: 1 worker + 1 upstream + no plugin part
1st 2nd Average
Requests/sec 15640.38 15550.03 15595.205
Avg Latency 1.02ms 1.03ms 1.025ms
Scenario 2: Single upstream + Two plugins
- apisix: 1 worker + 1 upstream + 2 plugins (limit-count + prometheus)
1st 2nd Average
Requests/sec 12248.00 12271.99 12259.995
Avg Latency 1.31ms 1.30ms 1.305ms
CPU and Memory status during testing
- CPU: The backend Nginx Worker process is using a lot of CPU due to load
- Memory: Nginx simple web page does not use much memory

	1st	2nd	Average
Requests/sec	15640.38	15550.03	15595.205
Avg Latency	1.02ms	1.03ms	1.025ms

	1st	2nd	Average
Requests/sec	12248.00	12271.99	12259.995
Avg Latency	1.31ms	1.30ms	1.305ms

Cost/Performance Comparison

OCI Arm Based Shape

Both VM and Bare Metal support Ampere A1, provided as one of Flex Shape in VM
- https://docs.oracle.com/en-us/iaas/Content/Compute/References/computeshapes.htm#flexible
Shape Maximum OCPUs Minimum Memory Maximum Memory
VM.Standard.A1.Flex 80 1 GB or number of OCPUs, whichever is greater 64 GB per OCPU, total up to 512 GB

Shape	Maximum OCPUs	Minimum Memory	Maximum Memory
VM.Standard.A1.Flex	80	1 GB or number of OCPUs, whichever is greater	64 GB per OCPU, total up to 512 GB

Cost of OCI Arm-based Virtual Machines

OCI Ampere A1 Compute
- https://www.oracle.com/cloud/compute/arm/
- On Arm CPU architecture (Ampere), 1 OCPU = 1 vCPU.
- Of the total monthly usage of Ampere A1 shape in Cloud Account, the first 3,000 OCPU-hours and the first 18,000 GB-hours are free.
- OCPU and Memory can be adjusted with Flex Shape, so each has a unit cost.
- OCI prices are the same in all regions.
Compute - Virtual Machine Instances Comparison Price ( /vCPU)* Unit Price Unit
Compute – Ampere A1 – OCPU $0.01 $0.01 OCPUs per Hour
Compute – Ampere A1 – Memory - $0.0015 gigabytes per hour

Compute - Virtual Machine Instances	Comparison Price ( /vCPU)*	Unit Price	Unit
Compute – Ampere A1 – OCPU	$0.01	$0.01	OCPUs per Hour
Compute – Ampere A1 – Memory	-	$0.0015	gigabytes per hour

Comparison condition

Based on ARM server.
AWS: Cost per hour in C7g (US East Ohio).
- https://aws.amazon.com/ec2/pricing/on-demand/
OCI: Cost per hour based on VM.Standard.A1.Flex.
- OCI prices are the same in all regions.
- For comparison, the price is compared with the CPU and memory of the same size as AWS C7g.
- VM.Standard.A1.Flex price is the sum of CPU price and Memory price.

cost per hour

VM series / vCPU(Memory)	1 (2G)	2 (4G)	4 (8G)	8 (16G)	16 (32G)	32 (64G)	64 (128G)
AWS C7g	$0.0361	$0.0723	$0.1445	$0.289	$0.5781	$1.1562	$1.7342
OCI Ampere A1	$0.013	$0.026	$0.052	$0.104	$0.208	$0.416	$0.832

Annual Cost: Based on Test Scenario 1

QPS (queries per second): Number of requests processed per second
AWS c7g.large’s QPS is based on reference documents #1 and #2 results
**Reference documents #1 and #2 were tested with the same configuration as possible, but the final load request of the reference document and resource usage at that time are unknown, so the performance and cost-performance ratio below are for reference only, not absolute. **
Since OCI Flex Shape is freely configurable, it is also possible to configure 2 GB of memory with 2 OCPUs as it is.

	hourly cost	Total hours in a year (24*365)	Annual Cost	QPS	Cost Performance (QPS/cost)
AWS c7g.large	$0.0723	8760 hours	$633.3	23000	36.3
OCI VM.Standard.A1.Flex 2 OCPU, 4 GB Memory	$0.026	8760 hours	$227.8	15595.205	68.5
OCI VM.Standard.A1.Flex 2 OCPU, 2 GB Memory	$0.023	8760 hours	$201.5	15595.205	77.4

Excluding free provision under the assumption that only the test environment in the tenant is used

Of the total monthly usage of Ampere A1 shape in Cloud Account, the first 3,000 OCPU-hours and the first 18,000 GB-hours are free.
Maximum usage time per month: 24 hours * 31 days = 744 hours

	Maximum usage per month	Monthly usage after free deduction	Annual Usage	Unit Price	Annual Cost
OCI VM.Standard.A1.Flex 2 OCPU	1,488 OCPU-Hours	0 OCPU-Hours	0	$0.01	$0
OCI VM.Standard.A1.Flex 4 GB Memory	2,976 GB-Hours	0 GB-Hours	0	$0.0015	$0

As an individual, this article was written with my personal time. There may be errors in the content of the article, and the opinions in the article are personal opinions.

ARM-based server performance comparison: OCI vs. AWS

OCI Ampere Altra A1

OCI Ampere A1 Performance Measurements

Performance Measurement References

Create ARM Ubuntu VM

Install APISIX on ARM Ubuntu VM

Install Requirements

Install ETCD

Start APISIX

Performance measurement

Prepare for Performance Measurements

Measurement Scenario

Run the benchmark test script

Performance measurement results

Cost/Performance Comparison

OCI Arm Based Shape

Cost of OCI Arm-based Virtual Machines

Comparison of Costs Related to Benchmark Testing

Comparison condition

cost per hour

Annual Cost: Based on Test Scenario 1

Excluding free provision under the assumption that only the test environment in the tenant is used