User Guides

Huawei Technologies Co., Ltd.

Kunpeng Virtualization Solution

Kunpeng Virtualization Solution User Guide Issue 05 Date 2020-12-30 HUAWEI TECHNOLOGIES CO., LTD.

Kunpeng BoostKit for Virtualization User Guide

User Guide

15 ott 2021 — 1.2.1.3 Manual Installation. ... DPDK) can run in pods. ○. IPv6 support: Pods can be deployed in ... Figure 1-12 Successful deployment of CRD resources.

kunpengcpfs-usermanual
Kunpeng BoostKit for Virtualization
User Guides

Issue Date

06 2021-03-23

HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

i

Kunpeng BoostKit for Virtualization User Guides

Contents

Contents

1 Large-Scale Container Networking User Guide............................................................... 1
1.1 Introduction............................................................................................................................................................................... 1 1.1.1 Container Network Overview.......................................................................................................................................... 1 1.1.2 Concepts.................................................................................................................................................................................. 2 1.1.3 Application Scenarios......................................................................................................................................................... 4 1.1.4 Implementation Principle.................................................................................................................................................. 4 1.1.5 Access and Usage Modes.................................................................................................................................................. 7 1.2 Component Selection Guide................................................................................................................................................ 7 1.2.1 Selection Procedure............................................................................................................................................................. 7 1.2.2 Component Comparison....................................................................................................................................................8 1.3 Quick Start.............................................................................................................................................................................. 10 1.3.1 Installation Guide.............................................................................................................................................................. 10 1.3.2 Typical Network Topology.............................................................................................................................................. 10 1.3.2.1 Typical OVS Network Structure.................................................................................................................................10 1.3.2.2 Typical Calico Network Structure............................................................................................................................. 11 1.3.3 Networking Operation Practice.................................................................................................................................... 12 1.3.3.1 OVS Component Networking Operations............................................................................................................. 12 1.3.3.2 Calico Component Networking Operations.......................................................................................................... 16 1.4 Open vSwitch Network Plane Management............................................................................................................... 19 1.5 Calico Network Plane Management.............................................................................................................................. 20 1.5.1 IP Address Pool Management....................................................................................................................................... 20 1.5.1.1 Viewing IP Address Pool Information......................................................................................................................20 1.5.1.2 Adding an IP Address Pool.......................................................................................................................................... 20 1.5.1.3 Modifying an IP Address Pool.................................................................................................................................... 21 1.5.1.4 Deleting an IP Address Pool....................................................................................................................................... 21 1.5.2 Managing BGP Configuration....................................................................................................................................... 21 1.5.2.1 Checking the BGP Configuration.............................................................................................................................. 21 1.5.2.2 Adding BGP Configuration.......................................................................................................................................... 22 1.5.2.3 Modifying BGP Configuration.................................................................................................................................... 22 1.5.2.4 Deleting BGP Configuration....................................................................................................................................... 23 1.6 Troubleshooting..................................................................................................................................................................... 23 1.7 References............................................................................................................................................................................... 25 1.7.1 OVS Kubernetes Deployment YAML File................................................................................................................... 25

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

ii

Kunpeng BoostKit for Virtualization User Guides

Contents

1.7.2 Calico Deployment YAML File....................................................................................................................................... 29 1.7.3 Calicoctl Deployment YAML File.................................................................................................................................. 41
2 Kube-OVN User Guide..........................................................................................................43
2.1 Introduction............................................................................................................................................................................ 43 2.1.1 Kube-OVN Overview........................................................................................................................................................ 43 2.1.2 Concepts............................................................................................................................................................................... 44 2.1.3 Application Scenarios....................................................................................................................................................... 47 2.1.4 Accessing and Using Kube-OVN.................................................................................................................................. 47 2.2 Quick Start.............................................................................................................................................................................. 48 2.2.1 Basic Installation................................................................................................................................................................ 48 2.2.1.1 Environment Requirements........................................................................................................................................ 48 2.2.1.2 One-Click Installation (Recommended).................................................................................................................48 2.2.1.3 Manual Installation....................................................................................................................................................... 51 2.2.2 Advanced Installation.......................................................................................................................................................54 2.2.2.1 Configuring VLAN Support......................................................................................................................................... 54 2.2.2.2 Deploying HA.................................................................................................................................................................. 55 2.2.2.3 Configuring the Built-in Subnets.............................................................................................................................. 57 2.2.2.4 Selecting the Host NIC, MTU, and Traffic Mirrors..............................................................................................58 2.2.3 Uninstallation..................................................................................................................................................................... 59 2.3 Service Deployment............................................................................................................................................................. 59 2.3.1 Recommended Deployment Modes............................................................................................................................ 59 2.3.2 Static IP Addresses and MAC Addresses for Pods.................................................................................................. 60 2.3.3 Static IP Addresses for Workloads............................................................................................................................... 61 2.3.4 Static IP Addresses for StatefulSets.............................................................................................................................62 2.4 Subnet Management........................................................................................................................................................... 62 2.4.1 Default Subnet................................................................................................................................................................... 62 2.4.2 Node Subnet....................................................................................................................................................................... 64 2.4.3 Customized Subnet........................................................................................................................................................... 65 2.4.4 Subnet Access Control..................................................................................................................................................... 65 2.4.5 Egress Gateway Configuration...................................................................................................................................... 66 2.4.6 IP Addresses of Pods Exposed to the External Network...................................................................................... 66 2.4.7 IPv6 Subnet.......................................................................................................................................................................... 67 2.5 O&M Operations................................................................................................................................................................... 68 2.5.1 Deleting a Node................................................................................................................................................................. 68 2.5.2 Configuring QoS................................................................................................................................................................ 68 2.5.3 Mirroring Traffic................................................................................................................................................................. 69 2.6 Reference................................................................................................................................................................................. 69 2.6.1 YAML Files for Manual Installation............................................................................................................................. 69
3 XPF User Guide.......................................................................................................................86
3.1 Introduction............................................................................................................................................................................ 86 3.2 Environment Requirements............................................................................................................................................... 87 3.3 BIOS Settings.......................................................................................................................................................................... 88

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

iii

Kunpeng BoostKit for Virtualization User Guides

Contents

3.4 Configuring the Compilation Environment.................................................................................................................. 90 3.5 Compiling and Installing XPF............................................................................................................................................ 90 3.6 Configuring Logs................................................................................................................................................................... 94 3.7 Running and Verifying XPF................................................................................................................................................ 96 3.8 Troubleshooting.................................................................................................................................................................. 105 3.9 OVS Command Description............................................................................................................................................ 106 3.10 Change History................................................................................................................................................................. 121
4 SR-IOV User Guide.............................................................................................................. 122
4.1 Introduction.......................................................................................................................................................................... 122 4.2 Environment Requirements.............................................................................................................................................122 4.3 Configuring the Environment......................................................................................................................................... 125 4.4 Configuring SR-IOV............................................................................................................................................................ 131 4.4.1 Checking Mellanox NIC Information........................................................................................................................ 131 4.4.2 Configuring Kernel-Mode SR-IOV............................................................................................................................. 132 4.4.3 Configuring OVS Boot Parameters............................................................................................................................134 4.4.4 Configuring Network Data.......................................................................................................................................... 135 4.4.5 Creating a VM.................................................................................................................................................................. 136 4.4.6 Verifying Communication Between VMs................................................................................................................ 140 4.5 Verifying SR-IOV................................................................................................................................................................. 141 4.5.1 Restoring the Environment.......................................................................................................................................... 141 4.5.2 Bonding.............................................................................................................................................................................. 141 4.5.3 QoS...................................................................................................................................................................................... 144 4.5.4 Port Mirroring................................................................................................................................................................... 146 4.5.5 GRO......................................................................................................................................................................................148 4.5.6 Traffic Sampling.............................................................................................................................................................. 150 4.5.7 Protocol Offloading........................................................................................................................................................ 151
A Change History.................................................................................................................... 158

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

iv

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

1 Large-Scale Container Networking User Guide

1.1 Introduction 1.2 Component Selection Guide 1.3 Quick Start 1.4 Open vSwitch Network Plane Management 1.5 Calico Network Plane Management 1.6 Troubleshooting 1.7 References

1.1 Introduction

1.1.1 Container Network Overview
In container-based microservice service scenarios, one of the keys to deploying containers on the cloud is to manage the network between container clusters so that containers within a node or between physical nodes can communicate with each other.
Currently, there are two standard proposals for configuring Linux container network interfaces in the industry: Container Network Model (CNM) and Container Network Interface (CNI). This document describes how to verify large scale cluster deployment based on the container orchestration engine Kubernetes and network plane components Open vSwitch and Calico that are widely used in the industry. You can determine the most suitable network plane components based on the actual scale, function requirements, and performance requirements of your services.
For details about related commands and parameters, see the following official documents:
Open vSwitch official documentation: https://docs.openvswitch.org/en/latest/

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

1

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Calico official documentation: https://docs.projectcalico.org/about/about-calico
1.1.2 Concepts

Container Network Model (CNM)
CNM is proposed by Docker and adopted by the Libnetwork project as the network model standard. It is used by common open-source network components such as Kuryr, Open Virtual Networking, Calico, and Weave for container network interconnection.
The CNM implementation Libnetwork provides interfaces between Docker daemons and network driver programs. The network controller is responsible for matching drivers with networks, and each driver is responsible for managing the network of the driver and provides services such as IP Address Management (IPAM) for the network. Drivers of the CNM can be native drivers (for built-in Libnetwork or network models supported by Docker) or third-party plugin drivers. The native drivers include None, Bridge, Overlay, and MACvlan. Third-party drivers provide more functions. In addition, the scope of a CNM driver can be defined as either a local scope (single-host mode) or a global scope (multi-host mode).

Figure 1-1 CNM drivers

Containers are connected through a series of network endpoints, as shown in Figure 1-2. A typical network interface exists in the form of a veth pair. One end of the veth pair is in the network sandbox of the container, and the other end is in the specified network. One network endpoint is added to only one network plane. Multiple network endpoints can exist in the network sandbox of a container.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2

Kunpeng BoostKit for Virtualization User Guides
Figure 1-2 CNM interfaces

1 Large-Scale Container Networking User Guide

Container Network Interface (CNI)
Container Network Interface (CNI) is proposed by CoreOS and adopted by Apache Mesos, Cloud Foundry, Kubernetes, Kurma, and rkt as the network model standard. CNI is used by common open-source network components such as Contiv Networking, Calico, and Weave for container network interconnection.
As shown in Figure 1-3, CNI is implemented based on a simplest standard, enabling network development engineers to implement protocol communication between containers and network plugins in a simple way.
Multiple network plugins can run in a container so that the container can connect to different network planes driven by different plugins. The network is defined in a JSON configuration file and is instantiated as a new namespace when CNI plugins are called.
Figure 1-3 CNI drivers

Kubernetes
Kubernetes is an open-source container orchestration engine provided by Google. It supports automatic deployment, large-scale scalability, and containerized application management. Kubernetes classifies network scenarios into the following types: communication between containers in the same pod,

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

3

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

communication between pods, communication between pods and services, and communication between systems outside the cluster and services.
Pod and service resource objects use their own dedicated networks. The pod network is implemented through Kubernetes network plugin configuration (CNI model). The service network is specified by the Kubernetes cluster. The overall network model is implemented using external plugins. Therefore, you need to plan the network model and network deployment before deploying the Kubernetes cluster.

Open vSwitch
Open vSwitch (OVS) is a multi-layer switch software based on the Apache 2.0 open-source protocol. It aims to build a production environment quality switching platform that supports standard management interfaces, forward function interfaces, programmable plug-ins, and management control. Open Virtual Network (OVN) is a native virtualized network solution provided by OVS. It uses existing OVS functions to implement large-scale and high-quality cluster management.

Calico
Calico is an open-source network and network solution component developed by Tigera based on the Apache 2.0 open-source protocol. It can be used for containers, VMs, and native host machines. This component supports multiple platforms, including Kubernetes, OpenShift, Docker EE, OpenStack, and bare metal services. The Calico project aims to combine flexible network functions with security policy enforcement to provide solutions with native Linux kernel performance and cloud native scalability.
1.1.3 Application Scenarios
This document describes component selection and tuning for network plane establishment and deployment, and is applicable to service deployment such as Kubernetes clusters using container orchestration engines. This document also describes the network establishment and simulation tests of common components on the user network plane when the cluster is large (more than 100 physical nodes or Pod objects).
1.1.4 Implementation Principle

Open vSwitch
The OVS software architecture consists of the kernel-mode datapath and usermode vswitchd and ovsdb, as shown in Figure 1-4.
datapath is a kernel module responsible for data exchange. It reads data from the network port, quickly matches flow entries in the FlowTable, and directly forwards the data that is successfully matched or sends the data that fails to be matched to the vswitchd process for processing. The hook function is registered during OVS initialization to make the kernel module take over the packet processing on the port.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

4

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

vswitchd is a daemon process for OVS management and control. It saves configuration information to OVSDB through the Unix socket and interacts with the kernel module through Netlink.
ovsdb is an OVS database that stores OVS configuration information.
In addition, the OVS's release package contains a series of management tools, such as ovs-vsctl, ovs-dpctl, ovs-ofctl, ovs-appctl and ovs-docker, facilitating OVS's configuration and use.

Figure 1-4 Open vSwitch software architecture

In Figure 1-5, dotted lines show the direction of data packets in the Linux network protocol before the OVS takes over network data. After receiving the data packets, physical NIC ports parse the packets layer by layer using the kernel protocol stack, exit the kernel mode, and transmits the data to the kernel mode.
After the OVS creates a bridge and binds the physical NIC, the data flow is received from the port of the physical NIC, enters the OVS through the vPort of the OVS in kernel mode, and matches the flow table based on the key value of the data packets. If the matching is successful, the subsequent flow table action is executed. If the operation fails, upcall is performed and the packets are processed by vswitchd.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

5

Kunpeng BoostKit for Virtualization User Guides
Figure 1-5 Principles of OVS

1 Large-Scale Container Networking User Guide

Calico

Calico consists of Felix, Confd, and BIRD, as shown in Figure 1-6.
Felix is a daemon that runs Calico nodes and functions as the daemon of the endpoint of each node. It manages pod information on the current host, exchanges cluster pod information with the etcd service, and combines routing information and ACL policies.
Confd stores Calico configuration information generated by etcd and is provided for the BIRD layer.
BIRD (Bird Internet Routing Daemon) is a core component. BIRD in Calico refers to BIRD client and BIRD route reflector. BIRD proactively reads routing information configured by Felix on the local host and distributes routes in the data center through the Border Gateway Protocol (BGP).
In addition, the etcd component is a dependent component of Calico. You need to deploy the etcd service in the cluster in advance or reuse the etcd component of Kubernetes. Calico also provides the calicoctl management tool, which is used to confirm and configure the status of the Calico node.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

6

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Figure 1-6 Calico software architecture

1.1.5 Access and Usage Modes
You can use either of the following methods to access the OVS and Calico components:
1. Using built-in management tools such as ovs-vsctl (for OVS) and calicoctl (for Calico).
2. Using Kubernetes management tools such as kubectl to indirectly access, manage, and configure network components.

1.2 Component Selection Guide

1.2.1 Selection Procedure
Step 1 Determine whether the system requires L2 or L3 networking.  If L2 networking is required, use OVS.  If L3 networking is required, use Calico.  If L2 and L3 networking are not required, go to Step 2.
Step 2 Determine whether multiple network planes are required for containers. (For example, multiple network components are used at the same time.)  If yes, use Calico.  If no, go to Step 3.
Step 3 Determine whether low computing resource usage of network components is required.  If yes, use Calico.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

7

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

 If no, go to Step 4.
Step 4 Determine whether multiple tenants use the same CIDR network.  If yes, use OVS.  If no, go to Step 5.
Step 5 Determine whether support for native VLAN configuration is required.  If yes, use OVS.  If no, go to Step 6.
Step 6 Determine whether support for advanced functions such as SDN integration, twoway rate limiting, and DPDK is required.  If yes, use OVS.  If no, go to Step 7.
Step 7 Determine whether hardware network isolation is required.  If yes, use OVS.  If no, go to Step 8.
Step 8 Determine whether the network bandwidth needs to be close to that of the host machine?  If yes, use OVS.  If no, go to Step 9.
Step 9 Determine whether the system is sensitive to latency in large-scale deployment?  If yes, use Calico.  If no, go to Step 10.
Step 10 Determine whether fixed pod IP addresses are required.  If yes, use OVS.  If no, no further action is required. You can select a component based on its usability and your familiarity with the component.
----End
1.2.2 Component Comparison
Both the OVS and Calico components support the ARM64 architecture, but they are different in terms of network model implementation. Table 1-1 describes the differences of the two components. You can select a network component based on the following table.

Table 1-1 Component comparison

Component

Open vSwitch

Basic Network L2 (Underlay/Overlay) Model

Network Configuration Support

GRE/VxLAN/VLAN

Calico L3 BGP (Overlay)
BGP/IPIP/VxLAN

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

8

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Ecosystem Maturity
Usability VLAN Support Network Performance
Compute Resource

Highly mature. Additional plug-ins, such as k-vswitch and kube-ovn, are required for integration with Kubernetes.
Medium
Supported by the native system.
Excellent
There is a small amount of performance loss in GRE, VxLAN encapsulation and decapsulation, and flow table matching.
High resource usage when the network pressure is high.

Accessing the Cluster from Outside

Direct connection based on L2 routing
Cross-VLAN communication requires routing support.

Accessing Systems Out of the Cluster

Direct connection based on L2 routing
Cross-VLAN communication requires routing support.

Number of Nodes in a Cluster

Unlimited

Single-cluster IP address space

Unlimited

Single-node IP Unlimited (single-node address space multi-VLAN)

Assigned Pod IP Supported Address

Fixed Pod IP Address

Supported

Support for

No supported version is

Multiple

available in the community.

Network Planes

Mature. Works well with container orchestration engines such as Kubernetes.
Simplicity Supported and implemented based on L3 routing. Excellent There is a small amount of performance loss in IPIP encapsulation and decapsulation.
Low. L3 routing direct connection, kernel routing table, and IP table are used. The computing resource overhead is low. Direct connection based on L3 routing
Direct connection based on L3 routing
Unlimited
Unlimited. The default value is 65535.
Unlimited
Supported
Not supported. Customization is required. Supported by the open-source community.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

9

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Network Isolation
Advanced Function Ecosystem Support
Requirements

Two-level isolation based on iptables software isolation. VLANs and network policies.

Strong (SDN integration, rate limiting, and other features)

Medium

None

If more than 100 nodes are deployed, you are advised to change the Full Mesh network mode to RouteReflector (RR) and configure one or two RR nodes for every 100 nodes.

1.3 Quick Start

1.3.1 Installation Guide
To manage and use large-scale container network planes, you need to select proper network plane components and configurations based on the planned physical cluster and container service scale and actual service requirements on network functions, bandwidth latency, computing resources, and management customization.
This chapter describes how to select and operate network components.
For details about the installation, see the following documents:
Open vSwitch Installation Guide (CentOS 7.6)
Calico Installation Guide (CentOS 7.6)
1.3.2 Typical Network Topology
This section describes the typical topology when the OVS and Calico components are used as network planes in a container cluster.

1.3.2.1 Typical OVS Network Structure
The OVS uses a virtual bridge to take over data of physical ports and performs flow table configuration and bridge configuration to implement end-to-end data switching, as shown in Figure 1-4 and Figure 1-5.
Figure 1-7 shows the principles of the OVS cluster network topology:
1. Physical nodes in a cluster can communicate with each other. The OVS uses the VxLAN or GRE overlay mode to implement network plane communication between nodes.
2. In a physical cluster node, the OVS uses a virtual bridge to manage the container port network and its VLAN information.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

10

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

3. Between physical cluster nodes, different OVS components use cluster orchestration tools such as OVN to orchestrate networks and configure routing and communication.

Figure 1-7 Typical OVS network topology

1.3.2.2 Typical Calico Network Structure
Calico uses IP Tunnel and VxLAN to encapsulate and decapsulate data, defines ACLs, plans network policies using iptables, and uses the kernel routing table to implement end-to-end data exchange.
Figure 1-8 shows the principles of the Calico cluster network topology:
1. Physical nodes in a cluster can communicate with each other. Calico uses the BMS network, IPIP, or VxLAN host network overlay mode to implement network plane communication between nodes.
2. In a physical cluster node, Calico maps the network to the container in veth pair mode and manages the container port network information through the routing table and iptables.
3. Between physical cluster nodes, Calico components exchange cluster topology information through the etcd cluster and obtain the routing information of container services in the cluster through BGP, implementing cross-node container network interconnection.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

11

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Figure 1-8 Typical Calico network topology

1.3.3 Networking Operation Practice
This section describes how to deploy and verify network planes using Kubernetes cluster, OVS components, and Calico components as an example.

1.3.3.1 OVS Component Networking Operations

Prerequisites
1.
2. 3.

The Docker and Kubernetes components (kubeadm, kubectl, and kubelet) have been installed on the node to be deployed.
The OVS component has been installed on the node to be deployed.
The node to be deployed can properly pull Docker images.

Procedure
Step 1 Start the OVS service on all nodes to be deployed.
NOTE
This section uses the default installation path /usr as an example. If the OVS configuration has been changed, you need to change the commands accordingly. For details, see Open vSwitch Installation Guide (CentOS 7.6). export PATH=$PATH:/usr/share/openvswitch/scripts ovs-ctl start
Figure 1-9 shows the startup process. After the startup is complete, check the OVS virtual bridge version information.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

12

Kunpeng BoostKit for Virtualization User Guides
Figure 1-9 OVS startup process

1 Large-Scale Container Networking User Guide

Step 2 Set the Kubernetes master node during initialization.
NOTE
In this section, the 10.244.0.0/16 network segment is used as the network driver CIDR and the default gateway is used as the network broadcast address. If you need to specify another network segment, modify the command accordingly. kubeadm init --pod-network-cidr=10.244.0.0/16 mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
After the initialization is complete, information in Figure 1-10 and Figure 1-11 are displayed. Check whether the pods information of the Kubernetes cluster is normal and whether the node status is NotReady. Back up the kubeadm join command in the output for future use. Then, deploy the network plane.
Figure 1-10 Successful initialization of the active Kubernetes node

Figure 1-11 Initialization status of the active Kubernetes node

Step 3 Edit the deployment YAML file of k-vswitch (OVS Kubernetes component).
Download the k-vswitch.yaml file and edit the clusterCIDA, serviceCIDA, and overlayType fields in the file.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

13

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

The values of clusterCIDA and serviceCIDA must be the same as the IP address segments planned for the Kubernetes cluster. The value of overlayType can be set to vxlan or gre based on user requirements.

Figure 1-12 Modifying the k-vswitch configuration file

Step 4 Deploy cluster network components. Use kubectl to deploy cluster network components.
kubectl apply -f k-vswitch.yaml
After the deployment is complete, the coredns service is in the Running state and the node is in the Ready state, as shown in Figure 1-13 and Figure 1-14.
Figure 1-13 Installing the k-vswitch component

Figure 1-14 k-vswitch component deployment status

Step 5 Add nodes to the cluster.
On other Kubernetes nodes to be deployed, run the kubeadm join command backed up in Step 2 to add the nodes to be deployed to the Kubernetes cluster.
kubeadm join <master-ip:port> --token <your-token> \ --discovery-token-ca-cert-hash sha256:<your-sha256-ca>
After the cluster nodes are added, "This node has joined the cluster" is displayed, as shown in Figure 1-15. The OVS component networking procedure is complete.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

14

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Figure 1-15 Adding nodes on the OVS network

----End
Network Verification
Step 1 Copy the following content and edit the Nginx deployment test YAML file to test the intra-node and cross-node communication capability.
apiVersion: apps/v1 kind: Deployment metadata:
name: nginx-arm-deployment spec:
selector: matchLabels: app: arm64v8_nginx
replicas: 5 template:
metadata: labels: app: arm64v8_nginx
spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 80
Step 2 Run the kubectl command to deploy the Nginx service. If information shown in Figure 1-16 is displayed, the Nginx service is running properly and the IP address has been allocated.
kubectl apply -f nginx.yaml
Figure 1-16 OVS networking service deployment test

Step 3 Check that the node routing information and OVS bridge status are normal, as shown in Figure 1-17. The OVS has configured the cluster-wide and node-local port information on the k-vswitch0 bridge. All service ports are mounted to the

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

15

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

bridge in veth pair mode. Intra-node and inter-node communication can be implemented through routes and the OVS bridge.

Figure 1-17 Routing and bridge information of OVS networking nodes

----End

1.3.3.2 Calico Component Networking Operations

Prerequisites
1.
2.

The Docker and Kubernetes components (kubeadm, kubectl, and kubelet) have been installed on the node to be deployed.
The node to be deployed can properly pull Docker images.

Procedure
Step 1 Set the Kubernetes master node during initialization.
NOTE
In this section, the 10.244.0.0/16 network segment is used as the network driver CIDR and the default gateway is used as the network broadcast address. If you need to specify another network segment, modify the command accordingly. kubeadm init --pod-network-cidr=10.244.0.0/16 mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
After the initialization is complete, information in Figure 1-18 and Figure 1-19 are displayed. Check whether the pods information of the Kubernetes cluster is normal and whether the node status is NotReady. Back up the kubeadm join command in the output for future use. Then, deploy the network plane.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

16

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Figure 1-18 Successful initialization of the active Kubernetes node

Figure 1-19 Initialization status of the active Kubernetes node
Step 2 Edit the Calico deployment YAML file. Download the calico.yaml and calicoctl.yaml deployment files. The BGP IPIP mode recommended by Calico is deployed by default and you do not need to modify the configuration file. If you need to deploy the Calico component in VxLAN Only mode, see 1.4 Open vSwitch Network Plane Management.
Step 3 Deploy cluster network components. Use kubectl to deploy cluster network components.
kubectl apply -f calico.yaml kubectl apply -f calicoctl.yaml alias calicoctl="kubectl exec -i -n kube-system calicoctl -- /calicoctl"
After the deployment is complete, the coredns service is in the Running state and the node is in the Ready state, as shown in Figure 1-20 and Figure 1-21.
Figure 1-20 Calico component installation

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

17

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Figure 1-21 Calico component deployment status

Step 4 Add nodes to the cluster.
On other Kubernetes nodes to be deployed, run the kubeadm join command backed up in Step 2 to add the nodes to be deployed to the Kubernetes cluster.
kubeadm join <master-ip:port> --token <your-token> \ --discovery-token-ca-cert-hash sha256:<your-sha256-ca>
After the cluster nodes are added, "This node has joined the cluster" is displayed, as shown in Figure 1-22. The Calico component networking procedure is complete.
Figure 1-22 Adding nodes on the Calico network

----End

Network Verification

Step 1

Copy the following content and edit the Nginx deployment test YAML file to test the intra-node and cross-node communication capability.
apiVersion: apps/v1 kind: Deployment metadata:
name: nginx-arm-deployment spec:
selector: matchLabels: app: arm64v8_nginx
replicas: 5 template:
metadata: labels: app: arm64v8_nginx
spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 80

Step 2 Run the kubectl command to deploy the Nginx service. If information shown in Figure 1-23 is displayed, the Nginx service is running properly and the IP address has been allocated.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

18

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

kubectl apply -f nginx.yaml

Figure 1-23 Calico networking service deployment test

Check that the node routing information and OVS bridge status are normal, as shown in Figure 1-24. Calico has configured the addresses in the IP pool for the container space. All service ports are mapped to Calico components in veth pair mode. Intra-node and inter-node communication can be implemented through routes.
Figure 1-24 Routing and workload of Calico network nodes

----End

1.4 Open vSwitch Network Plane Management
For OVS, network plane management is to manage the networking mode. The management tasks are as follows:
Managing the networking mode  Configuring the VxLAN network  Configuring the GRE tunnel network
For details about the OVS networking mode, see 1.3.3.1 OVS Component Networking Operations. After the value of overlayType is changed to gre or vxlan, run the kubectl apply command to enable the configuration file to complete the networking mode configuration.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

19

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

1.5 Calico Network Plane Management
1.5.1 IP Address Pool Management
For Calico, network plane management covers IP address pool and BGP configuration. The IP pool management tasks are as follows:  Managing an IP Address Pool
­ Viewing IP address pool information ­ Adding an IP address pool ­ Modifying an IP address pool  Changing the IPIP mode  Changing the VxLAN mode  Disabling or enabling an IP address pool ­ Deleting an IP address pool
1.5.1.1 Viewing IP Address Pool Information
calicoctl get ipPool -o wide

1.5.1.2 Adding an IP Address Pool
Step 1 Edit the YAML file of the new IP address pool.
Configure a new IP address pool. Ensure that the name and CIDR of the IP address pool cannot conflict with those of existing IP address pools.
apiVersion: projectcalico.org/v3 kind: IPPool metadata: name: ippool-vxlan spec: cidr: 192.144.0.0/16 vxlanMode: Always natOutgoing: true
For details about parameter settings, see https://docs.projectcalico.org/ reference/resources/ippool.
Step 2 Add an IP address pool.
$ calicoctl apply -f - < ipPool.yaml

----End

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

20

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

1.5.1.3 Modifying an IP Address Pool

Changing the IPIP Mode
This section describes how to change the IPIP mode of the default-ipv4-ippool IP address pool to CrossSubnet as an example.
calicoctl patch ipPool default-ipv4-ippool -p '{"spec": {"ipipMode": "CrossSubnet"}}'
For details about the parameter settings, see https://docs.projectcalico.org/ reference/resources/ippool.
Changing the VxLAN Mode
This section describes how to change the VxLAN mode of the ippool-vxlan IP address pool to CrossSubnet as an example.
calicoctl patch ipPool ippool-vxlan -p '{"spec": {"vxlanMode": "CrossSubnet"}}'
For details about the parameter settings, see https://docs.projectcalico.org/ reference/resources/ippool.
Disabling or Enabling an IP Address Pool
This section uses the ippool-vxlan IP address pool as an example.
calicoctl patch ipPool ippool-vxlan -p '{"spec": {"disabled": true}}' calicoctl patch ipPool ippool-vxlan -p '{"spec": {"disabled": false}}'
1.5.1.4 Deleting an IP Address Pool
This section describes how to delete the ippool-vxlan IP address pool as an example. You can use either of the following commands.
calicoctl delete ipPool ippool-vxlan calicoctl delete -f - < ipPool_vxlan.yaml
1.5.2 Managing BGP Configuration
For Calico, network plane management covers IP address pool and BGP configuration. The BGP pool management tasks are as follows:
 Managing BGP configuration ­ Adding BGP configuration ­ Modifying BGP configuration
 Changing the BGP mode to Full Mesh  Changing the BGP mode to Route Reflector
­ Deleting BGP configuration

1.5.2.1 Checking the BGP Configuration
calicoctl get bgpConfiguration -o wide
By default, new clusters do not contain any BGP configuration and the default Calico BGP configuration is used. To change the BGP configuration, you need to the BGP configuration of the default domain.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

21

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

1.5.2.2 Adding BGP Configuration
Step 1 Edit the YAML file for adding BGP configuration.
Configure a new BGP. Ensure that the name and ASNumber of the BGP do not conflict with those of existing BGPs.
apiVersion: projectcalico.org/v3 kind: BGPConfiguration metadata: name: default spec: logSeverityScreen: Info nodeToNodeMeshEnabled: true asNumber: 63400 serviceClusterIPs: - cidr: 10.244.0.0/16
For details about parameter settings, see https://docs.projectcalico.org/ reference/resources/bgpconfig.
Step 2 Add a BGP.
calicoctl apply -f - < bgp.yaml

----End

1.5.2.3 Modifying BGP Configuration

Change the BGP mode to Full Mesh.
This section describes how to change the default BGP configuration as an example.
calicoctl patch bgpconfiguration default -p '{"spec": {"nodeToNodeMeshEnabled": true}}'

Changing the BGP Mode to Route Reflector
This section uses the default BGP configuration as an example to describe how to configure a node as a route reflector.
Step 1 Disable the full mesh mode between nodes.
After this step is complete, cross-node pod communication is interrupted.
calicoctl patch bgpconfiguration default -p '{"spec": {"nodeToNodeMeshEnabled": false}}'
Step 2 Configure a route reflector node.
kubectl label node <node-name> route-reflector=true
Step 3 Interconnecting the route reflector node and common nodes.
Edit the BGPPeer YAML file and use calicoctl to load the file.
kind: BGPPeer apiVersion: projectcalico.org/v3

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

22

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

metadata: name: peer-with-route-reflectors spec: nodeSelector: all() peerSelector: route-reflector == true calicoctl apply -f - < bgpPeer.yaml
After the configuration is complete, check the node status on each node. All nodes managed in the local domain can be viewed on the route reflector node, and only the route reflector node can be viewed on common nodes.

----End
1.5.2.4 Deleting BGP Configuration
This section describes how to delete the default BGP as an example. You can use either of the following commands.
calicoctl delete bgpConfiguration default calicoctl delete -f - < bgp.yaml

1.6 Troubleshooting

Problem 1: "ovs-ctl command not found" During OVS Startup
Symptom
When the ovs-ctl start command is run, the message "ovs-ctl command not found" is displayed.
Possible Causes
The tool path is not set into the environment variable. For details about how to set the environment variable, see Start with the Default Configurations.
Procedure
1. Check the Open vSwitch installation path, for example, /usr/local/share. 2. Set the environment variable.
export PATH=$PATH:/usr/local/share/openvswitch/scripts
3. Restart OVS components.
ovs-ctl start
Problem 2: Abnormal Inter-Node Network Communication After Kubernetes OVS Plane Deployment
Symptom
After the OVS plane is deployed on Kubernetes, the displayed network bandwidth is 0 when iperf/qperf services are running.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

23

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Possible Causes
By default, the network interface MTU in the container is 1500 bytes, which is the same as that of the host. After data packets are sent from the container, the data packets are encapsulated in the GRE/VxLAN overlay mode on the host OVS bridge. After the encapsulation, the data packets exceed 1500 bytes. As a result, the host machine fails to send network packets.
Procedure
You are advised to add network port MTU configuration in the Docker startup command and set the network interface MTU in the container to 1400.

Problem 3: Calico-node in the Running Instead of Ready State After Kubernetes Calico Deployment
Symptom
After Calico is deployed on Kubernetes, the calico-node is in the Running state but cannot be Ready.
Possible Causes
As shown in the following figure, docker logs show that the dataplane updates operation is being performed and the resync connection cannot be set up. The calicoctl tool is used to query node information on the master node. It is found that some nodes have dual network planes. However, the calico component uses AUTO_DETECTION to discover BGP network interfaces by default. The external network interfaces are incorrectly selected on these nodes. As a result, the network cannot communicate with each other.

Procedure
You are advised to use nodeSelector + IP_AUTODETECTION_METHOD to distinguish these nodes and manually identify BGP network interfaces.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

24

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

Problem 4: Service Node Network Interruption After Kubernetes Calico Deployment
Symptom
After Calico is deployed on Kubernetes, the network of the service node is interrupted.
Possible Causes
Check whether the Hi1822 NIC is used on the network plane.
The Calico component uses IPIP (IP tunnel mode) by default. However, the Hi1822 NIC firmware does not support the verification and unloading of IP tunnel packets and the TSO function. If the verification and offload functions in the transmit direction and the TSO function are enabled on the NIC, and the system sends IP tunnel packets, the NIC becomes abnormal. As a result, the NIC hardware does not obtain packets from the host and sends them to the network side, and the driver reports transmit timeout.
Specifically, the service network is interrupted and disconnected from the gateway. The service network can be recovered only after the gateway is restarted.
Procedure
You are advised to check the NIC first. If the Hi1822 NIC is used, you are advised to run the ethtool command to disable the verification and unload in the transmit direction.
ethtool -K <eth-port> tx off

1.7 References

1.7.1 OVS Kubernetes Deployment YAML File
# Configmap 'k-vswitch' is the only resource in this file that requires # updates based on your cluster configuration. # # clusterCIDR should be updated to the same CIDR configured on your # Kubernetes components # serviceCIDR should be updated to the same CIDR configured on your # Kubernetes components # overlayType should be updated based on the overlay type you want. # Currently 'vxlan' and 'gre' are supported. 'gre' is recommended # but some cloud providers may not allow gre traffic over your network. --apiVersion: v1 kind: ConfigMap metadata: name: k-vswitch namespace: kube-system data:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

25

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

clusterCIDR: "<clusterCIDR>" # change this depending on your cluster, e.g. "100.96.0.0/11"

serviceCIDR: "<serviceCIDR>" # change this depending on your cluster, e.g. "100.64.0.0/13"

overlayType: "<overlayType>"

# change this depending on your cluster, can be "vxlan" or "gre"

--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: vswitchconfigs.kvswitch.io spec: group: kvswitch.io version: v1alpha1 names: kind: VSwitchConfig plural: vswitchconfigs scope: Cluster

--apiVersion: v1 kind: ServiceAccount metadata: name: k-vswitch namespace: kube-system

--kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: k-vswitch rules: - apiGroups: - "" resources: - services - nodes - endpoints - pods - namespaces verbs: - list - get - watch - apiGroups: - "networking.k8s.io" resources: - "networkpolicies" verbs: - get - list - watch - apiGroups: - "kvswitch.io" resources: - vswitchconfigs verbs: - list - get - watch - create - update - patch - delete

--kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: k-vswitch roleRef:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

26

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: k-vswitch subjects: - kind: ServiceAccount name: k-vswitch namespace: kube-system
--apiVersion: apps/v1 kind: StatefulSet metadata: name: k-vswitch-controller namespace: kube-system spec: replicas: 1 updateStrategy: type: RollingUpdate serviceName: k-vswitch-controller selector: matchLabels: k8s-app: k-vswitch-controller template: metadata: labels: k8s-app: k-vswitch-controller spec: tolerations: - key: "node-role.kubernetes.io/master" effect: NoSchedule hostNetwork: true serviceAccountName: k-vswitch containers: - name: k-vswitch-controller image: kvswitch/k-vswitch:latest imagePullPolicy: IfNotPresent command: - "/bin/k-vswitch-controller" - "--cluster-cidr=$(K_VSWITCH_CLUSTER_CIDR)" - "--service-cidr=$(K_VSWITCH_SERVICE_CIDR)" - "--overlay-type=$(K_VSWITCH_OVERLAY_TYPE)" env: - name: K_VSWITCH_CLUSTER_CIDR valueFrom: configMapKeyRef: name: k-vswitch key: clusterCIDR - name: K_VSWITCH_SERVICE_CIDR valueFrom: configMapKeyRef: name: k-vswitch key: serviceCIDR - name: K_VSWITCH_OVERLAY_TYPE valueFrom: configMapKeyRef: name: k-vswitch key: overlayType
--apiVersion: apps/v1 kind: DaemonSet metadata: name: k-vswitchd namespace: kube-system labels: k8s-app: k-vswitchd spec: selector: matchLabels:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

27

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

k8s-app: k-vswitchd template: metadata: labels: k8s-app: k-vswitchd annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: serviceAccountName: k-vswitch containers: - name: k-vswitchd image: kvswitch/k-vswitch:latest imagePullPolicy: IfNotPresent env: - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName securityContext: privileged: true volumeMounts: - mountPath: /etc/cni/net.d name: cni-conf - mountPath: /etc/openvswitch name: ovs-etc - mountPath: /var/run/openvswitch name: ovs-run - mountPath: /var/log/openvswitch name: ovs-log - mountPath: /lib/modules name: lib-modules initContainers: - name: install-cni image: kvswitch/k-vswitch:latest imagePullPolicy: IfNotPresent command: - /bin/sh - -c - | set -e -x; cp /bin/k-vswitch-cni /opt/cni/bin/ volumeMounts: - mountPath: /opt/cni/bin name: cni-bin-dir hostNetwork: true tolerations: - key: CriticalAddonsOnly operator: Exists - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoSchedule key: node.kubernetes.io/not-ready operator: Exists volumes: - name: cni-bin-dir hostPath: path: /opt/cni/bin - name: cni-conf hostPath: path: /etc/cni/net.d - name: ovs-run hostPath: path: /var/run/openvswitch - name: ovs-etc hostPath: path: /etc/openvswitch - name: ovs-log hostPath:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

28

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

path: /var/log/openvswitch - name: lib-modules hostPath: path: /lib/modules
1.7.2 Calico Deployment YAML File
--# Source: calico/templates/calico-config.yaml # This ConfigMap is used to configure a self-hosted Calico installation. kind: ConfigMap apiVersion: v1 metadata: name: calico-config namespace: kube-system data: # Typha is disabled. typha_service_name: "none" # Configure the backend to use. calico_backend: "bird" # Configure the MTU to use for workload interfaces and the # tunnels. For IPIP, set to your network MTU - 20; for VXLAN # set to your network MTU - 50. veth_mtu: "1440"
# The CNI network configuration to install on each node. The special # values in this config will be automatically populated. cni_network_config: |{ "name": "k8s-pod-network", "cniVersion": "0.3.1", "plugins": [ { "type": "calico", "log_level": "info", "datastore_type": "kubernetes", "nodename": "__KUBERNETES_NODE_NAME__", "mtu": __CNI_MTU__, "ipam": { "type": "calico-ipam" }, "policy": { "type": "k8s" }, "kubernetes": { "kubeconfig": "__KUBECONFIG_FILEPATH__" } }, { "type": "portmap", "snat": true, "capabilities": {"portMappings": true} }, { "type": "bandwidth", "capabilities": {"bandwidth": true} } ] }
--# Source: calico/templates/kdd-crds.yaml
apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: bgpconfigurations.crd.projectcalico.org spec: scope: Cluster

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

29

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

group: crd.projectcalico.org version: v1 names: kind: BGPConfiguration plural: bgpconfigurations singular: bgpconfiguration
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: bgppeers.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: BGPPeer plural: bgppeers singular: bgppeer
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: blockaffinities.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: BlockAffinity plural: blockaffinities singular: blockaffinity
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: clusterinformations.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: ClusterInformation plural: clusterinformations singular: clusterinformation
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: felixconfigurations.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: FelixConfiguration plural: felixconfigurations singular: felixconfiguration
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: globalnetworkpolicies.crd.projectcalico.org spec:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

30

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

scope: Cluster group: crd.projectcalico.org version: v1 names: kind: GlobalNetworkPolicy plural: globalnetworkpolicies singular: globalnetworkpolicy shortNames: - gnp
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: globalnetworksets.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: GlobalNetworkSet plural: globalnetworksets singular: globalnetworkset
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: hostendpoints.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: HostEndpoint plural: hostendpoints singular: hostendpoint
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ipamblocks.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: IPAMBlock plural: ipamblocks singular: ipamblock
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ipamconfigs.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: IPAMConfig plural: ipamconfigs singular: ipamconfig
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

31

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

metadata: name: ipamhandles.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: IPAMHandle plural: ipamhandles singular: ipamhandle
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ippools.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: IPPool plural: ippools singular: ippool
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: kubecontrollersconfigurations.crd.projectcalico.org spec: scope: Cluster group: crd.projectcalico.org version: v1 names: kind: KubeControllersConfiguration plural: kubecontrollersconfigurations singular: kubecontrollersconfiguration --apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: networkpolicies.crd.projectcalico.org spec: scope: Namespaced group: crd.projectcalico.org version: v1 names: kind: NetworkPolicy plural: networkpolicies singular: networkpolicy
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: networksets.crd.projectcalico.org spec: scope: Namespaced group: crd.projectcalico.org version: v1 names: kind: NetworkSet plural: networksets singular: networkset
----# Source: calico/templates/rbac.yaml

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

32

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

# Include a clusterrole for the kube-controllers component, # and bind it to the calico-kube-controllers serviceaccount. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: calico-kube-controllers rules: # Nodes are watched to monitor for deletions. - apiGroups: [""] resources: - nodes verbs: - watch - list - get # Pods are queried to check for existence. - apiGroups: [""] resources: - pods verbs: - get # IPAM resources are manipulated when nodes are deleted. - apiGroups: ["crd.projectcalico.org"] resources: - ippools verbs: - list - apiGroups: ["crd.projectcalico.org"] resources: - blockaffinities - ipamblocks - ipamhandles verbs: - get - list - create - update - delete # kube-controllers manages hostendpoints. - apiGroups: ["crd.projectcalico.org"] resources: - hostendpoints verbs: - get - list - create - update - delete # Needs access to update clusterinformations. - apiGroups: ["crd.projectcalico.org"] resources: - clusterinformations verbs: - get - create - update # KubeControllersConfiguration is where it gets its config - apiGroups: ["crd.projectcalico.org"] resources: - kubecontrollersconfigurations verbs: # read its own config - get # create a default if none exists - create # update status - update # watch for changes

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

33

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

- watch --kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: calico-kube-controllers roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: calico-kube-controllers subjects: - kind: ServiceAccount name: calico-kube-controllers namespace: kube-system --# Include a clusterrole for the calico-node DaemonSet, # and bind it to the calico-node serviceaccount. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: calico-node rules: # The CNI plugin needs to get pods, nodes, and namespaces. - apiGroups: [""] resources: - pods - nodes - namespaces verbs: - get - apiGroups: [""] resources: - endpoints - services verbs: # Used to discover service IPs for advertisement. - watch - list # Used to discover Typhas. - get # Pod CIDR auto-detection on kubeadm needs access to config maps. - apiGroups: [""] resources: - configmaps verbs: - get - apiGroups: [""] resources: - nodes/status verbs: # Needed for clearing NodeNetworkUnavailable flag. - patch # Calico stores some configuration information in node annotations. - update # Watch for changes to Kubernetes NetworkPolicies. - apiGroups: ["networking.k8s.io"] resources: - networkpolicies verbs: - watch - list # Used by Calico for policy information. - apiGroups: [""] resources: - pods - namespaces - serviceaccounts verbs: - list

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

34

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

- watch # The CNI plugin patches pods/status. - apiGroups: [""] resources: - pods/status verbs: - patch # Calico monitors various CRDs for config. - apiGroups: ["crd.projectcalico.org"] resources: - globalfelixconfigs - felixconfigurations - bgppeers - globalbgpconfigs - bgpconfigurations - ippools - ipamblocks - globalnetworkpolicies - globalnetworksets - networkpolicies - networksets - clusterinformations - hostendpoints - blockaffinities verbs: - get - list - watch # Calico must create and update some CRDs on startup. - apiGroups: ["crd.projectcalico.org"] resources: - ippools - felixconfigurations - clusterinformations verbs: - create - update # Calico stores some configuration information on the node. - apiGroups: [""] resources: - nodes verbs: - get - list - watch # These permissions are only requried for upgrade from v2.6, and can # be removed after upgrade or on fresh installations. - apiGroups: ["crd.projectcalico.org"] resources: - bgpconfigurations - bgppeers verbs: - create - update # These permissions are required for Calico CNI to perform IPAM allocations. - apiGroups: ["crd.projectcalico.org"] resources: - blockaffinities - ipamblocks - ipamhandles verbs: - get - list - create - update - delete - apiGroups: ["crd.projectcalico.org"] resources: - ipamconfigs

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

35

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

verbs: - get # Block affinities must also be watchable by confd for route aggregation. - apiGroups: ["crd.projectcalico.org"] resources: - blockaffinities verbs: - watch # The Calico IPAM migration needs to get daemonsets. These permissions can be # removed if not upgrading from an installation using host-local IPAM. - apiGroups: ["apps"] resources: - daemonsets verbs: - get
--apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: calico-node roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: calico-node subjects: - kind: ServiceAccount name: calico-node namespace: kube-system
--# Source: calico/templates/calico-node.yaml # This manifest installs the calico-node container, as well # as the CNI plugins and network config on # each master and worker node in a Kubernetes cluster. kind: DaemonSet apiVersion: apps/v1 metadata: name: calico-node namespace: kube-system labels: k8s-app: calico-node spec: selector: matchLabels: k8s-app: calico-node updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 template: metadata: labels: k8s-app: calico-node annotations: # This, along with the CriticalAddonsOnly toleration below, # marks the pod as a critical add-on, ensuring it gets # priority scheduling and that its resources are reserved # if it ever gets evicted. scheduler.alpha.kubernetes.io/critical-pod: '' spec: nodeSelector: kubernetes.io/os: linux hostNetwork: true tolerations: # Make sure calico-node gets scheduled on all nodes. - effect: NoSchedule operator: Exists # Mark the pod as a critical add-on for rescheduling.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

36

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

- key: CriticalAddonsOnly operator: Exists - effect: NoExecute operator: Exists serviceAccountName: calico-node # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods. terminationGracePeriodSeconds: 0 priorityClassName: system-node-critical initContainers: # This container performs upgrade from host-local IPAM to calico-ipam. # It can be deleted if this is a fresh installation, or if you have already # upgraded to use calico-ipam. - name: upgrade-ipam image: calico/cni:v3.13.1 command: ["/opt/cni/bin/calico-ipam", "-upgrade"] env: - name: KUBERNETES_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: CALICO_NETWORKING_BACKEND valueFrom: configMapKeyRef: name: calico-config key: calico_backend volumeMounts: - mountPath: /var/lib/cni/networks name: host-local-net-dir - mountPath: /host/opt/cni/bin name: cni-bin-dir securityContext: privileged: true # This container installs the CNI binaries # and CNI network config file on each node. - name: install-cni image: calico/cni:v3.13.1 command: ["/install-cni.sh"] env: # Name of the CNI config file to create. - name: CNI_CONF_NAME value: "10-calico.conflist" # The CNI network config to install on each node. - name: CNI_NETWORK_CONFIG valueFrom: configMapKeyRef: name: calico-config key: cni_network_config # Set the hostname based on the k8s node name. - name: KUBERNETES_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName # CNI MTU Config variable - name: CNI_MTU valueFrom: configMapKeyRef: name: calico-config key: veth_mtu # Prevents the container from sleeping forever. - name: SLEEP value: "false" volumeMounts: - mountPath: /host/opt/cni/bin name: cni-bin-dir - mountPath: /host/etc/cni/net.d name: cni-net-dir securityContext: privileged: true

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

37

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

# Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes # to communicate with Felix over the Policy Sync API. - name: flexvol-driver image: calico/pod2daemon-flexvol:v3.13.1 volumeMounts: - name: flexvol-driver-host mountPath: /host/driver securityContext: privileged: true containers: # Runs calico-node container on each Kubernetes node. This # container programs network policy and routes on each # host. - name: calico-node image: calico/node:v3.13.1 env: # Use Kubernetes API as the backing datastore. - name: DATASTORE_TYPE value: "kubernetes" # Wait for the datastore. - name: WAIT_FOR_DATASTORE value: "true" # Set based on the k8s node name. - name: NODENAME valueFrom: fieldRef: fieldPath: spec.nodeName # Choose the backend to use. - name: CALICO_NETWORKING_BACKEND valueFrom: configMapKeyRef: name: calico-config key: calico_backend # Cluster type to identify the deployment type - name: CLUSTER_TYPE value: "k8s,bgp" # Auto-detect the BGP IP address. - name: IP value: "autodetect" # Enable IPIP - name: CALICO_IPV4POOL_IPIP value: "Always" # Enable or Disable VXLAN on the default IP pool. - name: CALICO_IPV4POOL_VXLAN value: "Never" # Set MTU for tunnel device used if ipip is enabled - name: FELIX_IPINIPMTU valueFrom: configMapKeyRef: name: calico-config key: veth_mtu # Set MTU for the VXLAN tunnel device. - name: FELIX_VXLANMTU valueFrom: configMapKeyRef: name: calico-config key: veth_mtu # The default IPv4 pool to create on startup if none exists. Pod IPs will be # chosen from this range. Changing this value after installation will have # no effect. This should fall within `--cluster-cidr`. # - name: CALICO_IPV4POOL_CIDR # value: "192.168.0.0/16" # Disable file logging so `kubectl logs` works. - name: CALICO_DISABLE_FILE_LOGGING value: "true" # Set Felix endpoint to host default action to ACCEPT. - name: FELIX_DEFAULTENDPOINTTOHOSTACTION value: "ACCEPT" # Disable IPv6 on Kubernetes.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

38

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

- name: FELIX_IPV6SUPPORT value: "false" # Set Felix logging to "info" - name: FELIX_LOGSEVERITYSCREEN value: "info" - name: FELIX_HEALTHENABLED value: "true" securityContext: privileged: true resources: requests: cpu: 250m livenessProbe: exec: command: - /bin/calico-node - -felix-live - -bird-live periodSeconds: 10 initialDelaySeconds: 10 failureThreshold: 6 readinessProbe: exec: command: - /bin/calico-node - -felix-ready - -bird-ready periodSeconds: 10 volumeMounts: - mountPath: /lib/modules name: lib-modules readOnly: true - mountPath: /run/xtables.lock name: xtables-lock readOnly: false - mountPath: /var/run/calico name: var-run-calico readOnly: false - mountPath: /var/lib/calico name: var-lib-calico readOnly: false - name: policysync mountPath: /var/run/nodeagent volumes: # Used by calico-node. - name: lib-modules hostPath: path: /lib/modules - name: var-run-calico hostPath: path: /var/run/calico - name: var-lib-calico hostPath: path: /var/lib/calico - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate # Used to install CNI. - name: cni-bin-dir hostPath: path: /opt/cni/bin - name: cni-net-dir hostPath: path: /etc/cni/net.d # Mount in the directory for host-local IPAM allocations. This is # used when upgrading from host-local to calico-ipam, and can be removed # if not using the upgrade-ipam init container. - name: host-local-net-dir

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

39

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

hostPath: path: /var/lib/cni/networks # Used to create per-pod Unix Domain Sockets - name: policysync hostPath: type: DirectoryOrCreate path: /var/run/nodeagent # Used to install Flex Volume Driver - name: flexvol-driver-host hostPath: type: DirectoryOrCreate path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds ---
apiVersion: v1 kind: ServiceAccount metadata: name: calico-node namespace: kube-system
--# Source: calico/templates/calico-kube-controllers.yaml # See https://github.com/projectcalico/kube-controllers apiVersion: apps/v1 kind: Deployment metadata: name: calico-kube-controllers namespace: kube-system labels: k8s-app: calico-kube-controllers spec: # The controllers can only have a single active instance. replicas: 1 selector: matchLabels: k8s-app: calico-kube-controllers strategy: type: Recreate template: metadata: name: calico-kube-controllers namespace: kube-system labels: k8s-app: calico-kube-controllers annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: nodeSelector: kubernetes.io/os: linux tolerations: # Mark the pod as a critical add-on for rescheduling. - key: CriticalAddonsOnly operator: Exists - key: node-role.kubernetes.io/master effect: NoSchedule serviceAccountName: calico-kube-controllers priorityClassName: system-cluster-critical containers: - name: calico-kube-controllers image: calico/kube-controllers:v3.13.1 env: # Choose which controllers to run. - name: ENABLED_CONTROLLERS value: node - name: DATASTORE_TYPE value: kubernetes readinessProbe: exec: command:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

40

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

- /usr/bin/check-status - -r
---
apiVersion: v1 kind: ServiceAccount metadata: name: calico-kube-controllers namespace: kube-system
--# Source: calico/templates/calico-etcd-secrets.yaml
--# Source: calico/templates/calico-typha.yaml
--# Source: calico/templates/configure-canal.yaml
1.7.3 Calicoctl Deployment YAML File
# Calico Version v3.13.1 # https://docs.projectcalico.org/releases#v3.13.1 # This manifest includes the following component versions: # calico/ctl:v3.13.1
apiVersion: v1 kind: ServiceAccount metadata: name: calicoctl namespace: kube-system
---
apiVersion: v1 kind: Pod metadata: name: calicoctl namespace: kube-system spec: nodeSelector: kubernetes.io/os: linux hostNetwork: true serviceAccountName: calicoctl containers: - name: calicoctl image: calico/ctl:v3.13.1 command: ["/bin/sh", "-c", "while true; do sleep 3600; done"] env: - name: DATASTORE_TYPE value: kubernetes
---
kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: calicoctl rules: - apiGroups: [""] resources: - namespaces - nodes verbs: - get - list - update - apiGroups: [""]

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

41

Kunpeng BoostKit for Virtualization User Guides

1 Large-Scale Container Networking User Guide

resources: - nodes/status verbs: - update - apiGroups: [""] resources: - pods - serviceaccounts verbs: - get - list - apiGroups: [""] resources: - pods/status verbs: - update - apiGroups: ["crd.projectcalico.org"] resources: - bgppeers - bgpconfigurations - clusterinformations - felixconfigurations - globalnetworkpolicies - globalnetworksets - ippools - kubecontrollersconfigurations - networkpolicies - networksets - hostendpoints - ipamblocks - blockaffinities - ipamhandles - ipamconfigs verbs: - create - get - list - update - delete - apiGroups: ["networking.k8s.io"] resources: - networkpolicies verbs: - get - list
---
apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: calicoctl roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: calicoctl subjects: - kind: ServiceAccount name: calicoctl namespace: kube-system

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

42

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2 Kube-OVN User Guide

2.1 Introduction 2.2 Quick Start 2.3 Service Deployment 2.4 Subnet Management 2.5 O&M Operations 2.6 Reference

2.1 Introduction

2.1.1 Kube-OVN Overview
Kube-OVN integrates the OVN-based (OVN is short for open virtual network) network virtualization with Kubernetes. It offers an advanced container network fabric for enterprises with the most functions and the easiest operation. The software is open sourced based on Apache 2.0. Kube-OVN 1.2.0 officially supports the ARM64 architecture and can natively run on TaiShan servers.
Kube-OVN supports the following functions:
 Binding subnets to namespaces: Each namespace can have a unique subnet (backed by an independent logical switch). Pods within a namespace will have IP addresses allocated from the current subnet. Multiple namespaces can share a subnet.
 Subnet isolation: Kube-OVN allows you to configure a subnet to deny any traffic from source IP addresses not within the same subnet. You can also filter traffic based on the IP address or IP address segment whitelist.
 Network policies: Kube-OVN supports Kubernetes network policies in highperformance OVN access control list (ACL) mode.
 Static IP addresses for workloads: Random or static IP addresses are allocated to workloads.
 Static IP addresses for pods: Static IP addresses and MAC addresses are allocated to pods.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

43

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

 IP address reuse supported by StatefulSets: Within the lifecycle of a StatefulSet, pods reuse IP addresses by name. (The IP addresses can be randomly allocated for the first time and are fixed within the lifecycle.)
 Multi-NIC IP address management (IPAM): The IPAM container network interface (CNI) plugin within the cluster is supported. In addition to KubeOVN, MACvlan, VLAN, and host-device can be deployed to make full use of the advantages of subnets and static IP address allocation.
 Dynamic and bidirectional QoS: Bidirectional bandwidth QoS management is supported, and the ingress/egress bandwidths of pods can be dynamically changed.
 Embedded load balancers: The high-performance distributed L2 load balancer embedded in the OVN is used to replace kube-proxy.
 Distributed gateways: Each node can function as a gateway to provide external network connections.
 Namespaced gateways: Each namespace can function as a gateway to provide external network connections.
 Direct connection to the external network: The IP addresses of pods are directly exposed to the external network. You can add a static route to an external router to divert the traffic of the container network segments to any host in the cluster.
 BGP support: The IP addresses of pods can be exposed to the external Border Gateway Protocol (BGP).
 Traffic mirror: Traffic between containers can be duplicated to facilitate monitoring, auditing, and fault diagnosis.
 VLAN support: The underlay virtual local area network (VLAN) mode is supported to obtain better performance and throughput.
 DPDK support: The Open vSwitch with the Data Plane Development Kit (OvSDPDK) can run in pods.
 IPv6 support: Pods can be deployed in IPv6-only mode.
 Problem locating tool: Tools are provided to locate, monitor, and dump network problems.
2.1.2 Concepts

Kubernetes
Kubernetes is an open-source container orchestration engine provided by Google. It supports automatic deployment, large-scale scalability, and containerized application management. Kubernetes classifies network scenarios into the following types: communication between containers in the same pod, communication between pods, communication between pods and services, and communication between systems outside the cluster and services.
Pod and service resource objects use their own dedicated networks. The pod network is implemented through Kubernetes network plugin configuration (CNI model). The service network is specified by the Kubernetes cluster. The overall network model is implemented using external plugins. Therefore, you need to plan the network model and network deployment before deploying the Kubernetes cluster.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

44

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

Container Network Model (CNM)
CNM is proposed by Docker and adopted by the Libnetwork project as the network model standard. It is used by common open-source network components such as Kuryr, OVN, Calico, and Weave for container network interconnection.
As shown in Figure 2-1, the CNM implementation Libnetwork provides interfaces between Docker daemons and network driver programs. The network controller is responsible for matching drivers with networks, and each driver is responsible for managing the network of the driver and provides services such as IPAM for the network. Drivers of the CNM can be native drivers (for built-in Libnetwork or network models supported by Docker) or third-party plugin drivers. The native drivers include None, Bridge, Overlay, and MACvlan. Third-party drivers provide more functions. In addition, the scope of a CNM driver can be defined as either a local scope (single-host mode) or a global scope (multi-host mode).

Figure 2-1 CNM drivers

Containers are connected through a series of network endpoints, as shown in Figure 2-2. A typical network interface exists in the form of a veth pair. One end of the veth pair is in the network sandbox of the container, and the other end is in the specified network. One network endpoint is added to only one network plane. Multiple network endpoints can exist in the network sandbox of a container.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

45

Kunpeng BoostKit for Virtualization User Guides
Figure 2-2 CNM interfaces

2 Kube-OVN User Guide

Container Network Interface (CNI)
CNI is proposed by CoreOS and adopted by Apache Mesos, Cloud Foundry, Kubernetes, Kurma, and rkt as the network model standard. CNI is used by common open-source network components such as Contiv Networking, Calico, and Weave for container network interconnection.
As shown in Figure 2-3, CNI is implemented based on a simplest standard, enabling network development engineers to implement protocol communication between containers and network plugins in a simple way.
Multiple network plugins can run in a container so that the container can connect to different network planes driven by different plugins. The network is defined in a JSON configuration file and is instantiated as a new namespace when CNI plugins are called.
Figure 2-3 CNI drivers

Open vSwitch
Open vSwitch (OVS) is a multi-layer switch software based on the Apache 2.0 open-source protocol. It aims to build a production environment quality switching platform that supports standard management interfaces, forward function interfaces, programmable plugins, and management control. OVN is a native

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

46

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

virtualized network solution provided by OVS. It uses existing OVS functions to implement large-scale and high-quality cluster management.

Kube-OVN network topology
Figure 2-4 shows the switches, cluster routers, and firewalls of Kube-OVN. They are deployed on all nodes in the cluster in distributed mode, and there is no single point of failure in the network topology of the cluster.

Figure 2-4 Kube-OVN network topology

2.1.3 Application Scenarios
When using the Kubernetes container orchestration engine to deploy services, you need to use the open-source Kube-OVN component to plan and deploy network planes. This document provides a Quick Start and detailed guidance for component deployment.
2.1.4 Accessing and Using Kube-OVN
You can access and use Kube-OVN by using the following tools during deployment and O&M:
 Plugin management tool provided by Kube-OVN, that is, kubectl ko

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

47

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

 Kubernetes management tool, that is, kubectl, which enables you to indirectly access, manage, and configure network components

2.2 Quick Start

2.2.1 Basic Installation

2.2.1.1 Environment Requirements

OS and Software Versions
Kube-OVN is a CNI-compliant network component that runs in the Kubernetes environment. Table 2-1 lists the supported OS and software versions.

Table 2-1 Environment requirements

Item

Name

OS

CentOS

Ubuntu

Kernel

Kernel

Docker

Docker Community

Docker CE

Kubernetes

Kubernetes

Version 7.5 or later 16.04 or later 3.10.0-898 or later 1.12.6 or later 18.09.0 or later 1.11 or later

Network Plugins
Kube-OVN contains all the required CNI plugins. You do not need to install the CNI plugin separately.
To prevent the network from being affected by the rules and routes of other network plugins, you are advised to install Kube-OVN in a Kubernetes cluster that has no other network plugins installed.
If other network plugins have been installed in the cluster, delete them and the corresponding network configurations such as NICs, iptables rules, and routing rules. Ensure that no configuration files of other network plugins exist in the /etc/cni/net.d/ directory.

2.2.1.2 One-Click Installation (Recommended)

Prerequisites
1.

The Docker and Kubernetes components (kubeadm, kubectl, and kubelet) have been installed on the node to be deployed.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

48

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2. The node to be deployed can properly pull Docker images.

Procedure
Step 1 Set the master Kubernetes nodes during initialization.
NOTE
In this section, the 10.16.0.0/16 network segment is used as the network driver classless inter-domain routing (CIDR) and the default gateway is used as the network broadcast address. If you need to specify another network segment, modify the commands accordingly. kubeadm init --pod-network-cidr=10.16.0.0/16 mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
After the initialization is complete, information in Figure 2-5 and Figure 2-6 are displayed. Check whether the pod information of the Kubernetes cluster is normal and whether the node status is NotReady. Back up the kubeadm join command in the output for future use. Then, deploy the network plane.

Figure 2-5 Successful initialization of the master Kubernetes nodes

Figure 2-6 Initialization status of the master Kubernetes nodes
Step 2 Obtain and edit the one-click installation script. For details about how to obtain the script, see One-Click Installation Script. Download and edit the Kube-OVN one-click installation script install.sh and deploy it on the Kunpeng processor platform. You need to add the suffix -arm to the VERSION field to support proper running on the ARM64 platform. Set the following variables in the file to expected values based on the site requirements. Set NETWORK_TYPE to geneve or vlan based on site requirements.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

49

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

CAUTION
The Kube-OVN version and features are evolving rapidly. It is recommended that the Kube-OVN version obtained by the one-click installation script be the same as the image version. Otherwise, Kube-OVN may fail to be deployed and started in one-click mode.

REGISTRY="kubeovn"

NAMESPACE="kube-system" # The ns to deploy kube-ovn

POD_CIDR="10.16.0.0/16"

# Do NOT overlap with NODE/SVC/JOIN CIDR

SVC_CIDR="10.96.0.0/12"

# Do NOT overlap with NODE/POD/JOIN CIDR

JOIN_CIDR="100.64.0.0/16" # Do NOT overlap with NODE/POD/SVC CIDR

LABEL="node-role.kubernetes.io/master" # The node label to deploy OVN DB

IFACE=""

# The nic to support container network, if empty will use the nic that the default

route use

NETWORK_TYPE="geneve" # geneve or vlan

VERSION="v1.2.1-arm"

Step 3 Use the install.sh script to deploy cluster network components.
sh install.sh

After the deployment is complete, the coredns service is in the Running state and nodes are in the Ready state, as shown in Figure 2-7 and Figure 2-8.

Figure 2-7 Pods after Kube-OVN is deployed

Figure 2-8 Node status after Kube-OVN is deployed
Step 4 Add nodes to the cluster. On other Kubernetes nodes to be deployed, run the kubeadm join command backed up in Step 1 to add the nodes to be deployed to the Kubernetes cluster.
kubeadm join <master-ip:port> --token <your-token> \ --discovery-token-ca-cert-hash sha256:<your-sha256-ca>
After the cluster nodes are added, "This node has joined the cluster" is displayed, as shown in Figure 2-9. The Kube-OVN networking procedure is complete.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

50

Kunpeng BoostKit for Virtualization User Guides
Figure 2-9 Successful adding of cluster nodes

2 Kube-OVN User Guide

----End
2.2.1.3 Manual Installation
Prerequisites
1. The Docker and Kubernetes components (kubeadm, kubectl, and kubelet) have been installed on the node to be deployed.
2. The node to be deployed can properly pull Docker images.
Procedure
Step 1 Set the master Kubernetes nodes during initialization. NOTE
In this section, the 10.16.0.0/16 network segment is used as the network driver classless CIDR and the default gateway is used as the network broadcast address. If you need to specify another network segment, modify the commands accordingly. kubeadm init --pod-network-cidr=10.16.0.0/16 mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
After the initialization is complete, information in Figure 2-10 and Figure 2-11 are displayed. Check whether the pod information of the Kubernetes cluster is normal and whether the node status is NotReady. Back up the kubeadm join command in the output for future use. Then, deploy the network plane.
Figure 2-10 Successful initialization of the master Kubernetes nodes

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

51

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

Figure 2-11 Initialization status of the master Kubernetes nodes

Step 2 Deploy custom resource definition (CRD) resources.
Kube-OVN creates three types of CRD resources, that is, subnets, IP addresses, and VLANs, to facilitate network management. Create the crd.yaml file by referring to YAML File for CRD Deployment. Then, run the kubectl command to create CRD resources.
kubectl apply -f crd.yaml
After the execution is complete, run the kubectl get crd command to view the result. The information shown in Figure 2-12 is displayed.
Figure 2-12 Successful deployment of CRD resources

Step 3 Label the node where the ovndb resides. Specify the master OVN node for Kube-OVN to deploy the ovndb for storing data persistently on the hard drive of the host machine.
kubectl label node <node-name to deploy ovndb> kube-ovn/role=master
After specifying the master OVN node, you can run the kubectl get node --showlabels command to view the label.
Figure 2-13 Node labels

Step 4 Deploy the OVN.
The underlying network of Kube-OVN depends on the OVS and OVN provided by the Open vSwitch community. Create the ovn.yaml file by referring to YAML File for OVN Deployment. Then, run the kubectl command to deploy the OVS and OVN.
kubectl apply -f ovn.yaml
After the execution is complete, run the kubectl get pods -A command to view pod status. The information shown in Figure 2-14 is displayed.
Figure 2-14 Successful deployment of the OVN

Step 5 Deploy the Kube-OVN-Controller and CNI Server.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

52

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

By default, Kube-OVN uses 10.16.0.0/16 as the default subnet. 100.64.0.1/16 is used as the subnet for communication between hosts and pods, the master NIC of Kubernetes nodes is used for pod traffic, and the traffic mirroring mode is enabled. Before the deployment, obtain and edit the kube-ovn.yaml file by referring to YAML File for Kube-OVN Deployment. Then, run the kubectl command to deploy Kube-OVN.
kubectl apply -f kube-ovn.yaml
After the execution is complete, run the kubectl get pods -A command to view pod status, and run the kubectl get subnet command to view subnets automatically created. The information shown in Figure 2-15 is displayed.

Figure 2-15 Successful deployment of Kube-OVN

Step 6 Install the kubectl plugin.
Kube-OVN provides the kubectl plugin for better monitoring network quality and diagnosing faults. You are advised to install the plugin.
Save the code provided in kubectl Plugin as the kubectl-ko file. Then, copy the file to a directory (for example, /usr/local/bin) in $PATH, and grant the execute permission on the file.
cp kubectl-ko /usr/local/bin/kubectl-ko chmod +x /usr/local/bin/kubectl-ko
After the execution is complete, run the kubectl command to check the plugin status.
kubectl plugin list
The command output is as follows:

Step 7 Add nodes to the cluster.
On other Kubernetes nodes to be deployed, run the kubeadm join command backed up in Step 1 to add the nodes to be deployed to the Kubernetes cluster.
kubeadm join <master-ip:port> --token <your-token> \ --discovery-token-ca-cert-hash sha256:<your-sha256-ca>
After the cluster nodes are added, "This node has joined the cluster" is displayed, as shown in Figure 2-16. The Kube-OVN networking procedure is complete.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

53

Kunpeng BoostKit for Virtualization User Guides
Figure 2-16 Successful adding of cluster nodes

2 Kube-OVN User Guide

----End
2.2.2 Advanced Installation

2.2.2.1 Configuring VLAN Support
By default, Kube-OVN uses Geneve to encapsulate cross-host traffic and abstracts a virtual overlay network from the infrastructure. In addition, Kube-OVN supports the VLAN function since version 1.2.0. In scenarios sensitive to performance and throughput, an underlay network in VLAN mode is supported. The container network can be directly connected to physical switches through the VLAN to achieve better performance and throughput.
To use the VLAN mode, a host must have an NIC dedicated for the container network. The NIC port on the switch must work in trunk mode to allow 802.1Q data packets to pass through. Currently, Geneve or VLAN is a global option, and all containers must work in the same mode.
NOTE
The VLAN mode requires that an independent NIC be provided for the Kube-OVN container network. The VLAN mode is not recommended for a single network plane.
The following is an example of configuring VLAN support:
Step 1 Modify the installation script.
In the script, set NETWORK_TYPE to VLAN and VLAN_INTERFACE_NAME to the corresponding host NIC, and deploy the cluster in the original mode.
Step 2 Create a VLAN.
kubectl create
Create the following VLAN:
apiVersion: kubeovn.io/v1 kind: Vlan metadata:
name: product spec:
vlanId: 10
Step 3 Create a namespace.
kubectl create
Create the following namespace:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

54

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

apiVersion: v1 kind: Namespace metadata:
name: product labels:
name: product
Step 4 Create subnets and bind them to the VLAN.
kubectl create
The command will create the following subnets. Multiple subnets can be bound to the same VLAN.
apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: product spec:
cidrBlock: 10.100.0.0/16 default: false gateway: 10.100.0.1 gatewayType: distributed natOutgoing: true vlan: product namespaces:
- product
Step 5 Create a pod.
Run the following command to deploy the nginx pod in the namespace created in Step 3 and allocate IP addresses to the pod in the corresponding VLAN:
kubectl run samplepod --image=nginx --namespace=product
----End

2.2.2.2 Deploying HA
The Kube-OVN HA involves two components: ovndb and Kube-OVN-Controller.
The HA modes of the two components are different. The ovndb uses Raft for distributed consistency to achieve the cluster HA of the active-active mode. The Kube-OVN-Controller needs to process the status and events in the cluster. Each event can have only one working instance. Therefore, the HA of the leaderelection mode is used.

Deploying HA for the ovndb
Step 1 Add nodes for deploying the ovndb. An odd number (1, 3, 5...) of nodes are recommended.
kubectl label node <node-name to deploy ovndb> kube-ovn/role=master
Step 2 Change the value of replicas in the ovn-central deployment section in the ovn.yaml file to the number of nodes configured in the previous step.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

55

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

Step 3 Deploy the modified ovn.yaml file.
kubectl apply -f ovn.yaml
After ovn-central pods enter the Ready state, HA deployment for the ovndb is complete.
----End
Deploying HA for the Kube-OVN-Controller
The Kube-OVN-Controller implements the leader-election process. You only need to increase the value of replicas to achieve HA.
Change the value of replicas in the kube-ovn-controller deployment section in the kube-ovn.yaml file and run the kubectl apply command.
kubectl apply -f kube-ovn.yaml
After the Kube-OVN-Controller enters the Ready state, HA deployment for the Kube-OVN-Controller is complete.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

56

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.2.2.3 Configuring the Built-in Subnets
Two built-in subnets are configured during Kube-OVN installation, as shown in Figure 2-17.  Default subnet: default subnet used for allocating IP addresses to pods. The
default CIDR is 10.16.0.0/16.  Node subnet: special subnet for communication between nodes and pods. The
default CIDR is 100.64.0.1/16.
Figure 2-17 Built-in subnets of Kube-OVN
During the installation, you can modify the Kube-OVN-Controller configuration in the kube-ovn.yaml file. Note that the two subnets cannot conflict with the CIDRs of the existing host network and service (SVC) network.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

57

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.2.2.4 Selecting the Host NIC, MTU, and Traffic Mirrors
If the host machine has multiple NICs, Kube-OVN selects the NIC of the default route as the NIC for cross-node communication between containers by default and establishes the corresponding Geneve tunnel.
Extra payloads are occupied when the Geneve tunnel is used for overlay network encapsulation. When creating a container NIC, Kube-OVN adjusts the maximum transmission unit (MTU) of the container NIC based on the MTU of the selected NIC. The default MTU is the MTU of the host NIC minus 100.
By default, Kube-OVN creates a mirror0 NIC on each node to copy the network traffic of all containers on the current node. You can use tcpdump and other tools to analyze the traffic.
If you need to customize the preceding functions, you can configure cni-server in the kube-ovn.yaml file.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

58

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.2.3 Uninstallation
If you need to delete Kube-OVN and replace other network plugins, perform the following steps to delete Kube-OVN components and OVS configuration to prevent interference to other network plugins.
Step 1 Use the cleanup script to uninstall the resources created by Kubernetes.
Obtain the script described in Cleanup Script, grant the execute permission on the script, and run the following command on the master Kubernetes nodes:
bash cleanup.sh
Step 2 Delete the configuration of the ovsdb and Open vSwitch.
rm -rf /var/run/openvswitch rm -rf /var/run/ovn rm -rf /etc/origin/openvswitch/ rm -rf /etc/origin/ovn/ rm -rf /etc/cni/net.d/00-kube-ovn.conflist rm -rf /etc/cni/net.d/01-kube-ovn.conflist rm -rf /var/log/openvswitch rm -rf /var/log/ovn
Step 3 Restart the nodes and ensure that the corresponding NIC information, iptables rules, and ipset rules are deleted.
----End

2.3 Service Deployment

2.3.1 Recommended Deployment Modes

Deployme Features and Application nt Mode Scenarios

Pod

Minimum deployment unit.

Deployme Pods and ReplicaSets. nt

Kube-OVN Behavior
Kube-OVN randomly allocates an IP address in a subnet. You can also specify a static IP address and a MAC address in the YAML configuration file.
Kube-OVN randomly allocates an IP address in a subnet. You can also specify a static IP address in the YAML configuration file.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

59

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

Deployme Features and Application nt Mode Scenarios

Kube-OVN Behavior

StatefulSet

Manages workloads of stateful applications and provides static IDs and uniqueness assurance. The IDs cannot be replaced with each other. After scheduling, the IDs remain unchanged. StatefulSets are applicable to scenarios that require stable network IDs, persistent storage, orderly deployment, and automatic rolling update.

IP addresses are allocated in sequence in a subnet. You can specify static IP addresses in the YAML configuration file. The network information remains unchanged within the lifecycle of a StatefulSet. During pod update or deletion, the logical switch port of the OVN is not deleted, and the new pod reuses the old interface information. Therefore, the IP address, MAC address, and other network information can be reused.

DaemonSe t

Ensures that a pod copy runs on all or certain nodes. When a node is added or deleted, the pod is automatically added or deleted. DaemonSet is applicable to storage, logs, and monitoring services.

Kube-OVN randomly allocates an IP address in a subnet. You can also specify a static IP address in the YAML configuration file.

Job

Creates one or more pods to run Kube-OVN randomly allocates

a task, and deletes all pods after an IP address in a subnet. You

the task is detected as completed. can also specify a static IP

address in the YAML

configuration file.

CronJob

Time-based scheduling task

Kube-OVN randomly allocates an IP address in a subnet. You can also specify a static IP address in the YAML configuration file.

2.3.2 Static IP Addresses and MAC Addresses for Pods
By default, Kube-OVN allocates an IP address and an MAC addresses to a pod based on the subnet to which the namespace where the pod resides belongs. You can specify the IP address and MAC address by setting the ovn.kubernetes.io/ ip_address and ovn.kubernetes.io/mac_address fields in the annotations section in the YAML configuration file when creating the pod.
Example:
apiVersion: v1 kind: Pod metadata:
name: static-ip

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

60

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

namespace: ls1 annotations:
ovn.kubernetes.io/ip_address: 10.16.0.15 ovn.kubernetes.io/mac_address: 00:00:00:53:6B:B6 spec: containers: - name: static-ip image: nginx:alpine
NOTE
When specifying the IP address and MAC address for a pod, note that:
 The specified IP address must be within the CIDR of the subnet to which the namespace where the pod resides belongs.
 The specified IP address and MAC address cannot conflict with used IP addresses and MAC addresses.
 You can specify only the IP address or the MAC address.
2.3.3 Static IP Addresses for Workloads
Kube-OVN allows you to specify static IP addresses for workloads (Deployment, StatefulSet, DaemonSet, Job, and CronJob) by setting the ovn.kubernetes.io/ ip_pool field in the annotations section in the YAML configuration file. A pod under a workload automatically selects the IP addresses specified by the ovn.kubernetes.io/ip_pool field. In addition, Kube-OVN can ensure that the IP addresses do not conflict with other IP addresses.
Example:
apiVersion: apps/v1 kind: Deployment metadata:
namespace: ls1 name: starter-backend labels:
app: starter-backend spec:
replicas: 2 selector:
matchLabels: app: starter-backend
template: metadata: labels: app: starter-backend annotations: ovn.kubernetes.io/ip_pool: 10.16.0.15,10.16.0.16,10.16.0.17 spec: containers: - name: backend image: nginx:alpine
NOTE
When specifying IP addresses for a workload, note that:
 The specified IP addresses must be within the CIDR of the subnet to which the namespace where the pod resides belongs.
 The specified IP addresses cannot conflict with used IP addresses.
 If the number of specified IP addresses is less than the value of replicas, extra pods cannot be created. In this case, adjust the number of IP addresses specified in the ovn.kubernetes.io/ip_pool field based on the workload update policy and capacity expansion plan.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

61

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.3.4 Static IP Addresses for StatefulSets
Same as other workloads, you can specify the IP address used by a pod by setting the ovn.kubernetes.io/ip_pool field. StatefulSets are mainly used for stateful services and have higher requirements on network fixation. Therefore, Kube-OVN provides the following enhancements:
1. IP addresses specified in the ovn.kubernetes.io/ip_pool field are allocated to pods in sequence.
2. During the update or deletion of a StatefulSet, Logical_switch_port stored in the OVN is not deleted. The newly generated pod directly reuses the old interface information. Therefore, this pod can reuse IP addresses, MAC addresses, and other network information to achieve the state retention function similar to that of StatefulSet Volumn.
3. If ovn.kubernetes.io/ip_pool is not specified for a StatefulSet during the update or deletion of this StatefulSet, an IP address and a MAC address are randomly allocated to a pod when the pod is created for the first time. After that, the network information remains fixed throughout the lifecycle of the StatefulSet.
Example:
apiVersion: apps/v1beta1 kind: StatefulSet metadata:
name: web namespace: ls1 spec: serviceName: "nginx" replicas: 2 template:
metadata: labels: app: nginx
spec: containers: - name: nginx image: nginx:alpine
NOTE
When specifying IP addresses for a StatefulSet, note that:
 The specified IP addresses must be within the CIDR of the subnet to which the namespace where the pod resides belongs.
 The specified IP addresses cannot conflict with used IP addresses.
 If the number of specified IP addresses is less than the value of replicas, extra pods cannot be created. In this case, adjust the number of IP addresses specified in the ovn.kubernetes.io/ip_pool field based on the workload update policy and capacity expansion plan.

2.4 Subnet Management

2.4.1 Default Subnet
Kube-OVN organizes IP addresses by subnet. Each namespace can belong to different subnets. Pods under a namespace automatically obtain IP addresses from

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

62

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

the corresponding subnet and share the network configuration (such as the CIDR, gateway type, access control, and NAT control) of the subnet.
Kube-OVN has a built-in default subnet. IP addresses are automatically allocated from the default subnet to all the namespaces without explicit declaration of subnet belonging, and the namespaces use the network configuration of the default subnet. The default subnet uses a distributed gateway and performs network address translation (NAT) on the outbound traffic. The behavior is the same as the default behavior of Flannel. Users can use most network functions without additional configurations.
You can change the configuration of the default subnet during the installation. For details, see 2.2.2.3 Configuring the Built-in Subnets.
Run the following command to view the default subnet:
kubectl get subnet ovn-default -o yaml apiVersion: kubeovn.io/v1 kind: Subnet metadata:
creationTimestamp: "2020-06-28T06:42:32Z" finalizers: - kube-ovn-controller generation: 1 name: ovn-default resourceVersion: "750199" selfLink: /apis/kubeovn.io/v1/subnets/ovn-default uid: e3a9162e-ad11-4bd0-9962-db6d6d26b01e spec: cidrBlock: 10.16.0.0/16 default: true excludeIps: - 10.16.0.1 gateway: 10.16.0.1 gatewayNode: "" gatewayType: distributed natOutgoing: true private: false protocol: IPv4 provider: ovn underlayGateway: false status: activateGateway: "" availableIPs: 65517 conditions: - lastTransitionTime: "2020-06-28T06:42:32Z"
lastUpdateTime: "2020-06-28T06:42:33Z" reason: ResetLogicalSwitchAclSuccess status: "True" type: Validated - lastTransitionTime: "2020-06-28T06:42:33Z" lastUpdateTime: "2020-06-28T06:42:33Z" reason: ResetLogicalSwitchAclSuccess status: "True" type: Ready - lastTransitionTime: "2020-06-28T06:42:32Z" lastUpdateTime: "2020-06-28T06:42:32Z" message: Not Observed reason: Init status: Unknown type: Error usingIPs: 16

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

63

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.4.2 Node Subnet
According to the Kubernetes network specifications, a node can directly communicate with all pods. To achieve this goal, Kube-OVN creates a join subnet and creates a virtual NIC ovn0 on each node to connect to the join subnet. Hosts can communicate with pods through the subnet.
You can change the configuration of the node subnet during the installation. For details, see 2.2.2.3 Configuring the Built-in Subnets.
 Run the following command to view the node subnet:
kubectl get subnet join -o yaml apiVersion: kubeovn.io/v1 kind: Subnet metadata:
creationTimestamp: "2020-06-28T06:42:32Z" finalizers: - kube-ovn-controller generation: 2 name: join resourceVersion: "749434" selfLink: /apis/kubeovn.io/v1/subnets/join uid: 3179fdc6-56a9-4211-a4c2-d922f4463adb spec: cidrBlock: 100.64.0.0/16 default: false excludeIps: - 100.64.0.1 gateway: 100.64.0.1 gatewayNode: "" gatewayType: distributed natOutgoing: false private: false protocol: IPv4 provider: ovn underlayGateway: false status: activateGateway: "" availableIPs: 65530 conditions: - lastTransitionTime: "2020-06-28T06:42:33Z"
lastUpdateTime: "2020-06-28T06:42:33Z" reason: ResetLogicalSwitchAclSuccess status: "True" type: Validated - lastTransitionTime: "2020-06-28T06:42:33Z" lastUpdateTime: "2020-06-28T06:42:33Z" reason: ResetLogicalSwitchAclSuccess status: "True" type: Ready - lastTransitionTime: "2020-06-28T06:42:33Z" lastUpdateTime: "2020-06-28T06:42:33Z" message: Not Observed reason: Init status: Unknown type: Error usingIPs: 3
 Run the following command on each node to view the ovn0 NIC:
ifconfig ovn0 ovn0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1400
inet 100.64.0.2 netmask 255.255.0.0 broadcast 100.64.255.255 inet6 fe80::200:ff:fec6:936d prefixlen 64 scopeid 0x20<link> ether 00:00:00:c6:93:6d txqueuelen 1000 (Ethernet) RX packets 5673864 bytes 530785977 (506.1 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 5243874 bytes 8069050878 (7.5 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

64

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.4.3 Customized Subnet
In Kube-OVN, IP addresses are managed by subnet network segment. One or more namespaces can be bound to a subnet. Pods under these namespaces automatically obtain IP addresses from the subnet and share the network configuration (such as CIDR, gateway type, access control, and NAT control) of the subnet.
Run the following command to create a subnet:
cat <<EOF | kubectl create -f apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: ls1 spec:
protocol: IPv4 cidrBlock: 10.66.0.0/16 excludeIps: - 10.66.0.1..10.66.0.10 gateway: 10.66.0.1 namespaces: - ls1 EOF
2.4.4 Subnet Access Control
By default, subnets created by Kube-OVN can communicate with each other, and pods can access external networks through gateways.
To control access between subnets, set private to true in the subnet CRD. In this way, the subnet is isolated from other subnets and external networks, and can only communicate with internal subnets. You can configure the whitelist by setting the allSubnets field in the YAML configuration file of Kube-OVN.
Access control can be further implemented using the network policy of Kubernetes. Kube-OVN implements network policy rules, which have higher priority than the access control settings in the subnet CRD.
The following is an example of configuring access control for a subnet.
apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: subnet-acl spec:
protocol: IPv4 default: false namespaces: - ns1 - ns2 cidrBlock: 10.69.0.0/16 gateway: 100.64.0.1 excludeIps: - 100.64.0.1 private: true allowSubnets: - 10.16.0.0/16 - 10.18.0.0/16

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

65

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.4.5 Egress Gateway Configuration
Pods in the Kube-OVN network access the network outside the cluster through gateways. Currently, two types of gateways are supported. You can adjust the gateway type in the subnet.
 Distributed gateway
Distributed gateways are the default gateway type of subnets. Each node functions as the gateway for the pods on the current node to access the external network. Data packets are routed to the host network stack through the ovn0 NIC of the local host, and then to the external network based on the routing rules of the host. If natOutgoing is set to true, a pod uses the IP address of the current host machine to access the external network.
The following is an example of configuration on a subnet.
apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: distributed spec:
cidrBlock: 10.166.0.0/16 default: false excludeIps: - 10.166.0.1 gateway: 10.166.0.1 gatewayType: distributed natOutgoing: true
 Centralized gateway
If you want to use a static IP address to access the external network from a subnet for security operations such as auditing and whitelisting, you can configure a centralized gateway in the subnet. In centralized gateway mode, data packets of a pod are first routed to the ovn0 NIC of a specified node and then to the external network based on the routing rules of the host. If natOutgoing is set to true, a pod uses the IP address of a specified host machine to access the external network.
The following is an example of configuration on a subnet.
apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: centralized spec:
cidrBlock: 10.166.0.0/16 default: false excludeIps: - 10.166.0.1 gateway: 10.166.0.1 gatewayType: centralized gatewayNode: "node1,node2" natOutgoing: true
2.4.6 IP Addresses of Pods Exposed to the External Network
In Kube-OVN, the IP address of a pod can be directly exposed to the external network through a static route. In this case, the natOutgoing field of the subnet where the pod is located must be set to false to disable the NAT mapping for outbound traffic.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

66

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

In addition, check whether there is a drop rule in the Forward link of iptables on the host node. For the ovn0 NIC and the default outbound NIC, enable the Forward rule in the Forward link of iptables.
 Physical environment
To expose the IP address of a pod to the external network so that the external network can access containers using the IP address, set natOutgoing of the corresponding subnet to false. In addition, you need to add a static route on the external router and set the next hop of the data packets whose destination address is the subnet CIDR to any host in the cluster.
The following is an example of configuration on a subnet.
apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: subnet-gateway spec:
protocol: IPv4 default: false namespaces: - ns1 - ns2 cidrBlock: 100.64.0.0/16 gateway: 100.64.0.1 excludeIps: - 100.64.0.1 private: false gatewayType: distributed natOutgoing: false
 Virtual network environment
If the next hop of the container network is set to a host in the cluster due to restrictions of security groups or conntrack, data packets may be dropped due to asymmetric routing.
In this case, you are advised to set gatewayType to centralized and set the corresponding gatewayNode. When an external system accesses a container, the next hop of the container CIDR must be set to the node specified by gatewayNode to avoid asymmetric routing restrictions.
2.4.7 IPv6 Subnet
Kube-OVN supports the coexistence of IPv4 and IPv6 subnets in a cluster. However, due to some restrictions of the Kubernetes control plane, if the pod network protocol is inconsistent with the protocol of the Kubernetes control plane, functions such as service discovery cannot work properly. Before using an IPv6 subnet, you are advised to configure the protocol of the Kubernetes control plane to IPv6 to prevent network problems.
The following is an example of creating an IPv6 subnet.
cat <<EOF | kubectl create -f apiVersion: kubeovn.io/v1 kind: Subnet metadata:
name: ipv6 spec:
cidrBlock: 2001:4860::/32 excludeIps: - 2001:4860::1 gateway: 2001:4860::1 gatewayType: distributed

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

67

Kunpeng BoostKit for Virtualization User Guides
namespaces: [ls2] natOutgoing: false private: false protocol: "IPv6" EOF

2 Kube-OVN User Guide

2.5 O&M Operations

2.5.1 Deleting a Node
The Kube-OVN-Controller running on a node periodically reconnects to the ovnsb, but the chassis still registers again. As a result, some network configurations are residual, resources are wasted, and potential rule conflicts may occur. Therefore, when deleting a node from Kubernetes, perform the following steps to ensure that network information can be properly deleted.
Step 1 Drain a node.
kubectl drain <node-name> --ignore-daemonsets --force
Step 2 Log in to the node and stop kubelet and Docker to stop the corresponding DaemonSet.
systemctl stop kubelet && systemctl stop docker
Step 3 Delete the node from the master nodes.
kubectl delete node <node-name>
Step 4 Check whether the node is deleted from the ovn-sb.
kubectl ko sbctl show
Step 5 If any chassis corresponding to the hostname still exists, manually delete it.
kubectl ko sbctl chassis-del <chassis-uuid>
----End
2.5.2 Configuring QoS
In Kube-OVN, the ingress and egress bandwidths (in Mbit/s) of a pod can be specified by the ovn.kubernetes.io/ingress_rate and ovn.kubernetes.io/ egress_rate fields respectively in the annotations section in the YAML configuration file on the pod. You can configure the QoS when creating a pod or dynamically adjust the QoS by modifying the two fields when the pod is running.
 Configuring the QoS when creating a pod
apiVersion: v1 kind: Pod metadata:
name: qos namespace: ls1 annotations:
ovn.kubernetes.io/ingress_rate: "3" ovn.kubernetes.io/egress_rate: "1" spec: containers: - name: qos image: nginx:alpine
 Dynamically adjusting the QoS
kubectl annotate --overwrite pod nginx-74d5899f46-d7qkn ovn.kubernetes.io/ingress_rate=3

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

68

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

2.5.3 Mirroring Traffic
By default, Kube-OVN creates a mirror0 virtual network interface card (vNIC) on each node to mirror the traffic of all containers on the node. You can run the tcpdump -nni mirror0 command to view the traffic information or use other tools to export the traffic from the mirror0 vNIC for analysis.

2.6 Reference

2.6.1 YAML Files for Manual Installation

CAUTION
The Kube-OVN version and features are evolving rapidly. It is recommended that the Kube-OVN version obtained by the one-click installation script be the same as the image version. Otherwise, Kube-OVN may fail to be deployed and started in one-click mode. This section provides only the YAML deployment files based on Kube-OVN v1.2.1.

YAML File for CRD Deployment
apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata:
name: ips.kubeovn.io spec:
group: kubeovn.io version: v1 scope: Cluster names:
plural: ips singular: ip kind: IP shortNames:
- ip additionalPrinterColumns:
- name: IP type: string JSONPath: .spec.ipAddress
- name: Mac type: string JSONPath: .spec.macAddress
- name: Node type: string JSONPath: .spec.nodeName
- name: Subnet type: string JSONPath: .spec.subnet
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata:
name: subnets.kubeovn.io spec:
group: kubeovn.io version: v1 scope: Cluster names:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

69

Kunpeng BoostKit for Virtualization User Guides
plural: subnets singular: subnet kind: Subnet shortNames:
- subnet subresources:
status: {} additionalPrinterColumns:
- name: Provider type: string JSONPath: .spec.provider
- name: Protocol type: string JSONPath: .spec.protocol
- name: CIDR type: string JSONPath: .spec.cidrBlock
- name: Private type: boolean JSONPath: .spec.private
- name: NAT type: boolean JSONPath: .spec.natOutgoing
- name: Default type: boolean JSONPath: .spec.default
- name: GatewayType type: string JSONPath: .spec.gatewayType
- name: Used type: number JSONPath: .status.usingIPs
- name: Available type: number JSONPath: .status.availableIPs
validation: openAPIV3Schema: properties: spec: required: ["cidrBlock"] properties: cidrBlock: type: "string" gateway: type: "string"
--apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata:
name: vlans.kubeovn.io spec:
group: kubeovn.io version: v1 scope: Cluster names:
plural: vlans singular: vlan kind: Vlan shortNames:
- vlan additionalPrinterColumns:
- name: VlanID type: string JSONPath: .spec.vlanId
- name: ProviderInterfaceName type: string JSONPath: .spec.providerInterfaceName
- name: Subnet

2 Kube-OVN User Guide

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

70

Kunpeng BoostKit for Virtualization User Guides

type: string JSONPath: .spec.subnet

YAML File for OVN Deployment
--apiVersion: v1 kind: ConfigMap metadata:
name: ovn-config namespace: kube-system
--apiVersion: v1 kind: ServiceAccount metadata:
name: ovn namespace: kube-system
--apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata:
annotations: rbac.authorization.k8s.io/system-only: "true"
name: system:ovn rules:
- apiGroups: - "kubeovn.io"
resources: - subnets - subnets/status - ips - vlans
verbs: - "*"
- apiGroups: - ""
resources: - pods - namespaces - nodes - configmaps
verbs: - create - get - list - watch - patch - update
- apiGroups: - "" - networking.k8s.io - apps - extensions
resources: - networkpolicies - services - endpoints - statefulsets - daemonsets - deployments
verbs: - get - list - watch
- apiGroups: - ""
resources:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 71

Kunpeng BoostKit for Virtualization User Guides
- events verbs:
- create - patch - update
--apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata:
name: ovn roleRef:
name: system:ovn kind: ClusterRole apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount
name: ovn namespace: kube-system
--kind: Service apiVersion: v1 metadata:
name: ovn-nb namespace: kube-system spec: ports:
- name: ovn-nb protocol: TCP port: 6641 targetPort: 6641
type: ClusterIP selector:
app: ovn-central ovn-nb-leader: "true" sessionAffinity: None
--kind: Service apiVersion: v1 metadata:
name: ovn-sb namespace: kube-system spec: ports:
- name: ovn-sb protocol: TCP port: 6642 targetPort: 6642
type: ClusterIP selector:
app: ovn-central ovn-sb-leader: "true" sessionAffinity: None
--kind: Deployment apiVersion: apps/v1 metadata:
name: ovn-central namespace: kube-system annotations:
kubernetes.io/description: | OVN components: northd, nb and sb.
spec: replicas: 1 strategy: rollingUpdate:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 72

Kunpeng BoostKit for Virtualization User Guides
maxSurge: 0% maxUnavailable: 100% type: RollingUpdate selector: matchLabels: app: ovn-central template: metadata: labels:
app: ovn-central component: network type: infra spec: tolerations: - operator: Exists effect: NoSchedule affinity: podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: ovn-central topologyKey: kubernetes.io/hostname
priorityClassName: system-cluster-critical serviceAccountName: ovn hostNetwork: true containers:
- name: ovn-central image: "kubeovn/kube-ovn:v1.2.1-arm" imagePullPolicy: IfNotPresent command: ["/kube-ovn/start-db.sh"] securityContext: capabilities: add: ["SYS_NICE"] env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace resources: requests: cpu: 500m memory: 300Mi volumeMounts: - mountPath: /var/run/openvswitch name: host-run-ovs - mountPath: /var/run/ovn name: host-run-ovn - mountPath: /sys name: host-sys readOnly: true - mountPath: /etc/openvswitch name: host-config-openvswitch - mountPath: /etc/ovn name: host-config-ovn - mountPath: /var/log/openvswitch name: host-log-ovs - mountPath: /var/log/ovn name: host-log-ovn readinessProbe: exec:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 73

Kunpeng BoostKit for Virtualization User Guides
command: - sh - /kube-ovn/ovn-is-leader.sh
periodSeconds: 3 livenessProbe:
exec: command: - sh - /kube-ovn/ovn-healthcheck.sh
initialDelaySeconds: 30 periodSeconds: 7 failureThreshold: 5 nodeSelector: kubernetes.io/os: "linux" kube-ovn/role: "master" volumes: - name: host-run-ovs hostPath: path: /run/openvswitch - name: host-run-ovn hostPath: path: /run/ovn - name: host-sys hostPath: path: /sys - name: host-config-openvswitch hostPath: path: /etc/origin/openvswitch - name: host-config-ovn hostPath: path: /etc/origin/ovn - name: host-log-ovs hostPath: path: /var/log/openvswitch - name: host-log-ovn hostPath: path: /var/log/ovn
--kind: DaemonSet apiVersion: apps/v1 metadata:
name: ovs-ovn namespace: kube-system annotations:
kubernetes.io/description: | This daemon set launches the openvswitch daemon.
spec: selector: matchLabels: app: ovs updateStrategy: type: OnDelete template: metadata: labels: app: ovs component: network type: infra spec: tolerations: - operator: Exists effect: NoSchedule priorityClassName: system-cluster-critical serviceAccountName: ovn hostNetwork: true hostPID: true containers: - name: openvswitch

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 74

Kunpeng BoostKit for Virtualization User Guides
image: "kubeovn/kube-ovn:v1.2.1-arm" imagePullPolicy: IfNotPresent command: ["/kube-ovn/start-ovs.sh"] securityContext:
runAsUser: 0 privileged: true env: - name: POD_IP
valueFrom: fieldRef: fieldPath: status.podIP
volumeMounts: - mountPath: /lib/modules name: host-modules readOnly: true - mountPath: /var/run/openvswitch name: host-run-ovs - mountPath: /var/run/ovn name: host-run-ovn - mountPath: /sys name: host-sys readOnly: true - mountPath: /etc/openvswitch name: host-config-openvswitch - mountPath: /etc/ovn name: host-config-ovn - mountPath: /var/log/openvswitch name: host-log-ovs - mountPath: /var/log/ovn name: host-log-ovn
readinessProbe: exec: command: - sh - /kube-ovn/ovs-healthcheck.sh periodSeconds: 5
livenessProbe: exec: command: - sh - /kube-ovn/ovs-healthcheck.sh initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 5
resources: requests: cpu: 200m memory: 300Mi limits: cpu: 1000m memory: 800Mi
nodeSelector: kubernetes.io/os: "linux"
volumes: - name: host-modules hostPath: path: /lib/modules - name: host-run-ovs hostPath: path: /run/openvswitch - name: host-run-ovn hostPath: path: /run/ovn - name: host-sys hostPath: path: /sys - name: host-config-openvswitch hostPath: path: /etc/origin/openvswitch

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 75

Kunpeng BoostKit for Virtualization User Guides

- name: host-config-ovn hostPath: path: /etc/origin/ovn
- name: host-log-ovs hostPath: path: /var/log/openvswitch
- name: host-log-ovn hostPath: path: /var/log/ovn

YAML File for Kube-OVN Deployment
--kind: Deployment apiVersion: apps/v1 metadata:
name: kube-ovn-controller namespace: kube-system annotations:
kubernetes.io/description: | kube-ovn controller
spec: replicas: 2 selector: matchLabels: app: kube-ovn-controller strategy: rollingUpdate: maxSurge: 0% maxUnavailable: 100% type: RollingUpdate template: metadata: labels: app: kube-ovn-controller component: network type: infra spec: tolerations: - operator: Exists effect: NoSchedule affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: kube-ovn-controller topologyKey: kubernetes.io/hostname priorityClassName: system-cluster-critical serviceAccountName: ovn hostNetwork: true containers: - name: kube-ovn-controller image: "kubeovn/kube-ovn:v1.2.1-arm" imagePullPolicy: IfNotPresent command: - /kube-ovn/start-controller.sh args: - --default-cidr=10.16.0.0/16 - --default-gateway=10.16.0.1 - --node-switch-cidr=100.64.0.0/16 env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: KUBE_NAMESPACE valueFrom: fieldRef:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 76

Kunpeng BoostKit for Virtualization User Guides
fieldPath: metadata.namespace - name: KUBE_NODE_NAME
valueFrom: fieldRef: fieldPath: spec.nodeName
readinessProbe: exec: command: - sh - /kube-ovn/kube-ovn-controller-healthcheck.sh periodSeconds: 3
livenessProbe: exec: command: - sh - /kube-ovn/kube-ovn-controller-healthcheck.sh initialDelaySeconds: 300 periodSeconds: 7 failureThreshold: 5
nodeSelector: kubernetes.io/os: "linux"
--kind: DaemonSet apiVersion: apps/v1 metadata:
name: kube-ovn-cni namespace: kube-system annotations:
kubernetes.io/description: | This daemon set launches the kube-ovn cni daemon.
spec: selector: matchLabels: app: kube-ovn-cni updateStrategy: type: OnDelete template: metadata: labels: app: kube-ovn-cni component: network type: infra spec: tolerations: - operator: Exists effect: NoSchedule priorityClassName: system-cluster-critical serviceAccountName: ovn hostNetwork: true hostPID: true initContainers: - name: install-cni image: "kubeovn/kube-ovn:v1.2.1-arm" imagePullPolicy: IfNotPresent command: ["/kube-ovn/install-cni.sh"] securityContext: runAsUser: 0 privileged: true volumeMounts: - mountPath: /etc/cni/net.d name: cni-conf - mountPath: /opt/cni/bin name: cni-bin containers: - name: cni-server image: "kubeovn/kube-ovn:v1.2.1-arm" imagePullPolicy: IfNotPresent command:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 77

Kunpeng BoostKit for Virtualization User Guides
- sh - /kube-ovn/start-cniserver.sh args: - --enable-mirror=true securityContext: runAsUser: 0 privileged: true env: - name: POD_IP
valueFrom: fieldRef: fieldPath: status.podIP
- name: KUBE_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName
volumeMounts: - mountPath: /run/openvswitch name: host-run-ovs - mountPath: /run/ovn name: host-run-ovn - mountPath: /var/run/netns name: host-ns mountPropagation: HostToContainer
readinessProbe: exec: command: - nc - -z - -w3 - 127.0.0.1 - "10665" periodSeconds: 3
livenessProbe: exec: command: - nc - -z - -w3 - 127.0.0.1 - "10665" initialDelaySeconds: 30 periodSeconds: 7 failureThreshold: 5
nodeSelector: kubernetes.io/os: "linux"
volumes: - name: host-run-ovs hostPath: path: /run/openvswitch - name: host-run-ovn hostPath: path: /run/ovn - name: cni-conf hostPath: path: /etc/cni/net.d - name: cni-bin hostPath: path: /opt/cni/bin - name: host-ns hostPath: path: /var/run/netns
--kind: DaemonSet apiVersion: apps/v1 metadata:
name: kube-ovn-pinger namespace: kube-system

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 78

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

annotations: kubernetes.io/description: | This daemon set launches the openvswitch daemon.
spec: selector: matchLabels: app: kube-ovn-pinger updateStrategy: type: RollingUpdate template: metadata: labels: app: kube-ovn-pinger component: network type: infra spec: tolerations: - operator: Exists effect: NoSchedule serviceAccountName: ovn hostPID: true containers: - name: pinger image: "kubeovn/kube-ovn:v1.2.1-arm" command: ["/kube-ovn/kube-ovn-pinger", "--external-address=114.114.114.114"] imagePullPolicy: IfNotPresent securityContext: runAsUser: 0 privileged: false env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: HOST_IP valueFrom: fieldRef: fieldPath: status.hostIP - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName volumeMounts: - mountPath: /lib/modules name: host-modules readOnly: true - mountPath: /run/openvswitch name: host-run-ovs - mountPath: /var/run/openvswitch name: host-run-ovs - mountPath: /var/run/ovn name: host-run-ovn - mountPath: /sys name: host-sys readOnly: true - mountPath: /etc/openvswitch name: host-config-openvswitch - mountPath: /var/log/openvswitch name: host-log-ovs - mountPath: /var/log/ovn name: host-log-ovn resources: requests: cpu: 100m memory: 300Mi

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

79

Kunpeng BoostKit for Virtualization User Guides
limits: cpu: 200m memory: 400Mi
nodeSelector: kubernetes.io/os: "linux"
volumes: - name: host-modules hostPath: path: /lib/modules - name: host-run-ovs hostPath: path: /run/openvswitch - name: host-run-ovn hostPath: path: /run/ovn - name: host-sys hostPath: path: /sys - name: host-config-openvswitch hostPath: path: /etc/origin/openvswitch - name: host-log-ovs hostPath: path: /var/log/openvswitch - name: host-log-ovn hostPath: path: /var/log/ovn
--kind: Service apiVersion: v1 metadata:
name: kube-ovn-pinger namespace: kube-system labels:
app: kube-ovn-pinger spec:
selector: app: kube-ovn-pinger
ports: - port: 8080 name: metrics
--kind: Service apiVersion: v1 metadata:
name: kube-ovn-controller namespace: kube-system labels:
app: kube-ovn-controller spec:
selector: app: kube-ovn-controller
ports: - port: 10660 name: metrics
--kind: Service apiVersion: v1 metadata:
name: kube-ovn-cni namespace: kube-system labels:
app: kube-ovn-cni spec:
selector: app: kube-ovn-cni
ports: - port: 10665 name: metrics

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

2 Kube-OVN User Guide 80

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

kubectl Plugin
#!/bin/bash set -euo pipefail

KUBE_OVN_NS=kube-system OVN_NB_POD= OVN_SB_POD=

showHelp(){ echo "kubectl ko {subcommand} [option...]" echo "Available Subcommands:" echo " nbctl [ovn-nbctl options ...] invoke ovn-nbctl" echo " sbctl [ovn-sbctl options ...] invoke ovn-sbctl" echo " vsctl {nodeName} [ovs-vsctl options ...] invoke ovs-vsctl on selected node" echo " tcpdump {namespace/podname} [tcpdump options ...] capture pod traffic" echo " trace {namespace/podname} {target ip address} {icmp|tcp|udp} [target tcp or udp port]
ovn microflow of specific packet" echo " diagnose {all|node} [nodename] diagnose connectivity of all nodes or a specific node"
}

trace

tcpdump(){ namespacedPod="$1"; shift namespace=$(echo "$namespacedPod" | cut -d "/" -f1) podName=$(echo "$namespacedPod" | cut -d "/" -f2) if [ "$podName" = "$namespacedPod" ]; then nodeName=$(kubectl get pod "$podName" -o jsonpath={.spec.nodeName}) mac=$(kubectl get pod "$podName" -o jsonpath={.metadata.annotations.ovn\\.kubernetes\\.io/
mac_address}) hostNetwork=$(kubectl get pod "$podName" -o jsonpath={.spec.hostNetwork})
else nodeName=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.spec.nodeName}) mac=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.metadata.annotations.ovn\
\.kubernetes\\.io/mac_address}) hostNetwork=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.spec.hostNetwork})
fi

if [ -z "$nodeName" ]; then echo "Pod $namespacedPod not exists on any node" exit 1
fi

if [ -z "$mac" ] && [ "$hostNetwork" != "true" ]; then echo "pod mac address not ready" exit 1
fi mac=$(echo "$mac" | tr '[:upper:]' '[:lower:]')

ovnCni=$(kubectl get pod -n $KUBE_OVN_NS -o wide| grep kube-ovn-cni| grep " $nodeName " | awk '{print $1}')
if [ -z "$ovnCni" ]; then echo "kube-ovn-cni not exist on node $nodeName" exit 1
fi

if [ "$hostNetwork" = "true" ]; then set -x kubectl exec -it "$ovnCni" -n $KUBE_OVN_NS -- tcpdump -nn "$@"
else nicName=$(kubectl exec -it "$ovnCni" -n $KUBE_OVN_NS -- ovs-vsctl --data=bare --no-heading --
columns=name find interface mac_in_use="${mac//:/\\:}" | tr -d '\r') if [ -z "$nicName" ]; then echo "nic doesn't exist on node $nodeName" exit 1 fi set -x kubectl exec -it "$ovnCni" -n $KUBE_OVN_NS -- tcpdump -nn -i "$nicName" "$@"
fi }

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

81

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

trace(){ namespacedPod="$1" namespace=$(echo "$1" | cut -d "/" -f1) podName=$(echo "$1" | cut -d "/" -f2) if [ "$podName" = "$1" ]; then echo "namespace is required" exit 1 fi
podIP=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.metadata.annotations.ovn\ \.kubernetes\\.io/ip_address})
mac=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.metadata.annotations.ovn\ \.kubernetes\\.io/mac_address})
ls=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.metadata.annotations.ovn\\.kubernetes\ \.io/logical_switch})
hostNetwork=$(kubectl get pod "$podName" -n "$namespace" -o jsonpath={.spec.hostNetwork})
if [ "$hostNetwork" = "true" ]; then echo "Can not trace host network pod" exit 1
fi
if [ -z "$ls" ]; then echo "pod address not ready" exit 1
fi
gwMac=$(kubectl exec -it $OVN_NB_POD -n $KUBE_OVN_NS -- ovn-nbctl --data=bare --no-heading -columns=mac find logical_router_port name=ovn-cluster-"$ls" | tr -d '\r')
if [ -z "$gwMac" ]; then echo "get gw mac failed" exit 1
fi
dst="$2" if [ -z "$dst" ]; then
echo "need a target ip address" exit 1 fi
type="$3"
case $type in icmp) set -x kubectl exec "$OVN_SB_POD" -n $KUBE_OVN_NS -- ovn-trace --ct=new "$ls" "inport == \"$podName.
$namespace\" && ip.ttl == 64 && icmp && eth.src == $mac && ip4.src == $podIP && eth.dst == $gwMac && ip4.dst == $dst"
;; tcp|udp)
set -x kubectl exec "$OVN_SB_POD" -n $KUBE_OVN_NS -- ovn-trace --ct=new "$ls" "inport == \"$podName. $namespace\" && ip.ttl == 64 && eth.src == $mac && ip4.src == $podIP && eth.dst == $gwMac && ip4.dst == $dst && $type.src == 10000 && $type.dst == $4" ;; *) echo "type $type not supported" echo "kubectl ko trace {namespace/podname} {target ip address} {icmp|tcp|udp} [target tcp or udp port]" ;; esac }
vsctl(){ nodeName="$1"; shift kubectl get no "$nodeName" > /dev/null ovsPod=$(kubectl get pod -n $KUBE_OVN_NS -o wide | grep " $nodeName " | grep ovs-ovn | awk '{print
$1}')

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

82

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

if [ -z "$ovsPod" ]; then echo "ovs pod doesn't exist on node $nodeName" exit 1
fi kubectl exec "$ovsPod" -n $KUBE_OVN_NS -- ovs-vsctl "$@" }
diagnose(){ kubectl get crd subnets.kubeovn.io kubectl get crd ips.kubeovn.io kubectl get svc kube-dns -n kube-system kubectl get svc kubernetes -n default
checkDaemonSet kube-proxy checkDeployment ovn-central checkDeployment kube-ovn-controller checkDaemonSet kube-ovn-cni checkDaemonSet ovs-ovn checkDeployment coredns type="$1" case $type in
all) echo "### kube-ovn-controller recent log" set +e kubectl logs -n $KUBE_OVN_NS -l app=kube-ovn-controller --tail=100 | grep E$(date +%m%d) set -e echo "" pingers=$(kubectl get pod -n $KUBE_OVN_NS | grep kube-ovn-pinger | awk '{print $1}') for pinger in $pingers do nodeName=$(kubectl get pod "$pinger" -n "$KUBE_OVN_NS" -o jsonpath={.spec.nodeName}) echo "### start to diagnose node $nodeName" echo "#### ovn-controller log:" kubectl exec -n $KUBE_OVN_NS -it "$pinger" -- tail /var/log/ovn/ovn-controller.log echo "" kubectl exec -n $KUBE_OVN_NS -it "$pinger" -- /kube-ovn/kube-ovn-pinger --mode=job echo "### finish diagnose node $nodeName" echo "" done ;;
node) nodeName="$2" kubectl get no "$nodeName" > /dev/null pinger=$(kubectl get pod -n $KUBE_OVN_NS -o wide | grep kube-ovn-pinger | grep " $nodeName " |
awk '{print $1}') echo "### start to diagnose node nodeName" echo "#### ovn-controller log:" kubectl exec -n $KUBE_OVN_NS -it "$pinger" -- tail /var/log/ovn/ovn-controller.log echo "" kubectl exec -n $KUBE_OVN_NS -it "$pinger" -- /kube-ovn/kube-ovn-pinger --mode=job echo "### finish diagnose node nodeName" echo "" ;;
*) echo "type $type not supported" echo "kubectl ko diagnose {all|node} [nodename]" ;;
esac }
getOvnCentralPod(){ NB_POD=$(kubectl get pod -n $KUBE_OVN_NS -l ovn-nb-leader=true | grep ovn-central | head -n 1 |
awk '{print $1}') if [ -z "$NB_POD" ]; then echo "nb leader not exists" exit 1 fi OVN_NB_POD=$NB_POD SB_POD=$(kubectl get pod -n $KUBE_OVN_NS -l ovn-sb-leader=true | grep ovn-central | head -n 1 | awk

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

83

Kunpeng BoostKit for Virtualization User Guides

2 Kube-OVN User Guide

'{print $1}') if [ -z "$SB_POD" ]; then echo "nb leader not exists" exit 1 fi OVN_SB_POD=$SB_POD
}
checkDaemonSet(){ name="$1" currentScheduled=$(kubectl get ds -n $KUBE_OVN_NS "$name" -o
jsonpath={.status.currentNumberScheduled}) desiredScheduled=$(kubectl get ds -n $KUBE_OVN_NS "$name" -o
jsonpath={.status.desiredNumberScheduled}) available=$(kubectl get ds -n $KUBE_OVN_NS "$name" -o jsonpath={.status.numberAvailable}) ready=$(kubectl get ds -n $KUBE_OVN_NS "$name" -o jsonpath={.status.numberReady}) if [ "$currentScheduled" = "$desiredScheduled" ] && [ "$desiredScheduled" = "$available" ] &&
[ "$available" = "$ready" ]; then echo "ds $name ready"
else echo "Error ds $name not ready" exit 1
fi }
checkDeployment(){ name="$1" ready=$(kubectl get deployment -n $KUBE_OVN_NS "$name" -o jsonpath={.status.readyReplicas}) updated=$(kubectl get deployment -n $KUBE_OVN_NS "$name" -o jsonpath={.status.updatedReplicas}) desire=$(kubectl get deployment -n $KUBE_OVN_NS "$name" -o jsonpath={.status.replicas}) available=$(kubectl get deployment -n $KUBE_OVN_NS "$name" -o jsonpath={.status.availableReplicas}) if [ "$ready" = "$updated" ] && [ "$updated" = "$desire" ] && [ "$desire" = "$available" ]; then echo "deployment $name ready" else echo "Error deployment $name not ready" exit 1 fi
}
if [ $# -lt 1 ]; then showHelp exit 0
else subcommand="$1"; shift
fi
getOvnCentralPod
case $subcommand in nbctl) kubectl exec "$OVN_NB_POD" -n $KUBE_OVN_NS -- ovn-nbctl "$@" ;; sbctl) kubectl exec "$OVN_SB_POD" -n $KUBE_OVN_NS -- ovn-sbctl "$@" ;; vsctl) vsctl "$@" ;; tcpdump) tcpdump "$@" ;; trace) trace "$@" ;; diagnose) diagnose "$@" ;; *) showHelp

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

84

Kunpeng BoostKit for Virtualization User Guides
;; esac

2 Kube-OVN User Guide

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

85

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

3 XPF User Guide

3.1 Introduction 3.2 Environment Requirements 3.3 BIOS Settings 3.4 Configuring the Compilation Environment 3.5 Compiling and Installing XPF 3.6 Configuring Logs 3.7 Running and Verifying XPF 3.8 Troubleshooting 3.9 OVS Command Description 3.10 Change History

3.1 Introduction

XPF Overview
The Data Plane Development Kit (DPDK) is an open-source, high-performance, and user-mode software project developed by Intel. Open vSwitch (OVS) is an open-source implementation of a distributed virtual multilayer switch and is widely used in cloud computing. OVS+DPDK provides flexible network management and high-performance forwarding capabilities. For more information about DPDK and OVS, visit the official websites https://www.dpdk.org/ and https://www.openvswitch.org/.
Based on OVS+DPDK, the Extensible Packet Framework (XPF) proposes flow table normalization to further accelerate data packet forwarding performance in cloud computing. The typical scenario is VXLAN+CT networking.
Programming language: C
Brief description: high-performance implementation of a virtual switch

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

86

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Open-source protocols: BSD 3-Clause License, LGPL V2.1, GPL V2.0, Apache License V2.0

Recommended Version

Table 3-1 Software versions Software DPDK OVS XPF

Version 19.11 2.12.0 1.0.0

Security Hardening
Two common vulnerabilities and exposures (CVE) vulnerabilities are found in open-source DPDK 19.11. You are advised to integrate the following patches into DPDK 19.11 before using it:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10726
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10725

3.2 Environment Requirements

Hardware Requirements
Table 3-2 lists the hardware requirements.

Table 3-2 Hardware requirements

Item

Description

CPU

Kunpeng processor

NIC

The NIC must support DPDK.

OS Requirements
Table 3-3 lists the OS requirements.

Table 3-3 OS requirements

Item

Version

CentOS

7.6

How to Obtain
https://mirrors.huaweicloud.com/centosvault/altarch/7.6.1810/isos/aarch64

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

87

Kunpeng BoostKit for Virtualization User Guides
Item Kernel

Version 4.14.0

3 XPF User Guide
How to Obtain Included in the OS image.

3.3 BIOS Settings
Step 1 Log in to the baseboard management controller (BMC) system of the server, restart the server, and press DEL to enter the BIOS.

Step 2 Enable the input/output memory management unit (IOMMU). 1. Choose Advanced > MISC Config and press Enter.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

88

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

2. Set Support Smmu to Enabled.

3. Press F10 to save the settings and exit. ----End

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

89

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

3.4 Configuring the Compilation Environment
Step 1 Install compilation dependencies.
sudo yum install -y automake cmake patch numactl numactl-devel kernel-devel libevent glib2 glib2devel libtool openssl-devel selinux-policy-devel autoconf python-sphinx unbound-devel logrotate
Step 2 Install VM dependencies.
sudo yum install centos-release-qemu-ev sudo yum install -y libvirt AAVMF virt-install qemu-guest-agent qemu-kvm-common-ev qemu-img-ev qemu-kvm-tools-ev qemu-kvm-ev
Step 3 Upgrade the GCC.
sudo yum install -y centos-release-scl sudo yum install -y devtoolset-7-gcc devtoolset-7-gcc-c++ scl enable devtoolset-7 bash
Step 4 (Optional) Configure GCC environment variables.
1. The default values of GCC environment variables will be used upon Bash shutdown or re-login. In this case, run the following command again to retain GCC environment variables:
scl enable devtoolset-7 bash
2. Retain GCC environment variables of Bash. Open the ~/.bash_profile file.
vim ~/.bash_profile
Add the following content to the last line:
scl enable devtoolset-7 bash
----End

CAUTION
 Some of the preceding software packages need to be downloaded from the Internet. Ensure that the server is connected to the Internet and the corresponding sources are configured.
 The QEMU software package with the file name extension ev must be installed. Otherwise, the dpdkvhostuser and dpdkvhostuserclient ports cannot be configured for VMs.
 By default, DPDK 19.11 does not support compilation with GCC 4.8.5. You need to upgrade GCC or modify compilation parameters. Modifying compilation parameters will affect the performance. Therefore, you are advised to upgrade GCC.

3.5 Compiling and Installing XPF

Obtaining the Source Code
Step 1 Create directories.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

90

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

mkdir -p /home/source_code mkdir -p /home/patch_code/ovs_patch mkdir -p /home/rpm_packet
Step 2 Obtain the DPDK source code.
1. Download the source code. ­ Method 1 (recommended): Download the DPDK 19.11 source code from the Kunpeng community. https://github.com/kunpengcompute/ovs/releases/download/v2.12.0/ dpdk-19.11.tar.gz ­ Method 2: Download the DPDK 19.11 source code from the official website. https://fast.dpdk.org/rel/dpdk-19.11.tar.xz
2. Copy the source code to the /home/source_code directory on the server.
Step 3 Obtain the OVS source code.
1. Download the source code. ­ Method 1 (recommended): Download the source code of the ovs_xpf_v1.2.10 branch for which the open-source xpf.patch has been installed. https://github.com/kunpengcompute/ovs/archive/ovs_xpf_v2.12.0.zip ­ Method 2: Download the source code of v2.12.0 released in the open-source community. https://www.openvswitch.org/releases/openvswitch-2.12.0.tar.gz
2. Copy the source code to the /home/source_code directory on the server.
Step 4 Obtain the patch code.
1. Download the patch code. https://github.com/kunpengcompute/ovs/releases/download/v2.12.0/ xpf.patch
2. Copy the patch code to the /home/patch_code/ovs_patch directory on the server.
Step 5 Obtain the XPF library.
1. Download the binary RPM packages and digital signature files. ­ Method 1: Huawei Enterprise website xpf-1.0.0-1.aarch64.rpm xpf-devel-1.0.0-1.aarch64.rpm ­ Method 2: Huawei Carrier website xpf-1.0.0-1.aarch64.rpm xpf-devel-1.0.0-1.aarch64.rpm
2. Obtain the software verification tool. ­ Method 1: Huawei Enterprise website https://support.huawei.com/enterprise/en/tool/pgp-verifyTL1000000054

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

91

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

­ Method 2: Huawei Carrier website
https://support.huawei.com/carrier/digitalSignatureAction
3. Verify the software package integrity by following the procedure described in the OpenPGP Signature Verification Guide obtained in Step 5.2.
4. Copy the compressed RPM package to the /home/rpm_packet directory on the server.
5. Decompress the package.
tar -xzf /home/rpm_packet/OVSOE_ALL.tar.gz -C /home/rpm_packet && rm -rf /home/ rpm_packet/OVSOE_ALL.tar.gz
The directory structure is as follows:

----End

Decompressing Source Code Packages
Step 1 Go to the /home/source_code directory.
cd /home/source_code
Step 2 Decompress the DPDK source code package.
tar -xf dpdk-19.11.tar.xz && rm -f dpdk-19.11.tar.xz
Step 3 Decompress the OVS source code package.
tar -xzf openvswitch-2.12.0.tar.gz && rm -f openvswitch-2.12.0.tar.gz
Step 4 Install the patch for the corresponding source code.
patch -d openvswitch-2.12.0 -p1 < ../patch_code/ovs_patch/xpf.patch
NOTE
Skip this step if the downloaded code is the source code of the ovx_xpf_v2.12.0 branch.
Step 5 Forcibly install the XPF binary library.
rpm -ivh --nodeps /home/rpm_packet/xpf-1.0.0-1.aarch64.rpm /home/rpm_packet/xpfdevel-1.0.0-1.aarch64.rpm
NOTE
Replace /home/rpm_packet/ with the actual path.
----End

Compiling and Installing DPDK
Step 1 Go to the /home/source_code/dpdk-19.11 directory.
cd /home/source_code/dpdk-19.11

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

92

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Step 2 Compile and install DPDK.
make O=arm64-armv8a-linuxapp-gcc T=arm64-armv8a-linuxapp-gcc config sed -ri 's,(RTE_APP_TEST=).*,\1n,' arm64-armv8a-linuxapp-gcc/.config sed -ri 's,(RTE_BUILD_SHARED_LIB=).*,\1y,' arm64-armv8a-linuxapp-gcc/.config make O=arm64-armv8a-linuxapp-gcc -j 96 make install O=arm64-armv8a-linuxapp-gcc prefix=/usr libdir=/lib64
If the /usr/lib64 directory contains the librtexxx_xxx.so dynamic library file, the DPDK compilation and installation are complete.
----End

Compiling and Installing OVS
Step 1 Go to the root directory of source code.
 Run the following command if the source code of the ovs_xpf_v1.2.10 branch is downloaded from the Kunpeng community:
cd /home/source_code/ovs-ovs_xpf_v2.12.0
 Run the following command if the code is downloaded from the official OVS website:
cd /home/source_code/openvswitch-2.12.0
Step 2 Compile and install OVS.
./boot.sh ./configure CFLAGS="-g -O2 -march=armv8-a+crc" --prefix=/usr --sysconfdir=/etc --localstatedir=/var -libdir=/lib64 --enable-ssl --enable-shared --with-dpdk=yes --enable-Werror make -j 96 && make install
Step 3 Copy necessary header files.
cp config.h /usr/include/openvswitch mkdir /usr/include/openvswitch/lib cp lib/*.h /usr/include/openvswitch/lib/
Step 4 Recompile and reinstall OVS.
make clean ./configure CFLAGS="-g -O2 -march=armv8-a+crc -ftree-vectorize -I/usr/include/xpf-1.0.0/xpf_include" --prefix=/usr --sysconfdir=/etc --localstatedir=/var --libdir=/lib64 --enable-ssl --enable-shared --withdpdk=yes --enable-Werror --enable-xpf make -j 96 && make install
Step 5 (Optional) Configure OVS to start in service mode.
NOTE
You need to run multiple commands to manually start, stop, and restart the OVS service. Running OVS in service mode can greatly simplify related operations.
1. Switch to the rhel directory in the OVS source code directory, copy the etc_init.d_openvswitch file to the /etc/init.d directory, rename the file as openvswitch, and change the file execution permission to 755.
cd rhel/ cp etc_init_.d_openvswitch /etc/inin.d/openvswitch chmod 755 /etc/init.d/openvswitch

2. Run OVS in service mode.
After the OVS startup configuration is complete (see 3.7 Running and Verifying XPF), you can start, stop, and restart OVS in service mode.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

93

Kunpeng BoostKit for Virtualization User Guides
­ Start the OVS service.
service openvswitch start
­ Stop the OVS service.
service openvswitch stop
­ Restart the OVS service.
service openvswitch restart

3 XPF User Guide

----End
NOTE
 The XPF library depends on the open-source OVS, and the secondary developed OVS depends on the XPF library. Therefore, you need to compile OVS twice. The first step is to compile the open-source OVS and copy necessary header files. The second step is to compile the secondary developed OVS.
 The boot.sh script is used to generate the modified configure file by using Libtool, add the corresponding macro definition, and enable XPF code.

3.6 Configuring Logs
DPDK logs are output to dmesg by default. To facilitate fault locating, modify configuration files to output DPDK logs to OVS log files and configure log control methods. Perform the following configuration based on the site requirements.
Step 1 Create the DPDK log directory.
mkdir -p /var/log/dpdk
Step 2 Configure the Rsyslog configuration file of DPDK.
1. Create the dpdk.conf configuration file in the /etc/rsyslog.d directory.
vim /etc/rsyslog.d/dpdk.conf

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

94

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

2. Add the following content to the file:
#DPDK_LOG_TAG "LibLogTag_DPDK" template(name="template-dpdk" type="string" string="%TIMESTAMP:::date-rfc3339%| %syslogseverity-text%|%programname%[%PROCID%]|%$!msg_pre%%$!msg_after%\n") $outchannel dpdk,/var/log/dpdk/dpdk.log,2097152,/opt/esyslog/esyslog_log_rsyslog_dump.sh /var/log/ dpdk/dpdk.log dpdk
if ($msg contains "LibLogTag_DPDK" and $syslogseverity <= 7 ) then { set $!msg_pre = field($msg,"LibLogTag_DPDK|",1); set $!msg_after = field($msg,"LibLogTag_DPDK|",2); :omfile:$dpdk;template-dpdk stop }

if ($msg contains "LibLogTag_DPDK" and $syslogseverity > 7 ) then { /dev/null stop }
Step 3 Configure the DPDK log rollback file.
1. Create the /etc/logrotate.d/dpdk file.
vim /etc/logrotate.d/dpdk
2. Add the following content to the file:
/var/log/dpdk/dpdk.log { hourly compress missingok notifempty maxsize 2048k rotate 50 copytruncate }
Step 4 Configure the OVS log rollback file.
1. Create the /etc/logrotate.d/openvswitch file.
vim /etc/logrotate.d/openvswitch
2. Add the following content to the file:
# Copyright (C) 2009, 2010, 2011, 2012 Nicira, Inc. # # Copying and distribution of this file, with or without modification, # are permitted in any medium without royalty provided the copyright # notice and this notice are preserved. This file is offered as-is, # without warranty of any kind.
/var/log/openvswitch/*.log { su root root hourly compress missingok maxsize 2048k rotate 50 sharedscripts postrotate # Tell Open vSwitch daemons to reopen their log files if [ -d /var/run/openvswitch ]; then for ctl in /var/run/openvswitch/*.ctl; do ovs-appctl -t "$ctl" vlog/reopen 2>/dev/null || : done fi36 endscript
}
Step 5 Configure the scheduled log processing file.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

95

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

1. Create the ovslogrotat file in /etc/cron.hourly.
vim /etc/cron.hourly/ovslogrotate
2. Add the following content to the file:
#!/bin/sh
/usr/sbin/logrotate -s /var/lib/logrotate/logrotate.status /etc/logrotate.d/dpdk EXITVALUE=$? if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]" fi
/usr/sbin/logrotate -s /var/lib/logrotate/logrotate.status /etc/logrotate.d/openvswitch EXITVALUE=$? if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]" fi
exit 0
3. Add the read and execute permissions to the file.
chmod 0644 /etc/cron.hourly/ovslogrotate chmod +x /etc/cron.hourly/ovslogrotate
----End

3.7 Running and Verifying XPF
Step 1 Set the huge page memory and edit the startup items.
vim /etc/grub2-efi.cfg
Add default_hugepagesz=512M hugepagesz=512M hugepages=64 to the file.

Step 2 Enable IOMMU and CPU isolation.
1. Open the /etc/grub2-efi.cfg file.
vim /etc/grub2-efi.cfg
2. Add isolcpus=0-5 iommu.passthrough=1 to the file.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

96

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Step 3 Restart the host for the configuration to take effect.
NOTE
To enable IOMMU, you also need to configure the BIOS in addition to configuring the startup items. For details, see 3.3 BIOS Settings.
Step 4 Start OVS.
1. Create the OVS working directory.
mkdir -p /var/run/openvswitch mkdir -p /var/log/openvswitch
2. Create the OVS database file.
ovsdb-tool create /etc/openvswitch/conf.db
3. Start the ovsdb-server program.
ovsdb-server --remote=punix:/var/run/openvswitch/db.sock -remote=db:Open_vSwitch,Open_vSwitch,manager_options --privatekey=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrapca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach --log-file
4. Set OVS startup parameters.
ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true other_config:dpdk-socketmem="4096" other_config:dpdk-lcore-mask="0x1F" other_config:pmd-cpu-mask="0x1E"
5. Start OVS.
ovs-vswitchd --pidfile --detach --log-file

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

97

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Step 5 Bind the NIC to the user-mode DPDK.
1. Load the igb_uio driver.
modprobe igb_uio
NOTE
If this command is executed for the first time, run the depmod command to enable the system to process driver dependencies. The driver is provided by DPDK and is installed in /lib/modules/4.14.0-115.el7a.0.1.aarch64/extra/dpdk/igb_uio.ko by default. If the system restarts, you need to load the driver again.
2. View network port information.
dpdk-devbind -s

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

98

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Find the PCI address of the network port to be bound.
NOTE
The network port to be bound must be in the down state. Otherwise, the binding fails.
3. Bind the NIC to the user-mode DPDK.
dpdk-devbind --bind=igb_uio 0000:05:00.0 dpdk-devbind --bind=igb_uio 0000:06:00.0
NOTE
To roll back the operation, run the following command: dpdk-devbind -u 0000:05:00.0
4. Check whether the binding is successful.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

99

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Step 6 Create a network. The network to be verified is a typical OVS network, as shown in Figure 3-1.
Figure 3-1 OVS networking

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

100

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

NOTE
The following lists the commands run on Host 1. The commands run on Host 2 are similar, except that the IP address of br-dpdk on Host 2 is different.
1. Add and configure the br-dpdk bridge.
ovs-vsctl add-br br-dpdk -- set bridge br-dpdk datapath_type=netdev ovs-vsctl add-bond br-dpdk dpdk-bond p0 p1 -- set Interface p0 type=dpdk options:dpdkdevargs=0000:05:00.0 -- set Interface p1 type=dpdk options:dpdk-devargs=0000:06:00.0 ovs-vsctl set port dpdk-bond bond_mode=balance-tcp ovs-vsctl set port dpdk-bond lacp=active ifconfig br-dpdk 192.168.2.1/24 up
NOTE
The ifconfig br-dpdk 192.168.2.1/24 up command is used to configure a virtual extensible LAN (VXLAN) tunnel (192.168.2.1 indicates the IP address of the br-dpdk bridge). Run the ifconfig br-dpdk 192.168.2.2/24 up command on Host 2. The network segment of the tunnel is different from that of the VM.
2. Add and configure the br-int bridge.
ovs-vsctl add-br br-int -- set bridge br-int datapath_type=netdev ovs-vsctl add-port br-int vxlan0 -- set Interface vxlan0 type=vxlan options:local_ip=192.168.2.1 options:remote_ip=192.168.2.2
NOTE
In this networking, br-int has a VXLAN port, and the VXLAN header is added to all outgoing traffic of the host. local_ip of the VXLAN port is set to the IP address of the local br-dpdk, and remote_ip is set to the IP address of the peer br-dpdk.
3. Add and configure the br-plyn bridge.
ovs-vsctl add-br br-ply1 -- set bridge br-ply1 datapath_type=netdev ovs-vsctl add-port br-ply1 tap1 -- set Interface tap1 type=dpdkvhostuserclient options:vhostserver-path=/var/run/openvswitch/tap1 ovs-vsctl add-port br-ply1 p-tap1-int -- set Interface p-tap1-int type=patch options:peer=p-tap1 ovs-vsctl add-port br-int p-tap1 -- set Interface p-tap1 type=patch options:peer=p-tap1-int
NOTE
In this networking, a br-ply bridge is added each time a VM is added. The bridge has a dpdkvhostuser port for the VM, and the patch port is connected to the br-int bridge.
4. Verify the networking.
ovs-vsctl show

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

101

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

5. Check whether the br-dpdk bridge at the local end is connected to the brdpdk bridge at the peer end.
ping 192.168.2.2

Step 7 Start the VM.
When configuring a VM, pay attention to the huge page memory and network port configuration. The following VM configuration file is for reference:
<domain type='kvm'> <name>VM1</name> <uuid>fb8eb9ff-21a7-42ad-b233-2a6e0470e0b5</uuid> <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <memoryBacking>

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

102

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

<hugepages> <page size='524288' unit='KiB' nodeset='0'/>
</hugepages> <locked/> </memoryBacking> <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='6'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='8'/> <vcpupin vcpu='3' cpuset='9'/> <emulatorpin cpuset='0-3'/> </cputune> <numatune> <memory mode='strict' nodeset='0'/> </numatune> <os> <type arch='aarch64' machine='virt-rhel7.6.0'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/AAVMF/AAVMF_CODE.fd</loader> <nvram>/var/lib/libvirt/qemu/nvram/VM1_VARS.fd</nvram> <boot dev='hd'/> </os> <features> <acpi/> <gic version='3'/> </features> <cpu mode='host-passthrough' check='none'> <topology sockets='1' cores='4' threads='1'/> <numa>
<cell id='0' cpus='0-3' memory='2097152' unit='KiB' memAccess='shared'/> </numa> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/> <source file='/home/kvm/images/1.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='sda' bus='scsi'/> <readonly/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='qemu-xhci' ports='8'> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </controller> <controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='1' port='0x8'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='2' port='0x9'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='pci' index='3' model='pcie-root-port'> <model name='pcie-root-port'/>

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

103

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

<target chassis='3' port='0xa'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='4' port='0xb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='5' port='0xc'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/> </controller> <interface type='vhostuser'> <source type='unix' path='/var/run/openvswitch/tap1' mode='server'/> <target dev='tap1'/> <model type='virtio'/> <driver name='vhost' queues='4' rx_queue_size='1024' tx_queue_size='1024'/> </interface> <serial type='pty'> <target type='system-serial' port='0'>
<model name='pl011'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> </devices> </domain>
NOTE
 The memoryBacking section specifies the specifications of the huge page memory to be applied for. The huge page memory of 512 MB is configured for the host. Therefore, specify 512 MB in this section.
 The numatune section specifies the number of the NUMA node on which the memory is applied for. The NUMA node number of the VM must be the same as that of the NIC.
 The numa subsection specifies the VM memory mode. In this example, the virtual network port of vhostuser is configured. The VM must have a shared huge page memory.
 The interface section specifies the virtual network port of the VM.
­ In the source section, path specifies the location of the socket file for communication between the host and VM, and mode specifies the type of the VM socket. The OVS is configured with the dpdkvhostuserclient port, which is in the client mode. Therefore, mode is set to server in this example.
­ The target section specifies the name of socket used by the VM.
­ The driver section specifies the driver used by the VM and specifies the queue and queue depth. In this example, the vhostuser driver is used.
Step 8 Verify cross-host VM communication.
Check whether the VM on Host 1 can communicate with the VM on Host 2.
ping 192.168.1.21

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

104

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

NOTE The data length is added to the VXLAN header. The default maximum transmission unit (MTU) of the host is 1500 bytes. The MTU length of the VM needs to be reduced to ensure normal communication. It is recommended that the MTU length of the VM network port be 1400 bytes. ----End
3.8 Troubleshooting
Problem 1: Insufficient Permission Upon VM Startup
Symptom The VM fails to be started, and the following information is displayed:
Possible Cause  The SELinux of the system prevents the VM from creating the /var/run/
openvswitch/tap1 file.  The QEMU process does not have the permission to access the file. Procedure 1. In the QEMU configuration file /etc/libvirt/qemu.conf, ensure that the
QEMU process user has the read and execute permissions on the /var/run/ openvswitch directory. The following shows the default line numbers for configuration:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

105

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Restart the libvirtd service.
systemctl restart libvirtd
2. Ensure that SELinux does not block processes.
a. Disable SELinux.
setenforce 0
b. Start the VM.
virsh start VM1
The VM is started successfully, and the fault is rectified.

3.9 OVS Command Description
Table 3-4 lists the OVS commands used for maintaining the status and functions of software offloading.

Table 3-4 OVS command description Command ovs-appctl hwoff-flow-agent/addprotolist
ovs-appctl hwoff-flow-agent/addrapid-protolist
ovs-appctl hwoff-flow-agent/clearerror-stats ovs-appctl hwoff-flow-agent/disable
ovs-appctl hwoff-flow-agent/dump-ct
ovs-appctl hwoff-flow-agent/dumppolicy

Function
Adds a protocol blacklist. Packets transmitted using the listed protocols are not offloaded.
Adds a protocol whitelist. Packets transmitted using the listed protocols are offloaded.
Clears the error statistics collected by hwoff-flow-agent.
Disables the flow table offloading function (software offloading).
Queries the connection tracking (CT) information recorded by hwoff-flowagent.
Queries parameters in the offloading policy of hwoff-flow-agent.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

106

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Command ovs-appctl hwoff-flow-agent/dumpprotolist ovs-appctl hwoff-flow-agent/dumprapid-protolist ovs-appctl hwoff-flow-agent/enable
ovs-appctl hwoff-flow-agent/errorstats ovs-appctl hwoff-flow-agent/flushflow ovs-appctl hwoff-flow-agent/flushprotolist ovs-appctl hwoff-flow-agent/flushrapid-protolist ovs-appctl hwoff-flow-agent/getoffloadrx-threshold
ovs-appctl hwoff-flow-agent/live-time
ovs-appctl hwoff-flow-agent/policy
ovs-appctl hwoff-flow-agent/printflow ovs-appctl hwoff-flow-agent/setprotolist-mode
ovs-appctl hwoff-flow-agent/stats
ovs-appctl hwoff-flow-agent/useoffloadrx-threshold ovs-appctl hwoff/dump-hwoff-flows
ovs-appctl hwoff/shmap-dump-hw

Function
Queries the protocol blacklist that has been added to hwoff-flow-agent.
Queries the protocol whitelist that has been added to hwoff-flow-agent.
Enables the flow table offloading function (software offloading). This function is enabled by default.
Queries the error statistics collected by hwoff-flow-agent.
Clears information about hardware flow tables.
Clears the protocol blacklist.
Clears the protocol whitelist.
Obtains the threshold for automatically deleting cached flow tables.
Sets the aging time of flow tables offloaded to hwoff-flow-agent.
Configures policy parameters for flow table offloading.
Prints flow table details in OVS logs during flow table offloading.
Specifies the protocol list type (blacklist or whitelist) recorded by hwoff-flow-agent.
Lists non-error statistics collected by hwoff-flow-agent.
Sets the threshold for automatically deleting cached flow tables.
Queries information about hardware flow tables.
Queries the unique flow IDs (UFIDs) of all hardware flow tables in the software-hardware flow table mapping maintained by the host; queries the UFID of the software flow table associated with the UFID of a specified hardware flow table.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

107

Kunpeng BoostKit for Virtualization User Guides
Command ovs-appctl hwoff/shmap-dump-sw
ovs-appctl hwoff/shmap-error-stats ovs-appctl hwoff/shmap-flush

3 XPF User Guide
Function
Queries the UFIDs of all software flow tables associated with hardware flow tables in the software-hardware flow table mapping maintained by the host; queries the UFID of the hardware flow table associated with the UFID and type of a specified software flow table.
Queries the error statistics of the software-hardware flow table mapping.
Clears the software-hardware flow table mapping.

The following describes the commands in detail.

ovs-appctl hwoff-flow-agent/add-protolist
Syntax ovs-appctl hwoff-flow-agent/add-protolist [PROTOLIST] Function Adds a protocol blacklist. Packets transmitted using the listed protocols are not offloaded. Parameter Description

Parameter PROTOLIST

Mandatory Yes

Description
Protocol number list. The protocol number ranges from 0 to 255 or from 1536 to 65535, for example, 23, 25, or 1560.

Example
ovs-appctl hwoff-flow-agent/add-protolist 23,25,1560

ovs-appctl hwoff-flow-agent/add-rapid-protolist
Syntax hwoff-flow-agent/add-rapid-protolist [PROTOLIST] Function

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

108

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Adds a protocol whitelist. Packets transmitted using the listed protocols are offloaded.
Parameter Description

Parameter PROTOLIST

Mandatory Yes

Description
Protocol number table. The protocol number can be 0, 1, 2, or 3 (0 indicates UDP, 1 indicates TCP, 2 indicates ICMP, and 3 indicates ICMPv6).

Example
ovs-appctl hwoff-flow-agent/add-rapid-protolist 0,1,2,3
ovs-appctl hwoff-flow-agent/clear-error-stats
Syntax ovs-appctl hwoff-flow-agent/clear-error-stats Function Clears the error statistics collected by hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/clear-error-stats
ovs-appctl hwoff-flow-agent/disable
Syntax ovs-appctl hwoff-flow-agent/disable Function Disables the flow table offloading function (software offloading). Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/disable
ovs-appctl hwoff-flow-agent/dump-ct
Syntax

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

109

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

ovs-appctl hwoff-flow-agent/dump-ct [-m] [1-1000] Function Queries the connection tracking information recorded by hwoff-flow-agent. Parameter Description

Parameter -m
[1-1000]

Mandatory No
No

Description
Views details of the connection tracking table.
Specifies the displayed number of records in the connection tracking table.

Example
ovs-appctl hwoff-flow-agent/dump-ct -m 1

ovs-appctl hwoff-flow-agent/dump-policy
Syntax ovs-appctl hwoff-flow-agent/dump-policy Function Queries parameters in the offloading policy of hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/dump-policy

ovs-appctl hwoff-flow-agent/dump-protolist
Syntax ovs-appctl hwoff-flow-agent/dump-protolist Function Queries the protocol blacklist that has been added to hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/dump-protolist

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

110

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

ovs-appctl hwoff-flow-agent/dump-rapid-protolist
Syntax ovs-appctl hwoff-flow-agent/dump-rapid-protolist Function Queries the protocol whitelist that has been added to hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/dump-rapid-protolist
ovs-appctl hwoff-flow-agent/enable
Syntax ovs-appctl hwoff-flow-agent/enable Function Enables the flow table offloading function (software offloading). This function is enabled by default. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/enable
ovs-appctl hwoff-flow-agent/error-stats
Syntax ovs-appctl hwoff-flow-agent/error-stats Function Queries the error statistics collected by hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/error-stats
ovs-appctl hwoff-flow-agent/flush-flow
Syntax ovs-appctl hwoff-flow-agent/flush-flow Function

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

111

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Clears information about hardware flow tables. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/flush-flow
ovs-appctl hwoff-flow-agent/flush-protolist
Syntax ovs-appctl hwoff-flow-agent/flush-protolist Function Clears the protocol blacklist. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/flush-protolist
ovs-appctl hwoff-flow-agent/flush-rapid-protolist
Syntax ovs-appctl hwoff-flow-agent/flush-rapid-protolist Function Clears the protocol whitelist. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/flush-rapid-protolist
ovs-appctl hwoff-flow-agent/get-offloadrx-threshold
Syntax ovs-appctl hwoff-flow-agent/get-offloadrx-threshold Function Obtains the threshold for automatically deleting cached flow tables. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/get-offloadrx-threshold

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

112

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

ovs-appctl hwoff-flow-agent/live-time
Syntax ovs-appctl hwoff-flow-agent/live-time TIME Function Sets the aging time of flow tables offloaded to hwoff-flow-agent. Parameter Description

Parameter TIME

Mandatory Yes

Description
Aging time, in ms. The value ranges from 1 to 86400000.

Example
ovs-appctl hwoff-flow-agent/live-time 200000

ovs-appctl hwoff-flow-agent/policy
Syntax
ovs-appctl hwoff-flow-agent/policy enable_permission <0|1> | user_default_permissions <0-10000000> | permission_update_interval <100-10000000 ms> | permission_credit <0-10000> offload_packet_num <0-10000000> | offload_delay_mode <0|1> | duration_before_offload <0-10000000 ms> | offload_pps <0-10000000> | garbage_clean_interval <100-10000000 ms> | garbage_max_life <10-10000000 ms> | enable_clean_window <0|1> | clean_pps <0-10000000> | clean_window_update_interval <100-10000000 ms> | limit_flow_nums <2000000-8000000> | user_idle_time <1-10000000>
Function
Configures policy parameters for flow table offloading.
Parameter Description

Parameter enable_permission

Mandatory No

user_default_permissions No

Description
Indicates whether to enable offloading. The values are as follows: 0: disable 1: enable
Maximum number of offloaded table flows. The value ranges from 0 to 10000000.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

113

Kunpeng BoostKit for Virtualization User Guides

Parameter
permission_update_inter val

Mandatory No

permission_credit

No

offload_packet_num

No

offload_delay_mode

No

duration_before_offload No

offload_pps

No

garbage_clean_interval No

garbage_max_life

No

3 XPF User Guide
Description
Interval (in ms) for updating offloaded table flows. The value ranges from 100 to 10000000.
Maximum number of established connections. The value ranges from 0 to 10000. The value 0 indicates no limit.
Maximum number of packets before offloading. The value ranges from 0 to 10000000.
Indicates whether to enable offloading delay. The values are as follows:
0: disable
1: enable
Offloading delay time, in ms. The value ranges from 0 to 10000000.
Offloading delay PPS (PPS is short for packet per second). Offloading occurs when the traffic reaches this value. The value ranges from 0 to 10000000.
Interval (in ms) for deleting garbage table flows. The value ranges from 100 to 10000000.
Maximum matching time (in ms) of table flows. If matching fails after this time, the table flows are deleted as garbage table flows. The value ranges from 10 to 10000000.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

114

Kunpeng BoostKit for Virtualization User Guides
Parameter enable_clean_window

Mandatory No

clean_pps

No

clean_window_update_in No terval

limit_flow_nums

No

user_idle_time

No

3 XPF User Guide
Description
Threshold for deleting connection tracking status from the cache. The value ranges from 0 to 10000000. The value 0 indicates that the function is disabled.
Flow table replacement PPS. The old flow tables are replaced by new flow tables when the traffic reaches this value. The value ranges from 0 to 10000000.
Interval (in ms) for deleting connection tracking status from the cache. The value ranges from 100 to 10000000.
Maximum number of flow tables generated after offloading. The value ranges from 2000000 to 8000000.
Sets the connection tracking aging time (in second) for the hwoffflow-agent offloading policy. The value ranges from 1 to 10000000.

Example
ovs-appctl hwoff-flow-agent/policy enable_permission 1
ovs-appctl hwoff-flow-agent/print-flow
Syntax ovs-appctl hwoff-flow-agent/print-flow num Function Prints flow table details in OVS logs during flow table offloading. Parameter Description

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

115

Kunpeng BoostKit for Virtualization User Guides
Parameter num

Mandatory Yes

3 XPF User Guide
Description
Prints the details of the Nth offloading. The value ranges from 0 to 15. The default value is 0, indicating that the details are not printed.

Example
ovs-appctl hwoff-flow-agent/print-flow 1

ovs-appctl hwoff-flow-agent/set-protolist-mode
Syntax
ovs-appctl hwoff-flow-agent/set-protolist-mode [black] | [white]
Function
Specifies the protocol list type (blacklist or whitelist) recorded by hwoff-flowagent.
This parameter must be used together with ovs-appctl hwoff-flow-agent/addprotolist.
Parameter Description

Parameter black
white

Mandatory No
No

Description
Prohibits offloading of the packets transmitted using the listed protocols.
Allows offloading of the packets transmitted using the listed protocols.

Example  Configuring the protocol list
ovs-appctl hwoff-flow-agent/add-protolist 23,25,1560
 Prohibiting offloading of the packets transmitted using the listed protocols
ovs-appctl hwoff-flow-agent/set-protolist-mode black
ovs-appctl hwoff-flow-agent/stats
Syntax ovs-appctl hwoff-flow-agent/stats

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

116

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Function Lists non-error statistics collected by hwoff-flow-agent. Parameter Description N/A Example
ovs-appctl hwoff-flow-agent/stats

ovs-appctl hwoff-flow-agent/use-offloadrx-threshold
Syntax ovs-appctl hwoff-flow-agent/use-offloadrx-threshold NUM Function Sets the threshold for automatically deleting cached flow tables. Parameter Description

Parameter NUM

Mandatory Yes

Description
Threshold for automatically deleting cached flow tables. The value ranges from 1 to 512.

Example
ovs-appctl hwoff-flow-agent/use-offloadrx-threshold 256

ovs-appctl hwoff/dump-hwoff-flows
Syntax hwoff/dump-hwoff-flows [-m]|[ufid ufid]|[check]|[stop]|[-f file]|[-n]|[-h] Function Queries information about hardware flow tables. Parameter Description

Parameter ufid hw_ufid

Mandatory No

Description
Queries information about hardware flow tables based on UFIDs. (The UFID is in the hexadecimal format.)

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

117

Kunpeng BoostKit for Virtualization User Guides
Parameter -n
-f file

Mandatory No
No

check

No

stop

No

-h

No

3 XPF User Guide
Description
Displays the total number of hardware flow tables that have been offloaded.
(Recommended) Dumps flow tables to a specified file (the file path is an absolute path) when the number of flow tables exceeds 1000.
Before using this parameter, use the -n parameter to determine the total number of current hardware flow tables.
Queries the progress of dumping hardware flow tables to a file.
Forcibly stops the operation of dumping flow tables to a file.
Displays help information.

Example
 Displaying the total number of hardware flow tables that have been offloaded
ovs-appctl hwoff/dump-hwoff-flows -n
 Dumping information about hardware flow tables in command line mode (It is recommended that the number of flow tables be less than 1000.)
ovs-appctl hwoff/dump-hwoff-flows
NOTE
 Before running this command, run the ovs-appctl hwoff/dump-hwoff-flows -n command to check the total number of flow tables.
 When the total number of flow tables exceeds 1000, run the ovs-appctl hwoff/ dump-hwoff-flows -f Absolute path of a file command to dump the flow table information to a file. Otherwise, the command execution will be suspended for a long time. If this command is executed repeatedly, flow table information is repeatedly added to the file, and the file size keeps increasing. In this case, you need to manually clear the file.
 The dumped hardware flow table information (the same as the information displayed by running the open-source dpctl/dump-flows command) contains the source IP address, destination IP address, source MAC address, destination MAC address, and quintuple information of the outbound interface.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

118

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

 Saving information about hardware flow tables to a specified file
ovs-appctl hwoff/dump-hwoff-flows -f /root/test.log
 Querying information about hardware flow tables based on UFIDs
ovs-appctl hwoff/dump-hwoff-flows ufid 12345678-12345678
 Querying the progress of dumping hardware flow tables to a file
ovs-appctl hwoff/dump-hwoff-flows check
 Forcibly stopping the operation of dumping flow tables to a file
ovs-appctl hwoff/dump-hwoff-flows stop

ovs-appctl hwoff/shmap-dump-hw
Syntax
ovs-appctl hwoff/shmap-dump-hw [all | hw_ufid]
Function
Queries the UFIDs of all hardware flow tables in the software-hardware flow table mapping maintained by the host; queries the UFID of the software flow table associated with the UFID of a specified hardware flow table.
Parameter Description

Parameter all
hw_ufid

Mandatory No
No

Description
Queries the UFIDs of all hardware flow tables in the software-hardware flow table mapping maintained by the host.
Queries the UFID of the software flow table associated with the UFID of a specified hardware flow table.
The UFID of a hardware flow table is in the hexadecimal format.

Example
 Querying the UFIDs of all hardware flow tables in the software-hardware flow table mapping maintained by the host
ovs-appctl hwoff/shmap-dump-hw all
 Querying the UFID of the software flow table associated with the UFID of a specified hardware flow table
ovs-appctl hwoff/shmap-dump-hw 69e285a3-78f2-4029-9350-82d03e02564f
ovs-appctl hwoff/shmap-dump-sw
Syntax
ovs-appctl hwoff/shmap-dump-sw [all |sw_ufid type ]

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

119

Kunpeng BoostKit for Virtualization User Guides

3 XPF User Guide

Function
Queries the UFIDs of all software flow tables associated with hardware flow tables in the software-hardware flow table mapping maintained by the host; queries the UFID of the hardware flow table associated with the UFID and type of a specified software flow table.
Parameter Description

Parameter all

Mandatory No

sw_ufid, type

No

Description
Queries the UFIDs of all software flow tables associated with hardware flow tables in the software-hardware flow table mapping maintained by the host.
Queries the UFID of the hardware flow table associated with the UFID and type of a specified software flow table.
The UFID of a software flow table is in the hexadecimal format.
The options of type are as follows:
0: OpenFlow data plane
1: source IP addressbased transparent transmission

Example
 Querying the UFIDs of all software flow tables associated with hardware flow tables in the software-hardware flow table mapping maintained by the host
ovs-appctl hwoff/shmap-dump-sw all
 Querying the UFID of the hardware flow table associated with the UFID and type of a specified software flow table
ovs-appctl hwoff/shmap-dump-sw e8288d3b-5daa-4815-8b23-2be16662977a 0
ovs-appctl hwoff/shmap-error-stats
Syntax
ovs-appctl hwoff/shmap-error-stats
Function
Queries the error statistics of the software-hardware flow table mapping.
Parameter Description

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

120

Kunpeng BoostKit for Virtualization User Guides
N/A Example
ovs-appctl hwoff/shmap-error-stats
ovs-appctl hwoff/shmap-flush
Syntax ovs-appctl hwoff/shmap-flush Function Clears the software-hardware flow table mapping. Parameter Description N/A Example
ovs-appctl hwoff/shmap-flush

3.10 Change History

Date

Description

2021-01-20 This issue is the first official release.

3 XPF User Guide

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

121

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

4 SR-IOV User Guide

4.1 Introduction 4.2 Environment Requirements 4.3 Configuring the Environment 4.4 Configuring SR-IOV 4.5 Verifying SR-IOV

4.1 Introduction

Introduction to SR-IOV
Single-Root I/O Virtualization (SR-IOV) comprises physical functions (PFs) and virtual functions (VFs). It allows a physical device to provide multiple virtual functions, reducing the hardware cost of each additional function.
SR-IOV enables a single functional unit (for example, an Ethernet port) to serve as multiple independent physical devices. That is, a physical device that supports SRIOV can be configured as multiple functional units. It eliminates the need for CPU and virtual machine (VM) management, improving system performance. However, its dependence on hardware results in certain limitations in terms of commonality, compatibility, and scalability.
SR-IOV virtualizes NICs and enables them to be used by VMs, aiming to improve network traffic processing capabilities and switching performance.

4.2 Environment Requirements

Networking
As shown in Figure 4-1, SR-IOV requires two servers (server 1 and server 2). The two servers communicate with each other through direct network port connections.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

122

Kunpeng BoostKit for Virtualization User Guides
Figure 4-1 Networking diagram

4 SR-IOV User Guide

Server Server 1
Server 2

IP Address 90.90.48.60
90.90.48.62

Mellanox CX5 Network Port
 enp1s0f0 PCI port: 0000:01:00.0
 enp1s0f1 PCI port: 0000:01:00.1
 enp1s0f0 PCI port: 0000:01:00.0
 enp1s0f1 PCI port: 0000:01:00.1

Hardware

Table 4-1 describes the hardware configuration.

Table 4-1 Hardware configuration

Item

Description

CPU

Kunpeng 920 processor

NIC

Mellanox CX5 public edition

Other

RAID controller cards must support pass-through.

Operating System
Table 4-2 lists the operating system (OS) requirements.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

123

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Table 4-2 OS requirements

Item

Version

CentOS

CentOS Linux release 7.6.1810 (AltArch)

Kernel

4.14.0-115.el7a. 0.1.aarch64

NIC firmware 16.28.2006

NIC driver

OFED-5.1-2.3.7

QEMU

2.12.0

libvirt Python GCC VM OS

4.5.0
2.7.5
4.8.5
CentOS Linux release 7.6.1810 (AltArch)

Remarks -
Contained in the OS image
Contained in the Mellanox driver package EV version, not contained in the OS image package Contained in the OS image Contained in the OS image Contained in the OS image -

Software Packages

Table 4-3 Software packages

Software Package

Description

CentOS Linux CentOS 7.6 image file release 7.6.1810 (AltArch)

MLNX_OFED_LI NUX-5.1-2.3.7.1 rhel7.6alternate -aarch64.tgz

Mellanox NIC driver and software packages

How to Obtain
https:// mirrors.huaweicloud.co m/centos-vault/ altarch/7.6.1810/isos/ aarch64/
http:// content.mellanox.com/ ofed/ MLNX_OFED-5.1-2.3.7.1 / MLNX_OFED_LINUX-5.1 -2.3.7.1rhel7.6alternateaarch64.tgz

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

124

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Software Package

Description

How to Obtain

qemu-kvmtoolsev-2.12.0-33.1.el 7.aarch64.rpm

qemu-2.12.0 centos-release-qemuev

http:// mirrors.tools.huawei.co m/centos-altarch/7/ virt/aarch64/kvmcommon/Packages/q/ qemu-kvm-toolsev-2.12.0-33.1.el7.aarch 64.rpm

qume-imgev-2.12.0-33.1.el 7.aarch64.rpm

http:// mirrors.tools.huawei.co m/centos-altarch/7/ virt/aarch64/kvmcommon/Packages/q/ qemu-imgev-2.12.0-33.1.el7.aarch 64.rpm

qume-kvmcommonev-2.12.0-33.1.el 7.aarch64.rpm

http:// mirrors.tools.huawei.co m/centos-altarch/7/ virt/aarch64/kvmcommon/Packages/q/ qemu-kvm-commonev-2.12.0-33.1.el7.aarch 64.rpm

qume-kvmev-2.12.0-33.1.el 7.aarch64.rpm

http:// mirrors.tools.huawei.co m/centos-altarch/7/ virt/aarch64/kvmcommon/Packages/q/ qemu-kvmev-2.12.0-33.1.el7.aarch 64.rpm

libvirt

-

Install through Yum.

4.3 Configuring the Environment
NOTE
 Unless otherwise specified, all operations described in this document must be performed on both server 1 and server 2.
 In the following example, the Mellanox PF network port name is enp1s0f0/1, the VF port number is enp1s0f0_$, and the PCI port number is 0000:01:00/1.$. Replace them with the actual values to match your configuration.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

125

Kunpeng BoostKit for Virtualization User Guides
BIOS Settings
Step 1 Go to the BIOS and choose Advanced > MISC Config.

4 SR-IOV User Guide

Step 2 Set Support Smmu to Enabled.

Step 3 Return to the upper-level directory and select PCIe Config.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

126

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 4 Set SRIOV to Enable.

----End

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

127

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Configuring Memory Huge Pages
NOTE
You need to restart the server for this operation and subsequent operations including Enabling IOMMU and CPU Isolation, Disabling NetworkManager, and Disabling SELinux. However, you can restart the server once after all these operations are complete.
Step 1 Use an SSH remote login tool to log in to the server and switch to the root user account.
Step 2 Check whether memory huge pages have been configured.
cat /proc/meminfo |grep -i huge

 If both HugePages_Total and Hugepagesize are greater than 0 and Hugepagesize is greater than 5 GB, no further action is required.
 If either HugePages_Total or Hugepagesize is 0, go to Step 3.
Step 3 Edit /boot/efi/EFI/centos/grub.cfg to modify the boot setting.
vim /boot/efi/EFI/centos/grub.cfg
1. Locate the boot item menuentry. Add the huge page option default_hugepagesz=512M hugepagesz=512M hugepages=128 at the end of line 100. For example:

2. Save the configuration and exit.
:wq
Step 4 Configure the huge pages to be mounted upon system boot.
1. Edit the /etc/fstab file.
vim /etc/fstab
Add the following text:
nodev /mnt/huge hugetlbfs defaults 0 0

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

128

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

2. Save the configuration and exit.
:wq
Step 5 Create a /mnt/huge directory.
mkdir -p /mnt/huge
Step 6 Restart the server for the configuration to take effect.
reboot
Step 7 Perform Step 2 again to check whether the configuration has taken effect.
----End
Enabling IOMMU and CPU Isolation
Step 1 Edit the /etc/grub2-efi.cfg file.
vim /etc/grub2-efi.cfg
Add the IOMMU setting next to the boot item (at the end of line 100).
isolcpus=0-5 iommu.passthrough=1

Step 2 Restart the server for the configuration to take effect.
reboot
----End

Disabling NetworkManager
Run the following commands to disable NetworkManager:
systemctl stop NetworkManager systemctl disable NetworkManager

Disabling SELinux
Step 1 Disable the firewall.
systemctl stop firewalld.service systemctl disable firewalld.service
Step 2 Edit the /etc/selinux/config file.
vim /etc/selinux/config
Set SELINUX to disabled.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

129

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
To disable SELinux temporarily, run the setenforce 0 command.
Step 3 Restart the server for the configuration to take effect.
reboot
----End

Configuring the Local Mirror Source
NOTE
To match the kernel-devel, configure the local mirror source.
Step 1 Configure the local Yum source.
mkdir /mnt/repo mount -o loop /home/iso/CentOS-7-aarch64-Everything-1810.iso /mnt/repo cd /etc/yum.repos.d mkdir backup mv *.repo backup
Step 2 Edit the local.repo file.
vim local.repo
1. Add the following content to the file:
[local] name=local baseurl=file:///mnt/repo enable=1 gpgcheck=0 gpgkey=file:///mnt/repo/RPM-GPG-KEY-CentOS-7
2. Save the configuration and exit.
:wq
Step 3 Clear all the cached content.
yum clean all
Step 4 Build a cache.
yum makecache
----End

Installing the Mellanox NIC Driver
Step 1 Upload the Mellanox NIC driver package from the local PC to the server.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

130

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 2 Decompress the driver package and go to the decompressed folder.
tar -zxvf MLNX_OFED_LINUX-5.1-2.3.7.1-rhel7.6alternate-aarch64.tgz cd MLNX_OFED_LINUX-5.1-2.3.7.1-rhel7.6alternate-aarch64
Step 3 Install the dependencies.
yum install unbound tcl gcc-gfortran fuse-libs tk createrepo kernel-devel python-devel redhat-rpmconfig rpm-build gcc gcc-c++
Step 4 Install the driver.
./mlnxofedinstall --ovs-dpdk --upstream-libs --add-kernel-support
Step 5 Update initramfs.
dracut -f
Step 6 Load the driver.
/etc/init.d/openibd restart

NOTICE
If a FAILED error message is displayed, run the rmmod hns_roce_hw_v2 command to retry.

NOTE
If a 1822 NIC exists in the environment, the server performance deteriorates due to a large number of software interrupts generated by the 1822 NIC. Therefore, disable all existing 1822 NICs for a higher server performance. Run the following command to disable 1822 NICs: rmmod hinic Perform this operation each time after the server is restarted.
----End
4.4 Configuring SR-IOV
4.4.1 Checking Mellanox NIC Information
Step 1 Check for Mellanox NICs.
lspci -nn | grep Mellanox

Step 2 Query the port information about the Mellanox NIC.
ls -l /sys/class/net/ | grep 01:00

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

131

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 3 Check the hardware version. The installed firmware must match the version requirements specified in the Release Notes.
ethtool -i enp1s0f1 | head -5

Step 4 Check the maximum number of VFs supported by the Mellanox network ports.
cat /sys/class/net/enp1s0f1/device/sriov_totalvfs
----End
4.4.2 Configuring Kernel-Mode SR-IOV
Step 1 Add VFs to the PF network port. 1. Run the following command:
echo 8 > /sys/class/net/enp1s0f1/device/sriov_numvfs
2. Check whether the adding operation is successful.
cat /sys/class/net/enp1s0f1/device/sriov_numvfs
Step 2 Configure the MAC address of the VF port. 1. Run the following command:
ip link set enp1s0f1 vf 0 mac e4:11:22:33:44:50 ip link set enp1s0f1 vf 1 mac e4:11:22:33:44:51 ip link set enp1s0f1 vf 2 mac e4:11:22:33:44:52 ip link set enp1s0f1 vf 3 mac e4:11:22:33:44:53 ip link set enp1s0f1 vf 4 mac e4:11:22:33:44:54 ip link set enp1s0f1 vf 5 mac e4:11:22:33:44:55 ip link set enp1s0f1 vf 6 mac e4:11:22:33:44:56 ip link set enp1s0f1 vf 7 mac e4:11:22:33:44:57
2. Verify the configuration.
ip link show dev enp1s0f1

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

132

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTICE
Each MAC address must be unique on the local server, the peer server, and the switch.
3. Check the PCI port numbers of the eight virtual ports.
ls -l /sys/class/net/

Step 3 Change the network port mode. 1. Unbind the VFs.
echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.4 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.5 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.6 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.7 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:02.0 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:02.1 > /sys/bus/pci/drivers/mlx5_core/unbind
2. On the PF, change the eSwitch mode from Legacy to SwitchDev.
devlink dev eswitch set pci/0000:01:00.1 mode switchdev echo switchdev > /sys/class/net/enp1s0f1/compat/devlink/mode cat /sys/class/net/enp1s0f1/compat/devlink/mode
3. Check whether the device name of the Representor has been changed.
ls -l /sys/class/net/

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

133

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

The device name of the VFs has been changed from enp1s0f$ to enp1s0f1_$.
Step 4 Bind the VFs.
echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.4 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.5 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.6 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.7 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:02.0 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:02.1 > /sys/bus/pci/drivers/mlx5_core/bind
----End
4.4.3 Configuring OVS Boot Parameters
Step 1 Start OVS.
systemctl start openvswitch
Step 2 Enable offloading.
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true ovs-vsctl set Open_vSwitch . other_config:tc-policy=verbose
Step 3 Restart the OVS.
systemctl restart openvswitch
Step 4 View the OVS information.
ovs-vsctl list open_vswitch

----End

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

134

Kunpeng BoostKit for Virtualization User Guides
4.4.4 Configuring Network Data
Step 1 Set up the normal network.
ovs-vsctl add-br ovs-sriov ovs-vsctl add-port ovs-sriov enp1s0f1 ovs-vsctl add-port ovs-sriov enp1s0f1_0 ovs-vsctl add-port ovs-sriov enp1s0f1_1 ovs-vsctl add-port ovs-sriov enp1s0f1_2 ovs-vsctl add-port ovs-sriov enp1s0f1_3 ovs-vsctl add-port ovs-sriov enp1s0f1_4 ovs-vsctl add-port ovs-sriov enp1s0f1_5 ovs-vsctl add-port ovs-sriov enp1s0f1_6 ovs-vsctl add-port ovs-sriov enp1s0f1_7 ip link set dev enp1s0f1 up ip link set dev enp1s0f1_0 up ip link set dev enp1s0f1_1 up ip link set dev enp1s0f1_2 up ip link set dev enp1s0f1_3 up ip link set dev enp1s0f1_4 up ip link set dev enp1s0f1_5 up ip link set dev enp1s0f1_6 up ip link set dev enp1s0f1_7 up

4 SR-IOV User Guide

Step 2 View the OVS bridge information.
ovs-vsctl show

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

135

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

----End
4.4.5 Creating a VM
Installing the VM Software Packages
Step 1 Install VM dependencies.
yum install centos-release-qemu-ev yum install -y libvirt AAVMF virt-install qemu-guest-agent
Step 2 Upload the qemu-2.12.0 software package obtained in Software Packages to the server and install it.
yum localinstall -y *.rpm
Software packages:

Step 3 Modify the QEMU configuration file.
1. Open the qemu.conf file.
vim /etc/libvirt/qemu.conf
2. Set both user and group to root.
user = "root"
# The group for QEMU processes run by the system instance. It can be # specified in a similar way to user.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

136

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

group = "root" # Whether libvirt should dynamically change file ownership
----End

Creating a VM
Step 1 Start the libvertd service and set it to automatically start upon system boot.
systemctl start libvirtd systemctl enable libvirtd
Step 2 Create a storage pool.
1. Create a storage pool directory and configure the directory permissions.
mkdir -p /home/kvm/images chown root:root /home/kvm/images chmod 755 /home/kvm/images
2. Define a storage pool and bind it to the storage pool directory. Create a folder-based storage pool, activate it, and set it to start upon system boot.
virsh pool-define-as StoragePool --type dir --target /home/kvm/images virsh pool-build StoragePool virsh pool-start StoragePool virsh pool-autostart StoragePool
3. View the storage pool information.
virsh pool-info StoragePool virsh pool-list
Step 3 Create a drive space for the VM.
1. Create a volume.
For example, the volume is named 1.img, the storage pool is StoragePool, the volume capacity is 50 GB, the initially allocated capacity is 1 GB, the file format is qcow2, and the drive file format is qcow2.
virsh vol-create-as --pool StoragePool --name 1.img --capacity 50G --allocation 1G --format qcow2
2. View the volume information.
virsh vol-info /home/kvm/images/1.img

Step 4 Create a VM.
1. Create a VM vm1. Allocate four CPUs and 8 GB memory to it, and use 1.img as the drive space. Copy the .iso file to /home/iso/ and install CentOS 7.6.
virt-install --name=vm1 --vcpus=4 --ram=8192 \ --disk path=/home/kvm/images/1.img,format=qcow2,size=50,bus=virtio \ --cdrom /home/iso/CentOS-7-aarch64-Everything-1810.iso
2. Install the VM OS.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

137

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

3. Configure all items that contain an exclamation mark (!). Enter the serial number corresponding to the option, configure the parameter as prompted, and press b to start the installation.

4. After the installation is complete, the login prompt is displayed.

----End
Configuring the VM
Step 1 Modify the configuration of vm1.
virsh edit vm1
Delete <interface type='network'>xxx</interface>. Add the following content before </devices>:
<hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x01' slot='0x01' function='0x2'/> </source> </hostdev>

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

138

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
The values of domain, bus, slot, and function in <address domain='0x0000' bus='0x01' slot='0x01' function='0x2'/> correspond to the PCI port number 0000:01:01.2 of the VF.
The following figure shows the VM settings after modification:

Step 2 Modify the same configuration for other VMs.
NOTE
When configuring the PCI port numbers for the other VMs, bind the PCI port numbers to VF ports that are not in use.
----End
Cloning the VM
NOTE
Subsequent verification operations require up to eight VMs for a single server. Clone a sufficient number of VMs.
Step 1 Stop vm1. Run the virt-clone command in the virt-install software package to clone vm10.
virt-clone -o vm1 -n vm10 -f /home/kvm/images/10.img
In the command, -o indicates the source VM, -n indicates the new VM, and -f indicates that the newly created VM uses the file on the host machine as the image file.
NOTE
After the command is executed, vm10 is created. The CPU, memory, drive, and network resources allocated to vm10 are the same as those allocated to vm1. The CPU and network resources allocated to vm10 need to be configured separately.
Step 2 Check the status of the created VM.
virsh list --all

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

139

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 3 Change the host name of the VM.
The cloned VM has the same host name and IP address as vm1. Log in to vm10 and run the following command to change the host name:
hostnamectl --static set-hostname vm10
Step 4 Modify the VM IP address.
vim /etc/sysconfig/network-scripts/ifcfg-eth0 mv etc/sysconfig/network-scripts/ifcfg-eth0 etc/sysconfig/network-scripts/ifcfg-enp1s0
NOTE
When pass-through is enabled, the default VF driver is installed in the OS. By default the configuration defines the virtual NIC name. The NIC name varies according to the OS. This document uses enp1s0 as an example. Replace it with the actual name in your OS.
----End
4.4.6 Verifying Communication Between VMs
Step 1 Start the VMs.
virsh start vm1 virsh start vm2
Step 2 Log in to the VM.
Run the following command on server 1:
virsh console vm1
Step 3 Verify the pass-through.
Run the following command on VM 1 of server 1:
ping <Host1vm2_ip> ping <Host2vm1_ip>

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

140

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

VMs on the same physical machine and VMs on different physical machines can ping each other.
----End

4.5 Verifying SR-IOV

4.5.1 Restoring the Environment
Run the following command to restore the environment:
virsh shutdown vm$ ovs-vsctl del-br br-ovs echo 0 > /sys/class/net/enp1s0f0/device/sriov_numvfs echo 0 > /sys/class/net/enp1s0f1/device/sriov_numvfs
NOTE
The verification operations that follow are performed for each different function of SR-IOV. Before verifying each function, restore the environment.
4.5.2 Bonding
Step 1 Create VFs.
echo 4 > /sys/class/net/enp1s0f0/device/sriov_numvfs ip link set enp1s0f0 vf 0 mac e4:11:22:33:61:11 ip link set enp1s0f0 vf 1 mac e4:11:22:33:61:22 ip link set enp1s0f0 vf 2 mac e4:11:22:33:61:33 ip link set enp1s0f0 vf 3 mac e4:11:22:33:61:44 echo 0000:01:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:00.4 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:00.5 > /sys/bus/pci/drivers/mlx5_core/unbind devlink dev eswitch set pci/0000:01:00.0 mode switchdev echo 4 > /sys/class/net/enp1s0f1/device/sriov_numvfs ip link set enp1s0f1 vf 0 mac e4:11:22:33:62:11 ip link set enp1s0f1 vf 1 mac e4:11:22:33:62:22 ip link set enp1s0f1 vf 2 mac e4:11:22:33:62:33 ip link set enp1s0f1 vf 3 mac e4:11:22:33:62:44 echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.4 > /sys/bus/pci/drivers/mlx5_core/unbind echo 0000:01:01.5 > /sys/bus/pci/drivers/mlx5_core/unbind devlink dev eswitch set pci/0000:01:00.1 mode switchdev

NOTICE
 For a Linux bond, after setting pci mode to switchdev, strictly follow the operation procedure. Do not rebind the VFs immediately. Otherwise, an error is reported.
 Though it is allowed to create a bond immediately after the system is started, but it is not advised to do so. You can set onboot to no in the ifcfg-xxxxx file.
 Each MAC address must be unique on the local server, the peer server, and the switch.

Step 2 Create a Linux bond.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

141

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

1. Modify the configuration file of the physical NIC enp1s0f0 enp1s0f1.
vim /etc/sysconfig/network-scripts/ifcfg-enp1s0f0 vim /etc/sysconfig/network-scripts/ifcfg-enp1s0f1
Add the following two lines:
MASTER=bond0 SLAVE=yes

2. Add a configuration file for the port bond.
vim /etc/sysconfig/network-scripts/ifcfg-bond0
Add the following content:
DEVICE=bond0 NAME='bond0' TYPE=Ethernet NM_CONTROLLED=no ONBOOT=yes BOOTPROTO=none BONDING_OPTS='mode=4 miimon=100 xmit_hash_policy=layer3+4' IPV6INIT=no
NOTE
­ In the BONDING_OPTS configuration:
 mode indicates the mode. The value 4 indicates LACP.
 miimon=100 indicates that the monitoring is performed every 100 ms.
 xmit_hash_policy=layer3+4 indicates the LACP balancing policy, which uses Layer 3 and Layer 4 balancing, IP addresses, and ports. ­ The active/standby mode parameter is mode=1 miimon=100.
3. Load the bond kernel module.
modprobe bonding mode=4 miimon=100
NOTE
If needed, set mode to 1 for the active/standby mode, and to 4 for the LACP mode.
4. Start the bond.
ifup bond0

NOTICE Do not restart the network.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

142

Kunpeng BoostKit for Virtualization User Guides
5. Check the bond configuration.
cat /proc/net/bonding/bond0

4 SR-IOV User Guide

Step 3 Bind the VFs.
echo 0000:01:00.2 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:00.3 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:00.4 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:00.5 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.2 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.3 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.4 > /sys/bus/pci/drivers/mlx5_core/bind echo 0000:01:01.5 > /sys/bus/pci/drivers/mlx5_core/bind
Step 4 Start OVS and configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs bond0 ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ovs-vsctl add-port br-ovs enp1s0f1_3 ovs-vsctl add-port br-ovs enp1s0f1_2 ovs-vsctl add-port br-ovs enp1s0f1_1 ovs-vsctl add-port br-ovs enp1s0f1_0 ifconfig bond0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up ip link set dev enp1s0f1_0 up ip link set dev enp1s0f1_1 up ip link set dev enp1s0f1_2 up ip link set dev enp1s0f1_3 up
Step 5 Start the VM and log in to it.
virsh start vm1 virsh console vm1
Step 6 Use the two VMs to send traffic.
 Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

143

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 7 Press Ctrl+] to exit from the VMs. View the flow table offloading on a physical machine.
watch -n 1 -d ovs-appctl dpctl/dump-flows type=offloaded

Verify the bidirectional flow table offloading. In LACP mode, verify that the total bandwidth of the eight VMs is close to the total bandwidth of the two network ports. In active/standby mode, verify the network connectivity when one network port is disconnected.
Step 8 Delete the bond.
1. Stop the VM.
virsh shutdown vm1
2. Delete the OVS bridge.
ovs-vsctl del-br br-ovs
3. Unbind all related VFs (for details, see Step 1) or clear the VFs.
echo 0 > /sys/class/net/enp1s0f0/device/sriov_numvfs echo 0 > /sys/class/net/enp1s0f1/device/sriov_numvfs
4. Delete the bond.
ip link delete bond0 rmmod bonding
----End
4.5.3 QoS
Step 1 Configure VFs.
For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.
Step 2 Start Open Virtual Switch (OVS) and configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up
Step 3 Start the VM.
virsh start vm1
Step 4 Configure QoS rate limiting in the ingress direction.
ovs-vsctl set Interface enp1s0f0_0 ingress_policing_rate=100000
Step 5 Log in to the VM.
virsh console vm1
Step 6 Generate traffic to verify the rate limiting.  Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0
When the two VMs send traffic to each other, rate limiting does not work.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

144

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
The kernel version must be 5.7 or later to support inbound port rate limiting.
Step 7 Press Ctrl+] to exit from the VM. Configure QoS rate limiting in the outbound direction on the physical machine.
ovs-vsctl set port enp1s0f0_0 qos=@newqos -- --id=@newqos create qos type=linux-htb otherconfig:max-rate=200000000 \ queues=123=@q1 -- --id=@q1 create queue other-config:max-rate=200000000 ovs-ofctl add-flow br-ovs "in_port=2,actions=set_queue:123,normal"
Step 8 Log in to the VM again and generate traffic to verify outbound rate limiting.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

145

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 9 Press Ctrl+] to exit from the VM. View the flow table offloading on a physical machine.
watch -n 1 -d ovs-appctl dpctl/dump-flows
Flow table offloading:

----End

Verification Result
Rate limiting works in the inbound direction but not in the outbound direction. The rate limiting flow table is not offloaded.
4.5.4 Port Mirroring
Step 1 Configure VFs.
For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.
Step 2 Start OVS and configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ovs-vsctl add-port br-ovs enp1s0f0_4 ovs-vsctl add-port br-ovs enp1s0f0_5 ovs-vsctl add-port br-ovs enp1s0f0_6 ovs-vsctl add-port br-ovs enp1s0f0_7 ovs-vsctl add-port br-ovs enp1s0f0 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up ip link set dev enp1s0f0_4 up ip link set dev enp1s0f0_5 up ip link set dev enp1s0f0_6 up ip link set dev enp1s0f0_7 up
Step 3 Start the VMs.
 Run the following command on server 1:
virsh start vm1
 Run the following command on server 2:
virsh start vm1 virsh start vm4
Step 4 Configure the OVS SPAN port mirroring.
ovs-vsctl -- --id=@p get port enp1s0f0_3 -- --id=@q get port enp1s0f0_0 -- --id=@m create mirror name=m0 select_src_port=@q select_dst_port=@q output-port=@p -- set bridge br-ovs mirrors=@m

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

146

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
In this command, configure port mirroring on br-ovs to mirror the incoming and outgoing traffic of port enp1s0f0_0 to port enp1s0f0_3.  -- --id=@p get port enp1s0f0_3: Create an alias of port enp1s0f0_3.  -- --id=@m create mirror name=m0: Create a port mirror.  select_src_port=@q select_dst_port=@q output-port=@p: Set a port mirroring rule.
Among the parameters, select_src_port indicates that the traffic entering this port is mirrored, select_dst_port indicates that the traffic leaving this port is mirrored, and output-port indicates that the mirrored traffic is outputted to the designated port.  -- set bridge br-ovs mirrors=@m: Specify the port mirroring rule of the bridge.
Step 5 Log in to the VM.
Run the following command on server 2:
virsh console vm4
Step 6 Capture packets on the mirrored port. Send packets from server 1 to VM 1 (enp1s0f0_0) of server 2, and capture packets on VM 4 (enp1s0f0_3) of server 2.
Run the following command on VM 4 of server 2:
tcpdump -i enp1s0
Step 7 Generate traffic on the VMs and check the captured packets.  Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0
Captured packets:

Step 8 Press Ctrl+] to exit from the VMs and view the flow table offloading.
watch -n 1 -d ovs-appctl dpctl/dump-flows type=offloaded
Offloading status:

Step 9 Clear port mirrors.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

147

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

ovs-vsctl clear bridge br-ovs mirrors
Step 10 Configure the OVS RSPAN port mirroring.
ovs-vsctl set bridge br-ovs flood_vlans=111 ovs-vsctl -- --id=@q get port enp1s0f0_0 -- --id=@m create mirror name=m0 select_src_port=@q select_dst_port=@q output_vlan=111 -- set bridge br-ovs mirrors=@m
Step 11 Repeat Step 5 to Step 8 to check the packet capturing and offloading.
NOTE
Capture packets on VM 4 of server 2: tcpdump -i enp1s0 -ne
Captured packets:

Offloading status:

----End

Verification Result
Port mirroring in SPAN mode supports packet capturing and flow table offloading. Port mirroring in other modes supports only packet capturing.
4.5.5 GRO
Step 1 Check the generic receive offload (GRO) and large receive offload (LRO) status on the VM.
ethtool -k <dev> |grep generic-receive-offload ethtool -k <dev> |grep large-receive-offload
GRO and LRO are enabled by default.
Step 2 Disable GRO and LRO on the receiver.
Run the following command on VM 1 of server 2:
ethtool -K <dev> lro off ethtool -K <dev> gro off

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

148

Kunpeng BoostKit for Virtualization User Guides
Step 3 Receive packets on the receiver.
Run the following command on VM 1 of server 2:
iperf3 -s & tcpdump tcp -i <dev>
Step 4 Send packets from the sender.
Run the following command on VM 1 of server 1:
iperf3 -c <dst_ip>

4 SR-IOV User Guide

The length on the receiver is 1448.

Step 5 Enable LRO at the receiver.
ethtool -K <dev> lro on 11:52:01.964338 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4106154718:4106156166, ack 1, win 229, options [nop,nop,TS val 3504848929 ecr 2864936518], length 1448 11:52:01.966923 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4112365190:4112430350, ack 1, win 229, options [nop,nop,TS val 3504848931 ecr 2864936521], length 65160 11:52:01.969460 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4118834854:4118836302, ack 1, win 229, options [nop,nop,TS val 3504848934 ecr 2864936523], length 1448 11:52:01.972144 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4125595566:4125660726, ack 1, win 229, options [nop,nop,TS val 3504848937 ecr 2864936526], length 65160 11:52:01.974685 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4131612006:4131677166, ack 1, win 229, options [nop,nop,TS val 3504848939 ecr 2864936528], length 65160 11:52:01.977213 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4137750078:4137751526, ack 1, win 229, options [nop,nop,TS val 3504848942 ecr 2864936531], length 1448 11:52:01.979802 IP 192.168.1.21.40472 > VM1.targus-getdata1: Flags [.], seq 4144198022:4144199470, ack 1, win 229, options [nop,nop,TS val 3504848944 ecr 2864936533], length 1448
Step 6 Enable GRO.
ethtool -K <dev> gro on 11:49:03.845248 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1786770278:1786835438, ack 1, win 229, options [nop,nop,TS val 3504670810 ecr 2864758399], length 65160 11:49:03.849930 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1800345278:1800410438, ack 1, win 229, options [nop,nop,TS val 3504670815 ecr 2864758404], length 65160 11:49:03.852684 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1808393262:1808458422, ack 1, win 229, options [nop,nop,TS val 3504670817 ecr 2864758406], length 65160 11:49:03.855304 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1815960510:1816025670, ack 1, win 229, options [nop,nop,TS val 3504670820 ecr 2864758409], length 65160 11:49:03.857815 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1823200510:1823265670, ack 1, win 229, options [nop,nop,TS val 3504670823 ecr 2864758411], length 65160 11:49:03.860765 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1830801062:1830866222, ack 1, win 229, options [nop,nop,TS val 3504670826 ecr 2864758414], length 65160

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

149

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

11:49:03.863560 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1836160110:1836225270, ack 1, win 229, options [nop,nop,TS val 3504670828 ecr 2864758417], length 65160 11:49:03.866532 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1840640222:1840705382, ack 1, win 229, options [nop,nop,TS val 3504670831 ecr 2864758420], length 65160 11:49:03.869366 IP 192.168.1.21.40468 > VM1.targus-getdata1: Flags [.], seq 1844448462:1844489006, ack 1, win 229, options [nop,nop,TS val 3504670834 ecr 2864758423], length 40544
----End
4.5.6 Traffic Sampling
Step 1 Configure sFlow on the sender.
Run the following command on server 1:
ovs-vsctl -- --id=@sflow create sflow agent=enp3s0 target=\"<host2_ip>:6343\" header=128 sampling=64 polling=10 -- set bridge br-ovs sflow=@sflow
NOTE
Set target to the collector IP address, agent to the sflow egress. Add the OVS bridge name next to bridge.

Step 2 Install the sflowtool on the collector.
git config --global http.sslVerify false git clone https://github.com/sflow/sflowtool cd sflowtool ./boot.sh ./configure make make install
Step 3 Use the sFlow.
Run the following command on server 2:
./src/sflowtool -p 6343
Step 4 Start two VMs on the sender to send packets. On the physical machine used to sample traffic, check the sampling result.
The following information is displayed:
headerBytes E4-11-22-33-44-51-E4-11-22-33-44-50-08-00-45-00-00-54-D4-4E-00-00-40-01-22-DF-C0A8-01-15-C0-A8-01-16-00-00-B9-20-2B-93-00-1A-D6-87-9A-5F-00-00-00-00E7-77-04-00-00-00-00-00-10-11-12-13-14-15-16-17-18-19-1A-1B-1C-1D-1E-1F-20-21-22-23-24-25-26-27-2 8-29-2A-2B-2C-2D-2E-2F-30-31-32-33-34-35-36-37 dstMAC e41122334451 srcMAC e41122334450 IPSize 84 ip.tot_len 84 srcIP 192.168.1.21 dstIP 192.168.1.22
Step 5 Clear the network configuration.
ovs-vsctl del-br br-ovs

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

150

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

Step 6 Configure netflow on the sender.
Run the following command on server 1:
ovs-vsctl -- set Bridge br-ovs netflow=@nf -- --id=@nf create NetFlow target=\"<host2_ip>:2055\" activetimeout=60
Step 7 Use the NetFlow.
Run the following command on server 2:
./src/sflowtool -p 2055
Step 8 Repeat Step 4 and check the sampling result.
The following information is displayed:
startDatagram ================================= datagramSourceIP 90.90.48.60 datagramSize 120 unixSecondsUTC 1607393677 localtime 2020-12-08T10:14:37+0800 datagramVersion 327682 unexpected datagram version number (source IP = 90.90.48.60) 00-05-00-02-<*>-00-00-d7-15-5f-ce-e1-ac-2a-e4-37-46 00-00-00-01-5b-5b-00-00-c0-a8-63-0b-c0-a8-63-15 00-00-00-00-00-09-00-01-00-00-00-08-00-00-02-a6 00-00-5f-a7-00-00-60-1a-c9-26-14-51-00-1a-06-00 00-00-00-00-00-00-00-00-c0-a8-63-15-c0-a8-63-0b 00-00-00-00-00-01-00-09-00-00-00-07-00-00-01-da 00-00-5f-a9-00-00-60-1a-14-51-c9-26-00-1a-06-00 00-00-00-00-00-00-00-00 caught exception: 2 endDatagram =================================
----End
4.5.7 Protocol Offloading

VLAN Offloading

Step 1 Configure VFs. For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.

Step 2

Configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ovs-vsctl add-port br-ovs enp1s0f0 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up ovs-vsctl set Port enp1s0f0_0 tag=100

Step 3 Start the VM and log in to it.
virsh start vm1 virsh console vm1

Step 4 Verify the network connectivity.

Run the following command on VM 1 of server 1:

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

151

Kunpeng BoostKit for Virtualization User Guides
ping <Host2 vm1_IP>

4 SR-IOV User Guide

Step 5 Verify flow table offloading and VLAN tagging. Use the two VMs to send traffic and check whether the flow table is offloaded.  Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0
 Run the following command on any of the physical machines:
ovs-appctl dpctl/dump-flows type=offloaded
Step 6 Verify the CT configuration. 1. Configure the CT flow table on the physical machine.
ovs-ofctl del-flows br-ovs ovs-ofctl add-flow br-ovs "arp, actions=normal" ovs-ofctl add-flow br-ovs "table=0, ip,ct_state=-trk, actions=ct(table=1)" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+new, actions=ct(commit),normal" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+est, actions=normal"
2. Verify the flow table offloading.
ovs-ofctl dump-flows br-ovs

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

152

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
After the preceding commands are executed, the command output is displayed only once. You can run the watch -n 1 -d 'commands' command to view the command output again. Replace commands with the actual command. ovs-appctl dpctl/dump-flows

ovs-appctl dpctl/dump-flows type=offloaded

NOTE
CT stateful offloading requires the support of the complete TC module of kernel 5.7. Kernel 4.14 of CentOS 7.6 does not fully support this function.
----End

VXLAN Offloading

Step 1 Configure VFs.

For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.

Step 2

Configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up ovs-vsctl add-port br-ovs vxlan0 -- set Interface vxlan0 type=vxlan options:local_ip=192.168.1.12 options:remote_ip=192.168.1.11 options:key=98 ifconfig enp1s0f0 192.168.1.12/24 up

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

153

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
When configuring the VXLAN port on the peer server, change local_ip and remote_ip to each other.
Step 3 Start the VM and log in to it.
virsh start vm1 virsh console vm1
Step 4 Set the maximum transmission unit (MTU) of the two VMs to a value not greater than 1450.
ifconfig <dev> mtu 1450
Step 5 Use the two VMs to send traffic.  Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0
Step 6 Verify the flow table offloading on the physical machine.
ovs-appctl dpctl/dump-flows

Step 7 Verify the CT configuration.
1. Configure the CT flow table on the physical machine.
ovs-ofctl del-flows br-ovs ovs-ofctl add-flow br-ovs "arp, actions=normal" ovs-ofctl add-flow br-ovs "table=0, ip,ct_state=-trk, actions=ct(table=1)" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+new, actions=ct(commit),normal" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+est, actions=normal"

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

154

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

2. Verify the flow table offloading.
ovs-appctl dpctl/dump-flows type=offloaded

NOTE
CT stateful offloading requires the support of the complete TC module of kernel 5.7. Kernel 4.14 of CentOS 7.6 does not fully support this function.
----End

Geneve Offloading
Step 1 Configure VFs.
For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.
Step 2 Configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up ovs-vsctl add-port br-ovs tun0 -- set Interface tun0 type=geneve options:local_ip=192.168.1.12 options:remote_ip=192.168.1.11 ifconfig enp1s0f0 192.168.1.12/24 up
NOTE
When configuring the VXLAN port on the peer server, change local_ip and remote_ip to each other.
Step 3 Start the VM and log in to it.
virsh start vm1 virsh console vm1
Step 4 Set the MTU of the two VMs to a value not greater than 1450.
ifconfig <dev> mtu 1450

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

155

Kunpeng BoostKit for Virtualization User Guides
Step 5 Use the two VMs to send traffic.  Run the following command on VM 1 of server 2:
iperf3 -s
 Run the following command on VM 1 of server 1:
iperf3 -c <Host2vm1_ip> -t 0
Step 6 Verify the flow table offloading on the physical machine.
ovs-appctl dpctl/dump-flows

4 SR-IOV User Guide

NOTE
Geneve protocol offloading requires that the kernel version be 5.3 or later.
Step 7 Verify the CT configuration.
1. Configure the CT flow table on the physical machine.
ovs-ofctl del-flows br-ovs ovs-ofctl add-flow br-ovs "arp, actions=normal" ovs-ofctl add-flow br-ovs "table=0, ip,ct_state=-trk, actions=ct(table=1)" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+new, actions=ct(commit),normal" ovs-ofctl add-flow br-ovs "table=1, ip,ct_state=+trk+est, actions=normal"

2. Verify the flow table offloading.
ovs-appctl dpctl/dump-flows type=offloaded

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

156

Kunpeng BoostKit for Virtualization User Guides

4 SR-IOV User Guide

NOTE
CT stateful offloading requires the support of the complete TC module of kernel 5.7. Kernel 4.14 of CentOS 7.6 does not fully support this function. After flow table offloading on the hardware, the network connection becomes unidirectional.
----End
IPv6 Offloading
Step 1 Configure VFs.
For details, see 4.4.2 Configuring Kernel-Mode SR-IOV.
Step 2 Configure the network.
systemctl start openvswitch ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs enp1s0f0_0 ovs-vsctl add-port br-ovs enp1s0f0_1 ovs-vsctl add-port br-ovs enp1s0f0_2 ovs-vsctl add-port br-ovs enp1s0f0_3 ovs-vsctl add-port br-ovs enp1s0f0 ip link set dev enp1s0f0 up ip link set dev enp1s0f0_0 up ip link set dev enp1s0f0_1 up ip link set dev enp1s0f0_2 up ip link set dev enp1s0f0_3 up
Step 3 Start the VM and log in to it.
virsh start vm1 virsh console vm1
Step 4 Add IPv6 addresses to the VMs.  Run the following command on VM 1 of server 1:
ifconfig enp1s0 add 3000:1::11/64
 Run the following command on VM 1 of server 2:
ifconfig enp1s0 add 3000:1::12/64
Step 5 Use the two VMs to send traffic.  Run the following command on VM 1 of server 2 (receiver):
iperf3 -6 -s
 Run the following command on VM 1 of server 1 (sender):
iperf3 -6 -c 3000:1::12 -u -l 512 -t 999
Step 6 Check the flow table offloading on the physical machine.
ovs-appctl dpctl/dump-flows type=offloaded

----End

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

157

Kunpeng BoostKit for Virtualization User Guides

A Change History

A Change History

Date

Description

2021-03-23

This issue is the sixth official release.
Changed the solution name from "Kunpeng virtualization solution" to "Kunpeng BoostKit for Virtualization".

2020-12-30 This issue is the fifth official release. Added the 4 SR-IOV User Guide.

2020-11-30 This issue is the fourth official release. Added the 3 XPF User Guide.

2020-09-28 This issue is the third official release. Added the 2 Kube-OVN User Guide.

2020-09-21

This issue is the second official release.
Changed the solution name from "Kunpeng cloud platform solution" to "Kunpeng virtualization solution".

2020-06-24 This issue is the first official release.

Issue 06 (2021-03-23)

Copyright © Huawei Technologies Co., Ltd.

158


AH Formatter V6.2 MR8 for Windows : 6.2.10.20473 (2015/04/14 10:00JST) Antenna House PDF Output Library 6.2.680 (Windows)