The file system target parameter with the path sub-parameter determines the
location on the host physical machine file system to attach volumes created with
this storage pool.
For example, sdb1, sdb2, sdb3. Using /dev/, as in the example below, means
volumes created from this storage pool can be accessed as /dev/sdb1, /dev/sdb2,
/dev/sdb3.
< f o rmat t yp e= ' gpt' />
The format parameter specifies the partition table type. This example uses the gpt
in the example below, to match the GPT disk label type created in the previous step.
Create the XML file for the storage pool device with a text editor.
Examp le 14 .2. D isk b ased st o rag e d evice st o rag e p o o l
guest_images_disk/dev
109
Virt ualiz at ion Deployment and Administ rat ion G uide
3. At t ach t h e d evice
Add the storage pool definition using the vi rsh po o l -d efi ne command with the XML
configuration file created in the previous step.
# virsh pool-define ~/guest_images_disk.xml
Pool guest_images_disk defined from /root/guest_images_disk.xml
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_disk
inactive
no
4. St art t h e st o rag e p o o l
Start the storage pool with the vi rsh po o l -start command. Verify the pool is started with
the vi rsh po o l -l i st --al l command.
# virsh pool-start guest_images_disk
Pool guest_images_disk started
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_disk
active
no
5. T u rn o n au t o st art
Turn on autostart for the storage pool. Autostart configures the l i bvi rtd service to start
the storage pool when the service starts.
# virsh pool-autostart guest_images_disk
Pool guest_images_disk marked as autostarted
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_disk
active
yes
6. Verif y t h e st o rag e p o o l co n f ig u rat io n
Verify the storage pool was created correctly, the sizes reported correctly, and the state
reports as runni ng .
# virsh pool-info guest_images_disk
Name:
guest_images_disk
UUID:
551a67c8-5f2a-012c-3844-df29b167431c
State:
running
Capacity:
465.76 GB
Allocation:
0.00
Available:
465.76 GB
# ls -la /dev/sdb
110
Chapt er 1 4 . St orage pools
brw-rw----. 1 root disk 8, 16 May 30 14:08 /dev/sdb
# virsh vol-list guest_images_disk
Name
Path
----------------------------------------7. O p t io n al: R emo ve t h e t emp o rary co n f ig u rat io n f ile
Remove the temporary storage pool XML configuration file if it is not needed anymore.
# rm ~/guest_images_disk.xml
A disk based storage pool is now available.
14 .1.2. Delet ing a st orage pool using virsh
The following demonstrates how to delete a storage pool using virsh:
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it.
# virsh pool-destroy guest_images_disk
2. Remove the storage pool's definition
# virsh pool-undefine guest_images_disk
14 .2. Part it ion-based st orage pools
This section covers using a pre-formatted block device, a partition, as a storage pool.
For the following examples, a host physical machine has a 500GB hard drive (/d ev/sd c)
partitioned into one 500GB, ext4 formatted partition (/d ev/sd c1). We set up a storage pool for it
using the procedure below.
14 .2.1. Creat ing a part it ion-based st orage pool using virt -manager
This procedure creates a new storage pool using a partition of a storage device.
Pro ced u re 14 .1. C reat in g a p art it io n - b ased st o rag e p o o l wit h virt - man ag er
1. O p en t h e st o rag e p o o l set t in g s
a. In the vi rt-manag er graphical interface, select the host physical machine from the
main window.
Open the Ed i t menu and select C o nnecti o n D etai l s
111
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .1. C o n n ect io n D et ails
b. Click on the Sto rag e tab of the C o nnecti o n D etai l s window.
Fig u re 14 .2. St o rag e t ab
2. C reat e t h e n ew st o rag e p o o l
a. Ad d a n ew p o o l ( p art 1)
Press the + button (the add pool button). The Ad d a New Sto rag e P o o l wizard
appears.
Choose a Name for the storage pool. This example uses the name guest_images_fs.
Change the T ype to fs: P re-Fo rmatted Bl o ck D evi ce.
112
Chapt er 1 4 . St orage pools
Fig u re 14 .3. St o rag e p o o l n ame an d t yp e
Press the Fo rward button to continue.
b. Ad d a n ew p o o l ( p art 2)
Change the T arg et P ath, Fo rmat, and So urce P ath fields.
113
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .4 . St o rag e p o o l p at h an d f o rmat
T arg et Pat h
Enter the location to mount the source device for the storage pool in the
T arg et P ath field. If the location does not already exist, vi rt-manag er
will create the directory.
F o rmat
Select a format from the Fo rmat list. The device is formatted with the selected
format.
This example uses the ext4 file system, the default Red Hat Enterprise Linux
file system.
S o u rce Pat h
Enter the device in the So urce P ath field.
This example uses the /dev/sdc1 device.
Verify the details and press the Fi ni sh button to create the storage pool.
3. Verif y t h e n ew st o rag e p o o l
The new storage pool appears in the storage list on the left after a few seconds. Verify the size
is reported as expected, 458.20 GB Free in this example. Verify the State field reports the new
storage pool as Active.
Select the storage pool. In the Auto start field, click the O n Bo o t checkbox. This will make
sure the storage device starts whenever the l i bvi rtd service starts.
114
Chapt er 1 4 . St orage pools
Fig u re 14 .5. St o rag e list co n f irmat io n
The storage pool is now created, close the C o nnecti o n D etai l s window.
14 .2.2. Delet ing a st orage pool using virt -manager
This procedure demonstrates how to delete a storage pool.
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it. To do this, select the storage pool
you want to stop and click the red X icon at the bottom of the Storage window.
Fig u re 14 .6 . St o p Ico n
2. D elete the storage pool by clicking the Trash can icon. This icon is only enabled if you stop
the storage pool first.
14 .2.3. Creat ing a part it ion-based st orage pool using virsh
This section covers creating a partition-based storage pool with the vi rsh command.
Warning
D o not use this procedure to assign an entire disk as a storage pool (for example,
/d ev/sd b). Guests should not be given write access to whole disks or block devices. Only
use this method to assign partitions (for example, /d ev/sd b1) to storage pools.
Pro ced u re 14 .2. C reat in g p re- f o rmat t ed b lo ck d evice st o rag e p o o ls u sin g virsh
115
Virt ualiz at ion Deployment and Administ rat ion G uide
Pro ced u re 14 .2. C reat in g p re- f o rmat t ed b lo ck d evice st o rag e p o o ls u sin g virsh
1. C reat e t h e st o rag e p o o l d ef in it io n
Use the virsh po o l -d efi ne-as command to create a new storage pool definition. There are
three options that must be provided to define a pre-formatted disk as a storage pool:
P art it io n n ame
The name parameter determines the name of the storage pool. This example uses
the name guest_images_fs in the example below.
d evice
The device parameter with the path attribute specifies the device path of the
storage device. This example uses the partition /dev/sdc1.
mo u n t p o in t
The mountpoint on the local file system where the formatted device will be
mounted. If the mount point directory does not exist, the vi rsh command can create
the directory.
The directory /guest_images is used in this example.
# virsh pool-define-as guest_images_fs fs - - /dev/sdc1 "/guest_images"
Pool guest_images_fs defined
The new pool is now created.
2. Verif y t h e n ew p o o l
List the present storage pools.
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_fs
inactive
no
3. C reat e t h e mo u n t p o in t
Use the vi rsh po o l -bui l d command to create a mount point for a pre-formatted file
system storage pool.
# virsh pool-build guest_images_fs
Pool guest_images_fs built
# ls -la /guest_images
total 8
drwx------. 2 root root 4096 May 31 19:38 .
dr-xr-xr-x. 25 root root 4096 May 31 19:38 ..
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_fs
inactive
no
116
Chapt er 1 4 . St orage pools
4. St art t h e st o rag e p o o l
Use the vi rsh po o l -start command to mount the file system onto the mount point and
make the pool available for use.
# virsh pool-start guest_images_fs
Pool guest_images_fs started
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_fs
active
no
5. T u rn o n au t o st art
By default, a storage pool is defined with vi rsh is not set to automatically start each time
l i bvi rtd starts. Turn on automatic start with the vi rsh po o l -auto start command. The
storage pool is now automatically started each time l i bvi rtd starts.
# virsh pool-autostart guest_images_fs
Pool guest_images_fs marked as autostarted
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_fs
active
yes
6. Verif y t h e st o rag e p o o l
Verify the storage pool was created correctly, the sizes reported are as expected, and the state
is reported as runni ng . Verify there is a " lost+found" directory in the mount point on the file
system, indicating the device is mounted.
# virsh pool-info guest_images_fs
Name:
guest_images_fs
UUID:
c7466869-e82a-a66c-2187-dc9d6f0877d0
State:
running
Persistent:
yes
Autostart:
yes
Capacity:
458.39 GB
Allocation:
197.91 MB
Available:
458.20 GB
# mount | grep /guest_images
/dev/sdc1 on /guest_images type ext4 (rw)
# ls -la /guest_images
total 24
drwxr-xr-x. 3 root root 4096 May 31 19:47 .
dr-xr-xr-x. 25 root root 4096 May 31 19:38 ..
drwx------. 2 root root 16384 May 31 14:18 lost+found
14 .2.4 . Delet ing a st orage pool using virsh
117
Virt ualiz at ion Deployment and Administ rat ion G uide
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it.
# virsh pool-destroy guest_images_disk
2. Optionally, if you want to remove the directory where the storage pool resides use the
following command:
# virsh pool-delete guest_images_disk
3. Remove the storage pool's definition
# virsh pool-undefine guest_images_disk
14 .3. Direct ory-based st orage pools
This section covers storing guest virtual machines in a directory on the host physical machine.
D irectory-based storage pools can be created with vi rt-manag er or the vi rsh command line
tools.
14 .3.1. Creat ing a direct ory-based st orage pool wit h virt -manager
1. C reat e t h e lo cal d irect o ry
a. O p t io n al: C reat e a n ew d irect o ry f o r t h e st o rag e p o o l
Create the directory on the host physical machine for the storage pool. This example
uses a directory named /guest virtual machine_images.
# mkdir /guest_images
b. Set d irect o ry o wn ersh ip
Change the user and group ownership of the directory. The directory must be owned
by the root user.
# chown root:root /guest_images
c. Set d irect o ry p ermissio n s
Change the file permissions of the directory.
# chmod 700 /guest_images
d. Verif y t h e ch an g es
Verify the permissions were modified. The output shows a correctly configured empty
directory.
118
Chapt er 1 4 . St orage pools
# ls -la /guest_images
total 8
drwx------. 2 root root 4096 May 28 13:57 .
dr-xr-xr-x. 26 root root 4096 May 28 13:57 ..
2. C o n f ig u re SELin u x f ile co n t ext s
Configure the correct SELinux context for the new directory. Note that the name of the pool
and the directory do not have to match. However, when you shutdown the guest virtual
machine, libvirt has to set the context back to a default value. The context of the directory
determines what this default value is. It is worth explicitly labeling the directory virt_image_t,
so that when the guest virtual machine is shutdown, the images get labeled 'virt_image_t' and
are thus isolated from other processes running on the host physical machine.
# semanage fcontext -a -t virt_image_t '/guest_images(/.*)?'
# restorecon -R /guest_images
3. O p en t h e st o rag e p o o l set t in g s
a. In the vi rt-manag er graphical interface, select the host physical machine from the
main window.
Open the Ed i t menu and select C o nnecti o n D etai l s
Fig u re 14 .7. C o n n ect io n d et ails win d o w
b. Click on the Sto rag e tab of the C o nnecti o n D etai l s window.
119
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .8. St o rag e t ab
4. C reat e t h e n ew st o rag e p o o l
a. Ad d a n ew p o o l ( p art 1)
Press the + button (the add pool button). The Ad d a New Sto rag e P o o l wizard
appears.
Choose a Name for the storage pool. This example uses the name guest_images.
Change the T ype to d i r: Fi l esystem D i recto ry.
Fig u re 14 .9 . N ame t h e st o rag e p o o l
120
Chapt er 1 4 . St orage pools
Press the Fo rward button to continue.
b. Ad d a n ew p o o l ( p art 2)
Change the T arg et P ath field. For example, /guest_images.
Verify the details and press the Fi ni sh button to create the storage pool.
5. Verif y t h e n ew st o rag e p o o l
The new storage pool appears in the storage list on the left after a few seconds. Verify the size
is reported as expected, 36.41 GB Free in this example. Verify the State field reports the new
storage pool as Active.
Select the storage pool. In the Auto start field, confirm that the O n Bo o t checkbox is
checked. This will make sure the storage pool starts whenever the l i bvi rtd service starts.
Fig u re 14 .10. Verif y t h e st o rag e p o o l in f o rmat io n
The storage pool is now created, close the C o nnecti o n D etai l s window.
14 .3.2. Delet ing a st orage pool using virt -manager
This procedure demonstrates how to delete a storage pool.
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it. To do this, select the storage pool
you want to stop and click the red X icon at the bottom of the Storage window.
121
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .11. St o p Ico n
2. D elete the storage pool by clicking the Trash can icon. This icon is only enabled if you stop
the storage pool first.
14 .3.3. Creat ing a direct ory-based st orage pool wit h virsh
1. C reat e t h e st o rag e p o o l d ef in it io n
Use the vi rsh po o l -d efi ne-as command to define a new storage pool. There are two
options required for creating directory-based storage pools:
The name of the storage pool.
This example uses the name guest_images. All further vi rsh commands used in this
example use this name.
The path to a file system directory for storing guest image files. If this directory does not
exist, vi rsh will create it.
This example uses the /guest_images directory.
# virsh pool-define-as guest_images dir - - - - "/guest_images"
Pool guest_images defined
2. Verif y t h e st o rag e p o o l is list ed
Verify the storage pool object is created correctly and the state reports it as i nacti ve.
# virsh pool-list --all
Name
State
122
Autostart
Chapt er 1 4 . St orage pools
----------------------------------------default
active
yes
guest_images
inactive
no
3. C reat e t h e lo cal d irect o ry
Use the vi rsh po o l -bui l d command to build the directory-based storage pool for the
directory guest_images (for example), as shown:
# virsh pool-build guest_images
Pool guest_images built
# ls -la /guest_images
total 8
drwx------. 2 root root 4096 May 30 02:44 .
dr-xr-xr-x. 26 root root 4096 May 30 02:44 ..
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images
inactive
no
4. St art t h e st o rag e p o o l
Use the virsh command po o l -start to enable a directory storage pool, thereby allowing
allowing volumes of the pool to be used as guest disk images.
# virsh pool-start guest_images
Pool guest_images started
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images
active
no
5. T u rn o n au t o st art
Turn on autostart for the storage pool. Autostart configures the l i bvi rtd service to start
the storage pool when the service starts.
# virsh pool-autostart guest_images
Pool guest_images marked as autostarted
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images
active
yes
6. Verif y t h e st o rag e p o o l co n f ig u rat io n
Verify the storage pool was created correctly, the size is reported correctly, and the state is
reported as runni ng . If you want the pool to be accessible even if the guest virtual machine
is not running, make sure that P ersi stent is reported as yes. If you want the pool to start
automatically when the service starts, make sure that Auto start is reported as yes.
123
Virt ualiz at ion Deployment and Administ rat ion G uide
# virsh pool-info guest_images
Name:
guest_images
UUID:
779081bf-7a82-107b-2874-a19a9c51d24c
State:
running
Persistent:
yes
Autostart:
yes
Capacity:
49.22 GB
Allocation:
12.80 GB
Available:
36.41 GB
# ls -la /guest_images
total 8
drwx------. 2 root root 4096 May 30 02:44 .
dr-xr-xr-x. 26 root root 4096 May 30 02:44 ..
#
A directory-based storage pool is now available.
14 .3.4 . Delet ing a st orage pool using virsh
The following demonstrates how to delete a storage pool using virsh:
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it.
# virsh pool-destroy guest_images_disk
2. Optionally, if you want to remove the directory where the storage pool resides use the
following command:
# virsh pool-delete guest_images_disk
3. Remove the storage pool's definition
# virsh pool-undefine guest_images_disk
14 .4 . LVM-based st orage pools
This chapter covers using LVM volume groups as storage pools.
LVM-based storage groups provide the full flexibility of LVM.
Note
Thin provisioning is currently not possible with LVM based storage pools.
124
Chapt er 1 4 . St orage pools
Note
Please refer to the Red Hat Enterprise Linux Storage Administration Guide for more details on LVM.
Warning
LVM-based storage pools require a full disk partition. If activating a new partition/device with
these procedures, the partition will be formatted and all data will be erased. If using the host's
existing Volume Group (VG) nothing will be erased. It is recommended to back up the storage
device before commencing the following procedure.
14 .4 .1. Creat ing an LVM-based st orage pool wit h virt -manager
LVM-based storage pools can use existing LVM volume groups or create new LVM volume groups on
a blank partition.
1. O p t io n al: C reat e n ew p art it io n f o r LVM vo lu mes
These steps describe how to create a new partition and LVM volume group on a new hard
disk drive.
Warning
This procedure will remove all data from the selected storage device.
a. C reat e a n ew p art it io n
Use the fd i sk command to create a new disk partition from the command line. The
following example creates a new partition that uses the entire disk on the storage
device /d ev/sd b.
# fdisk /dev/sdb
Command (m for help):
Press n for a new partition.
b. Press p for a primary partition.
Command action
e
extended
p
primary partition (1-4)
c. Choose an available partition number. In this example the first partition is chosen by
entering 1.
Partition number (1-4): 1
d. Enter the default first cylinder by pressing Enter.
125
Virt ualiz at ion Deployment and Administ rat ion G uide
First cylinder (1-400, default 1):
e. Select the size of the partition. In this example the entire disk is allocated by pressing
Enter.
Last cylinder or +size or +sizeM or +sizeK (2-400, default
400):
f. Set the type of partition by pressing t.
Command (m for help): t
g. Choose the partition you created in the previous steps. In this example, the partition
number is 1.
Partition number (1-4): 1
h. Enter 8e for a Linux LVM partition.
Hex code (type L to list codes): 8e
i. write changes to disk and quit.
Command (m for help): w
Command (m for help): q
j. C reat e a n ew LVM vo lu me g ro u p
Create a new LVM volume group with the vg create command. This example creates
a volume group named guest_images_lvm.
# vgcreate guest_images_lvm /dev/sdb1
Physical volume "/dev/vdb1" successfully created
Volume group "guest_images_lvm" successfully created
The new LVM volume group, guest_images_lvm, can now be used for an LVM-based storage
pool.
2. O p en t h e st o rag e p o o l set t in g s
a. In the vi rt-manag er graphical interface, select the host from the main window.
Open the Ed i t menu and select C o nnecti o n D etai l s
126
Chapt er 1 4 . St orage pools
Fig u re 14 .12. C o n n ect io n d et ails
b. Click on the Sto rag e tab.
Fig u re 14 .13. St o rag e t ab
3. C reat e t h e n ew st o rag e p o o l
a. St art t h e Wiz ard
Press the + button (the add pool button). The Ad d a New Sto rag e P o o l wizard
appears.
Choose a Name for the storage pool. We use guest_images_lvm for this example. Then
change the T ype to l o g i cal : LVM Vo l ume G ro up, and
127
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .14 . Ad d LVM st o rag e p o o l
Press the Fo rward button to continue.
b. Ad d a n ew p o o l ( p art 2)
Change the T arg et P ath field. This example uses /guest_images.
Now fill in the T arg et P ath and So urce P ath fields, then tick the Bui l d P o o l
check box.
Use the T arg et P ath field to either select an existing LVM volume group or as the
name for a new volume group. The default format is /d ev/storage_pool_name.
This example uses a new volume group named /dev/guest_images_lvm.
The So urce P ath field is optional if an existing LVM volume group is used in the
T arg et P ath.
For new LVM volume groups, input the location of a storage device in the So urce
P ath field. This example uses a blank partition /dev/sdc.
The Bui l d P o o l checkbox instructs vi rt-manag er to create a new LVM
volume group. If you are using an existing volume group you should not select the
Bui l d P o o l checkbox.
This example is using a blank partition to create a new volume group so the
Bui l d P o o l checkbox must be selected.
128
Chapt er 1 4 . St orage pools
Fig u re 14 .15. Ad d t arg et an d so u rce
Verify the details and press the Fi ni sh button format the LVM volume group and
create the storage pool.
c. C o n f irm t h e d evice t o b e f o rmat t ed
A warning message appears.
Fig u re 14 .16 . Warn in g messag e
Press the Y es button to proceed to erase all data on the storage device and create
the storage pool.
4. Verif y t h e n ew st o rag e p o o l
The new storage pool will appear in the list on the left after a few seconds. Verify the details
are what you expect, 465.76 GB Free in our example. Also verify the State field reports the
new storage pool as Active.
It is generally a good idea to have the Auto start check box enabled, to ensure the storage
pool starts automatically with libvirtd.
129
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .17. C o n f irm LVM st o rag e p o o l d et ails
Close the Host D etails dialog, as the task is now complete.
14 .4 .2. Delet ing a st orage pool using virt -manager
This procedure demonstrates how to delete a storage pool.
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it. To do this, select the storage pool
you want to stop and click the red X icon at the bottom of the Storage window.
130
Chapt er 1 4 . St orage pools
Fig u re 14 .18. St o p Ico n
2. D elete the storage pool by clicking the Trash can icon. This icon is only enabled if you stop
the storage pool first.
14 .4 .3. Creat ing an LVM-based st orage pool wit h virsh
This section outlines the steps required to create an LVM-based storage pool with the vi rsh
command. It uses the example of a pool named g u est _imag es_lvm from a single drive
(/d ev/sd c). This is only an example and your settings should be substituted as appropriate.
Pro ced u re 14 .3. C reat in g an LVM- b ased st o rag e p o o l wit h virsh
1. D efine the pool name g u est _imag es_lvm.
# virsh pool-define-as guest_images_lvm logical - - /dev/sdc
libvirt_lvm \ /dev/libvirt_lvm
Pool guest_images_lvm defined
2. Build the pool according to the specified name. If you are using an already existing volume
group, skip this step.
# virsh pool-build guest_images_lvm
Pool guest_images_lvm built
3. Initialize the new pool.
131
Virt ualiz at ion Deployment and Administ rat ion G uide
# virsh pool-start guest_images_lvm
Pool guest_images_lvm started
4. Show the volume group information with the vg s command.
# vgs
VG
#PV #LV #SN Attr
VSize
VFree
libvirt_lvm
1
0
0 wz--n- 465.76g 465.76g
5. Set the pool to start automatically.
# virsh pool-autostart guest_images_lvm
Pool guest_images_lvm marked as autostarted
6. List the available pools with the vi rsh command.
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
guest_images_lvm
active
yes
7. The following commands demonstrate the creation of three volumes (volume1, volume2 and
volume3) within this pool.
# virsh vol-create-as guest_images_lvm volume1 8G
Vol volume1 created
# virsh vol-create-as guest_images_lvm volume2 8G
Vol volume2 created
# virsh vol-create-as guest_images_lvm volume3 8G
Vol volume3 created
8. List the available volumes in this pool with the vi rsh command.
# virsh vol-list guest_images_lvm
Name
Path
----------------------------------------volume1
/dev/libvirt_lvm/volume1
volume2
/dev/libvirt_lvm/volume2
volume3
/dev/libvirt_lvm/volume3
9. The following two commands (l vscan and l vs) display further information about the newly
created volumes.
# lvscan
ACTIVE
ACTIVE
ACTIVE
# lvs
132
'/dev/libvirt_lvm/volume1' [8.00 GiB] inherit
'/dev/libvirt_lvm/volume2' [8.00 GiB] inherit
'/dev/libvirt_lvm/volume3' [8.00 GiB] inherit
Chapt er 1 4 . St orage pools
LV
VG
Copy% Convert
volume1 libvirt_lvm
volume2 libvirt_lvm
volume3 libvirt_lvm
Attr
LSize
-wi-a-wi-a-wi-a-
8.00g
8.00g
8.00g
Pool Origin Data%
Move Log
14 .4 .4 . Delet ing a st orage pool using virsh
The following demonstrates how to delete a storage pool using virsh:
1. To avoid any issues with other guests using the same pool, it is best to stop the storage pool
and release any resources in use by it.
# virsh pool-destroy guest_images_disk
2. Optionally, if you want to remove the directory where the storage pool resides use the
following command:
# virsh pool-delete guest_images_disk
3. Remove the storage pool's definition
# virsh pool-undefine guest_images_disk
14 .5. iSCSI-based st orage pools
This section covers using iSCSI-based devices to store guest virtual machines. This allows for more
flexible storage options such as using iSCSI as a block storage device. The iSCSI devices use an
LIO target, which is a multi-protocol SCSI target for Linux. In addition to iSCSI, LIO also supports
Fibre Channel and Fibre Channel over Ethernet (FCoE).
iSCSI (Internet Small Computer System Interface) is a network protocol for sharing storage devices.
iSCSI connects initiators (storage clients) to targets (storage servers) using SCSI instructions over
the IP layer.
14 .5.1. Configuring a soft ware iSCSI t arget
Introduced in Red Hat Enterprise Linux 7, iSCSI targets are created with the targetcli package, which
provides a command set for creating software-backed iSCSI targets.
Pro ced u re 14 .4 . C reat in g an iSC SI t arg et
1. In st all t h e req u ired p ackag e
Install the targetcli package and all dependencies:
# yum install targetcli
2. Lau n ch targ etcl i
Launch the targ etcl i command set:
133
Virt ualiz at ion Deployment and Administ rat ion G uide
# targetcli
3.
C reat e st o rag e o b ject s
Create three storage objects as follows, using the device created in Section 14.4, “ LVM-based
storage pools” :
a. Create a block storage object, by changing into the /backsto res/bl o ck directory
and running the following command:
# create [bl o ck-name][fi l epath]
For example:
# create bl o ck1 d ev= /d ev/vd b1
b. Create a fileio object, by changing into the fi l ei o directory and running the
following command:
# create [fi l ei o name] [i mag ename] [i mag e-si ze]
For example:
# create fi l ei o 1 /fo o . i mg 50 M
c. Create a ramdisk object by changing into the ramd i sk directory, and running the
following command:
# create [ramd i skname] [si ze]
For example:
# create ramd i sk1 1M
d. Remember the names of the disks you created in this step, as you will need them later.
4. N avig at e t o t h e /i scsi d irect o ry
Change into the i scsi directory:
#cd /i scsi
5. C reat e iSC SI t arg et
Create an iSCSI target in one of two ways:
a. create with no additional parameters, automatically generates the IQN.
b. create i q n. 20 10 -0 5. co m. exampl e. server1: i scsi rhel 7g uest creates a
specific IQN on a specific server.
6.
134
Chapt er 1 4 . St orage pools
D ef in e t h e t arg et p o rt al g ro u p ( T PG )
Each iSCSI target needs to have a target portal group (TPG) defined. In this example, the
default tpg 1 will be used, but you can add additional tpgs as well. As this is the most
common configuration, the example will configure tpg 1. To do this, make sure you are still in
the /i scsi directory and change to the /tpg 1 directory.
# /i scsi >i q n. i q n. 20 10 0 5. co m. exampl e. server1: i scsi rhel 7g uest/tpg 1
7. D ef in e t h e p o rt al IP ad d ress
In order to export the block storage over iSCSI, the portals, LUNs, and ACLs must all be
configured first.
The portal includes the IP address and TCP port that the target will listen on, and the
initiators will connect to. iSCSI uses port 3260, which is the port that will be configured by
default. To connect to this port, run the following command from the /tpg directory:
# po rtal s/ create
This command will have all available IP addresses listening to this port. To specify that only
one specific IP address will listen on the port, run po rtal s/ create [i pad d ress], and
the specified IP address will be configured to listen to port 3260.
8. C o n f ig u re t h e LU N s an d assig n t h e st o rag e o b ject s t o t h e f ab ric
This step uses the storage devices created in Step 3. Make sure you change into the l uns
directory for the TPG you created in Step 6, or i scsi >i q n. i q n. 20 10 0 5. co m. exampl e. server1: i scsi rhel 7g uest, for example.
a. Assign the first LUN to the ramdisk as follows:
# create /backsto res/ramd i sk/ramd i sk1
b. Assign the second LUN to the block disk as follows:
# create /backsto res/bl o ck/bl o ck1
c. Assign the third LUN to the fileio disk as follows:
# create /backsto res/fi l ei o /fi l e1
d. Listing the resulting LUNs should resemble this screen output:
/iscsi/iqn.20...csirhel7guest/tpg1 ls
o- tgp1
.............................................................
...............[enabled, auth]
oacls.........................................................
.........................[0 ACL]
135
Virt ualiz at ion Deployment and Administ rat ion G uide
oluns.........................................................
........................[3 LUNs]
| olun0.........................................................
............[ramdisk/ramdisk1]
| olun1.........................................................
.....[block/block1 (dev/vdb1)]
| olun2.........................................................
......[fileio/file1 (foo.img)]
oportals......................................................
......................[1 Portal]
o- IPADDRESS:3260.................................................
.......................[OK]
9.
C reat in g AC Ls f o r each in it iat o r
This step allows for the creation of authentication when the initiator connects, and it also
allows for restriction of specified LUNs to specified initiators. Both targets and initiators have
unique names. iSCSI initiators use an IQN.
a. To find the IQN of the iSCSI initiator, run the following command, replacing the name
of the initiator:
# cat /etc/i scsi /i ni ti ato rname. i scsi
Use this IQN to create the ACLs.
b. Change to the acl s directory.
c. Run the command create [i q n], or to create specific ACLs, refer to the following
example:
# create i q n. 20 10 -0 5. co m. exampl e. fo o : 888
Alternatively, to configure the kernel target to use a single user ID and password for
all initiators, and allow all initiators to log in with that user ID and password, use the
following commands (replacing userid and password):
#
#
#
#
set
set
set
set
auth userid=redhat
auth password=password123
attribute authentication=1
attribute generate_node_acls=1
10. Make the configuration persistent with the saveco nfi g command. This will overwrite the
previous boot settings. Alternatively, running exi t from the t arg et cli saves the target
configuration by default.
11. Enable the service with systemctl enabl e targ et. servi ce to apply the saved settings
on next boot.
136
Chapt er 1 4 . St orage pools
Pro ced u re 14 .5. O p t io n al st ep s
1. C reat e LVM vo lu mes
LVM volumes are useful for iSCSI backing images. LVM snapshots and re-sizing can be
beneficial for guest virtual machines. This example creates an LVM image named virtimage1
on a new volume group named virtstore on a RAID 5 array for hosting guest virtual machines
with iSCSI.
a. C reat e t h e R AID array
Creating software RAID 5 arrays is covered by the Red Hat Enterprise Linux Deployment
Guide.
b. C reat e t h e LVM vo lu me g ro u p
Create a logical volume group named virtstore with the vg create command.
# vgcreate virtstore /dev/md1
c. C reat e a LVM lo g ical vo lu me
Create a logical volume group named virtimage1 on the virtstore volume group with a
size of 20GB using the l vcreate command.
# lvcreate **size 20G -n virtimage1 virtstore
The new logical volume, virtimage1, is ready to use for iSCSI.
Important
Using LVM volumes for kernel target backstores can cause issues if the initiator
also partitions the exported volume with LVM. This can be solved by adding
g l o bal _fi l ter = ["r| ^/d ev/vg 0 | "] to /etc/l vm/l vm. co nf
2. O p t io n al: T est d isco very
Test whether the new iSCSI device is discoverable.
# iscsiadm --mode discovery --type sendtargets --portal
server1.example.com
127.0.0.1:3260,1 iqn.2010-05.com.example.server1:iscsirhel7guest
3. O p t io n al: T est at t ach in g t h e d evice
Attach the new device (iqn.2010-05.com.example.server1:iscsirhel7guest) to determine whether
the device can be attached.
# iscsiadm -d2 -m node --login
scsiadm: Max file limits 1024 1024
Logging in to [iface: default, target: iqn.2010-
137
Virt ualiz at ion Deployment and Administ rat ion G uide
05.com.example.server1:iscsirhel7guest, portal: 10.0.0.1,3260]
Login to [iface: default, target: iqn.201005.com.example.server1:iscsirhel7guest, portal: 10.0.0.1,3260]
successful.
4. D etach the device.
# iscsiadm -d2 -m node --logout
scsiadm: Max file limits 1024 1024
Logging out of session [sid: 2, target: iqn.201005.com.example.server1:iscsirhel7guest, portal: 10.0.0.1,3260
Logout of [sid: 2, target: iqn.201005.com.example.server1:iscsirhel7guest, portal: 10.0.0.1,3260]
successful.
An iSCSI device is now ready to use for virtualization.
14 .5.2. Creat ing an iSCSI st orage pool in virt -manager
This procedure covers creating a storage pool with an iSCSI target in vi rt-manag er.
Pro ced u re 14 .6 . Ad d in g an iSC SI d evice t o virt - man ag er
1. O p en t h e h o st mach in e' s st o rag e d et ails
a. In virt - man ag er, click the Ed i t and select C o nnecti o n D etai l s from the
dropdown menu.
Fig u re 14 .19 . C o n n ect io n d et ails
b. Click on the Sto rag e tab.
138
Chapt er 1 4 . St orage pools
Fig u re 14 .20. St o rag e men u
2. Ad d a n ew p o o l ( St ep 1 o f 2)
Press the + button (the add pool button). The Ad d a New Sto rag e P o o l wizard appears.
Fig u re 14 .21. Ad d an iSC SI st o rag e p o o l n ame an d t yp e
139
Virt ualiz at ion Deployment and Administ rat ion G uide
Choose a name for the storage pool, change the Type to iSCSI, and press Fo rward to
continue.
3. Ad d a n ew p o o l ( St ep 2 o f 2)
You will need the information you used in Section 14.5, “ iSCSI-based storage pools” and
Step 6 to complete the fields in this menu.
a. Enter the iSCSI source and target. The Fo rmat option is not available as formatting is
handled by the guest virtual machines. It is not advised to edit the T arg et P ath. The
default target path value, /d ev/d i sk/by-path/, adds the drive path to that
directory. The target path should be the same on all host physical machines for
migration.
b. Enter the hostname or IP address of the iSCSI target. This example uses
ho st1. exampl e. co m.
c. In the So urce P ath field, enter the iSCSI target IQN. If you look at Step 6 in
Section 14.5, “ iSCSI-based storage pools” , this is the information you added in the
/etc/targ et/targ ets. co nf fi l e. This example uses i q n. 20 10 0 5. co m. exampl e. server1: i scsi rhel 7g uest.
d. Check the IQ N checkbox to enter the IQN for the initiator. This example uses
i q n. 20 10 -0 5. co m. exampl e. ho st1: i scsi rhel 7.
e. Click Fi ni sh to create the new storage pool.
Fig u re 14 .22. C reat e an iSC SI st o rag e p o o l
14 .5.3. Delet ing a st orage pool using virt -manager
This procedure demonstrates how to delete a storage pool.
14 0
Chapt er 1 4 . St orage pools
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it. To do this, select the storage pool
you want to stop and click the red X icon at the bottom of the Storage window.
Fig u re 14 .23. St o p Ico n
2. D elete the storage pool by clicking the Trash can icon. This icon is only enabled if you stop
the storage pool first.
14 .5.4 . Creat ing an iSCSI-based st orage pool wit h virsh
1. O p t io n al: Secu re t h e st o rag e p o o l
If desired, set up authentication with the steps in Section 14.5.5, “ Securing an iSCSI storage
pool” .
2. D ef in e t h e st o rag e p o o l
Storage pool definitions can be created with the vi rsh command line tool. Creating storage
pools with vi rsh is useful for system administrators using scripts to create multiple storage
pools.
The vi rsh po o l -d efi ne-as command has several parameters which are accepted in the
following format:
virsh pool-define-as name type source-host source-path source-dev
source-name target
The parameters are explained as follows:
14 1
Virt ualiz at ion Deployment and Administ rat ion G uide
t yp e
defines this pool as a particular type, iSCSI for example
n ame
sets the name for the storage pool; must be unique
s o u rce- h o st an d so u rce- p at h
the hostname and iSCSI IQN respectively
s o u rce- d ev an d so u rce- n ame
these parameters are not required for iSCSI-based pools; use a - character to leave
the field blank.
t arg et
defines the location for mounting the iSCSI device on the host machine
The example below creates the same iSCSI-based storage pool as the vi rsh po o l d efi ne-as example above:
# virsh pool-define-as --name iscsirhel7pool --type iscsi \
--source-host server1.example.com \
--source-dev iqn.2010-05.com.example.server1:iscsirhel7guest \
--target /dev/disk/by-path
Pool iscsirhel7pool defined
3. Verif y t h e st o rag e p o o l is list ed
Verify the storage pool object is created correctly and the state is i nacti ve.
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
iscsirhel7pool
inactive
no
4. O p t io n al: Est ab lish a d irect co n n ect io n t o t h e iSC SI st o rag e p o o l
This step is optional, but it allows you to establish a direct connection to the iSCSI storage
pool. By default this is enabled, but if the connection is to the host machine (and not direct to
the network) you can change it back by editing the domain XML for the virtual machine to
reflect this example:
...
...
Fig u re 14 .24 . D isk t yp e elemen t XML examp le
Note
The same iSCSI storage pool can be used for a LUN or a disk, by specifying the d i sk
d evi ce as either a d i sk or l un. See Section 15.5.3, “ Adding SCSI LUN-based
storage to a guest” for XML configuration examples for adding SCSI LUN-based
storage to a guest.
Additionally, the so urce mo d e can be specified as mo d e= ' ho st' for a connection
to the host machine.
If you have configured authentication on the iSCSI server as detailed in Step 9, then the
following XML used as a sub-element will provide the authentication credentials for
the disk. Section 14.5.5, “ Securing an iSCSI storage pool” describes how to configure the
libvirt secret.
5. St art t h e st o rag e p o o l
Use the vi rsh po o l -start to enable a directory storage pool. This allows the storage pool
to be used for volumes and guest virtual machines.
# virsh pool-start iscsirhel7pool
Pool iscsirhel7pool started
# virsh pool-list --all
Name
State
Autostart
----------------------------------------default
active
yes
iscsirhel7pool
active
no
6. T u rn o n au t o st art
Turn on autostart for the storage pool. Autostart configures the l i bvi rtd service to start
the storage pool when the service starts.
# virsh pool-autostart iscsirhel7pool
Pool iscsirhel7pool marked as autostarted
Verify that the iscsirhel7pool pool has autostart enabled:
# virsh pool-list --all
Name
State
Autostart
14 3
Virt ualiz at ion Deployment and Administ rat ion G uide
----------------------------------------default
active
yes
iscsirhel7pool
active
yes
7. Verif y t h e st o rag e p o o l co n f ig u rat io n
Verify the storage pool was created correctly, the sizes report correctly, and the state reports
as runni ng .
# virsh pool-info iscsirhel7pool
Name:
iscsirhel7pool
UUID:
afcc5367-6770-e151-bcb3-847bc36c5e28
State:
running
Persistent:
unknown
Autostart:
yes
Capacity:
100.31 GB
Allocation:
0.00
Available:
100.31 GB
An iSCSI-based storage pool called iscsirhel7pool is now available.
14 .5.5. Securing an iSCSI st orage pool
Username and password parameters can be configured with vi rsh to secure an iSCSI storage pool.
This can be configured before or after the pool is defined, but the pool must be started for the
authentication settings to take effect.
Pro ced u re 14 .7. C o n f ig u rin g au t h en t icat io n f o r a st o rag e p o o l wit h virsh
1. C reat e a lib virt secret f ile
Create a libvirt secret XML file called secret. xml , using the following example:
# cat secret.xml
Passphrase for the iSCSI example.com
serveriscsirhel7secret
2. D ef in e t h e secret f ile
D efine the secret. xml file with vi rsh:
# virsh secret-define secret.xml
3. Verif y t h e secret f ile' s U U ID
Verify the UUID in secret. xml :
# virsh secret-list
14 4
Chapt er 1 4 . St orage pools
UUID
Usage
------------------------------------------------------------------------------2d7891af-20be-4e5e-af83-190e8a922360 iscsi iscsirhel7secret
4. Assig n a secret t o t h e U U ID
Assign a secret to that UUID , using the following command syntax as an example:
# MYSECRET=`printf %s "password123" | base64`
# virsh secret-set-value 2d7891af-20be-4e5e-af83-190e8a922360
$MYSECRET
This ensures the CHAP username and password are set in a libvirt-controlled secret list.
5. Ad d an au t h en t icat io n en t ry t o t h e st o rag e p o o l
Modify the entry in the storage pool's XML file using vi rsh ed i t and add an
element, specifying authentication type, username, and secret usage.
The following shows an example of a storage pool XML definition with authentication
configured:
# cat iscsirhel7pool.xml
iscsirhel7pool/dev/disk/by-path
Note
The sub-element exists in different locations within the guest XML's
and elements. For a , is specified within the
element, as this describes where to find the pool sources, since authentication is a
property of some pool sources (iSCSI and RBD ). For a , which is a subelement of a domain, the authentication to the iSCSI or RBD disk is a property of the
disk. See Section 14.5.4, “ Creating an iSCSI-based storage pool with virsh” Creating
an iSCSI-based storage pool with virsh for an example of configured in the
guest XML.
6. Act ivat e t h e ch an g es in t h e st o rag e p o o l
14 5
Virt ualiz at ion Deployment and Administ rat ion G uide
The storage pool must be started to activate these changes.
If the storage pool has not yet been started, follow the steps in Section 14.5.4, “ Creating an
iSCSI-based storage pool with virsh” to define and start the storage pool.
If the pool has already been started, run the following commands to stop and restart the
storage pool:
# virsh pool-destroy iscsirhel7pool
# virsh pool-start iscsirhel7pool
14 .5.6. Delet ing a st orage pool using virsh
The following demonstrates how to delete a storage pool using virsh:
1. To avoid any issues with other guest virtual machines using the same pool, it is best to stop
the storage pool and release any resources in use by it.
# virsh pool-destroy iscsirhel7pool
2. Remove the storage pool's definition
# virsh pool-undefine iscsirhel7pool
14 .6. NFS-based st orage pools
This procedure covers creating a storage pool with a NFS mount point in vi rt-manag er.
14 .6.1. Creat ing a NFS-based st orage pool wit h virt -manager
1. O p en t h e h o st p h ysical mach in e' s st o rag e t ab
Open the Sto rag e tab in the Ho st D etai l s window.
a. Open vi rt-manag er.
b. Select a host physical machine from the main vi rt-manag er window. Click Ed i t
menu and select C o nnecti o n D etai l s.
14 6
Chapt er 1 4 . St orage pools
Fig u re 14 .25. C o n n ect io n d et ails
c. Click on the Storage tab.
Fig u re 14 .26 . St o rag e t ab
2. C reat e a n ew p o o l ( p art 1)
Press the + button (the add pool button). The Ad d a New Sto rag e P o o l wizard appears.
14 7
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .27. Ad d an N FS n ame an d t yp e
Choose a name for the storage pool and press Fo rward to continue.
3. C reat e a n ew p o o l ( p art 2)
Enter the target path for the device, the hostname and the NFS share path. Set the Fo rmat
option to NFS or auto (to detect the type). The target path must be identical on all host
physical machines for migration.
Enter the hostname or IP address of the NFS server. This example uses
server1. exampl e. co m.
Enter the NFS path. This example uses /nfstri al .
14 8
Chapt er 1 4 . St orage pools
Fig u re 14 .28. C reat e an N FS st o rag e p o o l
Press Fi ni sh to create the new storage pool.
14 .6.2. Delet ing a st orage pool using virt -manager
This procedure demonstrates how to delete a storage pool.
1. To avoid any issues with other guests using the same pool, it is best to stop the storage pool
and release any resources in use by it. To do this, select the storage pool you want to stop
and click the red X icon at the bottom of the Storage window.
14 9
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 14 .29 . St o p Ico n
2. D elete the storage pool by clicking the Trash can icon. This icon is only enabled if you stop
the storage pool first.
14 .7. Using a NPIV virt ual adapt er (vHBA) wit h SCSI devices
NPIV (N_Port ID Virtualization) is a software technology that allows sharing of a single physical
Fibre Channel host bus adapter (HBA).
This allows multiple guests to see the same storage from multiple physical hosts, and thus allows for
easier migration paths for the storage. As a result, there is no need for the migration to create or copy
storage, as long as the correct storage path is specified.
In virtualization, the virtual host bus adapter, or vHBA, controls the LUNs for virtual machines. For a
host to share one Fibre Channel device path between multiple KVM guests, a vHBA must be created
for each virtual machine. A single vHBA must not be used by multiple KVM guests.
Each vHBA is identified by its own WWNN (World Wide Node Name) and WWPN (World Wide Port
Name). The path to the storage is determined by the WWNN and WWPN values.
This section provides instructions for configuring a vHBA persistently on a virtual machine.
Note
Before creating a vHBA, it is recommended to configure storage array (SAN)-side zoning in the
host LUN to provide isolation between guests and prevent the possibility of data corruption.
150
Chapt er 1 4 . St orage pools
14 .7.1. Creat ing a vHBA
Pro ced u re 14 .8. C reat in g a vH B A
1. Lo cat e H B As o n t h e h o st syst em
To locate the HBAs on your host system, use the vi rsh no d ed ev-l i st --cap vpo rts
command.
For example, the following output shows a host that has two HBAs that support vHBA:
# virsh nodedev-list --cap vports
scsi_host3
scsi_host4
2. C h eck t h e H B A' s d et ails
Use the vi rsh no d ed ev-d umpxml HBA_device command to see the HBA's details.
The XML output from the vi rsh no d ed ev-d umpxml command will list the fields ,
, and , which are used to create a vHBA. The value shows the
maximum number of supported vHBAs.
# virsh nodedev-dumpxml scsi_host3
scsi_host3/sys/devices/pci0000:00/0000:00:04.0/0000:10:00.0/host3pci_0000_10_00_03020000000c984814010000000c98481402002000573de9a811270
In this example, the value shows there are a total 127 virtual ports available
for use in the HBA configuration. The value shows the number of virtual ports
currently being used. These values update after creating a vHBA.
3. C reat e a vH B A h o st d evice
Create an XML file similar to the following (in this example, named vhba_host3.xml) for the
vHBA host.
# cat vhba_host3.xml
151
Virt ualiz at ion Deployment and Administ rat ion G uide
scsi_host3
The field specifies the HBA device to associate with this vHBA device. The details
in the tag are used in the next step to create a new vHBA device for the host. See
http://libvirt.org/formatnode.html for more information on the no d ed ev XML format.
4. C reat e a n ew vH B A o n t h e vH B A h o st d evice
To create a vHBA on vhba_host3, use the vi rsh no d ed ev-create command:
# virsh nodedev-create vhba_host3.xml
Node device scsi_host5 created from vhba_host3.xml
5. Verif y t h e vH B A
Verify the new vHBA's details (scsi _ho st5) with the vi rsh no d ed ev-d umpxml command:
# virsh nodedev-dumpxml scsi_host5
scsi_host5/sys/devices/pci0000:00/0000:00:04.0/0000:10:00.0/host3/vport
-3:0-0/host5scsi_host3525001a4a93526d0a15001a4ace3ee047d2002000573de9a81
14 .7.2. Creat ing a st orage pool using t he vHBA
It is recommended to define a libvirt storage pool based on the vHBA in order to preserve the vHBA
configuration.
Using a storage pool has two primary advantages:
the libvirt code can easily find the LUN's path via virsh command output, and
virtual machine migration requires only defining and starting a storage pool with the same vHBA
name on the target machine. To do this, the vHBA LUN, libvirt storage pool and volume name must
be specified in the virtual machine's XML configuration. Refer to Section 14.7.3, “ Configuring the
virtual machine to use a vHBA LUN” for an example.
152
Chapt er 1 4 . St orage pools
1. C reat e a SC SI st o rag e p o o l
To create a persistent vHBA configuration, first create a libvirt ' scsi ' storage pool XML file
using the format below. It is recommended to use a stable location for the value, such
as one of the /d ev/d i sk/by-{path| i d | uui d | l abel } locations on your system. More
information on and the elements within can be found at
http://libvirt.org/formatstorage.html.
In this example, the ' scsi ' storage pool is named vhbapool_host3.xml:
vhbapool_host3/dev/disk/by-path070000
Important
The pool must be type= ' scsi ' and the source adapter type must be ' fc_ho st' .
For a persistent configuration across host reboots, the wwnn and wwpn attributes must
be the values assigned to the vHBA (scsi_host5 in this example) by libvirt.
Optionally, the ' parent' attribute can be used in the field to identify the parent
scsi_host device as the vHBA. Note, the value is not the scsi_host of the vHBA created by
vi rsh no d ed ev-create, but it is the parent of that vHBA.
Providing the ' parent' attribute is also useful for duplicate pool definition checks. This is
more important in environments where both the ' fc_ho st' and ' scsi _ho st' source
adapter pools are being used, to ensure a new definition does not duplicate using the same
scsi_host of another existing storage pool.
The following example shows the optional ' parent' attribute used in the field in
a storage pool configuration:
2. D ef in e t h e p o o l
To define the storage pool (named vhbapool_host3 in this example) persistently, use the
vi rsh po o l -d efi ne command:
# virsh pool-define vhbapool_host3.xml
Pool vhbapool_host3 defined from vhbapool_host3.xml
153
Virt ualiz at ion Deployment and Administ rat ion G uide
3. St art t h e p o o l
Start the storage pool with the following command:
# virsh pool-start vhbapool_host3
Pool vhbapool_host3 started
Note
When starting the pool, libvirt will check if the vHBA with same wwpn: wwnn already
exists. If it does not yet exist, a new vHBA with the provided wwpn: wwnn will be created
and the configuration will not be persistent. Correspondingly, when destroying the
pool, libvirt will destroy the vHBA using the same wwpn: wwnn values as well.
4. En ab le au t o st art
Finally, to ensure that subsequent host reboots will automatically define vHBAs for use in
virtual machines, set the storage pool autostart feature (in this example, for a pool named
vhbapool_host3):
# virsh pool-autostart vhbapool_host3
14 .7.3. Configuring t he virt ual machine t o use a vHBA LUN
After a storage pool is created for a vHBA, add the vHBA LUN to the virtual machine configuration.
1. Fin d availab le LU N s
First, use the vi rsh vo l -l i st command in order to generate a list of available LUNs on
the vHBA. For example:
# virsh vol-list vhbapool_host3
Name
Path
----------------------------------------------------------------------------unit:0:4:0
/dev/disk/by-path/pci-0000:10:00.0-fc0x5006016844602198-lun-0
unit:0:5:0
/dev/disk/by-path/pci-0000:10:00.0-fc0x5006016044602198-lun-0
The list of LUN names displayed will be available for use as disk volumes in virtual machine
configurations.
2. Ad d t h e vH B A LU N t o t h e virt u al mach in e
Add the vHBA LUN to the virtual machine by creating a disk volume on the virtual machine in
the virtual machine's XML. Specify the storage po o l and the vo l ume in the
parameter, using the following as an example:
154
Chapt er 1 4 . St orage pools
To specify a l un device instead of a d i sk, refer to the following example:
See Section 15.5.3, “ Adding SCSI LUN-based storage to a guest” for XML configuration
examples for adding SCSI LUN-based storage to a guest.
14 .7.4 . Dest roying t he vHBA st orage pool
A vHBA created by the storage pool can be destroyed by the vi rsh po o l -d estro y command:
# virsh pool-destroy vhbapool_host3
Note that executing the vi rsh po o l -d estro y command will also remove the vHBA that was
created in Section 14.7.1, “ Creating a vHBA” .
To verify the pool and vHBA have been destroyed, run:
# virsh nodedev-list --cap scsi_host
scsi _ho st5 will no longer appear in the list of results.
155
Virt ualiz at ion Deployment and Administ rat ion G uide
Chapter 15. Storage Volumes
15.1. Int roduct ion
Storage pools are divided into storage volumes. Storage volumes are an abstraction of physical
partitions, LVM logical volumes, file-based disk images and other storage types handled by libvirt.
Storage volumes are presented to guest virtual machines as local storage devices regardless of the
underlying hardware. Note the sections below do not contain all of the possible commands and
arguments that virsh allows, for more information refer to Section 23.15, “ Storage Volume
Commands” .
15.1.1. Referencing volumes
For more additional parameters and arguments, refer to Section 23.15.4, “ Listing volume
information” .
To reference a specific volume, three approaches are possible:
T h e n ame o f t h e vo lu me an d t h e st o rag e p o o l
A volume may be referred to by name, along with an identifier for the storage pool it belongs
in. On the virsh command line, this takes the form --pool storage_pool volume_name.
For example, a volume named firstimage in the guest_images pool.
# virsh vol-info --pool guest_images firstimage
Name:
firstimage
Type:
block
Capacity:
20.00 GB
Allocation:
20.00 GB
virsh #
T h e f u ll p at h t o t h e st o rag e o n t h e h o st p h ysical mach in e syst em
A volume may also be referred to by its full path on the file system. When using this
approach, a pool identifier does not need to be included.
For example, a volume named secondimage.img, visible to the host physical machine system
as /images/secondimage.img. The image can be referred to as /images/secondimage.img.
# virsh vol-info /images/secondimage.img
Name:
secondimage.img
Type:
file
Capacity:
20.00 GB
Allocation:
136.00 kB
T h e u n iq u e vo lu me key
When a volume is first created in the virtualization system, a unique identifier is generated
and assigned to it. The unique identifier is termed the volume key. The format of this volume
key varies upon the storage used.
When used with block based storage such as LVM, the volume key may follow this format:
156
Chapt er 1 5. St orage Volumes
c3pKz4-qPVc-Xf7M-7WNM-WJc8-qSiz-mtvpGn
When used with file based storage, the volume key may instead be a copy of the full path to
the volume storage.
/images/secondimage.img
For example, a volume with the volume key of Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr:
# virsh vol-info Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr
Name:
firstimage
Type:
block
Capacity:
20.00 GB
Allocation:
20.00 GB
vi rsh provides commands for converting between a volume name, volume path, or volume key:
vo l- n ame
Returns the volume name when provided with a volume path or volume key.
# virsh vol-name /dev/guest_images/firstimage
firstimage
# virsh vol-name Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr
vo l- p at h
Returns the volume path when provided with a volume key, or a storage pool identifier and
volume name.
# virsh vol-path Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr
/dev/guest_images/firstimage
# virsh vol-path --pool guest_images firstimage
/dev/guest_images/firstimage
T h e vo l- key co mman d
Returns the volume key when provided with a volume path, or a storage pool identifier and
volume name.
# virsh vol-key /dev/guest_images/firstimage
Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr
# virsh vol-key --pool guest_images firstimage
Wlvnf7-a4a3-Tlje-lJDa-9eak-PZBv-LoZuUr
For more information refer to Section 23.15.4, “ Listing volume information” .
15.2. Creat ing volumes
This section shows how to create disk volumes inside a block based storage pool. In the example
below, the vi rsh vo l -create-as command will create a storage volume with a specific size in GB
within the guest_images_disk storage pool. As this command is repeated per volume needed, three
volumes are created as shown in the example. For additional parameters and arguments refer to
157
Virt ualiz at ion Deployment and Administ rat ion G uide
Section 23.15.1, “ Creating storage volumes”
# virsh vol-create-as guest_images_disk volume1 8G
Vol volume1 created
# virsh vol-create-as guest_images_disk volume2 8G
Vol volume2 created
# virsh vol-create-as guest_images_disk volume3 8G
Vol volume3 created
# virsh vol-list guest_images_disk
Name
Path
----------------------------------------volume1
/dev/sdb1
volume2
/dev/sdb2
volume3
/dev/sdb3
# parted -s /dev/sdb pri nt
Model: ATA ST3500418AS (scsi)
Disk /dev/sdb: 500GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start
2
17.4kB
3
8590MB
1
21.5GB
End
8590MB
17.2GB
30.1GB
Size
8590MB
8590MB
8590MB
File system
Name
primary
primary
primary
Flags
15.3. Cloning volumes
The new volume will be allocated from storage in the same storage pool as the volume being cloned.
The vi rsh vo l -cl o ne must have the --po o l argument which dictates the name of the storage
pool that contains the volume to be cloned. The rest of the command names the volume to be cloned
(volume3) and the name of the new volume that was cloned (clone1). The vi rsh vo l -l i st
command lists the volumes that are present in the storage pool (guest_images_disk). For additional
commands and arguments refer to Section 23.15.1.2, “ Cloning a storage volume”
# virsh vol-clone --pool guest_images_disk volume3 clone1
Vol clone1 cloned from volume3
# vi rsh vo l -l i st guest_images_disk
Name
Path
----------------------------------------volume1
/dev/sdb1
volume2
/dev/sdb2
volume3
/dev/sdb3
clone1
/dev/sdb4
# parted -s /dev/sdb pri nt
Model: ATA ST3500418AS (scsi)
Disk /dev/sdb: 500GB
158
Chapt er 1 5. St orage Volumes
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Number
1
2
3
4
Start
4211MB
12.8GB
21.4GB
30.0GB
End
Size
File system
12.8GB 8595MB primary
21.4GB 8595MB primary
30.0GB 8595MB primary
38.6GB 8595MB primary
Name
Flags
15.4 . Delet ing and removing volumes
For the virsh commands you need to delete and remove a volume, refer to Section 23.15.2, “ D eleting
storage volumes” .
15.5. Adding st orage devices t o guest s
This section covers adding storage devices to a guest. Additional storage can only be added as
needed. The following types of storage is discussed in this section:
File based storage. Refer to Section 15.5.1, “ Adding file based storage to a guest” .
Block devices - including CD -ROM, D VD and floppy devices. Refer to Section 15.5.2, “ Adding
hard drives and other block devices to a guest” .
SCSI controllers and devices. If your host physical machine can accommodate it, up to 100 SCSI
controllers can be added to any guest virtual machine. Refer to Section 15.5.4, “ Managing
storage controllers in a guest virtual machine” .
15.5.1. Adding file based st orage t o a guest
File-based storage is a collection of files that are stored on the host physical machines file system
that act as virtualized hard drives for guests. To add file-based storage, perform the following steps:
Pro ced u re 15.1. Ad d in g f ile- b ased st o rag e
1. Create a storage file or use an existing file (such as an IMG file). Note that both of the
following commands create a 4GB file which can be used as additional storage for a guest:
Pre-allocated files are recommended for file-based storage images. Create a pre-allocated
file using the following d d command as shown:
# dd if=/dev/zero of=/var/lib/libvirt/images/FileName.img bs=1G
count=4
Alternatively, create a sparse file instead of a pre-allocated file. Sparse files are created
much faster and can be used for testing, but are not recommended for production
environments due to data integrity and performance issues.
# dd if=/dev/zero of=/var/lib/libvirt/images/FileName.img bs=1G
seek=4096 count=4
159
Virt ualiz at ion Deployment and Administ rat ion G uide
2. Create the additional storage by writing a element in a new file. In this example, this file
will be known as NewSto rag e. xml .
A element describes the source of the disk, and a device name for the virtual block
device. The device name should be unique across all devices in the guest, and identifies the
bus on which the guest will find the virtual block device. The following example defines a
virtio block device whose source is a file-based storage container named Fi l eName. i mg :
D evice names can also start with " hd" or " sd" , identifying respectively an ID E and a SCSI
disk. The configuration file can also contain an sub-element that specifies the
position on the bus for the new device. In the case of virtio block devices, this should be a
PCI address. Omitting the sub-element lets libvirt locate and assign the next
available PCI slot.
3. Attach the CD -ROM as follows:
4. Add the device defined in NewSto rag e. xml with your guest (G uest1):
# virsh attach-device --config Guest1 ~/NewStorage.xml
Note
This change will only apply after the guest has been destroyed and restarted. In
addition, persistent devices can only be added to a persistent domain, that is a
domain whose configuration has been saved with vi rsh d efi ne command.
If the guest is running, and you want the new device to be added temporarily until the guest is
destroyed, omit the --co nfi g option:
# virsh attach-device Guest1 ~/NewStorage.xml
160
Chapt er 1 5. St orage Volumes
Note
The vi rsh command allows for an attach-d i sk command that can set a limited
number of parameters with a simpler syntax and without the need to create an XML file.
The attach-d i sk command is used in a similar manner to the attach-d evi ce
command mentioned previously, as shown:
# virsh attach-disk Guest1
/var/lib/libvirt/images/FileName.img vdb --cache none
Note that the vi rsh attach-d i sk command also accepts the --co nfi g option.
5. Start the guest machine (if it is currently not running):
# virsh start Guest1
Note
The following steps are Linux guest specific. Other operating systems handle new
storage devices in different ways. For other systems, refer to that operating system's
documentation.
6.
Part it io n in g t h e d isk d rive
The guest now has a hard disk device called /d ev/vd b. If required, partition this disk drive
and format the partitions. If you do not see the device that you added, then it indicates that
there is an issue with the disk hotplug in your guest's operating system.
a. Start fd i sk for the new device:
# fdisk /dev/vdb
Command (m for help):
b. Type n for a new partition.
c. The following appears:
Command action
e
extended
p
primary partition (1-4)
Type p for a primary partition.
d. Choose an available partition number. In this example, the first partition is chosen by
entering 1.
Partition number (1-4): 1
161
Virt ualiz at ion Deployment and Administ rat ion G uide
e. Enter the default first cylinder by pressing Enter.
First cylinder (1-400, default 1):
f. Select the size of the partition. In this example the entire disk is allocated by pressing
Enter.
Last cylinder or +size or +sizeM or +sizeK (2-400, default
400):
g. Enter t to configure the partition type.
Command (m for help): t
h. Select the partition you created in the previous steps. In this example, the partition
number is 1 as there was only one partition created and fdisk automatically selected
partition 1.
Partition number (1-4): 1
i. Enter 83 for a Linux partition.
Hex code (type L to list codes): 83
j. Enter w to write changes and quit.
Command (m for help): w
k. Format the new partition with the ext3 file system.
# mke2fs -j /dev/vdb1
7. Create a mount directory, and mount the disk on the guest. In this example, the directory is
located in myfiles.
# mkdir /myfiles
# mount /dev/vdb1 /myfiles
The guest now has an additional virtualized file-based storage device. Note however, that
this storage will not mount persistently across reboot unless defined in the guest's
/etc/fstab file:
/dev/vdb1
/myfiles
ext3
defaults
0 0
15.5.2. Adding hard drives and ot her block devices t o a guest
System administrators have the option to use additional hard drives to provide increased storage
space for a guest, or to separate system data from user data.
Pro ced u re 15.2. Ad d in g p h ysical b lo ck d evices t o g u est s
162
Chapt er 1 5. St orage Volumes
1. This procedure describes how to add a hard drive on the host physical machine to a guest. It
applies to all physical block devices, including CD -ROM, D VD and floppy devices.
Physically attach the hard disk device to the host physical machine. Configure the host
physical machine if the drive is not accessible by default.
2. D o one of the following:
a. Create the additional storage by writing a d i sk element in a new file. In this example,
this file will be known as NewSto rag e. xml . The following example is a configuration
file section which contains an additional device-based storage container for the host
physical machine partition /d ev/sr0 :
b. Follow the instruction in the previous section to attach the device to the guest virtual
machine. Alternatively, you can use the virsh attach-disk command, as shown:
# virsh attach-disk Guest1 /dev/sr0 vdc
Note that the following options are available:
The vi rsh attach-d i sk command also accepts the --config, --type, and -mode options, as shown:
# vi rsh attach-d i sk G uest1 /d ev/sr0 vd c --co nfi g --type
cd ro m --mo d e read o nl y
Additionally, --type also accepts --type disk in cases where the device is a
hard drive.
3. The guest virtual machine now has a new hard disk device called /d ev/vd c on Linux (or
something similar, depending on what the guest virtual machine OS chooses) . You can now
initialize the disk from the guest virtual machine, following the standard procedures for the
guest virtual machine's operating system. Refer to Procedure 15.1, “ Adding file-based
storage” and Step 6 for an example.
163
Virt ualiz at ion Deployment and Administ rat ion G uide
Warning
The host physical machine should not use filesystem labels to identify file systems in
the fstab file, the i ni trd file or on the kernel command line. D oing so presents a
security risk if less guest virtual machines, have write access to whole partitions or LVM
volumes, because a guest virtual machine could potentially write a filesystem label
belonging to the host physical machine, to its own block device storage. Upon reboot
of the host physical machine, the host physical machine could then mistakenly use the
guest virtual machine's disk as a system disk, which would compromise the host
physical machine system.
It is preferable to use the UUID of a device to identify it in the fstab file, the i ni trd file
or on the kernel command line. While using UUID s is still not completely secure on
certain file systems, a similar compromise with UUID is significantly less feasible.
Warning
Guest virtual machines should not be given write access to whole disks or block
devices (for example, /d ev/sd b). Guest virtual machines with access to whole block
devices may be able to modify volume labels, which can be used to compromise the
host physical machine system. Use partitions (for example, /d ev/sd b1) or LVM
volumes to prevent this issue. Refer to https://access.redhat.com/documentation/enUS/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/LVM_C
LI.html, or https://access.redhat.com/documentation/enUS/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/LVM_e
xamples.html for information on LVM administration and configuration examples. If you
are using raw access to partitions, for example, /d ev/sd b1 or raw disks such as
/d ev/sd b, you should configure LVM to only scan disks that are safe, using the
g l o bal _fi l ter setting. Refer to https://access.redhat.com/documentation/enUS/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/lvmcon
f_file.html for an example of an LVM configuration script using the g l o bal _fi l ter
command.
15.5.3. Adding SCSI LUN-based st orage t o a guest
A host SCSI LUN device can be exposed entirely to the guest using three mechanisms, depending on
your host configuration. Exposing the SCSI LUN device in this way allows for SCSI commands to be
executed directly to the LUN on the guest. This is useful as a means to share a LUN between guests,
as well as to share Fibre Channel storage between hosts.
164
Chapt er 1 5. St orage Volumes
Important
The optional sgio attribute controls whether unprivileged SCSI Generical I/O (SG_IO)
commands are filtered for a d evi ce= ' l un' disk. The sgio attribute can be specified as
' fi l tered ' or ' unfi l tered ' , but must be set to ' unfi l tered ' to allow SG_IO i o ctl
commands to be passed through on the guest in a persistent reservation.
In addition to setting sg i o = ' unfi l tered ' , the element must be set to share
a LUN between guests. The sgio attribute defaults to ' fi l tered ' if not specified.
The XML attribute d evi ce= ' l un' is valid for the following guest disk configurations:
type= ' bl o ck' for
< /disk>
Note
The backslashes prior to the colons in the device name are required.
type= ' netwo rk' for
type= ' vo l ume' when using an iSCSI or a NPIV/vHBA source pool as the SCSI source pool.
The following example XML shows a guest using an iSCSI source pool (named iscsi-net-pool) as
the SCSI source pool:
165
Virt ualiz at ion Deployment and Administ rat ion G uide
Note
The mo d e= option within the tag is optional, but if used, it must be set to
' ho st' and not ' d i rect' . When set to ' ho st' , libvirt will find the path to the device on
the local host. When set to ' d i rect' , libvirt will generate the path to the device using the
source pool's source host data.
The iSCSI pool (iscsi-net-pool) in the example above will have a similar configuration to the
following:
# virsh pool-dumpxml iscsi-net-pool
iscsi-net-pool11274289152112742891520/dev/disk/by-path0755
To verify the details of the available LUNs in the iSCSI source pool, run the following command:
# virsh vol-list iscsi-net-pool
Name
Path
----------------------------------------------------------------------------unit:0:0:1
/dev/disk/by-path/ip-192.168.122.1:3260-iscsiiqn.2013-12.com.example:iscsi-chap-netpool-lun-1
unit:0:0:2
/dev/disk/by-path/ip-192.168.122.1:3260-iscsiiqn.2013-12.com.example:iscsi-chap-netpool-lun-2
type= ' vo l ume' when using a NPIV/vHBA source pool as the SCSI source pool.
The following example XML shows a guest using a NPIV/vHBA source pool (named
vhbapool_host3) as the SCSI source pool:
166
Chapt er 1 5. St orage Volumes
The NPIV/vHBA pool (vhbapool_host3) in the example above will have a similar configuration to:
# virsh pool-dumpxml vhbapool_host3
vhbapool_host3000/dev/disk/by-path070000
To verify the details of the available LUNs on the vHBA, run the following command:
# virsh vol-list vhbapool_host3
Name
Path
----------------------------------------------------------------------------unit:0:0:0
/dev/disk/by-path/pci-0000:10:00.0-fc0x5006016044602198-lun-0
unit:0:1:0
/dev/disk/by-path/pci-0000:10:00.0-fc0x5006016844602198-lun-0
For more information on using a NPIV vHBA with SCSI devices, see Section 14.7.3, “ Configuring
the virtual machine to use a vHBA LUN” .
The following procedure shows an example of adding a SCSI LUN-based storage device to a guest.
Any of the above guest disk configurations can be attached with this
method. Substitute configurations according to your environment.
Pro ced u re 15.3. At t ach in g SC SI LU N - b ased st o rag e t o a g u est
1. Create the device file by writing a element in a new file, and save this file with an XML
extension (in this example, sda.xml):
# cat sda.xml
167
Virt ualiz at ion Deployment and Administ rat ion G uide
2. Associate the device created in sda.xml with your guest virtual machine (Guest1, for example):
# virsh attach-device --config Guest1 ~/sda.xml
Note
Running the vi rsh attach-d evi ce command with the --co nfi g option requires a
guest reboot to add the device permanently to the guest. Alternatively, the -persi stent option can be used instead of --co nfi g , which can also be used to
hotplug the device to a guest.
Alternatively, the SCSI LUN-based storage can be attached or configured on the guest using virt man ag er. To configure this using virt - man ag er, click the Ad d Hard ware button and add a virtual
disk with the desired parameters, or change the settings of an existing SCSI LUN device from this
window. In Red Hat Enterprise Linux 7.2, the SGIO value can also be configured in virt - man ag er:
Fig u re 15.1. C o n f ig u rin g SC SI LU N st o rag e wit h virt - man ag er
15.5.4 . Managing st orage cont rollers in a guest virt ual machine
168
Chapt er 1 5. St orage Volumes
Unlike virtio disks, SCSI devices require the presence of a controller in the guest virtual machine.
This section details the necessary steps to create a virtual SCSI controller (also known as " Host Bus
Adapter" , or HBA), and to add SCSI storage to the guest virtual machine.
Pro ced u re 15.4 . C reat in g a virt u al SC SI co n t ro ller
1. D isplay the configuration of the guest virtual machine (G uest1) and look for a pre-existing
SCSI controller:
# virsh dumpxml Guest1 | grep controller.*scsi
If a device controller is present, the command will output one or more lines similar to the
following:
2. If the previous step did not show a device controller, create the description for one in a new
file and add it to the virtual machine, using the following steps:
a. Create the device controller by writing a element in a new file and
save this file with an XML extension. vi rti o -scsi -co ntro l l er. xml , for example.
b. Associate the device controller you just created in vi rti o -scsi -co ntro l l er. xml
with your guest virtual machine (Guest1, for example):
# virsh attach-device --config Guest1 ~/virtio-scsicontroller.xml
In this example the --co nfi g option behaves the same as it does for disks. Refer to
Procedure 15.2, “ Adding physical block devices to guests” for more information.
3. Add a new SCSI disk or CD -ROM. The new disk can be added using the methods in sections
Section 15.5.1, “ Adding file based storage to a guest” and Section 15.5.2, “ Adding hard
drives and other block devices to a guest” . In order to create a SCSI disk, specify a target
device name that starts with sd. The supported limit for each controller is 1024 virtio-scsi
disks, but it is possible that other available resources in the host (such as file descriptors) are
exhausted with fewer disks.
For more information refer to the following Red Hat Enterprise Linux 6 whitepaper: The nextgeneration storage interface for the Red Hat Enterprise Linux Kernel Virtual Machine: virtioscsi.
# virsh attach-disk Guest1 /var/lib/libvirt/images/FileName.img sdb
--cache none
D epending on the version of the driver in the guest virtual machine, the new disk may not be
detected immediately by a running guest virtual machine. Follow the steps in the Red Hat
Enterprise Linux Storage Administration Guide.
169
Virt ualiz at ion Deployment and Administ rat ion G uide
Chapter 16. Using qemu-img
The qemu-img command line tool is used for formatting, modifying, and verifying various file systems
used by KVM. qemu-img options and usages are highlighted in the sections that follow.
16.1. Checking t he disk image
To perform a consistency check on a disk image with the file name imgname.
# qemu-img check [-f format] imgname
Note
Only the qcow2, qcow2 version3, and vdi formats support consistency checks.
16.2. Commit t ing changes t o an image
Commit any changes recorded in the specified image file (imgname) to the file's base image with the
q emu-i mg co mmi t command. Optionally, specify the file's format type (fmt).
# qemu-img commit [-f qcow2] [-t cache] imgname
16.3. Convert ing an exist ing image t o anot her format
The convert option is used to convert one recognized image format to another image format. Refer
to Section 16.9, “ Supported qemu-img formats” for a list of accepted formats.
# qemu-img convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o
options] [-S sparse_size] filename output_filename
The -p parameter shows the progress of the command (optional and not for every command) and -S
flag allows for the creation of a sparse file, which is included within the disk image. Sparse files in all
purposes function like a standard file, except that the physical blocks that only contain zeros (i.e.,
nothing). When the Operating System sees this file, it treats it as it exists and takes up actual disk
space, even though in reality it doesn't take any. This is particularly helpful when creating a disk for
a guest virtual machine as this gives the appearance that the disk has taken much more disk space
than it has. For example, if you set -S to 50Gb on a disk image that is 10Gb, then your 10Gb of disk
space will appear to be 60Gb in size even though only 10Gb is actually being used.
Convert the disk image filename to disk image output_filename using format output_format.
The disk image can be optionally compressed with the -c option, or encrypted with the -o option by
setting -o encrypti o n. Note that the options available with the -o parameter differ with the selected
format.
Only the q co w2 and qcow2 format supports encryption or compression. q co w2 encryption uses the
AES format with secure 128-bit keys. q co w2 compression is read-only, so if a compressed sector is
converted from q co w2 format, it is written to the new format as uncompressed data.
170
Chapt er 1 6 . Using qemu- img
Image conversion is also useful to get a smaller image when using a format which can grow, such as
q co w or co w. The empty sectors are detected and suppressed from the destination image.
16.4 . Creat ing and format t ing new images or devices
Create the new disk image filename of size size and format format.
# qemu-img create [-f format] [-o options] filename [size]
If a base image is specified with -o backi ng _fi l e= filename, the image will only record
differences between itself and the base image. The backing file will not be modified unless you use
the co mmi t command. No size needs to be specified in this case.
16.5. Displaying image informat ion
The i nfo parameter displays information about a disk image filename. The format for the i nfo
option is as follows:
# qemu-img info [-f format] filename
This command is often used to discover the size reserved on disk which can be different from the
displayed size. If snapshots are stored in the disk image, they are displayed also. This command will
show for example, how much space is being taken by a qcow2 image on a block device. This is done
by running the q emu-i mg . You can check that the image in use is the one that matches the output of
the q emu-i mg i nfo command with the q emu-i mg check command.
# qemu-img info /dev/vg-90.100-sluo/lv-90-100-sluo
image: /dev/vg-90.100-sluo/lv-90-100-sluo
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 0
cluster_size: 65536
16.6. Re-basing a backing file of an image
The q emu-i mg rebase changes the backing file of an image.
# qemu-img rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F
backing_fmt] filename
The backing file is changed to backing_file and (if the format of filename supports the feature), the
backing file format is changed to backing_format.
Note
Only the qcow2 format supports changing the backing file (rebase).
There are two different modes in which rebase can operate: safe and unsafe.
171
Virt ualiz at ion Deployment and Administ rat ion G uide
safe mode is used by default and performs a real rebase operation. The new backing file may differ
from the old one and the q emu-i mg rebase command will take care of keeping the guest virtual
machine-visible content of filename unchanged. In order to achieve this, any clusters that differ
between backing_file and old backing file of filename are merged into filename before making any
changes to the backing file.
Note that safe mode is an expensive operation, comparable to converting an image. The old
backing file is required for it to complete successfully.
unsafe mode is used if the -u option is passed to q emu-i mg rebase. In this mode, only the
backing file name and format of filename is changed, without any checks taking place on the file
contents. Make sure the new backing file is specified correctly or the guest-visible content of the
image will be corrupted.
This mode is useful for renaming or moving the backing file. It can be used without an accessible old
backing file. For instance, it can be used to fix an image whose backing file has already been moved
or renamed.
16.7. Re-siz ing t he disk image
Change the disk image filename as if it had been created with size size. Only images in raw format can
be re-sized in both directions, whereas qcow2 version 2 or qcow2 version 3 images can be grown
but cannot be shrunk.
Use the following to set the size of the disk image filename to size bytes:
# qemu-img resize filename size
You can also re-size relative to the current size of the disk image. To give a size relative to the current
size, prefix the number of bytes with + to grow, or - to reduce the size of the disk image by that
number of bytes. Adding a unit suffix allows you to set the image size in kilobytes (K), megabytes (M),
gigabytes (G) or terabytes (T).
# qemu-img resize filename [+|-]size[K|M|G|T]
Warning
Before using this command to shrink a disk image, you must use file system and partitioning
tools inside the VM itself to reduce allocated file systems and partition sizes accordingly.
Failure to do so will result in data loss.
After using this command to grow a disk image, you must use file system and partitioning tools
inside the VM to actually begin using the new space on the device.
16.8. List ing, creat ing, applying, and delet ing a snapshot
Using different parameters from the q emu-i mg snapsho t command you can list, apply, create, or
delete an existing snapshot (snapshot) of specified image (filename).
# qemu-img snapshot [ -l | -a snapshot | -c snapshot | -d snapshot ]
filename
172
Chapt er 1 6 . Using qemu- img
The accepted arguments are as follows:
-l lists all snapshots associated with the specified disk image.
The apply option, -a, reverts the disk image (filename) to the state of a previously saved snapshot.
-c creates a snapshot (snapshot) of an image (filename).
-d deletes the specified snapshot.
16.9. Support ed qemu-img format s
When a format is specified in any of the q emu - img commands, the following format types may be
used:
raw - Raw disk image format (default). This can be the fastest file-based format. If your file system
supports holes (for example in ext2 or ext3 ), then only the written sectors will reserve space. Use
q emu-i mg i nfo to obtain the real size used by the image or l s -l s on Unix/Linux. Although
Raw images give optimal performance, only very basic features are available with a Raw image
(no snapshots etc.).
q co w2 - QEMU image format, the most versatile format with the best feature set. Use it to have
optional AES encryption, zlib-based compression, support of multiple VM snapshots, and smaller
images, which are useful on file systems that do not support holes . Note that this expansive
feature set comes at the cost of performance.
Although only the formats above can be used to run on a guest virtual machine or host physical
machine machine, q emu - img also recognizes and supports the following formats in order to
convert from them into either raw , or q co w2 format. The format of an image is usually detected
automatically. In addition to converting these formats into raw or q co w2 , they can be converted
back from raw or q co w2 to the original format. Note that the qcow2 version supplied with Red Hat
Enterprise Linux 7 is 1.1. The format that is supplied with previous versions of Red Hat
Enterprise Linux will be 0.10. You can revert image files to previous versions of qcow2. To know
which version you are using, run q emu-i mg i nfo q co w2 [imagefilename.img] command.
To change the qcow version refer to Section 26.20.2, “ Setting target elements” .
bo chs - Bochs disk image format.
cl o o p - Linux Compressed Loop image, useful only to reuse directly compressed CD -ROM
images present for example in the Knoppix CD -ROMs.
co w - User Mode Linux Copy On Write image format. The co w format is included only for
compatibility with previous versions.
d mg - Mac disk image format.
nbd - Network block device.
paral l el s - Parallels virtualization disk image format.
q co w - Old QEMU image format. Only included for compatibility with older versions.
vd i - Oracle VM VirtualBox hard disk image format.
vmd k - VMware 3 and 4 compatible image format.
vvfat - Virtual VFAT disk image format.
173
Virt ualiz at ion Deployment and Administ rat ion G uide
Chapter 17. KVM live migration
This chapter covers migrating guest virtual machines running on one host physical machine to
another. In both instances, the host physical machines are running the KVM hypervisor.
Migration describes the process of moving a guest virtual machine from one host physical machine
to another. This is possible because guest virtual machines are running in a virtualized environment
instead of directly on the hardware. Migration is useful for:
Load balancing - guest virtual machines can be moved to host physical machines with lower
usage when their host physical machine becomes overloaded, or another host physical machine
is under-utilized.
Hardware independence - when we need to upgrade, add, or remove hardware devices on the
host physical machine, we can safely relocate guest virtual machines to other host physical
machines. This means that guest virtual machines do not experience any downtime for hardware
improvements.
Energy saving - guest virtual machines can be redistributed to other host physical machines and
can thus be powered off to save energy and cut costs in low usage periods.
Geographic migration - guest virtual machines can be moved to another location for lower latency
or in serious circumstances.
Migration works by sending the state of the guest virtual machine's memory and any virtualized
devices to a destination host physical machine. It is recommended to use shared, networked storage
to store the guest virtual machine's images to be migrated. It is also recommended to use libvirtmanaged storage pools for shared storage when migrating virtual machines.
Migrations can be performed live or not.
In a live migration, the guest virtual machine continues to run on the source host physical machine,
while its memory pages are transferred to the destination host physical machine. D uring migration,
KVM monitors the source for any changes in pages it has already transferred, and begins to transfer
these changes when all of the initial pages have been transferred. KVM also estimates transfer speed
during migration, so when the remaining amount of data to transfer will take a certain configurable
period of time (10ms by default), KVM suspends the original guest virtual machine, transfers the
remaining data, and resumes the same guest virtual machine on the destination host physical
machine.
In contrast, a non-live migration (offline migration) suspends the guest virtual machine and then
copies the guest virtual machine's memory to the destination host physical machine. The guest
virtual machine is then resumed on the destination host physical machine and the memory the guest
virtual machine used on the source host physical machine is freed. The time it takes to complete such
a migration only depends on network bandwidth and latency. If the network is experiencing heavy
use or low bandwidth, the migration will take much longer. It should be noted that if the original guest
virtual machine modifies pages faster than KVM can transfer them to the destination host physical
machine, offline migration must be used, as live migration would never complete.
Note
If you are migrating a guest virtual machine that has virtio devices on it please adhere to the
warning explained in Important
17.1. Live migrat ion requirement s
174
Chapt er 1 7 . KVM live migrat ion
17.1. Live migrat ion requirement s
Migrating guest virtual machines requires the following:
Mig rat io n req u iremen t s
A guest virtual machine installed on shared storage using one of the following protocols:
Fibre Channel-based LUNs
iSCSI
FCoE
NFS
GFS2
SCSI RD MA protocols (SCSI RCP): the block export protocol used in Infiniband and 10GbE
iWARP adapters
Make sure that the libvirtd service is enabled.
# systemctl enabl e l i bvi rtd
Make sure that the libvirtd service is running.
# systemctl restart l i bvi rtd
. It is also important to note that the ability to migrate effectively is dependent on the parameter
settings in the /etc/l i bvi rt/l i bvi rtd . co nf configuration file.
The migration platforms and versions should be checked against table Table 17.1, “ Live Migration
Compatibility”
Both systems must have the appropriate TCP/IP ports open. In cases where a firewall is used refer
to the Red Hat Enterprise Linux Virtualization Security Guide for detailed port information.
A separate system exporting the shared storage medium. Storage should not reside on either of
the two host physical machines being used for migration.
Shared storage must mount at the same location on source and destination systems. The
mounted directory names must be identical. Although it is possible to keep the images using
different paths, it is not recommended. Note that, if you are intending to use virt-manager to
perform the migration, the path names must be identical. If however you intend to use virsh to
perform the migration, different network configurations and mount directories can be used with the
help of --xml option or pre-hooks when doing migrations (refer to Live Migration Limitations). For
more information on preho o ks, refer to libvirt.org, and for more information on the XML option,
refer to Chapter 26, Manipulating the domain XML.
When migration is attempted on an existing guest virtual machine in a public bridge+tap network,
the source and destination host physical machines must be located in the same network.
Otherwise, the guest virtual machine network will not operate after migration.
175
Virt ualiz at ion Deployment and Administ rat ion G uide
Note
Guest virtual machine migration has the following limitations when used on Red Hat Enterprise
Linux with virtualization technology based on KVM:
Point to point migration – must be done manually to designate destination hypervisor from
originating hypervisor
No validation or roll-back is available
D etermination of target may only be done manually
Storage migration cannot be performed live on Red Hat Enterprise Linux 7, but you can
migrate storage while the guest virtual machine is powered down. Live storage migration is
available on Red Hat Enterprise Virtualization . Call your service representative for details.
Pro ced u re 17.1. C o n f ig u rin g lib virt d .co n f
1. Opening the l i bvi rtd . co nf requires running the command as root:
# vim /etc/libvirt/libvirtd.conf
2. Change the parameters as needed and save the file.
3. Restart the l i bvi rtd service:
# systemctl restart libvirtd
17.2. Live migrat ion and Red Hat Ent erprise Linux version compat ibilit y
Live Migration is supported as shown in table Table 17.1, “ Live Migration Compatibility” :
T ab le 17.1. Live Mig rat io n C o mp at ib ilit y
Mig rat io n
Met h o d
R elease T yp e
Examp le
Live Mig rat io n
Su p p o rt
N o t es
Forward
Major release
6.5+ → 7.x
Fully supported
Any issues
should be
reported
Backward
Forward
Major release
Minor release
7.x → 6.y
7.x → 7.y (7.0 →
7.1)
Not supported
Fully supported
Backward
Minor release
7.y → 7.x (7.1 →
7.0)
Fully supported
Any issues
should be
reported
Any issues
should be
reported
T ro u b lesh o o t in g p ro b lems wit h mig rat io n
Issu es wit h t h e mig rat io n p ro t o co l — If backward migration ends with " unknown section
error" , repeating the migration process can repair the issue as it may be a transient error. If not,
please report the problem.
176
Chapt er 1 7 . KVM live migrat ion
C o n f ig u rin g n et wo rk st o rag e
Configure shared storage and install a guest virtual machine on the shared storage.
Alternatively, use the NFS example in Section 17.3, “ Shared storage example: NFS for a simple
migration”
17.3. Shared st orage example: NFS for a simple migrat ion
Important
This example uses NFS to share guest virtual machine images with other KVM host physical
machines. Although not practical for large installations, it is presented to demonstrate
migration techniques only. D o not use this example for migrating or running more than a few
guest virtual machines. In addition, it is required that the synch parameter is enabled. This is
required for proper export of the NFS storage.
iSCSI storage is a better choice for large deployments. Refer to Section 14.5, “ iSCSI-based
storage pools” for configuration details.
Also note, that the instructions provided herin are not meant to replace the detailed instructions
found in Red Hat Linux Storage Administration Guide. Refer to this guide for information on configuring
NFS, opening IP tables, and configuring the firewall.
Make sure that NFS file locking is not used as it is not supported in KVM.
1. Exp o rt yo u r lib virt imag e d irect o ry
Migration requires storage to reside on a system that is separate to the migration target
systems. On this separate system, export the storage by adding the default image directory to
the /etc/expo rts file:
/var/lib/libvirt/images *.example.com(rw,no_root_squash,sync)
Change the hostname parameter as required for your environment.
2. St art N FS
a. Install the NFS packages if they are not yet installed:
# yum install nfs-utils
b. Make sure that the ports for NFS in i ptabl es (2049, for example) are opened and
add NFS to the /etc/ho sts. al l o w file.
c. Start the NFS service:
# systemctl restart nfs-server
3. Mo u n t t h e sh ared st o rag e o n t h e d est in at io n
177
Virt ualiz at ion Deployment and Administ rat ion G uide
On the migration destination system, mount the /var/l i b/l i bvi rt/i mag es directory:
# mount storage_host:/var/lib/libvirt/images
/var/lib/libvirt/images
Warning
Whichever directory is chosen for the source host physical machine must be exactly
the same as that on the destination host physical machine. This applies to all types of
shared storage. The directory must be the same or the migration with virt-manager will
fail.
17.4 . Live KVM migrat ion wit h virsh
A guest virtual machine can be migrated to another host physical machine with the vi rsh command.
The mi g rate command accepts parameters in the following format:
# virsh migrate --live GuestName DestinationURL
Note that the --live option may be eliminated when live migration is not desired. Additional options are
listed in Section 17.4.2, “ Additional options for the virsh migrate command” .
The GuestName parameter represents the name of the guest virtual machine which you want to
migrate.
The DestinationURL parameter is the connection URL of the destination host physical machine.
The destination system must run the same version of Red Hat Enterprise Linux, be using the same
hypervisor and have l i bvi rt running.
Note
The DestinationURL parameter for normal migration and peer2peer migration has different
semantics:
normal migration: the DestinationURL is the URL of the target host physical machine as
seen from the source guest virtual machine.
peer2peer migration: DestinationURL is the URL of the target host physical machine as
seen from the source host physical machine.
Once the command is entered, you will be prompted for the root password of the destination system.
Important
Name resolution must be working on both sides (source and destination) in order for
migration to succeed. Each side must be able to find the other. Make sure that you can ping
one side to the other to check that the name resolution is working.
178
Chapt er 1 7 . KVM live migrat ion
Examp le: live mig rat io n wit h virsh
This example migrates from ho st1. exampl e. co m to ho st2. exampl e. co m. Change the host
physical machine names for your environment. This example migrates a virtual machine named
g uest1-rhel 6 -6 4 .
This example assumes you have fully configured shared storage and meet all the prerequisites
(listed here: Migration requirements).
1.
Verif y t h e g u est virt u al mach in e is ru n n in g
From the source system, ho st1. exampl e. co m, verify g uest1-rhel 6 -6 4 is running:
[root@ host1 ~]# virsh list
Id Name
State
---------------------------------10 guest1-rhel6-64
running
2.
Mig rat e t h e g u est virt u al mach in e
Execute the following command to live migrate the guest virtual machine to the destination,
ho st2. exampl e. co m. Append /system to the end of the destination URL to tell libvirt that
you need full access.
# virsh migrate --live guest1-rhel7-64
qemu+ssh://host2.example.com/system
Once the command is entered you will be prompted for the root password of the destination
system.
3.
Wait
The migration may take some time depending on load and the size of the guest virtual
machine. vi rsh only reports errors. The guest virtual machine continues to run on the
source host physical machine until fully migrated.
4.
Verif y t h e g u est virt u al mach in e h as arrived at t h e d est in at io n h o st
From the destination system, ho st2. exampl e. co m, verify g uest1-rhel 7-6 4 is running:
[root@ host2 ~]# virsh list
Id Name
State
---------------------------------10 guest1-rhel7-64
running
The live migration is now complete.
179
Virt ualiz at ion Deployment and Administ rat ion G uide
Note
libvirt supports a variety of networking methods including TLS/SSL, UNIX sockets, SSH, and
unencrypted TCP. Refer to Chapter 21, Remote management of guests for more information on
using other methods.
Note
Non-running guest virtual machines cannot be migrated with the vi rsh mi g rate command.
To migrate a non-running guest virtual machine, the following script should be used:
virsh -c qemu+ssh:// migrate --offline -persistent
17.4 .1. Addit ional t ips for migrat ion wit h virsh
It is possible to perform multiple, concurrent live migrations where each migration runs in a separate
command shell. However, this should be done with caution and should involve careful calculations
as each migration instance uses one MAX_CLIENT from each side (source and target). As the default
setting is 20, there is enough to run 10 instances without changing the settings. Should you need to
change the settings, refer to the procedure Procedure 17.1, “ Configuring libvirtd.conf” .
1. Open the libvirtd.conf file as described in Procedure 17.1, “ Configuring libvirtd.conf” .
2. Look for the Processing controls section.
#################################################################
#
# Processing controls
#
# The maximum number of concurrent client connections to allow
# over all sockets combined.
#max_clients = 20
# The minimum limit sets the number of workers to start up
# initially. If the number of active clients exceeds this,
# then more threads are spawned, upto max_workers limit.
# Typically you'd want max_workers to equal maximum number
# of clients allowed
#min_workers = 5
#max_workers = 20
# The number of priority workers. If all workers from above
# pool will stuck, some calls marked as high priority
# (notably domainDestroy) can be executed in this pool.
#prio_workers = 5
180
Chapt er 1 7 . KVM live migrat ion
# Total global limit on concurrent RPC calls. Should be
# at least as large as max_workers. Beyond this, RPC requests
# will be read into memory and queued. This directly impact
# memory usage, currently each request requires 256 KB of
# memory. So by default upto 5 MB of memory is used
#
# XXX this isn't actually enforced yet, only the per-client
# limit is used so far
#max_requests = 20
# Limit on concurrent requests from a single client
# connection. To avoid one client monopolizing the server
# this should be a small fraction of the global max_requests
# and max_workers parameter
#max_client_requests = 5
#################################################################
3. Change the max_clients and max_workers parameters settings. It is recommended that
the number be the same in both parameters. The max_clients will use 2 clients per
migration (one per side) and max_workers will use 1 worker on the source and 0 workers on
the destination during the perform phase and 1 worker on the destination during the finish
phase.
Important
The max_clients and max_workers parameters settings are effected by all guest
virtual machine connections to the libvirtd service. This means that any user that is
using the same guest virtual machine and is performing a migration at the same time
will also beholden to the limits set in the the max_clients and max_workers
parameters settings. This is why the maximum value needs to be considered carefully
before performing a concurrent live migration.
Important
The max_clients parameter controls how many clients are allowed to connect to
libvirt. When a large number of containers are started at once, this limit can be easily
reached and exceeded. The value of the max_clients parameter could be increased
to avoid this, but doing so can leave the system more vulnerable to denial of service
(D oS) attacks against instances. To alleviate this problem, a new
max_anonymous_clients setting has been introduced in Red Hat Enterprise Linux
7.0 that specifies a limit of connections which are accepted but not yet authenticated.
You can implement a combination of max_clients and max_anonymous_clients
to suit your workload.
4. Save the file and restart the service.
181
Virt ualiz at ion Deployment and Administ rat ion G uide
Note
There may be cases where a migration connection drops because there are too many
ssh sessions that have been started, but not yet authenticated. By default, sshd allows
only 10 sessions to be in a " pre-authenticated state" at any time. This setting is
controlled by the MaxStartups parameter in the sshd configuration file (located here:
/etc/ssh/sshd _co nfi g ), which may require some adjustment. Adjusting this
parameter should be done with caution as the limitation is put in place to prevent D oS
attacks (and over-use of resources in general). Setting this value too high will negate
its purpose. To change this parameter, edit the file /etc/ssh/sshd _co nfi g , remove
the # from the beginning of the MaxStartups line, and change the 10 (default value)
to a higher number. Remember to save the file and restart the sshd service. For more
information, refer to the sshd _co nfi g man page.
17.4 .2. Addit ional opt ions for t he virsh migrat e command
In addition to --l i ve, virsh migrate accepts the following options:
--d i rect - used for direct migration
--p2p - used for peer-2-peer migration
--tunnel ed - used for tunneled migration
--o ffl i ne - migrates domain definition without starting the domain on destination and without
stopping it on source host. Offline migration may be used with inactive domains and it must be
used with the --persi stent option.
--persi stent - leaves the domain persistent on destination host physical machine
--und efi neso urce - undefines the domain on the source host physical machine
--suspend - leaves the domain paused on the destination host physical machine
--chang e-pro tecti o n - enforces that no incompatible configuration changes will be made to
the domain while the migration is underway; this flag is implicitly enabled when supported by the
hypervisor, but can be explicitly used to reject the migration if the hypervisor lacks change
protection support.
--unsafe - forces the migration to occur, ignoring all safety procedures.
--verbo se - displays the progress of migration as it is occurring
--co mpressed - activates compression of memory pages that have to be transferred repeatedly
during live migration.
--abo rt-o n-erro r - cancels the migration if a soft error (for example I/O error) happens during
the migration.
--d o mai n name - sets the domain name, id or uuid.
--d esturi uri - connection URI of the destination host as seen from the client (normal
migration) or source (p2p migration).
--mi g rateuri uri - the migration URI, which can usually be omitted.
--g raphi csuri uri - graphics URI to be used for seamless graphics migration.
182
Chapt er 1 7 . KVM live migrat ion
--l i sten-ad d ress address - sets the listen address that the hypervisor on the destination
side should bind to for incoming migration.
--ti meo ut seconds - forces a guest virtual machine to suspend when the live migration
counter exceeds N seconds. It can only be used with a live migration. Once the timeout is initiated,
the migration continues on the suspended guest virtual machine.
--d name newname - is used for renaming the domain during migration, which also usually can
be omitted
--xml filename - the filename indicated can be used to supply an alternative XML file for use
on the destination to supply a larger set of changes to any host-specific portions of the domain
XML, such as accounting for naming differences between source and destination in accessing
underlying storage. This option is usually omitted.
In addtion the following commands may help as well:
vi rsh mi g rate-setmaxd o wnti me domain downtime - will set a maximum tolerable
downtime for a domain which is being live-migrated to another host. The specified downtime is in
milliseconds. The domain specified must be the same domain that is being migrated.
vi rsh mi g rate-co mpcache domain --si ze - will set and or get the size of the cache in
bytes which is used for compressing repeatedly transferred memory pages during a live migration.
When the --si ze is not used the command displays the current size of the compression cache.
When --si ze is used, and specified in bytes, the hypervisor is asked to change compression to
match the indicated size, following which the current size is displayed. The --si ze argument is
supposed to be used while the domain is being live migrated as a reaction to the migration
progress and increasing number of compression cache misses obtained from the
d o mjo bi ng fo .
vi rsh mi g rate-setspeed domain bandwidth - sets the migration bandwidth in Mib/sec for
the specified domain which is being migrated to another host.
vi rsh mi g rate-g etspeed domain - gets the maximum migration bandwidth that is available
in Mib/sec for the specified domain.
Refer to Live Migration Limitations or the virsh man page for more information.
17.5. Migrat ing wit h virt -manager
This section covers migrating a KVM guest virtual machine with vi rt-manag er from one host
physical machine to another.
1. O p en virt - man ag er
Open vi rt-manag er. Choose Ap p licat io n s → Syst em T o o ls → Virt u al Mach in e
Man ag er from the main menu bar to launch vi rt-manag er.
183
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 17.1. Virt - Man ag er main men u
2. C o n n ect t o t h e t arg et h o st p h ysical mach in e
Connect to the target host physical machine by clicking on the File menu, then click Ad d
C o n n ect io n .
Fig u re 17.2. O p en Ad d C o n n ect io n win d o w
3. Ad d co n n ect io n
The Ad d C o nnecti o n window appears.
184
Chapt er 1 7 . KVM live migrat ion
Fig u re 17.3. Ad d in g a co n n ect io n t o t h e t arg et h o st p h ysical mach in e
Enter the following details:
Hypervi so r: Select Q EMU /K VM.
Metho d : Select the connection method.
Username: Enter the username for the remote host physical machine.
Ho stname: Enter the hostname for the remote host physical machine.
Click the C o nnect button. An SSH connection is used in this example, so the specified user's
password must be entered in the next step.
Fig u re 17.4 . En t er p asswo rd
185
Virt ualiz at ion Deployment and Administ rat ion G uide
4. Mig rat e g u est virt u al mach in es
Open the list of guests inside the source host physical machine (click the small triangle on
the left of the host name) and right click on the guest that is to be migrated (g u est 1- rh el6 6 4 in this example) and click Mig rat e.
Fig u re 17.5. C h o o sin g t h e g u est t o b e mig rat ed
In the N ew H o st field, use the drop-down list to select the host physical machine you wish to
migrate the guest virtual machine to and click Mig rat e.
186
Chapt er 1 7 . KVM live migrat ion
Fig u re 17.6 . C h o o sin g t h e d est in at io n h o st p h ysical mach in e an d st art in g t h e
mig rat io n p ro cess
A progress window will appear.
Fig u re 17.7. Pro g ress win d o w
187
Virt ualiz at ion Deployment and Administ rat ion G uide
virt - man ag er now displays the newly migrated guest virtual machine running in the
destination host. The guest virtual machine that was running in the source host physical
machine is now listed inthe Shutoff state.
Fig u re 17.8. Mig rat ed g u est virt u al mach in e ru n n in g in t h e d est in at io n h o st
p h ysical mach in e
5. O p t io n al - View t h e st o rag e d et ails f o r t h e h o st p h ysical mach in e
In the Ed it menu, click C o n n ect io n D et ails, the Connection D etails window appears.
Click the Sto rag e tab. The iSCSI target details for the destination host physical machine is
shown. Note that the migrated guest virtual machine is listed as using the storage
188
Chapt er 1 7 . KVM live migrat ion
Fig u re 17.9 . St o rag e d et ails
This host was defined by the following XML configuration:
iscsirhel6guest/dev/disk/by-path
...
Fig u re 17.10. XML co n f ig u rat io n f o r t h e d est in at io n h o st p h ysical mach in e
189
Virt ualiz at ion Deployment and Administ rat ion G uide
Chapter 18. Guest virtual machine device configuration
Red Hat Enterprise Linux 7 supports three classes of devices for guest virtual machines:
Emulated devices are purely virtual devices that mimic real hardware, allowing unmodified guest
operating systems to work with them using their standard in-box drivers. Red Hat Enterprise Linux
7 supports up to 216 virtio devices.
Virtio devices are purely virtual devices designed to work optimally in a virtual machine. Virtio
devices are similar to emulated devices, however, non-Linux virtual machines do not include the
drivers they require by default. Virtualization management software like the Virtual Machine
Manager (virt - man ag er) and the Red Hat Enterprise Virtualization Hypervisor install these
drivers automatically for supported non-Linux guest operating systems. Red Hat Enterprise Linux
7 supports up to 700 scsi disks.
Assigned devices are physical devices that are exposed to the virtual machine. This method is also
known as 'passthrough'. D evice assignment allows virtual machines exclusive access to PCI
devices for a range of tasks, and allows PCI devices to appear and behave as if they were
physically attached to the guest operating system. Red Hat Enterprise Linux 7 supports up to 32
assigned devices per virtual machine.
D evice assignment is supported on PCIe devices, including select graphics devices. Nvidia Kseries Quadro, GRID , and Tesla graphics card GPU functions are now supported with device
assignment in Red Hat Enterprise Linux 7. Parallel PCI devices may be supported as assigned
devices, but they have severe limitations due to security and system configuration conflicts. Refer
to the sections within this chapter for more details regarding specific series and versions that are
supported.
Red Hat Enterprise Linux 7 supports PCI hotplug of devices exposed as single function slots to the
virtual machine. Single function host devices and individual functions of multi-function host devices
may be configured to enable this. Configurations exposing devices as multi-function PCI slots to the
virtual machine are recommended only for non-hotplug applications.
For more information on specific devices and for limitations refer to Section 26.18, “ D evices” .
Note
Platform support for interrupt remapping is required to fully isolate a guest with assigned
devices from the host. Without such support, the host may be vulnerable to interrupt injection
attacks from a malicious guest. In an environment where guests are trusted, the admin may
opt-in to still allow PCI device assignment using the al l o w_unsafe_i nterrupts option to
the vfio_iommu_type1 module. This may either be done persistently by adding a .conf file
(e.g. l o cal . co nf) to /etc/mo d pro be. d containing the following:
options vfio_iommu_type1 allow_unsafe_interrupts=1
or dynamically using the sysfs entry to do the same:
# echo 1 >
/sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts
18.1. PCI devices
190
Chapt er 1 8 . G uest virt ual machine device configurat ion
PCI device assignment is only available on hardware platforms supporting either Intel VT-d or AMD
IOMMU. These Intel VT-d or AMD IOMMU specifications must be enabled in BIOS for PCI device
assignment to function.
Pro ced u re 18.1. Prep arin g an In t el syst em f o r PC I d evice assig n men t
1. En ab le t h e In t el VT - d sp ecif icat io n s
The Intel VT-d specifications provide hardware support for directly assigning a physical
device to a virtual machine. These specifications are required to use PCI device assignment
with Red Hat Enterprise Linux.
The Intel VT-d specifications must be enabled in the BIOS. Some system manufacturers
disable these specifications by default. The terms used to refer to these specifications can
differ between manufacturers; consult your system manufacturer's documentation for the
appropriate terms.
2. Act ivat e In t el VT - d in t h e kern el
Activate Intel VT-d in the kernel by adding the intel_iommu=pt parameter to the end of the
GRUB_CMD LINX_LINUX line, within the quotes, in the /etc/sysco nfi g /g rub file.
Note
Instead of using the *_iommu=pt parameter for device assignment, which puts IOMMU
into passthrough mode, it is also possible to use *_iommu=on. However, iommu=on
should be used with caution, as it enables IOMMU for all devices, including those not
used for device assignment by KVM, which may have a negative impact on guest
performance.
The example below is a modified g rub file with Intel VT-d activated.
GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ] & &
/usr/sbin/
rhcrashkernel-param || :) rhgb quiet i ntel _i o mmu= pt"
3. R eg en erat e co n f ig f ile
Regenerate /boot/grub2/grub.cfg by running:
grub2-mkconfig -o /boot/grub2/grub.cfg
4. R ead y t o u se
Reboot the system to enable the changes. Your system is now capable of PCI device
assignment.
Pro ced u re 18.2. Prep arin g an AMD syst em f o r PC I d evice assig n men t
1. En ab le t h e AMD IO MMU sp ecif icat io n s
191
Virt ualiz at ion Deployment and Administ rat ion G uide
The AMD IOMMU specifications are required to use PCI device assignment in Red Hat
Enterprise Linux. These specifications must be enabled in the BIOS. Some system
manufacturers disable these specifications by default.
2. En ab le IO MMU kern el su p p o rt
Append amd_iommu=pt to the end of the GRUB_CMD LINX_LINUX line, within the quotes, in
/etc/sysco nfi g /g rub so that AMD IOMMU specifications are enabled at boot.
3. R eg en erat e co n f ig f ile
Regenerate /boot/grub2/grub.cfg by running:
grub2-mkconfig -o /boot/grub2/grub.cfg
4. R ead y t o u se
Reboot the system to enable the changes. Your system is now capable of PCI device
assignment.
Note
For further information on IOMMU, see Appendix D , Working with IOMMU Groups.
18.1.1. Assigning a PCI device wit h virsh
These steps cover assigning a PCI device to a virtual machine on a KVM hypervisor.
This example uses a PCIe network controller with the PCI identifier code, pci _0 0 0 0 _0 1_0 0 _0 , and
a fully virtualized guest machine named guest1-rhel7-64.
Pro ced u re 18.3. Assig n in g a PC I d evice t o a g u est virt u al mach in e wit h virsh
1. Id en t if y t h e d evice
First, identify the PCI device designated for device assignment to the virtual machine. Use the
l spci command to list the available PCI devices. You can refine the output of l spci with
g rep.
This example uses the Ethernet controller highlighted in the following output:
# lspci | grep Ethernet
0 0 : 19 . 0 Ethernet co ntro l l er: Intel C o rpo rati o n 8256 7LM-2 G i g abi t
Netwo rk C o nnecti o n
01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit
Network Connection (rev 01)
01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit
Network Connection (rev 01)
This Ethernet controller is shown with the short identifier 0 0 : 19 . 0 . We need to find out the
full identifier used by vi rsh in order to assign this PCI device to a virtual machine.
192
Chapt er 1 8 . G uest virt ual machine device configurat ion
To do so, use the vi rsh no d ed ev-l i st command to list all devices of a particular type
(pci ) that are attached to the host machine. Then look at the output for the string that maps
to the short identifier of the device you wish to use.
This example shows the string that maps to the Ethernet controller with the short identifier
0 0 : 19 . 0 . Note that the : and . characters are replaced with underscores in the full
identifier.
# virsh nodedev-list --cap pci
pci_0000_00_00_0
pci_0000_00_01_0
pci_0000_00_03_0
pci_0000_00_07_0
pci_0000_00_10_0
pci_0000_00_10_1
pci_0000_00_14_0
pci_0000_00_14_1
pci_0000_00_14_2
pci_0000_00_14_3
pci_0000_0 0 _19 _0
pci_0000_00_1a_0
pci_0000_00_1a_1
pci_0000_00_1a_2
pci_0000_00_1a_7
pci_0000_00_1b_0
pci_0000_00_1c_0
pci_0000_00_1c_1
pci_0000_00_1c_4
pci_0000_00_1d_0
pci_0000_00_1d_1
pci_0000_00_1d_2
pci_0000_00_1d_7
pci_0000_00_1e_0
pci_0000_00_1f_0
pci_0000_00_1f_2
pci_0000_00_1f_3
pci_0000_01_00_0
pci_0000_01_00_1
pci_0000_02_00_0
pci_0000_02_00_1
pci_0000_06_00_0
pci_0000_07_02_0
pci_0000_07_03_0
Record the PCI device number that maps to the device you want to use; this is required in
other steps.
2. R eview d evice in f o rmat io n
Information on the domain, bus, and function are available from output of the vi rsh
no d ed ev-d umpxml command:
# virsh nodedev-dumpxml pci_0000_00_19_0
193
Virt ualiz at ion Deployment and Administ rat ion G uide
pci_0000_00_19_0computere1000e0025082579LM Gigabit Network
ConnectionIntel Corporation
Fig u re 18.1. D u mp co n t en t s
Note
An IOMMU group is determined based on the visibility and isolation of devices from the
perspective of the IOMMU. Each IOMMU group may contain one or more devices. When
multiple devices are present, all endpoints within the IOMMU group must be claimed for
any device within the group to be assigned to a guest. This can be accomplished
either by also assigning the extra endpoints to the guest or by detaching them from the
host driver using vi rsh no d ed ev-d etach. D evices contained within a single group
may not be split between multiple guests or split between host and guest. Nonendpoint devices such as PCIe root ports, switch ports, and bridges should not be
detached from the host drivers and will not interfere with assignment of endpoints.
D evices within an IOMMU group can be determined using the iommuGroup section of
the vi rsh no d ed ev-d umpxml output. Each member of the group is provided via a
separate " address" field. This information may also be found in sysfs using the
following:
$ ls /sys/bus/pci/devices/0000:01:00.0/iommu_group/devices/
An example of the output from this would be:
0000:01:00.0
0000:01:00.1
To assign only 0000.01.00.0 to the guest, the unused endpoint should be detached
from the host before starting the guest:
$ virsh nodedev-detach pci_0000_01_00_1
194
Chapt er 1 8 . G uest virt ual machine device configurat ion
3. D et ermin e req u ired co n f ig u rat io n d et ails
Refer to the output from the vi rsh no d ed ev-d umpxml pci _0 0 0 0 _0 0 _19 _0 command
for the values required for the configuration file.
The example device has the following values: bus = 0, slot = 25 and function = 0. The decimal
configuration uses those three values:
bus='0'
slot='25'
function='0'
4. Ad d co n f ig u rat io n d et ails
Run vi rsh ed i t, specifying the virtual machine name, and add a device entry in the
section to assign the PCI device to the guest virtual machine.
# virsh edit guest1-rhel7-64
Fig u re 18.2. Ad d PC I d evice
Alternately, run vi rsh attach-d evi ce, specifying the virtual machine name and the
guest's XML file:
virsh attach-device guest1-rhel7-64 fi l e. xml
5. St art t h e virt u al mach in e
# virsh start guest1-rhel7-64
The PCI device should now be successfully assigned to the virtual machine, and accessible to the
guest operating system.
18.1.2. Assigning a PCI device wit h virt -manager
PCI devices can be added to guest virtual machines using the graphical vi rt-manag er tool. The
following procedure adds a Gigabit Ethernet controller to a guest virtual machine.
Pro ced u re 18.4 . Assig n in g a PC I d evice t o a g u est virt u al mach in e u sin g virt - man ag er
1. O p en t h e h ard ware set t in g s
Open the guest virtual machine and click the Ad d Hard ware button to add a new device to
the virtual machine.
195
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 18.3. T h e virt u al mach in e h ard ware in f o rmat io n win d o w
2. Select a PC I d evice
Select PC I H o st D evice from the Hard ware list on the left.
Select an unused PCI device. Note that selecting PCI devices presently in use by another
guest causes errors. In this example, a spare 82576 network device is used. Click Fi ni sh to
complete setup.
196
Chapt er 1 8 . G uest virt ual machine device configurat ion
Fig u re 18.4 . T h e Ad d n ew virt u al h ard ware wiz ard
3. Ad d t h e n ew d evice
The setup is complete and the guest virtual machine now has direct access to the PCI device.
197
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 18.5. T h e virt u al mach in e h ard ware in f o rmat io n win d o w
Note
If device assignment fails, there may be other endpoints in the same IOMMU group that are still
attached to the host. There is no way to retrieve group information using virt-manager, but
virsh commands can be used to analyze the bounds of the IOMMU group and if necessary
sequester devices.
Refer to the Note in Section 18.1.1, “ Assigning a PCI device with virsh” for more information on
IOMMU groups and how to detach endpoint devices using virsh.
18.1.3. PCI device assignment wit h virt -inst all
To use virt - in st all to assign a PCI device, use the --host-device parameter.
Pro ced u re 18.5. Assig n in g a PC I d evice t o a virt u al mach in e wit h virt - in st all
1. Id en t if y t h e d evice
198
Chapt er 1 8 . G uest virt ual machine device configurat ion
Identify the PCI device designated for device assignment to the guest virtual machine.
# lspci
00:19.0
Network
01:00.0
Network
01:00.1
Network
| grep Ethernet
Ethernet controller: Intel Corporation 82567LM-2 Gigabit
Connection
Ethernet controller: Intel Corporation 82576 Gigabit
Connection (rev 01)
Ethernet controller: Intel Corporation 82576 Gigabit
Connection (rev 01)
The vi rsh no d ed ev-l i st command lists all devices attached to the system, and identifies
each PCI device with a string. To limit output to only PCI devices, run the following command:
# virsh nodedev-list --cap pci
pci_0000_00_00_0
pci_0000_00_01_0
pci_0000_00_03_0
pci_0000_00_07_0
pci_0000_00_10_0
pci_0000_00_10_1
pci_0000_00_14_0
pci_0000_00_14_1
pci_0000_00_14_2
pci_0000_00_14_3
pci_0000_00_19_0
pci_0000_00_1a_0
pci_0000_00_1a_1
pci_0000_00_1a_2
pci_0000_00_1a_7
pci_0000_00_1b_0
pci_0000_00_1c_0
pci_0000_00_1c_1
pci_0000_00_1c_4
pci_0000_00_1d_0
pci_0000_00_1d_1
pci_0000_00_1d_2
pci_0000_00_1d_7
pci_0000_00_1e_0
pci_0000_00_1f_0
pci_0000_00_1f_2
pci_0000_00_1f_3
pci_0000_01_00_0
pci_0000_01_00_1
pci_0000_02_00_0
pci_0000_02_00_1
pci_0000_06_00_0
pci_0000_07_02_0
pci_0000_07_03_0
Record the PCI device number; the number is needed in other steps.
Information on the domain, bus and function are available from output of the vi rsh
no d ed ev-d umpxml command:
# virsh nodedev-dumpxml pci_0000_01_00_0
199
Virt ualiz at ion Deployment and Administ rat ion G uide
pci_0000_01_00_0pci_0000_00_01_0igb010082576 Gigabit Network ConnectionIntel Corporation
Fig u re 18.6 . PC I d evice f ile co n t en t s
Note
If there are multiple endpoints in the IOMMU group and not all of them are assigned to
the guest, you will need to manually detach the other endpoint(s) from the host by
running the following command before you start the guest:
$ virsh nodedev-detach pci_0000_00_19_1
Refer to the Note in Section 18.1.1, “ Assigning a PCI device with virsh” for more
information on IOMMU groups.
2. Ad d t h e d evice
Use the PCI identifier output from the vi rsh no d ed ev command as the value for the -host-device parameter.
virt-install \
--name=guest1-rhel7-64 \
--disk path=/var/lib/libvirt/images/guest1-rhel7-64.img,size=8 \
--nonsparse --graphics spice \
--vcpus=2 --ram=2048 \
--location=http://example1.com/installation_tree/RHEL7.0-Serverx86_64/os \
200
Chapt er 1 8 . G uest virt ual machine device configurat ion
--nonetworks \
--os-type=linux \
--os-variant=rhel7
--host-device=pci_0000_01_00_0
3. C o mp let e t h e in st allat io n
Complete the guest installation. The PCI device should be attached to the guest.
18.1.4 . Det aching an assigned PCI device
When a host PCI device has been assigned to a guest machine, the host can no longer use the
device. Read this section to learn how to detach the device from the guest with vi rsh or virt man ag er so it is available for host use.
Pro ced u re 18.6 . D et ach in g a PC I d evice f ro m a g u est wit h virsh
1. D et ach t h e d evice
Use the following command to detach the PCI device from the guest by removing it in the
guest's XML file:
# virsh detach-device name_of_guest file.xml
2. R e- at t ach t h e d evice t o t h e h o st ( o p t io n al)
If the device is in managed mode, skip this step. The device will be returned to the host
automatically.
If the device is not using managed mode, use the following command to re-attach the PCI
device to the host machine:
# virsh nodedev-reattach device
For example, to re-attach the pci _0 0 0 0 _0 1_0 0 _0 device to the host:
virsh nodedev-reattach pci_0000_01_00_0
The device is now available for host use.
Pro ced u re 18.7. D et ach in g a PC I D evice f ro m a g u est wit h virt - man ag er
1. O p en t h e virt u al h ard ware d et ails screen
In virt - man ag er, double-click on the virtual machine that contains the device. Select the
Sho w vi rtual hard ware d etai l s button to display a list of virtual hardware.
201
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 18.7. T h e virt u al h ard ware d et ails b u t t o n
2. Select an d remo ve t h e d evice
Select the PCI device to be detached from the list of virtual devices in the left panel.
Fig u re 18.8. Select in g t h e PC I d evice t o b e d et ach ed
Click the R emo ve button to confirm. The device is now available for host use.
18.1.5. Creat ing PCI bridges
Peripheral Component Interconnects (PCI) bridges are used to attach to devices such as network
cards, modems and sound cards. Just like their physical counterparts, virtual devices can also be
attached to a PCI Bridge. In the past, only 31 PCI devices could be added to any guest virtual
machine. Now, when a 31st PCI device is added, a PCI bridge is automatically placed in the 31st slot
moving the additional PCI device to the PCI bridge. Each PCI bridge has 31 slots for 31 additional
devices, all of which can be bridges. In this manner, over 900 devices can be available for guest
virtual machines. Note that this action cannot be performed when the guest virtual machine is
running. You must add the PCI device on a guest virtual machine that is shutdown.
1 8 .1 .5 .1 . PCI Bridge ho t plug/unho t plug suppo rt
PCI Bridge hotplug/unhotplug is supported on the following device types:
virtio-net-pci
virtio-scsi-pci
202
Chapt er 1 8 . G uest virt ual machine device configurat ion
e1000
rtl8139
virtio-serial-pci
virtio-balloon-pci
18.1.6. PCI passt hrough
A PCI network device (specified by the element) is directly assigned to the guest using
generic device passthrough, after first optionally setting the device's MAC address to the configured
value, and associating the device with an 802.1Qbh capable switch using an optionally specified
element (see the examples of virtualport given above for type='direct' network
devices). Note that - due to limitations in standard single-port PCI ethernet card driver design - only
SR-IOV (Single Root I/O Virtualization) virtual function (VF) devices can be assigned in this manner;
to assign a standard single-port PCI or PCIe Ethernet card to a guest, use the traditional
device definition.
To use VFIO device assignment rather than traditional/legacy KVM device assignment (VFIO is a new
method of device assignment that is compatible with UEFI Secure Boot), a
interface can have an optional driver sub-element with a name attribute set to " vfio" . To use legacy
KVM device assignment you can set name to " kvm" (or simply omit the element, since
is currently the default).
Note that this " intelligent passthrough" of network devices is very similar to the functionality of a
standard device, the difference being that this method allows specifying a MAC address
and for the passed-through device. If these capabilities are not required, if you
have a standard single-port PCI, PCIe, or USB network card that does not support SR-IOV (and
hence would anyway lose the configured MAC address during reset after being assigned to the guest
domain), or if you are using a version of libvirt older than 0.9.11, you should use standard
to assign the device to the guest instead of .
Fig u re 18.9 . XML examp le f o r PC I d evice assig n men t
18.1.7. Configuring PCI assignment (passt hrough) wit h SR-IOV devices
This section is for SR-IOV devices only. SR-IOV network cards provide multiple Virtual Functions (VFs)
that can each be individually assigned to a guest virtual machines using PCI device assignment.
203
Virt ualiz at ion Deployment and Administ rat ion G uide
Once assigned, each will behave as a full physical network device. This permits many guest virtual
machines to gain the performance advantage of direct PCI device assignment, while only using a
single slot on the host physical machine.
These VFs can be assigned to guest virtual machines in the traditional manner using the element
, but as SR-IOV VF network devices do not have permanent unique MAC addresses, it
causes issues where the guest virtual machine's network settings would have to be re-configured
each time the host physical machine is rebooted. To remedy this, you would need to set the MAC
address prior to assigning the VF to the host physical machine and you would need to set this each
and every time the guest virtual machine boots. In order to assign this MAC address as well as other
options, refert to the procedure described in Procedure 18.8, “ Configuring MAC addresses, vLAN,
and virtual ports for assigning PCI devices on SR-IOV” .
Pro ced u re 18.8. C o n f ig u rin g MAC ad d resses, vLAN , an d virt u al p o rt s f o r assig n in g PC I
d evices o n SR - IO V
It is important to note that the element cannot be used for function-specific items like
MAC address assignment, vLAN tag ID assignment, or virtual port assignment because the ,
, and elements are not valid children for . As they are valid for
, support for a new interface type was added ().
This new interface device type behaves as a hybrid of an and . Thus,
before assigning the PCI device to the guest virtual machine, libvirt initializes the network-specific
hardware/switch that is indicated (such as setting the MAC address, setting a vLAN tag, and/or
associating with an 802.1Qbh switch) in the guest virtual machine's XML configuration file. For
information on setting the vLAN tag, refer to Section 20.16, “ Setting vLAN tags” .
1. Sh u t d o wn t h e g u est virt u al mach in e
Using vi rsh shutd o wn command (refer to Section 23.10.2, “ Shutting down Red Hat
Enterprise Linux 6 guests on a Red Hat Enterprise Linux 7 host” ), shutdown the guest virtual
machine named guestVM.
# vi rsh shutd o wn guestVM
2. G at h er in f o rmat io n
In order to use , you must have an SR-IOV-capable
network card, host physical machine hardware that supports either the Intel VT-d or AMD
IOMMU extensions, and you must know the PCI address of the VF that you wish to assign.
3. O p en t h e XML f ile f o r ed it in g
Run the # vi rsh save-i mag e-ed i t command to open the XML file for editing (refer to
Section 23.9.11, “ Editing the guest virtual machine configuration files” for more information).
As you would want to restore the guest virtual machine to its former running state, the -runni ng would be used in this case. The name of the configuration file in this example is
guestVM.xml, as the name of the guest virtual machine is guestVM.
# vi rsh save-i mag e-ed i t guestVM.xml --runni ng
The guestVM.xml opens in your default editor.
4. Ed it t h e XML f ile
Update the configuration file (guestVM.xml) to have a entry similar to the
following:
204
Chapt er 1 8 . G uest virt ual machine device configurat ion
...
...
Fig u re 18.10. Samp le d o main XML f o r h o st d ev in t erf ace t yp e
Note that if you do not provide a MAC address, one will be automatically generated, just as
with any other type of interface device. Also, the element is only used if you
are connecting to an 802.11Qgh hardware switch (802.11Qbg (a.k.a. " VEPA" ) switches are
currently not supported.
5. R e- st art t h e g u est virt u al mach in e
Run the vi rsh start command to restart the guest virtual machine you shutdown in the first
step (example uses guestVM as the guest virtual machine's domain name). Refer to
Section 23.9.1, “ Starting a virtual machine” for more information.
# vi rsh start guestVM
When the guest virtual machine starts, it sees the network device provided to it by the physical
host machine's adapter, with the configured MAC address. This MAC address will remain
unchanged across guest virtual machine and host physical machine reboots.
18.1.8. Set t ing PCI device assignment from a pool of SR-IOV virt ual funct ions
Hard coding the PCI addresses of a particular Virtual Functions (VFs) into a guest's configuration has
two serious limitations:
The specified VF must be available any time the guest virtual machine is started, implying that the
administrator must permanently assign each VF to a single guest virtual machine (or modify the
configuration file for every guest virtual machine to specify a currently unused VF's PCI address
each time every guest virtual machine is started).
If the guest vitual machine is moved to another host physical machine, that host physical machine
205
Virt ualiz at ion Deployment and Administ rat ion G uide
must have exactly the same hardware in the same location on the PCI bus (or, again, the guest
vitual machine configuration must be modified prior to start).
It is possible to avoid both of these problems by creating a libvirt network with a device pool
containing all the VFs of an SR-IOV device. Once that is done you would configure the guest virtual
machine to reference this network. Each time the guest is started, a single VF will be allocated from
the pool and assigned to the guest virtual machine. When the guest virtual machine is stopped, the
VF will be returned to the pool for use by another guest virtual machine.
Pro ced u re 18.9 . C reat in g a d evice p o o l
1. Sh u t d o wn t h e g u est virt u al mach in e
Using vi rsh shutd o wn command (refer to Section 23.10.2, “ Shutting down Red Hat
Enterprise Linux 6 guests on a Red Hat Enterprise Linux 7 host” ), shutdown the guest virtual
machine named guestVM.
# vi rsh shutd o wn guestVM
2. C reat e a co n f ig u rat io n f ile
Using your editor of chocice create an XML file (named passthrough.xml, for example) in the
/tmp directory. Make sure to replace pf d ev= ' eth3' with the netdev name of your own SRIOV device's PF
The following is an example network definition that will make available a pool of all VFs for
the SR-IOV adapter with its physical function (PF) at " eth3' on the host physical machine:
passthrough
Fig u re 18.11. Samp le n et wo rk d ef in it io n d o main XML
3. Lo ad t h e n ew XML f ile
Run the following command, replacing /tmp/passthrough.xml, with the name and location of
your XML file you created in the previous step:
# vi rsh net-d efi ne /tmp/passthrough.xml
4. R est art in g t h e g u est
Run the following replacing passthrough.xml, with the name of your XML file you created in the
previous step:
206
Chapt er 1 8 . G uest virt ual machine device configurat ion
# vi rsh net-auto start passthrough # vi rsh net-start passthrough
5. R e- st art t h e g u est virt u al mach in e
Run the vi rsh start command to restart the guest virtual machine you shutdown in the first
step (example uses guestVM as the guest virtual machine's domain name). Refer to
Section 23.9.1, “ Starting a virtual machine” for more information.
# vi rsh start guestVM
6. In it iat in g p asst h ro u g h f o r d evices
Although only a single device is shown, libvirt will automatically derive the list of all VFs
associated with that PF the first time a guest virtual machine is started with an interface
definition in its domain XML like the following:
Fig u re 18.12. Samp le d o main XML f o r in t erf ace n et wo rk d ef in it io n
7. Verif icat io n
You can verify this by running vi rsh net-d umpxml passthrough command after starting
the first guest that uses the network; you will get output similar to the following:
passthrougha6b49429-d353-d7ad-3185-4451cc786437
207
Virt ualiz at ion Deployment and Administ rat ion G uide
Fig u re 18.13. XML d u mp f ile passthrough co n t en t s
18.2. USB devices
This section gives the commands required for handling USB devices.
18.2.1. Assigning USB devices t o guest virt ual machines
Most devices such as web cameras, card readers, disk drives, keyboards, mice, etc are connected to
a computer using a USB port and cable. There are two ways to pass such devices to a guest virtual
machine:
Using USB passthrough - this requires the device to be physically connected to the host physical
machine that is hosting the guest virtual machine. SPICE is not needed in this case. USB devices
on the host can be passed to the guest via the command line or virt - man ag er. Refer to
Section 22.3.2, “ Attaching USB devices to a guest virtual machine” for virt man ag er directions.
Note that the virt - man ag er directions are not suitable for hot plugging or hot unplugging
devices. If you want to hot plug/or hot unplug a USB device, refer to Procedure 23.5,
“ Hotplugging USB devices for use by the guest virtual machine” .
Using USB re-direction - USB re-direction is best used in cases where there is a host physical
machine that is running in a data center. The user connects to his/her guest virtual machine from
a local machine or thin client. On this local machine there is a SPICE client. The user can attach
any USB device to the thin client and the SPICE client will redirect the device to the host physical
machine on the data center so it can be used by the guest virtual machine that is running on the
thin client. For instructions via the virt-manager refer to Section 22.3.3, “ USB redirection” .
18.2.2. Set t ing a limit on USB device redirect ion
To filter out certain devices from redirection, pass the filter property to -device usb-redir. The
filter property takes a string consisting of filter rules, the format for a rule is:
: : : :
Use the value -1 to designate it to accept any value for a particular field. You may use multiple rules
on the same command line using | as a separator. Note that if a device matches none of the passed
in rules, redirecting it will not be allowed!
Examp le 18.1. An examp le o f limit in g red irect io n wit h a win d o ws g u est virt u al
mach in e
1. Prepare a Windows 7 guest virtual machine.
2. Add the following code excerpt to the guest virtual machine's' domain xml file:
208
Chapt er 1 8 . G uest virt ual machine device configurat ion
3. Start the guest virtual machine and confirm the setting changes by running the following:
#ps -ef | g rep $g uest_name
-d evi ce usb-red i r,chard ev= charred i r0 ,i d = red i r0 ,/
fi l ter= 0 x0 8: 0 x1234 : 0 xBEEF: 0 x0 20 0 : 1| -1: -1: -1: 1: 0 ,bus= usb. 0 ,po rt= 3
4. Plug a USB device into a host physical machine, and use virt - man ag er to connect to the
guest virtual machine.
5. Click U SB d evice select io n in the menu, which will produce the following message:
" Some USB devices are blocked by host policy" . Click O K to confirm and continue.
The filter takes effect.
6. To make sure that the filter captures properly check the USB device vendor and product,
then make the following changes in the host physical machine's domain XML to allow for
USB redirection.
7. Restart the guest virtual machine, then use virt - viewer to connect to the guest virtual
machine. The USB device will now redirect traffic to the guest virtual machine.
18.3. Configuring device cont rollers
D epending on the guest virtual machine architecture, some device buses can appear more than
once, with a group of virtual devices tied to a virtual controller. Normally, libvirt can automatically
infer such controllers without requiring explicit XML markup, but in some cases it is better to explicitly
set a virtual controller element.
...
...
...
Fig u re 18.14 . D o main XML examp le f o r virt u al co n t ro llers
Each controller has a mandatory attribute , which must be one of:
ide
fdc
scsi
sata
usb
ccid
virtio-serial
pci
The element has a mandatory attribute which is the decimal
integer describing in which order the bus controller is encountered (for use in controller attributes of
elements). When there are two additional
optional attributes (named po rts and vecto rs), which control how many devices can be connected
through the controller.
When , there is an optional attribute mo d el model, which can have
the following values:
auto
buslogic
ibmvscsi
lsilogic
lsisas1068
lsisas1078
virtio-scsi
vmpvscsi
When , there is an optional attribute mo d el model, which can have
the following values:
piix3-uhci
piix4-uhci
ehci
210
Chapt er 1 8 . G uest virt ual machine device configurat ion
ich9-ehci1
ich9-uhci1
ich9-uhci2
ich9-uhci3
vt82c686b-uhci
pci-ohci
nec-xhci
Note that if the USB bus needs to be explicitly disabled for the guest virtual machine,
may be used. .
For controllers that are themselves devices on a PCI or USB bus, an optional sub-element
can specify the exact relationship of the controller to its master bus, with semantics as
shown in Section 18.4, “ Setting addresses for devices” .
An optional sub-element can specify the driver specific options. Currently it only supports
attribute queues, which specifies the number of queues for the controller. For best performance, it's
recommended to specify a value matching the number of vCPUs.
USB companion controllers have an optional sub-element to specify the exact
relationship of the companion to its master controller. A companion controller is on the same bus as
its master, so the companion i nd ex value should be equal.
An example XML which can be used is as follows:
...
...
...
model='ich9-ehci1'>
bus='0' slot='4' function='7'/>
model='ich9-uhci1'>
bus='0' slot='4' function='0'
Fig u re 18.15. D o main XML examp le f o r U SB co n t ro llers
PCI controllers have an optional mo d el attribute with the following possible values:
pci-root
pcie-root
211
Virt ualiz at ion Deployment and Administ rat ion G uide
pci-bridge
dmi-to-pci-bridge
For machine types which provide an implicit PCI bus, the pci-root controller with i nd ex= ' 0 ' is
auto-added and required to use PCI devices. pci-root has no address. PCI bridges are auto-added if
there are too many devices to fit on the one bus provided by mo d el = ' pci -ro o t' , or a PCI bus
number greater than zero was specified. PCI bridges can also be specified manually, but their
addresses should only refer to PCI buses provided by already specified PCI controllers. Leaving
gaps in the PCI controller indexes might lead to an invalid configuration. The following XML example
can be added to the section:
...
...
Fig u re 18.16 . D o main XML examp le f o r PC I b rid g e
For machine types which provide an implicit PCI Express (PCIe) bus (for example, the machine types
based on the Q35 chipset), the pcie-root controller with i nd ex= ' 0 ' is auto-added to the domain's
configuration. pcie-root has also no address, but provides 31 slots (numbered 1-31) and can only be
used to attach PCIe devices. In order to connect standard PCI devices on a system which has a pcieroot controller, a pci controller with mo d el = ' d mi -to -pci -bri d g e' is automatically added. A
dmi-to-pci-bridge controller plugs into a PCIe slot (as provided by pcie-root), and itself provides 31
standard PCI slots (which are not hot-pluggable). In order to have hot-pluggable PCI slots in the
guest system, a pci-bridge controller will also be automatically created and connected to one of the
slots of the auto-created dmi-to-pci-bridge controller; all guest devices with PCI addresses that are
auto-determined by libvirt will be placed on this pci-bridge device.
...
...
212
model='pcie-root'/>
model='dmi-to-pci-bridge'>
bus='0' slot='0xe' function='0'/>
model='pci-bridge'>
bus='1' slot='1' function='0'/>
Chapt er 1 8 . G uest virt ual machine device configurat ion
Fig u re 18.17. D o main XML examp le f o r PC Ie ( PC I exp ress)
The following XML configuration is used for USB 3.0 / xHCI emulation:
...
...
Fig u re 18.18. D o main XML examp le f o r U SB 3/xH C I d evices
18.4 . Set t ing addresses for devices
Many devices have an optional sub-element which is used to describe where the device
is placed on the virtual bus presented to the guest virtual machine. If an address (or any optional
attribute within an address) is omitted on input, libvirt will generate an appropriate address; but an
explicit address is required if more control over layout is required. See Figure 18.9, “ XML example for
PCI device assignment” for domain XML device examples including an element.
Every address has a mandatory attribute type that describes which bus the device is on. The choice
of which address to use for a given device is constrained in part by the device and the architecture of
the guest virtual machine. For example, a device uses type= ' d ri ve' , while a
device would use type= ' pci ' on i686 or x86_64 guest virtual machine architectures. Each
address type has further optional attributes that control where on the bus the device will be placed as
described in the table:
T ab le 18.1. Su p p o rt ed d evice ad d ress t yp es
Ad d ress t yp e
D escrip t io n
type='pci'
PCI addresses have the following additional
attributes:
domain (a 2-byte hex integer, not currently
used by qemu)
bus (a hex value between 0 and 0xff,
inclusive)
slot (a hex value between 0x0 and 0x1f,
inclusive)
function (a value between 0 and 7, inclusive)
multifunction controls turning on the
multifunction bit for a particular slot/function
in the PCI control register By default it is set
to 'off', but should be set to 'on' for function 0
of a slot that will have multiple functions
used.
213
Virt ualiz at ion Deployment and Administ rat ion G uide
Ad d ress t yp e
D escrip t io n
type='drive'
D rive addresses have the following additional
attributes:
controller (a 2-digit controller number)
bus (a 2-digit bus number
target (a 2-digit bus number)
unit (a 2-digit unit number on the bus)
type='virtio-serial'
Each virtio-serial address has the following
additional attributes:
controller (a 2-digit controller number)
bus (a 2-digit bus number)
slot (a 2-digit slot within the bus)
type='ccid'
A CCID address, for smart-cards, has the
following additional attributes:
bus (a 2-digit bus number)
slot attribute (a 2-digit slot within the bus)
type='usb'
USB addresses have the following additional
attributes:
bus (a hex value between 0 and 0xfff,
inclusive)
port (a dotted notation of up to four octets,
such as 1.2 or 2.1.3.1)
type='isa'
ISA addresses have the following additional
attributes:
iobase
irq
18.5. Random number generat or device
virtio-rng is a virtual hardware random number generator device that can provide the guest with fresh
entropy upon request. The driver feeds the data back to the guest virtual machine's OS.
On the host physical machine, the hardware rng interface creates a chardev at /d ev/hwrng , which
can be opened and then read to fetch entropy from the host physical machine. Coupled with the rngd
daemon, the entropy from the host physical machine can be routed to the guest virtual machine's
/d ev/rand o m, which is the primary source of randomness.
Using a random number generator is particularly useful when a device such as a keyboard, mouse
and other inputs are not enough to generate sufficient entropy on the guest virtual machine.The
virtual random number generator device allows the host physical machine to pass through entropy
to guest virtual machine operating systems. This procedure can be done either via the command line
or via virt-manager. For virt-manager instructions refer to Procedure 18.10, “ Implementing virtio-rng via
Virtualzation Manager” and for command line instructions, refer to Procedure 18.11, “ Implementing
virtio-rng via command line tools” .
Pro ced u re 18.10. Imp lemen t in g virt io - rn g via Virt u alz at io n Man ag er
214
Chapt er 1 8 . G uest virt ual machine device configurat ion
1. Shutdown the guest virtual machine.
2. Select the guest virtual machine and from the Ed it menu, select Virt u al Mach in e D et ails, to
open the D etails window for the specified guest virtual machine.
3. Click the Ad d Hard ware button.
4. In the Ad d N ew Virt u al H ard ware window, select R N G to open the R an d o m N u mb er
G en erat o r window.
Fig u re 18.19 . R an d o m N u mb er G en erat o r win d o w
Enter the desired parameters and click Fin ish when done. The parameters are explained in
virtio-rng elements.
Pro ced u re 18.11. Imp lemen t in g virt io - rn g via co mman d lin e t o o ls
1. Shutdown the guest virtual machine.
2. Using vi rsh ed i t domain-name command, open the XML file for the desired guest virtual
machine.
3. Edit the element to include the following:
215
Virt ualiz at ion Deployment and Administ rat ion G uide
...
/dev/random
...
Fig u re 18.20. R an d o m n u mb er g en erat o r d evice
The random number generator device allows the following attributes/elements:
virt io - rn g elemen t s
mo d el - The required mo d el attribute specifies what type of RNG device is provided.
' vi rti o '
- The element specifies the source of entropy to be used for the
domain. The source model is configured using the mo d el attribute. Supported source
models include ' rand o m' — /d ev/rand o m (default setting) or similar device as source
and ' eg d ' which sets a EGD protocol backend.
backend type= ' rand o m' - This type expects a non-blocking character
device as input. Examples of such devices are /d ev/rand o m and /d ev/urand o m. The
file name is specified as contents of the element. When no file name is
specified the hypervisor default is used.
- This backend connects to a source using the EGD protocol.
The source is specified as a character device. Refer to character device host physical
machine interface for more information.
18.6. Assigning GPU devices
Red Hat Enterprise Linux 7 supports PCI device assignment of NVID IA K-Series Quadro (model 2000
series or higher), GRID , and Tesla as non-VGA graphics devices. Currently up to two GPUs may be
attached to the virtual machine in addition to one of the standard, emulated VGA interfaces. The
emulated VGA is used for pre-boot and installation and the NVID IA GPU takes over when the NVID IA
graphics drivers are loaded. Note that the NVID IA Quadro 2000 is not supported, nor is the Quadro
K420 card.
This procedure will, in short, identify the device from lspci, detach it from host physical machine and
then attach it to the guest virtual machine.
1. En ab le IO MMU su p p o rt in t h e h o st p h ysical mach in e kern el
216
Chapt er 1 8 . G uest virt ual machine device configurat ion
For an Intel VT-d system this is done by adding the intel_iommu=pt parameter to the kernel
command line. For an AMD -Vi system, the option is amd_iommu=pt. To enable this option
you will need to edit or add the GRUB_CMDLINX_LINUX line to the /etc/sysco nfi g /g rub
configuration file as follows:
GRUB_CMDLINE_LINUX="rd.lvm.lv=vg_VolGroup00/LogVol01
vconsole.font=latarcyrheb-sun16 rd.lvm.lv=vg_VolGroup_1/root
vconsole.keymap=us $([ -x /usr/sbin/rhcrashkernel-param ] & &
/usr/sbin/rhcrashkernel-param || :) rhgb quiet intel_iommu=pt"
Note
For further information on IOMMU, see Appendix D , Working with IOMMU Groups.
2. R eg en erat e t h e b o o t lo ad er co n f ig u rat io n
Regenerate the bootloader configuration using the grub2-mkconfig to include this option, by
running the following command:
# grub2-mkconfig -o /etc/grub2.cfg
Note that if you are using a UEFI-based host, the target file will be /etc/grub2-efi.cfg.
3. R eb o o t t h e h o st p h ysical mach in e
In order for this option to take effect, reboot the host physical machine with the following
command:
# reboot
Pro ced u re 18.12. Exclu d in g t h e G PU d evice f ro m b in d in g t o t h e h o st p h ysical mach in e
d river
For GPU assignment it is recommended to exclude the device from binding to host drivers as these
drivers often do not support dynamic unbinding of the device.
1. Id en t if y t h e PC I b u s ad d ress
To identify the PCI bus address and ID s of the device, run the following l spci command. In
this example, a VGA controller such as a Quadro or GRID card is used as follows:
# lspci -Dnn | grep VGA
0000:02:00.0 VGA compatible controller [0300]: NVIDIA Corporation
GK106GL [Quadro K4000] [10de:11fa] (rev a1)
The resulting search reveals that the PCI bus address of this device is 0000:02:00.0 and the
PCI ID s for the device are 10de:11fa.
2. Preven t t h e n at ive h o st p h ysical mach in e d river f ro m u sin g t h e G PU d evice
217
Virt ualiz at ion Deployment and Administ rat ion G uide
To prevent the native host physical machine driver from using the GPU device you can use a
PCI ID with the pci-stub driver. To do this, append the following additional option to the
GRUB_CMDLINX_LINUX configuration file located in /etc/sysco nfi g /g rub as follows:
pci-stub.ids=10de:11fa
To add additional PCI ID s for pci-stub, separate them with a comma.
3. R eg en erat e t h e b o o t lo ad er co n f ig u rat io n
Regenerate the bootloader configuration using the grub2-mkconfig to include this option, by
running the following command:
# grub2-mkconfig -o /etc/grub2.cfg
Note that if you are using a UEFI-based host, the target file will be /etc/grub2-efi.cfg.
4. R eb o o t t h e h o st p h ysical mach in e
In order for this option to take effect, reboot the host physical machine with the following
command:
# reboot
The virsh commands can be used to further evaluate the device, however in order to use virsh with
the devices you need to convert the PCI bus address to libvirt compatible format by appending pci_
and converting delimiters to underscores. In this example the libvirt address of PCI device
0000:02:00.0 becomes pci_0000_02_00_0. The nodedev-dumpxml option provides additional
information for the device as shown:
# vi rsh no d ed ev-d umpxml pci _0 0 0 0 _0 2_0 0 _0
pci_0000_02_00_0/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0pci_0000_00_03_0pci-stub0200GK106GL [Quadro K4000]NVIDIA Corporation
218
Chapt er 1 8 . G uest virt ual machine device configurat ion
Fig u re 18.21. XML f ile ad ap t at io n f o r G PU - Examp le
Particularly important in this output is the element. The iommuGroup indicates the
set of devices which are considered isolated from other devices due to IOMMU capabilities and PCI
bus topologies. All of the endpoint devices within the iommuGroup (ie. devices that are not PCIe root
ports, bridges, or switch ports) need to be unbound from the native host drivers in order to be
assigned to a guest. In the example above, the group is composed of the GPU device (0000:02:00.0)
as well as the companion audio device (0000:02:00.1). For more information, refer to Appendix D ,
Working with IOMMU Groups.
Note
Assignment of Nvidia audio functions is not supported due to hardware issues with legacy
interrupt support. In order to assign the GPU to a guest, the audio function must first be
detached from native host drivers. This can either be done by using lspci to find the PCI ID s
for the device and appending it to the pci-stub.ids option or dynamically using the nodedevdetach option of virsh. For example:
# virsh nodedev-detach pci_0000_02_00_1
Device pci_0000_02_00_1 detached
The GPU audio function is generally not useful without the GPU itself, so it’s generally recommended
to use the pci-stub.ids option instead.
The GPU can be attached to the VM using virt-manager or using virsh, either by directly editing the
VM XML ( vi rsh ed i t [d o mai n]) or attaching the GPU to the domain with vi rsh attachd evi ce. If you are using the vi rsh attach-d evi ce command, an XML fragment first needs to be
created for the device, such as the following:
Fig u re 18.22. XML f ile f o r at t ach in g G PU - Examp le
Save this to a file and run vi rsh attach-d evi ce [d o mai n] [fi l e] --persi stent to
include the XML in the VM configuration. Note that the assigned GPU is added in addition to the
219
Virt ualiz at ion Deployment and Administ rat ion G uide
existing emulated graphics device in the guest virtual machine. The assigned GPU is handled as a
secondary graphics device in the VM. Assignment as a primary graphics device is not supported and
emulated graphics devices in the VM's XML should not be removed.
Note
When using an assigned Nvidia GPU in the guest, only the Nvidia drivers are supported.
Other drivers may not work and may generate errors. For a Red Hat Enterprise Linux 7 guest,
the nouveau driver can be blacklisted using the option mo d pro be. bl ackl i st= no uveau on
the kernel command line during install. For information on other guest virtual machines refer to
the operating system's specific documentation.
When configuring Xorg for use with an assigned GPU in a KVM guest, the BusID option must be
added to xorg.conf to specify the guest address of the GPU. For example, within the guest determine
the PCI bus address of the GPU (this will be different than the host address):
# lspci | grep VGA
00:02.0 VGA compatible controller: Device 1234:1111
00:09.0 VGA compatible controller: NVIDIA Corporation GK106GL [Quadro
K4000] (rev a1)
In this example the address is 00:09.0. The file /etc/X11/xo rg . co nf is then modified to add the
highlighted entry below.
Section "Device"
Identifier
Driver
VendorName
BusID
EndSection
"Device0"
"nvidia"
"NVIDIA Corporation"
"PCI:0:9:0"
D epending on the guest operating system, with the Nvidia drivers loaded, the guest may support
using both the emulated graphics and assigned graphics simultaneously or may disable the
emulated graphics. Note that access to the assigned graphics framebuffer is not provided by tools
such as virt-manager. If the assigned GPU is not connected to a physical display, guest-based
remoting solutions may be necessary to access the GPU desktop. As with all PCI device assignment,
migration of a guest with an assigned GPU is not supported and each GPU is owned exclusively by
a single guest. D epending on the guest operating system, hotplug support of GPUs may be
available.
220
Chapt er 1 9 . SR- IO V
Chapter 19. SR-IOV
D eveloped by the PCI-SIG (PCI Special Interest Group), the Single Root I/O Virtualization (SR-IOV)
specification is a standard for a type of PCI device assignment that can share a single device to
multiple virtual machines. SR-IOV improves device performance for virtual machines.
Note
Virtual machines that use the Xeon E3-1200 series chip set, do not support SR-IOV. More
information can be found on Intel's website or in this article.
Fig u re 19 .1. H o w SR - IO V wo rks
SR-IOV enables a Single Root Function (for example, a single Ethernet port), to appear as multiple,
separate, physical devices. A physical device with SR-IOV capabilities can be configured to appear
in the PCI configuration space as multiple functions. Each device has its own configuration space
complete with Base Address Registers (BARs).
SR-IOV uses two PCI functions:
Physical Functions (PFs) are full PCIe devices that include the SR-IOV capabilities. Physical
Functions are discovered, managed, and configured as normal PCI devices. Physical Functions
configure and manage the SR-IOV functionality by assigning Virtual Functions.
Virtual Functions (VFs) are simple PCIe functions that only process I/O. Each Virtual Function is
derived from a Physical Function. The number of Virtual Functions a device may have is limited
by the device hardware. A single Ethernet port, the Physical D evice, may map to many Virtual
Functions that can be shared to virtual machines.
The hypervisor can map one or more Virtual Functions to a virtual machine. The Virtual Function's
configuration space is then mapped to the configuration space presented to the guest.
221
Virt ualiz at ion Deployment and Administ rat ion G uide
Each Virtual Function can only be mapped to a single guest at a time, as Virtual Functions require
real hardware resources. A virtual machine can have multiple Virtual Functions. A Virtual Function
appears as a network card in the same way as a normal network card would appear to an operating
system.
The SR-IOV drivers are implemented in the kernel. The core implementation is contained in the PCI
subsystem, but there must also be driver support for both the Physical Function (PF) and Virtual
Function (VF) devices. An SR-IOV capable device can allocate VFs from a PF. The VFs appear as
PCI devices which are backed on the physical PCI device by resources such as queues and register
sets.
19.1. Advant ages of SR-IOV
SR-IOV devices can share a single physical port with multiple virtual machines.
Virtual Functions have near-native performance and provide better performance than paravirtualized drivers and emulated access. Virtual Functions provide data protection between virtual
machines on the same physical server as the data is managed and controlled by the hardware.
These features allow for increased virtual machine density on hosts within a data center.
SR-IOV is better able to utilize the bandwidth of devices with multiple guests.
19.2. Using SR-IOV
This section covers the use of PCI passthrough to assign a Virtual Function of an SR-IOV capable
multiport network card to a virtual machine as a network device.
SR-IOV Virtual Functions (VFs) can be assigned to virtual machines by adding a device entry in
with the vi rsh ed i t or vi rsh attach-d evi ce command. However, this can be
problematic because unlike a regular network device, an SR-IOV VF network device does not have a
permanent unique MAC address, and is assigned a new MAC address each time the host is rebooted.
Because of this, even if the guest is assigned the same VF after a reboot, when the host is rebooted
the guest determines its network adapter to have a new MAC address. As a result, the guest believes
there is new hardware connected each time, and will usually require re-configuration of the guest's
network settings.
libvirt-0.9.10 and newer contains the interface device. Using this
interface device, lib virt will first perform any network-specific hardware/switch initialization indicated
(such as setting the MAC address, VLAN tag, or 802.1Qbh virtualport parameters), then perform the
PCI device assignment to the guest.
Using the interface device requires:
an SR-IOV-capable network card,
host hardware that supports either the Intel VT-d or the AMD IOMMU extensions, and
the PCI address of the VF to be assigned.
Important
Assignment of an SR-IOV device to a virtual machine requires that the host hardware supports
the Intel VT-d or the AMD IOMMU specification.
222
Chapt er 1 9 . SR- IO V
To attach an SR-IOV network device on an Intel or an AMD system, follow this procedure:
Pro ced u re 19 .1. At t ach an SR - IO V n et wo rk d evice o n an In t el o r AMD syst em
1. En ab le In t el VT - d o r t h e AMD IO MMU sp ecif icat io n s in t h e B IO S an d kern el
On an Intel system, enable Intel VT-d in the BIOS if it is not enabled already. Refer to
Procedure 18.1, “ Preparing an Intel system for PCI device assignment” for procedural help on
enabling Intel VT-d in the BIOS and kernel.
Skip this step if Intel VT-d is already enabled and working.
On an AMD system, enable the AMD IOMMU specifications in the BIOS if they are not enabled
already. Refer to Procedure 18.2, “ Preparing an AMD system for PCI device assignment” for
procedural help on enabling IOMMU in the BIOS.
2. Verif y su p p o rt
Verify if the PCI device with SR-IOV capabilities is detected. This example lists an Intel 82576
network interface card which supports SR-IOV. Use the l spci command to verify whether the
device was detected.
# lspci
03:00.0
Network
03:00.1
Network
Ethernet controller: Intel Corporation 82576 Gigabit
Connection (rev 01)
Ethernet controller: Intel Corporation 82576 Gigabit
Connection (rev 01)
Note that the output has been modified to remove all other devices.
3. St art t h e SR - IO V kern el mo d u les
If the device is supported the driver kernel module should be loaded automatically by the
kernel. Optional parameters can be passed to the module using the mo d pro be command.
The Intel 82576 network interface card uses the i g b driver kernel module.
# modprobe igb [