nova improvement of maximum attach volumes more than 26 vols

Registered by Tsuyoshi Nagata on 2018-05-10

Currntly implementation, nova instance can only handle 26 volumes(vda-vdz).
this bp improves the limit.

# create a instance with single volume named sles15rc
# openstack server add volume sles15rc vol2
# openstack server add volume sles15rc vol3
     :
# openstack server add volume sles15rc vol26
# openstack server add volume sles15rc vol27
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'NovaException_Remote'> (HTTP 500) (Request-ID: req-d95fea94-31fe-4063-9262-a84088cbaf29)
#

There is 2 problems in nova's implementation
CAUSE:

 1) device limitation number is 26.
      get_dev_count_for_disk_bus() always returns 26.
      The magic number 26 is a alphabet range.('a'-'z')

 2) nova:find_disk_dev_for_disk_bus() can hande device name by only one
     length characters('a'-'z'). like
          vda .. vdz
          sda .. sdz

TODO:

 1) increasing the number of allowed volumes attached per instance > 26
         26 -> ???

 2) Fix nova can handle device name length more widely for universally.
      not only under 26 volumes. like
          vda(index=0), sdb(index=1),
         vdaa(index=26), sdabc(index=730), vdfxshrxx(index=INT_MAX)

ML:
openstack-dev ml is starting to discus this topic.
 http://lists.openstack.org/pipermail/openstack-dev/2018-June/131289.html

Blueprint information

Status:
Started
Approver:
melanie witt
Priority:
Low
Drafter:
Tsuyoshi Nagata
Direction:
Approved
Assignee:
melanie witt
Definition:
Approved
Series goal:
Accepted for stein
Implementation:
Started
Milestone target:
None
Started by
Tsuyoshi Nagata on 2018-05-17

Whiteboard

Gerrit topic: https://review.openstack.org/#/q/topic:bp/nova-improvement-of-maximum-attach-volumes-more-than-26-vols

Addressed by: https://review.openstack.org/567472
    [nova] increasing the number of allowed volumes attached per instance > 26

Gerrit topic: https://review.openstack.org/#q,topic:bug/1770527,n,z

Addressed by: https://review.openstack.org/573066
    Fix nova can handle device name length more widely for universally.

Other information&Limitation:
(jichenjc)
googled and find this:

https://elixir.bootlin.com/linux/v4.4/source/drivers/scsi/sd.c#L547

https://www.reddit.com/r/linuxquestions/comments/4cifq3/max_number_of_devices_in_linux/

(Tsuyoshi Nagata)
I explain on kvm hypervisor implementation.

vd the virtid_blk device assigned by pci-bus.
  pci addressing was below

(complex:32):(bus-no:256):(id:32)

each pci-bus have 32 devices.
max pci-bus was 256.
complex was matched by cpu socket.
currently, intel's pci-complex was built by QPI-connection.
QPI-CPU-connection is now 32.
then current maximum virtio_blk no is,

  32 x 256 x 32 = 262144

get_dev_count_for_disk_bus=2147483647 -> 262144

*NOTE*
    I already tested kvm live instance can only handles 32 virtio devices.
  but shelved(stop) instance can handle more than 32 virtio devices.
  ('$ openstack server stop sles15rc'
   '$ openstack server add volume sles15rc vol32' command makes hot-add new pci bus and new volumes(/dev/vdag). )

(melanie witt)
so seems we can follow that ?
I was thinking no larger than 1024, for example.
[1] https://rwmj.wordpress.com/2017/04/25/how-many-disks-can-you-add-to-a-virtual-linux-machine/

get_dev_count_for_disk_bus=262144 -> 1024

(Tsuyoshi Nagata)
I tested attaching volumes to my instances. Its useless too taking time more than 256 volumes to provision a instance.
I decided this number is more smaller.

get_dev_count_for_disk_bus=1024 -> 128

(melanie witt)
Note bug https://bugs.launchpad.net/nova/+bug/1621138 where powervm was trying to boot from volume and wanted to do 128 volumes but failed at ~80 becaues of the build_requests.block_device_mappings table column size. So would a MEDIUMTEXT column be able to store up to 1024 serialized bdm objects?

What is the practical application of this? We shouldn't make changes just because we can.

(Tsuyoshi Nagata)
My app is auto-testing ceph on openstack. a ceph-osd node has many volumes.
if each instance can handle many volumes, I can solve provisioning SDS on openstack.
(without buying real disks.)

N = 64
   It seems OK.
N = 100
  my kvm(SOC7) environment seems ok.
N = 200
  my kvm(SOC7) environment seems ok.
  a boot time getting longer. (> over 6 min)
N = 256
  my kvm(SOC7) environment seems ok.
  a boot time getting more longer. (> over 13 min)
N = 512
  my kvm(SOC7) environment seems NG.
  VNC shows black display. never boot up a instance with 512 volumes. (> over 1day)

(sahid)
virtio-blk is using for each disk, one PCI slot. Since there is a limit of 32 slots per machine (For Q35 it's different) and some of them are already used in our default guest configuration for networking devices, memory balloon device, USB controller... My thinking is we should probably keep the limit of 26.
For virtio-scsi it's different, we have one controller that is using one PCI slot. Nova currently does not support to have more virtio-scsi controller. One controller can have 256 targets and each target supports 1 to 2^14 logical device (LUN). So it's about 4194304, which does not really make sense. Limiting it to 128 or perhaps the number of targets seems good.
For the native QEMU scsi implementation I think a target can only handle one LUN, meaning 256 devices.

(Tsuyoshi Nagata)
I think bus limit number shoud be specific limit, not a convinent number.

I'll fix by "specific based limitation" on next patch set!

Gerrit topic: https://review.openstack.org/#q,topic:bp/nova-improvement-of-maximum-attach-volumes-more-than-26-vols,n,z

virtio=4194304(2^14)
(Zhenyu Zheng) this seems really unrealistic to me...
(Chen) Agree unless a source indicating this limit is provided.

virtio=1000
(Stephen Finucane)+2
(melanie witt)-1
(zhaixiaojun)+1

Addressed by: https://review.openstack.org/597306
    Propose configurable maximum number of volumes to attach

The spec for this was merged on 20180919, so this is approved for Stein. -- melwitt 20180924

I updated the name of this blueprint to conf-max-attach-volumes to match the name of the spec file because tooling requires it (moving implemented specs tooling, for example). -- melwitt 20181002

Gerrit topic: https://review.openstack.org/#q,topic:bp/conf-max-attach-volumes,n,z

Addressed by: https://review.openstack.org/616777
    WIP Add configuration of maximum disk devices to attach

Addressed by: https://review.openstack.org/624832
    Propagate exception message from _prep_block_device

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.