Scaling Network Performance for Large Clouds

Registered by dongfeng

Proposed by Ji Xiaofeng, Tina TSOU, Dong Feng Huawei

Description
This session focuses on how to improve networking performance at large scale deployment.
For example
- having many VMs, thousands to tens of thousands, in a single data center
- very heavy traffic between VMs of different physical servers
- large quantities of OpenFlow flow tables causing slow forwarding on OVS and high CPU usage on hypervisor
- VMs belong to various tenants thus requiring traffic isolation and security and lots of configuration on OVS mainly overlay encapsulation and OpenFlow tables
- neutron server taking too long time to process requests

We are introducing a solution designed for the above scenario in this area.
The main idea is to deploy on the hypervisor a new monitor agent which will periodically check the CPU usage and network load of the NIC and inform SDN controller through plugin/API extension. If the OVS load goes very high, SDN controller can reactively off-load the traffic from OVS to TOR with minimum interruption. It means that initially, the overlay encapsulation might be done on OVS, but some feature rich TORs also provide this functionality which makes TOR capable of taking over whenever necessary. The same strategy will be applied for OpenFlow flow table. By doing this, OVS will have nothing to do other than sending the traffic to TOR. All the time-consuming jobs will be taken over by TOR dynamically. This more advanced strategy does require TOR to be feature-rich so it might cause more TCO.

We believe this is worth doing for large scale deployment.

Blueprint information

Status:
Complete
Approver:
None
Priority:
Undefined
Drafter:
dongfeng
Direction:
Needs approval
Assignee:
dongfeng
Definition:
Obsolete
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
Armando Migliaccio

Related branches

Sprints

Whiteboard

Nov-13-2015(armax): If someone is interested in pursuing it, this must be re-submitted according to guidelines defined in [1].

[1] http://docs.openstack.org/developer/neutron/policies/blueprints.html

-----------------

Note from Salvatore Orlando (2014-05-06):

Hi, please submit your spec to the neutron-specs repository.
Also, at a first glance it looks like you're proposing a different solution for managing bot control and data plane. In this case your spec should mention also the following:
- HW/SW requirements (for instance which transport technologies will be supported, and whether there are requirements on TOR switches)
- Positioning within current agent framework (extension, replacement)
- Development model (ml2 driver, new plugin)
- Required API extensions and RPC interfaces, if any.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.