Support large datasets being passed

Registered by Moshe Elisha

In many cases the current limitation of info passed in and out of tasks is insufficient.
This limitation is caused mainly due to the DB schema which defines insufficient columns types for some columns.

For example, listing and publishing Heat stack resources for even a small stack can result in "Data too long for column 'published'".
Trying to update stack a large Heat stack using heat.stacks_update can easily result in "Data too long for column 'input'".

The new size limitation will be configurable via the mistral.conf file and will be enforced by the application.

In order to support configurable limitation, we will change the size of the following columns from TEXT (up to 64 KB) to the max LONGTEXT (up to 4GB):

runtime_context
input
params
context
action_spec
published

Simply changing the DB column sizes is not enough -it causes Mistral to throw "Segmentation fault (core dumped) " on start.

Blueprint information

Status:
Complete
Approver:
Renat Akhmerov
Priority:
High
Drafter:
Moshe Elisha
Direction:
Approved
Assignee:
Moshe Elisha
Definition:
Approved
Series goal:
Accepted for liberty
Implementation:
Implemented
Milestone target:
milestone icon 1.0.0
Started by
Nikolay Makhotkin
Completed by
Renat Akhmerov

Related branches

Sprints

Whiteboard

1. After increasing DB column sizes - the next obstacle will be the RPC response timeout:

Failure caused by error in task 'generate_data': Timed out waiting for a reply to message ID d158e22a1e0a4e58ae20a7640b0f8215

Edit mistral.conf and increase the value of "[DEFAULT]/rpc_response_timeout".

2. Another obstacle is the socket timeout of the SQL UPDATE command against the DB:

(pymysql.err.OperationalError) (2006, "MySQL server has gone away (error(32, 'Broken pipe'))") [SQL: u'UPDATE executions_v2 SET updated_at=%s, state=%s, accepted=%s, output=%s WHERE executions_v2.id = %s'] [parameters: (datetime.datetime(2015, 7, 21, 14, 24, 57, 880505), 'SUCCESS', 1, '{"result": "...

Usually in MySQL there is a socket timeout option on the client side - it seems this option is not exposed by sqlalchemy.

Gerrit topic: https://review.openstack.org/#q,topic:bp/support-large-datasets,n,z

Addressed by: https://review.openstack.org/205190
    Support large datasets for execution objects

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.