Slave I/O thread won't attempt to automatically reconnect to the master / error-code 1159

Bug #1268729 reported by Agustín G
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Medium
Vlad Lesin
5.1
Won't Fix
Undecided
Vlad Lesin
5.5
Fix Released
Medium
Vlad Lesin
5.6
Fix Released
Medium
Vlad Lesin

Bug Description

140111 17:06:04 [Note] Slave I/O thread: connected to master 'name@host:port',replication started in log 'mysql-bin.015318' at position 887067847
140111 18:57:56 [ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159

The error in question seems to be a network error:

$ perror 1159
MySQL error code 1159 (ER_NET_READ_INTERRUPTED): Got timeout reading communication packets

Percona Server 5.5.29-29.4 is running on the affected slave. Looking at the source code for that release, the problematic code path appears to be:

1364 if (check_io_slave_killed(mi->io_thd, mi, NULL))
1365 goto slave_killed_err;
1366 else if (is_network_error(mysql_errno(mysql)))
1367 {
1368 mi->report(WARNING_LEVEL, mysql_errno(mysql),
1369 "Get master SERVER_ID failed with error: %s", mysql_error(mysql));
1370 goto network_err;
1371 }
1372 /* Fatal error */
1373 errmsg= "The slave I/O thread stops because a fatal error is encountered \
1374 when it try to get the value of SERVER_ID variable from master.";
1375 err_code= mysql_errno(mysql);
1376 sprintf(err_buff, "%s Error: %s", errmsg, mysql_error(mysql));
1377 goto err;

The call to is_network_error on line 1366 determines whether the slave thread automatically restarts. However is_network_error does not return true for ER_NET_READ_INTERRUPTED.

The following patch would fix the issue:

--- a/Percona-Server/sql/slave.cc
+++ b/Percona-Server/sql/slave.cc
@@ -1176,7 +1176,8 @@ bool is_network_error(uint errorno)
errorno == CR_SERVER_GONE_ERROR ||
errorno == CR_SERVER_LOST ||
errorno == ER_CON_COUNT_ERROR ||
- errorno == ER_SERVER_SHUTDOWN)
+ errorno == ER_SERVER_SHUTDOWN ||
+ errorno == ER_NET_READ_INTERRUPTED)
return TRUE;

return FALSE;

Related branches

Agustín G (guriandoro)
summary: Slave I/O thread won't attempt to automatically reconnect to the master
+ / error-code 1159
tags: added: upstream
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1470

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.