Comment 2 for bug 1365024

Revision history for this message
Matthew B (utdrmac) wrote :

This is simple example. Extremely simple. I have a master which I used innobackupex on to create a new slave, this one. pt-heartbeat was already running on the master. After starting up the slave and starting replication, I wanted to check the delay. pt-heartbeat should have seen that only 1 row was in the heartbeat table, compared that to show slave status Master_Server_Id and said "bingo! there's my master. no need to ask user for information i already know."

[mboehm@Master-DB ~]$ pt-heartbeat --monitor --database percona
The --master-server-id option must be specified because the heartbeat table `percona`.`heartbeat` uses the server_id column for --update or --check but the server's master could not be automatically determined.
Please read the DESCRIPTION section of the pt-heartbeat POD.

[mboehm@Master-DB ~]$ mysql -e "SELECT * FROM percona.heartbeat"
+----------------------------+-----------+---------------------+-----------+-----------------------+---------------------+
| ts | server_id | file | position | relay_master_log_file | exec_master_log_pos |
+----------------------------+-----------+---------------------+-----------+-----------------------+---------------------+
| 2014-09-16T17:13:30.000640 | 1316 | mysql-binlog.001260 | 111425973 | NULL | NULL |
+----------------------------+-----------+---------------------+-----------+-----------------------+---------------------+

I just don't see how "server's master could not be automatically determined" when show slave status clearly defines the masters' id. Only in the case of multiple rows in the heartbeat table or a single row not matching S.S.S id should the program abort with error.

In your example, why would you run pt-heartbeat in update mode on a slave? Slave's are supposed to be read-only and running pt-h changes data on a slave. Running in --check or --monitor mode should not be writing anything to the table. So I don't understand how your heartbeat table got a row with the slave's server_id.

Yes, if you have master->slaveA->slaveB you might want to run pt-h on master and slaveA. But the behavior of pt-h should still be pretty automatic. if you run --check/--monitor on slaveA, that's a simple case of S.S.S get masterid and compare from table. if you run --check/--monitor on serverB, in this case, it shouldn't say "cannot determine master id" it should say something more appropriate like "your direct master is slaveA but slaveA also has a master. please indicate which you would like to monitor in relation to"

Or heck, just have it output a column for each master if ran on slaveB. *shrug