PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > mysql.general > Replication still stopping...
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Replication still stopping...

Réponse
 
LinkBack Outils de la discussion
Vieux 22/10/2007, 18h55   #1
Jesse
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Replication still stopping...

I tried posting this on the Replication list, and got no response. Maybe
someone here can ...

OK. Still battling this issue after weeks of working with it. I'm racking
my brains. I re-set the slave again on Saturday, and got replication
started again. It was working fine until this afternoon some time. Before
starting things up, I cleaned the error log out completely, so it would be
clean before I started. Here is my error log in total:


071020 14:43:51 InnoDB: Started; log sequence number 0 142497221
071020 14:43:51 [Note] C:\Program Files\MySQL\MySQL Server
5.0\bin\mysqld-nt: ready for connections.
Version: '5.0.45-community-nt' socket: '' port: 3306 MySQL Community
Edition (GPL)
071020 14:43:51 [Note] Slave SQL thread initialized, starting replication in
log 'mysql-bin.000006' at position 98, relay log 'C:\Program
Files\MySQL\MySQL Server 5.0\Data\dlgsrv-relay-bin.000002' position: 235
071020 14:43:52 [Note] Slave I/O thread: connected to master
'Replication@webserver:3306', replication started in log 'mysql-bin.000006'
at position 98
071020 15:43:32 [Note] Slave: received end packet from server, apparent
master shutdown:
071020 15:43:32 [Note] Slave I/O thread: Failed reading log event,
reconnecting to retry, log 'mysql-bin.000006' position 98
071020 15:43:33 [ERROR] Slave I/O thread: error reconnecting to master
'Replication@webserver:3306': Error: 'Can't connect to MySQL server on
'webserver' (10061)' errno: 2003 retry-time: 60 retries: 86400
071020 15:45:56 [Note] Slave: connected to master
'Replication@webserver:3306',replication resumed in log 'mysql-bin.000006'
at position 98
071021 15:02:21 [Note] Slave SQL thread exiting, replication stopped in log
'mysql-bin.000007' at position 195

I checked periodically on the server, and everything seemed to be working.
The last time I checked was this morning sometime around 8:00 pr so. Still
running. As you can see, however, it juststopped processing at 15:02:21 this
afternoon.

The master server was not down. I was in and out of web sites that use the
MySQL database on the master several times, and it always worked just fine,
and never gave me an error. It almost appears as though the slave cannot
communicate with the master. It looks like it tried 86,400 times, which I
guess took almost a day to do, and just gave up. Why would it be able to
connect initially to the server, then suddenly not be able to connect any
more?

Any or suggestions anyone can offer is greatly appreciated!

Jesse


  Réponse avec citation
Vieux 22/10/2007, 19h06   #2
Baron Schwartz
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Replication still stopping...

Hi Jesse,

Jesse wrote:
> I tried posting this on the Replication list, and got no response.
> Maybe someone here can ...
>
> OK. Still battling this issue after weeks of working with it. I'm racking
> my brains. I re-set the slave again on Saturday, and got replication
> started again. It was working fine until this afternoon some time. Before
> starting things up, I cleaned the error log out completely, so it would be
> clean before I started. Here is my error log in total:
>
>
> 071020 14:43:51 InnoDB: Started; log sequence number 0 142497221
> 071020 14:43:51 [Note] C:\Program Files\MySQL\MySQL Server
> 5.0\bin\mysqld-nt: ready for connections.
> Version: '5.0.45-community-nt' socket: '' port: 3306 MySQL Community
> Edition (GPL)
> 071020 14:43:51 [Note] Slave SQL thread initialized, starting
> replication in
> log 'mysql-bin.000006' at position 98, relay log 'C:\Program
> Files\MySQL\MySQL Server 5.0\Data\dlgsrv-relay-bin.000002' position: 235
> 071020 14:43:52 [Note] Slave I/O thread: connected to master
> 'Replication@webserver:3306', replication started in log
> 'mysql-bin.000006'
> at position 98
> 071020 15:43:32 [Note] Slave: received end packet from server, apparent
> master shutdown:
> 071020 15:43:32 [Note] Slave I/O thread: Failed reading log event,
> reconnecting to retry, log 'mysql-bin.000006' position 98
> 071020 15:43:33 [ERROR] Slave I/O thread: error reconnecting to master
> 'Replication@webserver:3306': Error: 'Can't connect to MySQL server on
> 'webserver' (10061)' errno: 2003 retry-time: 60 retries: 86400
> 071020 15:45:56 [Note] Slave: connected to master
> 'Replication@webserver:3306',replication resumed in log 'mysql-bin.000006'
> at position 98
> 071021 15:02:21 [Note] Slave SQL thread exiting, replication stopped in log
> 'mysql-bin.000007' at position 195
>
> I checked periodically on the server, and everything seemed to be working.
> The last time I checked was this morning sometime around 8:00 pr so. Still
> running. As you can see, however, it juststopped processing at 15:02:21
> this
> afternoon.
>
> The master server was not down. I was in and out of web sites that use the
> MySQL database on the master several times, and it always worked just fine,
> and never gave me an error. It almost appears as though the slave cannot
> communicate with the master. It looks like it tried 86,400 times, which I
> guess took almost a day to do, and just gave up. Why would it be able to
> connect initially to the server, then suddenly not be able to connect any
> more?


A couple of thoughts. Do you have slaves with duplicated server IDs?
That seems most likely to me.

If that's not it, is the max_packet_size mismatched on the master and
slave? Can you connect to the master and view the binary log event at
the position it's trying to read, with SHOW BINLOG EVENTS? Can you use
the mysqlbinlog tool to verify that the binary log isn't corrupted on
the master?

Baron
  Réponse avec citation
Vieux 22/10/2007, 19h14   #3
Ralf Hüsing
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut [OT] Memory Usage on Windows? Re: Replication still stopping...

Hi Jesse,

> 071020 14:43:51 InnoDB: Started; log sequence number 0 142497221
> 071020 14:43:51 [Note] C:\Program Files\MySQL\MySQL Server
> 5.0\bin\mysqld-nt: ready for connections.


as i can see you are running mysql on windows.

If i start my db server (5.0.45/innodb/win2k) the server uses about ~80K
handles (as seen in taskmgr) and memory usage increases around 1g.
Taskmgr.exe says that there is some swapping (the box has only 1gb ram).

The DB itself is small (~50mb or so).

My Question is, did you have the same things on your box?
Did you have performace issues which resultes from the memory usage?

Thanks
Ralf
  Réponse avec citation
Vieux 24/10/2007, 16h40   #4
Jesse
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Replication still stopping...

> A couple of thoughts. Do you have slaves with duplicated server IDs? That
> seems most likely to me.


Nope. I've got one master, and one slave. The server ID is set to 1 on the
master, and it's set to 2 on the slave.

> If that's not it, is the max_packet_size mismatched on the master and
> slave?


I don't find max_packet_size in the My.ini file on either server, and when I
do a show variables on both, max_packet_size is not listed on either of
them.

>Can you connect to the master and view the binary log event at the position
>it's trying to read, with SHOW BINLOG EVENTS?


That's where things get squirley. The position it reports always seems to
be incorrect. For instance, when this was happening previously, I know that
it had made it to a later position in the log. However, when replication
stopped, it reported a position earlier in the file. This one, for instance,
reports position 195. the Nearest one I have starts at position 98 and ends
at position 1032. This is an update statement. If my logic is not flawed,
I'm thinking that I should follow starting at 98 out until I get to position
195. When I do that, I come to: RegOpenDate = '2007-11-05 00:00:00', which
is part of the udpate statement. This appears normal to me. I've checked,
and it is a DateTime field, and it is exactly the same on both the master
and slave.

> Can you use the mysqlbinlog tool to verify that the binary log isn't
> corrupted on the master?


I've dumped the log to a text file. What, exactly, should I look for? The
only suspicious thing I see is the first entry:
# at 4
#071020 15:45:34 server id 1 end_log_pos 98 Start: binlog v 4, server v
5.0.17-nt-log created 071020 15:45:34 at startup
# Warning: this binlog was not closed properly. Most probably mysqld crashed
writing it.
ROLLBACK;

Don't know why it would do this. However, I set the master_log_pos to 98
before re-starting the slave after re-setting it last time.

Thanks,
Jesse

  Réponse avec citation
Vieux 24/10/2007, 16h43   #5
Jesse
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: [OT] Memory Usage on Windows? Re: Replication still stopping...

> as i can see you are running mysql on windows.
>
> If i start my db server (5.0.45/innodb/win2k) the server uses about ~80K
> handles (as seen in taskmgr) and memory usage increases around 1g.
> Taskmgr.exe says that there is some swapping (the box has only 1gb ram).
>
> The DB itself is small (~50mb or so).
>
> My Question is, did you have the same things on your box?
> Did you have performace issues which resultes from the memory usage?


I can't even keep it running for longer that 24 hours, and I don't know why
I haven't even started looking into memory issues or performance. When it
is runnning, as a test, I change a record on the master, and I notice that
almost immediately, the same change is made on the slave. Works perfectly
for a few hours, then it just stops working. It almost appears to be a
network related issue, but I can't seem to track it down.

Jesse

  Réponse avec citation
Vieux 24/10/2007, 17h12   #6
Baron Schwartz
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Replication still stopping...

Jesse wrote:
>> A couple of thoughts. Do you have slaves with duplicated server IDs?
>> That seems most likely to me.

>
> Nope. I've got one master, and one slave. The server ID is set to 1 on
> the master, and it's set to 2 on the slave.
>
>> If that's not it, is the max_packet_size mismatched on the master and
>> slave?

>
> I don't find max_packet_size in the My.ini file on either server, and
> when I do a show variables on both, max_packet_size is not listed on
> either of them.


Whoops, I got the name wrong:

mysql> show variables like '%packet%';
+--------------------+----------+
| Variable_name | Value |
+--------------------+----------+
| max_allowed_packet | 16776192 |
+--------------------+----------+
1 row in set (0.00 sec)

>
>> Can you connect to the master and view the binary log event at the
>> position it's trying to read, with SHOW BINLOG EVENTS?

>
> That's where things get squirley. The position it reports always seems
> to be incorrect. For instance, when this was happening previously, I
> know that it had made it to a later position in the log. However, when
> replication stopped, it reported a position earlier in the file. This
> one, for instance, reports position 195. the Nearest one I have starts
> at position 98 and ends at position 1032. This is an update statement.
> If my logic is not flawed, I'm thinking that I should follow starting at
> 98 out until I get to position 195. When I do that, I come to:
> RegOpenDate = '2007-11-05 00:00:00', which is part of the udpate
> statement. This appears normal to me. I've checked, and it is a
> DateTime field, and it is exactly the same on both the master and slave.


That's strange. I'm not sure I understand what's happening there.
Check the packet size and let's come back to this if that's not the problem.

>
>> Can you use the mysqlbinlog tool to verify that the binary log isn't
>> corrupted on the master?

>
> I've dumped the log to a text file. What, exactly, should I look for?
> The only suspicious thing I see is the first entry:
> # at 4
> #071020 15:45:34 server id 1 end_log_pos 98 Start: binlog v 4,
> server v 5.0.17-nt-log created 071020 15:45:34 at startup
> # Warning: this binlog was not closed properly. Most probably mysqld
> crashed writing it.
> ROLLBACK;


That's fine --it just means the log is still open. (It is still open,
right?) If you run this on a log other than the newest one, you
shouldn't see that.

If there was corruption, the mysqlbinlog tool would have crashed.

Baron
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 05h47.


Édité par : vBulletin® version 3.7.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières ©2000-2008
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,17630 seconds with 14 queries