|
|
|
#1 (permalink) |
|
Messages: n/a
Hébergeur: |
I have a client that desires a new application for use at several stores in
different cities. He wants "real time" access to all store data and the ability to continue to run locally even if the internet connection goes down. It occurs to me that a local MySQL instance should definitely be in each store to continue work if the net is down for any reason. But, what is the best way to synchronize/share data among all of the stores? From what I have read on the MySQL site, you need MySQL Cluster to do synchronous replication, but it won't work over a network connection less than 100 mb. Is there any hope of synchronous replication using the free version of MySQL. And, just how stable if the free edition (with it's bleeding-edge code and such)? Any that you give will be greatly appreciated. Thanks! |
|
|
|
#2 (permalink) |
|
Messages: n/a
Hébergeur: |
>I have a client that desires a new application for use at several stores in
>different cities. He wants "real time" access to all store data and the >ability to continue to run locally even if the internet connection goes >down. > >It occurs to me that a local MySQL instance should definitely be >in each store to continue work if the net is down for any reason. But, what >is the best way to synchronize/share data among all of the stores? > >From what I have read on the MySQL site, you need MySQL Cluster to do >synchronous replication, but it won't work over a network connection less >than 100 mb. That's not the only form of replication. Plain old MySQL can do replication although it doesn't have all the features of the cluster. You might have to manually switch masters if the net goes down. Depending on the query volume and how up to date the slaves have to be, a 1200bps modem connection might be sufficient. My employer tried a mail server setup with MySQL Cluster, using MySQL to maintain info on where the mailbox was and user options, across several machines. It didn't work very well. We'd get inconsistent data and storage node crashes that would essentially take out the whole setup. Eventually we used conventional replication, with changes going to the master and each machine using its local copy for reads. Perhaps MySQL Cluster has improved since then. What is your required data setup? Each store is master of its own data and the head office also wants a real-time (read-only?) copy of it? Or does the head office need to make changes, and stores need to share each other's data? I don't know how to do conventional replication with, for example, the head office machine slaving one database off of each store master. Gordon L. Burditt |
|
|
|
#3 (permalink) |
|
Messages: n/a
Hébergeur: |
"Gordon Burditt" <gordonb.0jlgv@burditt.org> wrote in message news:12490m9bvsplifc@corp.supernews.com... > >I have a client that desires a new application for use at several stores > >in >>different cities. He wants "real time" access to all store data and the >>ability to continue to run locally even if the internet connection goes >>down. >> >>It occurs to me that a local MySQL instance should definitely be >>in each store to continue work if the net is down for any reason. But, >>what >>is the best way to synchronize/share data among all of the stores? >> >>From what I have read on the MySQL site, you need MySQL Cluster to do >>synchronous replication, but it won't work over a network connection less >>than 100 mb. > > That's not the only form of replication. Plain old MySQL can do > replication although it doesn't have all the features of the cluster. > You might have to manually switch masters if the net goes down. > Depending on the query volume and how up to date the slaves have to > be, a 1200bps modem connection might be sufficient. > > My employer tried a mail server setup with MySQL Cluster, using > MySQL to maintain info on where the mailbox was and user options, > across several machines. It didn't work very well. We'd get > inconsistent data and storage node crashes that would essentially > take out the whole setup. Eventually we used conventional replication, > with changes going to the master and each machine using its local > copy for reads. Perhaps MySQL Cluster has improved since then. > > What is your required data setup? Each store is master of its own > data and the head office also wants a real-time (read-only?) copy > of it? Or does the head office need to make changes, and stores > need to share each other's data? I don't know how to do conventional > replication with, for example, the head office machine slaving one > database off of each store master. The concept that they came up with was that they wanted real-time data access to do store-to-store inventory lookups and transfers and to track the store activities from the central office. They also want to share customer data (in case a customer visits another store or in the event that a customer has not finished paying for merchandise and is behind on payments). They also want all store data backup up in case of a catastrophic event at a store (like theft of IT resources or fire). They also want a local store of (at least) that store's data - or (if possible) all store data until the connection to the central database was lost. And, let's not forget a way to post back to a central database (if one is used) any local transactions done at a disconnected store once the connection to the group or central database is restored. Simple, huh? JH |
|
|
|
#4 (permalink) |
|
Messages: n/a
Hébergeur: |
Jim Hubbard wrote:
> They also want a local store of (at least) that store's data - or (if > possible) all store data until the connection to the central database was > lost. > > And, let's not forget a way to post back to a central database (if one is > used) any local transactions done at a disconnected store once the > connection to the group or central database is restored. > > Simple, huh? Here's how I would design this system: 1. They perform write operations to a database connection to the master database, which is at the home office. 2. The local sites keep a copy of the master database, which is updated periodically with standard replication. 3. Read operations are performed on this local database. 4. If the connection to the master database goes down, the apps need to detect this error and instead write their changes to a local database. I'd recommend this _not_ be their local copy of the master database, but a separate instance. I'll call this the failover write database. 5. Thus when they do queries against their local read-only copy of the master, they should combine the query results with any changes stored in their local failover write database. Hopefully this will be minimal, but the application code has to account for it. 7. When the connection is restored, the contents of the failover write database are sent to the master database. Then the local system deletes the contents of the failover write database. Wait a few moments for the master to process the new data, update keys, etc. Then re-synchronize the read-only copy of the master. Thus the failover changes have moved to the master database, and back again via replication. But now they have had their auto-generated primary keys assigned, and other data cleanup tasks. You might want to make the local app deny the user from making any further changes during these moments. Also note that other sites may have queued up changes, so the initial replication sync after an outage may contain more volume of data than usual. This still leaves the problem of what happens if two separate sites create records for the same real-world entity (like a customer who calls two separate sites). Resolving these cases is something which is not 100% automatable, in any RDBMS system. You must plan for some living person to review new data and performs cleanup at the home office. Some data cleanup is automatable, but it is simply not possible to handle all cases without a human. Regards, Bill K. |
|
|
|
#5 (permalink) |
|
Messages: n/a
Hébergeur: |
"Bill Karwin" <bill@karwin.com> wrote in message
news:e23bi30gk7@enews2.newsguy.com... > Jim Hubbard wrote: >> They also want a local store of (at least) that store's data - or (if >> possible) all store data until the connection to the central database was >> lost. >> >> And, let's not forget a way to post back to a central database (if one is >> used) any local transactions done at a disconnected store once the >> connection to the group or central database is restored. >> >> Simple, huh? > > Here's how I would design this system: > > 1. They perform write operations to a database connection to the master > database, which is at the home office. > > 2. The local sites keep a copy of the master database, which is updated > periodically with standard replication. This is where I hit my first "bump in the road". The client wants to use the FREE version of MSDE or MySQL. Neither version offers 2-way replication. Same things I quit working in corporate America 3 years ago over. They want it FREE and they want it NOW. I'm about to tell him to get someone else. This is starting the same way all of the pains in my ass did in corporate America. I won't do it for any amount if it is going to be one of THOSE applications/experiences. > > 3. Read operations are performed on this local database. Only in case of connection failure, right? They need to do immediate saves and reads of that data....so writing to a central store that would not update the local store for say 5 minutes even would not be acceptable. > > 4. If the connection to the master database goes down, the apps need to > detect this error and instead write their changes to a local database. I'd > recommend this _not_ be their local copy of the master database, but a > separate instance. I'll call this the failover write database. > > 5. Thus when they do queries against their local read-only copy of the > master, they should combine the query results with any changes stored in > their local failover write database. Hopefully this will be minimal, but > the application code has to account for it. > > 7. When the connection is restored, the contents of the failover write > database are sent to the master database. Then the local system deletes > the contents of the failover write database. Wait a few moments for the > master to process the new data, update keys, etc. Then re-synchronize the > read-only copy of the master. Thus the failover changes have moved to the > master database, and back again via replication. But now they have had > their auto-generated primary keys assigned, and other data cleanup tasks. > You might want to make the local app deny the user from making any further > changes during these moments. Also note that other sites may have queued > up changes, so the initial replication sync after an outage may contain > more volume of data than usual. > > This still leaves the problem of what happens if two separate sites create > records for the same real-world entity (like a customer who calls two > separate sites). Resolving these cases is something which is not 100% > automatable, in any RDBMS system. > > You must plan for some living person to review new data and performs > cleanup at the home office. Some data cleanup is automatable, but it is > simply not possible to handle all cases without a human. Actually, I was thinking of a more simple scenario and I'd be greatful for your input on it..... What if the local app always writes and reads from the local db. Every record change has an associated timestamp. A seperate process handles replicating the data to a central store (used to restore local databases in case data is lost due to a catastrophic event at the store) by updating records changed in either the central db or local db by selecting records with timestamps greater than the last time/date that the records were synchronized. This could be real-time, as long as a connection is available. As it will be virtually unheard of that any 2 changes will have the exact same timestamp, the newer change always overrides the older changes, but the older chnages are recorded anyway (for historical data - in this case they may be customer notes, etc.). The central office can attach to any database (including the central db) to get their data - thus the same app used by the clerks can be used by central office personnel with the proper security clearances for different functions. IF any store is offline (or the central db is not online) the UI should indicate that the data is valid as of the last synchronization. Does my simple solution miss anything major? Thanks! Jim |
|
|
|
#6 (permalink) |
|
Messages: n/a
Hébergeur: |
Jim Hubbard wrote:
> "Bill Karwin" <bill@karwin.com> wrote in message >>2. The local sites keep a copy of the master database, which is updated >>periodically with standard replication. > > This is where I hit my first "bump in the road". The client wants to use > the FREE version of MSDE or MySQL. Neither version offers 2-way > replication. I'm not aware that there is any two-way replication solution for MySQL, regardless of free vs. commercial. Can you cite a reference? > Same things I quit working in corporate America 3 years ago over. They want > it FREE and they want it NOW. > > I'm about to tell him to get someone else. This is starting the same way > all of the pains in my ass did in corporate America. I won't do it for any > amount if it is going to be one of THOSE applications/experiences. Indeed. What I would advise is to put together a proposal of what it will cost in time and money, call attention to the parts of their requirements that are the biggest cost drivers, and offer alternatives. Example anecdote: I went to a pre-sales meeting for a company that made custom trophy parts. They wanted to have an online storefront, and one of the managers insisted that he wanted a guarantee of "five nines" of availability (I assume he read it in a magazine). In other words, 99.999% uptime, not best-case, not average-case, but *guaranteed*. But he didn't want to spend more than, say, $3000 to $5000. My eyebrows raised, and I offered the comment as respectfully and calmly as I could that this level of service doesn't cost a few thousands, it costs tens of millions. Multiple co-lo facilities in different parts of the US. Full set of redundant servers at each site. Continuous, synchronous data mirroring between sites. Dynamic DNS failover. Round-the-clock staff. Not even banks and stock exchanges need this much availability. I further offered that if his business really was the #1 vendor of custom trophy parts, that if a high school coach wants to order online at 11pm on Sunday night, and the website says "performing maintenance, come back in 30 minutes," the customer is very likely to come back, instead of going to the competitor's website, which is likely to be non-existant anyway. Anyway, the point is that as a consultant, you should be very agreeable to their requirements at first, and make an accurate and well-supported estimate for what it would take to implement as they have described. Then offer a few alternatives for the features that were the primary cost drivers, saving them tons of money while preserving the value to their business. That's the way to be thought of as a very valuable consultant. On the other hand, if you already hate this client, you may be better off finding some open-source package on sourceforge.net that does most of what they need, pitch it to them, and bill them for your time. Then let them find a cheaper consultant to do a bit of customization and deployment. See also: http://sourceforge.net/search/?words...of_search=soft http://www.daffodildb.com/replicator/index.html >>3. Read operations are performed on this local database. > > > Only in case of connection failure, right? They need to do immediate saves > and reads of that data....so writing to a central store that would not > update the local store for say 5 minutes even would not be acceptable. Hmm, I had in mind all reads go to the local database, for speed and to avoid overburdening the central database. The main problem is with things like auto-generated primary key values. The main site and the local site aren't synchronized with respect to the next auto-gen key value per table. So if you insert values to the local database during an outage, and then you try to replicate to the central database when the connection is restored, how to you prevent your records from conflicting with the records entered simultaneously at other sites? One answer is never to use auto-generated "pseudokeys"; always use some combination of real attributes in the table as the key. Like firstname, lastname, date-of-birth. Or SSN. Another answer is to enter new data with an interim primary key value, let the central database override this value to make sure it's unique, and then make sure the local system knows how to reconcile the data when the same record comes back from the central system with a different primary key. Yet another solution is to make the primary key "pseudokey" be a two-column key, one using an auto-generated value per site, and another a constant "site id". The problem with any of these problems is that you can get two people at different sites entering data for the same entity. Then you need some way to merge these records and dependent data too. This can be tricky, since people can make spelling mistakes, or enter "Jim Brown" instead of "James R. Brown" for example. How can you tell that these refer to the same real-world entity? In general, you can automate some of this, but there are always going to be more types of ambiguity than those for which you can predict and code a resolution. > Actually, I was thinking of a more simple scenario and I'd be greatful for > your input on it..... > > What if the local app always writes and reads from the local db. Every > record change has an associated timestamp. A seperate process handles > replicating the data to a central store (used to restore local databases in > case data is lost due to a catastrophic event at the store) by updating > records changed in either the central db or local db by selecting records > with timestamps greater than the last time/date that the records were > synchronized. This could be real-time, as long as a connection is > available. Do you use auto-generated primary keys? How do you tell the client site to delete the entry they made, in favor of the newer entry? How do you resolve multiple client sites generating the same value in the primary key? > As it will be virtually unheard of that any 2 changes will have the exact > same timestamp, the newer change always overrides the older changes, but the > older chnages are recorded anyway (for historical data - in this case they > may be customer notes, etc.). Right, or in case the earlier entry actually did have the correct information. > The central office can attach to any database (including the central db) to > get their data - thus the same app used by the clerks can be used by central > office personnel with the proper security clearances for different > functions. > > IF any store is offline (or the central db is not online) the UI should > indicate that the data is valid as of the last synchronization. > > Does my simple solution miss anything major? This seems like as good a solution as any, if you can resolve the problems that are common in bi-directional replication. Bi-directional replication will always have some potential for ambiguously conflicting entries, or duplicate entries. A human being must review these and decide on the correct resolution. Regards, Bill K. |
|
|
|
#7 (permalink) |
|
Messages: n/a
Hébergeur: |
Bill Karwin wrote:
> I'm not aware that there is any two-way replication solution for MySQL, > regardless of free vs. commercial. Can you cite a reference? Obviously from the links I included, there are such solutions. Just not offered by MySQL AB. Regards, Bill K. |
|
|
|
#8 (permalink) |
|
Messages: n/a
Hébergeur: |
>>>2. The local sites keep a copy of the master database, which is updated
>>>periodically with standard replication. >> >> This is where I hit my first "bump in the road". The client wants to use >> the FREE version of MSDE or MySQL. Neither version offers 2-way >> replication. > >I'm not aware that there is any two-way replication solution for MySQL, >regardless of free vs. commercial. Can you cite a reference? According to the manual for MySQL 5.0.19 (NOT cluster) on the dev.mysql.com site, a given server can have one master and multiple slaves. You are allowed to connect them in a circular setup. There is little provision for resolving conflicts in near-simulataneous updates. This is probably not acceptable for what the original poster wants. But it is arguably two-way replication, particularly with the setup on two servers. The server variables auto_increment_increment and auto_increment_offset are provided to deal with the assignment of auto_increment values. From the descriptions I see, it looks like they work with multiple masters, but I don't know how you have multiple masters. The idea is that each server assigns its own set of numbers, e.g. with 10 servers, one server assigns 11, 21, 31, 41, ... and another server assigns 13, 23, 33, 43, ... . Gordon L. Burditt |
|
![]() |
| Outils de la discussion | |
|
|