|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Dear all,
Currently I am working on a project that has to do with logging visitors traffic. Let's say, every time a visitor has visited the website, one row will be inserted into a table. Thing is, just like OneStat, Nedstat or whatever, this project retrieves it's input from a large number of website. This may result in let's say 1000 rows per minute from the start but probably a hell of a lot more. First question is: How many rows can be processed per minute, or second? Now my second question is more difficult. I will also need to compare the results in a guide page, that will compare all the data collected and for example show, which site has had the most visitors. For the last day, week, year... whatever. My problem is that if I would store all data in one table, this table will very soon be very very large. I don't know the maximum number of rows that are allowed, but this will sure influence the speed. The more rows, the longer it will take of course to compare and show results in the Guide Page. And finaly I will hit the maximum number of rows anyhow. What I am thinking of is to automatically generate a table for every month but I am not sure if this is wise. Will I be able to compare fast enough after a year? Let's say I have 12 tables and I want to compare all rows, let's say there are one bilion rows in each table... What I actually like to know I think, is how the database of NedStat is more or less structure. They probably must have over a milion rows to store every minute... Can anyone tell me how the manage to store and compare all this data? Kind regards, Pim Zeekoers |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
On 30 Oct, 15:43, p...@impulzief.nl wrote:
> Dear all, > > Currently I am working on a project that has to do with logging > visitors traffic. Surely either "visitor traffic" or "visitor's traffic" > Let's say, every time a visitor has visited the website, one row will > be inserted into a table. > > Thing is, just like OneStat, Nedstat or whatever, this project > retrieves it's Surely "its" (see http://groups.google.co.uk/group/alt....no.apostrophe) > First question is: How many rows can be processed per minute, or > second? This depends on many things, for example the processor speed, the disk speed, how many disks, ... > Now my second question is more difficult. The first one was impossible to answer, so how can this be more difficult? |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
pim@impulzief.nl wrote in news:1193759019.881961.6310
@y42g2000hsy.googlegroups.com: > What I actually like to know I think, is how the database of NedStat > is more or less structure. They probably must have over a milion rows > to store every minute... Can anyone tell me how the manage to store > and compare all this data? As far as I know, databases are not used at all for "web analytics" type of things... they tend to use server logs for information, and process them into useful information. |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
On 30 okt, 17:52, Captain Paralytic <paul_laut...@yahoo.com> wrote:
> On 30 Oct, 15:43, p...@impulzief.nl wrote:> Dear all, > > > Currently I am working on a project that has to do with logging > > visitors traffic. > > Surely either "visitor traffic" or "visitor's traffic"> Let's say, every time a visitor has visited the website, one row will > > be inserted into a table. > > > Thing is, just like OneStat, Nedstat or whatever, this project > > retrieves it's > > Surely "its" (seehttp://groups.google.co.uk/group/alt.possessive.its.has.no.apostrophe) > > > First question is: How many rows can be processed per minute, or > > second? > > This depends on many things, for example the processor speed, the disk > speed, how many disks, ... > > > Now my second question is more difficult. > > The first one was impossible to answer, so how can this be more > difficult? Dear Paul, I am sorry my English does not meet your standards. Although you had no answer, you seem to have been able to understand my questions so I guess it wasn't that bad. Of course, hardware will limit the amount of rows that can be processed every minute. I am not looking for a detailed answer but would just like to know what could normally be achieved under regular circumstances. Like a regular dedicated server. What can be seen as a hard maximum? But the second question was a lot more important of course. Do you, or someone else, have any tips on how to structure my database in order to be able to insert unlimited rows to the database and still be able to compare the results? Regards, Pim |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
On 30 okt, 18:16, Good Man <he...@letsgo.com> wrote:
> p...@impulzief.nl wrote in news:1193759019.881961.6310 > @y42g2000hsy.googlegroups.com: > > > What I actually like to know I think, is how the database of NedStat > > is more or less structure. They probably must have over a milion rows > > to store every minute... Can anyone tell me how the manage to store > > and compare all this data? > > As far as I know, databases are not used at all for "web analytics" type of > things... they tend to use server logs for information, and process them > into useful information. Dear Good Man, Thank you for your reply. They don't use databases? Mmm... that will probably have reasons. But how would I be able to store unlimited rows and still be able to compare the results? A friend of mine just suggested to use tables as pairs, or couples , that will automatically store certain standard comparisons. Is that an option? Pim |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
pim@impulzief.nl wrote:
> Dear all, > > Currently I am working on a project that has to do with logging > visitors traffic. > Let's say, every time a visitor has visited the website, one row will > be inserted into a table. > > Thing is, just like OneStat, Nedstat or whatever, this project > retrieves it's input from a large number of website. This may result > in let's say 1000 rows per minute from the start but probably a hell > of a lot more. > > First question is: How many rows can be processed per minute, or > second? > Too many factors to determine. It depends on everything from the data being inserted and indexes being used to the operating system, memory, disk speed, etc. But 1K rows/min. should be easily doable with good hardware, as long as it isn't huge amounts of data on each insert. > > Now my second question is more difficult. I will also need to compare > the results in a guide page, that will compare all the data collected > and for example show, which site has had the most visitors. For the > last day, week, year... whatever. > > My problem is that if I would store all data in one table, this table > will very soon be very very large. I don't know the maximum number of > rows that are allowed, but this will sure influence the speed. The > more rows, the longer it will take of course to compare and show > results in the Guide Page. And finaly I will hit the maximum number of > rows anyhow. > In most cases you'll run into OS limitations first, i.e. the maximum size of a file allowed. A 64 bit platform will give you more room. And yes, it will take a while to search such large tables. Perhaps you would be better off implementing replication, inserting to the master and searching on the slave. > What I am thinking of is to automatically generate a table for every > month but I am not sure if this is wise. Will I be able to compare > fast enough after a year? Let's say I have 12 tables and I want to > compare all rows, let's say there are one bilion rows in each > table... > It will be slower to join tables. It means multiple indexes need to be searched, etc. > What I actually like to know I think, is how the database of NedStat > is more or less structure. They probably must have over a milion rows > to store every minute... Can anyone tell me how the manage to store > and compare all this data? > Not here. > > Kind regards, > > > Pim Zeekoers > > -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On 30 okt, 19:33, Jerry Stuckle <jstuck...@attglobal.net> wrote:
> p...@impulzief.nl wrote: > > Dear all, > > > Currently I am working on a project that has to do with logging > > visitors traffic. > > Let's say, every time a visitor has visited the website, one row will > > be inserted into a table. > > > Thing is, just like OneStat, Nedstat or whatever, this project > > retrieves it's input from a large number of website. This may result > > in let's say 1000 rows per minute from the start but probably a hell > > of a lot more. > > > First question is: How many rows can be processed per minute, or > > second? > > Too many factors to determine. It depends on everything from the data > being inserted and indexes being used to the operating system, memory, > disk speed, etc. > > But 1K rows/min. should be easily doable with good hardware, as long as > it isn't huge amounts of data on each insert. > > > > > Now my second question is more difficult. I will also need to compare > > the results in a guide page, that will compare all the data collected > > and for example show, which site has had the most visitors. For the > > last day, week, year... whatever. > > > My problem is that if I would store all data in one table, this table > > will very soon be very very large. I don't know the maximum number of > > rows that are allowed, but this will sure influence the speed. The > > more rows, the longer it will take of course to compare and show > > results in the Guide Page. And finaly I will hit the maximum number of > > rows anyhow. > > In most cases you'll run into OS limitations first, i.e. the maximum > size of a file allowed. A 64 bit platform will give you more room. And > yes, it will take a while to search such large tables. Perhaps you > would be better off implementing replication, inserting to the master > and searching on the slave. > > > What I am thinking of is to automatically generate a table for every > > month but I am not sure if this is wise. Will I be able to compare > > fast enough after a year? Let's say I have 12 tables and I want to > > compare all rows, let's say there are one bilion rows in each > > table... > > It will be slower to join tables. It means multiple indexes need to be > searched, etc. > > > What I actually like to know I think, is how the database of NedStat > > is more or less structure. They probably must have over a milion rows > > to store every minute... Can anyone tell me how the manage to store > > and compare all this data? > > Not here. > > > > > Kind regards, > > > Pim Zeekoers > > -- > ================== > Remove the "x" from my email address > Jerry Stuckle > JDS Computer Training Corp. > jstuck...@attglobal.net > ==================- Tekst uit oorspronkelijk bericht niet weergeven - > > - Tekst uit oorspronkelijk bericht weergeven - Dear Jerry, Thank you for your answer. Sure s me to find the right answer. The 'real' big table will only have 5 tables with id's so 1k will never be reached. I will see what I will do, first have to test your master/slave solution. Kind regards, Pim |
|
![]() |
| Outils de la discussion | |
|
|