So recently I’ve noticed that the site is struggling quite badly to process the data – the database has grown to just shy of 2.5GB and well it’s not optimised is any way at all.
So there’s 95173 teams from my last data collection and of those 2158 have a score of 0 and 164 have no WU. IE these are unused teams. So in total over the last 5 months, that’s generated an extra ~300k of history records that are not needed at all. Then there are the teams who are not active, also having many records that I don’t really need to worry about creating.
To put things into context – this is a pruning exercise. The DB is too big, so the useless data needs to go. Given that in the 5 months my team has risen to the top 3000 teams with less than 2 machines running for that time, I don’t think this will affect the data validity or usefulness of the site long term.
So to keep the ship from sinking I think it’s time we removed the teams with no WU or Score from the history (after creating a backup). When I say no WU or score I actually mean all teams with less than 10 WU and less than 200 score.
So “SELECT * FROM `teamhistory` where score < 200″ yields 974,625 records and “SELECT * FROM `teamhistory` where wu < 10″ yields 4,459,439 records. Given there’s around 14,000,000 records, removing 30% of them should see a massive speed up in future queries.
So after deleting some records, the table seems to have shrunk, but I think I will have to review how I keep team history in the future and the size of the fields in the table as it has grown way too fast from January till now.