Cleaning Up Statistics At Year End

On this page:

The WebStatistics topics rcs files get out of hand, with well over 4000 revisions by end of year. Assumptions
  • The revision history doesn't really serve much purpose, and slows everything down.
  • We will archive each year into a per-year statistics file. WebStatistics2012 for example
  • The WebStatistics topic header has been modified to search for prior year statistics topics in that web.

Manual cleanup

Here are the steps I've gone through to fix this up. First time around it was done at midnight on the Dec. 31st. but that's not all that practical. For the 2012 -> 2013 transition, the following steps were run, redirecting the output to a file that was then run as a shell script.

Backup all the existing statistics files
find /home/ -iname WebStatistics\.* -exec tar -rf webstats3.tar {} +
Copy all stats files into their 2012 versions
 for i in `find /home/ -name 'WebStatistics\.*'` ; do echo sudo -u www cp $i `echo $i | sed 's/\.txt/2012.txt/'` ; done
Delete the 2013 entry from the 2012 files
 for i in `find /home/ -name 'WebStatistics2012.txt'`; do echo sudo -u www sed -i_bak -e "/Jan\ 2013/d" $i ; done
Delete any 2012 statistics from the 2013 files
 for i in `find /home/ -name 'WebStatistics.txt'`; do echo sudo -u www sed -i_bak -e "/[a-z]\ 2012/d" $i; done 
Remove any rcs files.
 find /home/ -iname WebStatistics\.txt,v -exec echo sudo -u www mv {} {}.bak \;

After all is completed, the backup files need to be removed.:

BasicForm edit

TopicClassification Select one...
Topic Summary Cleanup of WebStatistics files at year end
Interested Parties
Related Topics
Topic revision: r1 - 08 Jan 2013, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy