November 12, 2002, 12:07 PM ET
Content-to-code ratio statistics
Thanks to everybody for the positive feedback on GetContentSize, which I presented Friday.
I logged the tool's results to a database, so I'm able to present a few interesting observations/statistics. (I've been away from my computer for a few days, so I apologize for not having this earlier.)
Total Web pages examined: 4,296
(Pages ranged from news sites to blogs to, yes, porn sites.)
Average percent text content: 21.57%
Highest percent text content: This page (86.57%)
Lowest percent text content: This page (.02%)
Average percent text content for URLs ending in ".com" or ".com/": 17.71%
(I figured this might be decent way to narrow down the results to commercial home pages.)
Average page size: 27,910 bytes
If there's another statistic you'd like to see, post a comment here, and I'll query the database to get it, as long as the statistic is obtainable by MySQL. The logged fields are: URL, page size (in bytes), percent content and the date/time.
