{"id":463,"date":"2009-05-02T17:00:32","date_gmt":"2009-05-02T14:00:32","guid":{"rendered":"http:\/\/www.void.gr\/kargig\/blog\/?p=463"},"modified":"2009-05-02T17:00:32","modified_gmt":"2009-05-02T14:00:32","slug":"some-statistics-on-linux-greek-users-mailing-list-and-forumhelluggr","status":"publish","type":"post","link":"https:\/\/www.void.gr\/kargig\/blog\/2009\/05\/02\/some-statistics-on-linux-greek-users-mailing-list-and-forumhelluggr\/","title":{"rendered":"Some statistics on linux-greek-users mailing list and forum.hellug.gr"},"content":{"rendered":"<p>Out of boredom I decided to parse the Linux-Greek-Users (LGU) archives and create some graphs. Then I wrote a few more oneliners to deduct some numbers out of the archives. These numbers may or may not mean anything to someone, it&#8217;s entirely up to the reader.. Since the archives contain some amount of spam (not too much though) one must take that into consideration as well while reading the numbers I extracted below&#8230;<\/p>\n<p>First thing I did was to download the index file containing the links to the monthly archives since February 1997:<br \/>\n<code>wget http:\/\/lists.hellug.gr\/pipermail\/linux-greek-users\/<\/code><\/p>\n<p>Then download each month&#8217;s archive:<br \/>\n<code>for i in `grep date index.html | cut -d\"\\\"\" -f2`; do foo=`echo $i|cut -d\"\/\" -f1`; wget http:\/\/lists.hellug.gr\/pipermail\/linux-greek-users\/$i -O $foo-date.html ; done<\/code><br \/>\n<!--more--><br \/>\nAfter a while I had the full archives at my disk&#8230;<br \/>\nThen it was time for the first metric.<br \/>\n<strong>How many posts does each month have ?<\/strong><br \/>\nThis was quite easy because mailman contains this information inside each month&#8217;s archive:<br \/>\n<code>for i in *-date.html; do count=`grep -i \"Messages:\" $i | cut -d\">\" -f 3|cut -d\"< \" -f1`; echo \"$i $count\" >> count.txt;done<\/code><br \/>\nThat command gave me an output such as:<\/p>\n<blockquote><p>1999-May-date.html 985<br \/>\n1999-November-date.html 1441<br \/>\n1999-October-date.html 1148<br \/>\n1999-September-date.html 1369<br \/>\n2000-April-date.html 690<br \/>\n2000-August-date.html 354<br \/>\n2000-December-date.html 444<br \/>\n2000-February-date.html 833<br \/>\n&#8230;<\/p><\/blockquote>\n<p>It didn&#8217;t take me a long time before I had this data entered inside OOcalc. Time for the first graph:<br \/>\nLGU posts per month:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posts_per_month.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posts_per_month-300x77.png\" alt=\"lgu-posts_per_month\" title=\"lgu-posts_per_month\" width=\"300\" height=\"77\" class=\"alignnone size-medium wp-image-464\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posts_per_month-300x77.png 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posts_per_month-1024x263.png 1024w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posts_per_month.png 1203w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>And the average monthly posts per year:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-monthly_average_posts_per_year.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-monthly_average_posts_per_year-299x165.png\" alt=\"lgu-monthly_average_posts_per_year\" title=\"lgu-monthly_average_posts_per_year\" width=\"299\" height=\"165\" class=\"alignnone size-medium wp-image-466\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-monthly_average_posts_per_year-299x165.png 299w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-monthly_average_posts_per_year.png 562w\" sizes=\"auto, (max-width: 299px) 100vw, 299px\" \/><\/a><\/p>\n<p>One can certainly argue there&#8217;s a tendency here. 2008 had the fewer posts than any other year&#8230;<\/p>\n<p>What&#8217;s more interesting is <strong>who are actually writing on this list<\/strong>. To extract that information made the following oneliners:<br \/>\nFirst of all I converted all html files to utf8 because the archives are kept at iso-8859-7 encoding and I am using a unicode terminal:<br \/>\n<code>for i in *-date.html; do iconv -f iso-8859-7 -t utf-8 $i -o $i.utf; done<\/code><br \/>\nThen I could find all the authors for each monthly archive:<br \/>\n<code>for i in *-date.html.utf; do grep \"<i>\" $i | cut -d\">\" -f 2 >$i-authors.txt; done<\/i><\/code><br \/>\nThen count each author&#8217;s posts:<br \/>\n<code>for i in *-date.html.utf-authors.txt; do sort $i | uniq -c | sort -n >$i-sorted-count; done<\/code><br \/>\nEasiest thing to do is to <strong>count how many people post per month<\/strong>. Are the users of the mailing list increasing or decreasing ?<br \/>\n<code>for i in *-sorted-count; do wc -l $i >> posters-per-month; done<\/code><br \/>\nThat gave a nice output like that:<\/p>\n<blockquote><p>55 1997-April-date.html.utf-authors.txt-sorted-count<br \/>\n49 1997-August-date.html.utf-authors.txt-sorted-count<br \/>\n88 1997-December-date.html.utf-authors.txt-sorted-count<br \/>\n62 1997-February-date.html.utf-authors.txt-sorted-count<br \/>\n41 1997-July-date.html.utf-authors.txt-sorted-count<br \/>\n60 1997-June-date.html.utf-authors.txt-sorted-count<br \/>\n62 1997-March-date.html.utf-authors.txt-sorted-count<br \/>\n69 1997-May-date.html.utf-authors.txt-sorted-count<br \/>\n&#8230;\n<\/p><\/blockquote>\n<p>I inserted that to OOcalc again and here is the output:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posters_per_month1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posters_per_month1-300x179.png\" alt=\"lgu-posters_per_month1\" title=\"lgu-posters_per_month1\" width=\"300\" height=\"179\" class=\"alignnone size-medium wp-image-469\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posters_per_month1-300x179.png 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/lgu-posters_per_month1.png 672w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The decline in the number of posters is clearly shown.<\/p>\n<p>Another interesting statistic would be the months with the most and fewer posters respectively:<br \/>\n<strong>Months with most posters<\/strong>:<br \/>\n<code>sort -n posters-per-month | tail -n10<\/code><\/p>\n<blockquote><p>166 1999-May-date.html.utf-authors.txt-sorted-count<br \/>\n167 2001-October-date.html.utf-authors.txt-sorted-count<br \/>\n168 1999-December-date.html.utf-authors.txt-sorted-count<br \/>\n173 1999-October-date.html.utf-authors.txt-sorted-count<br \/>\n173 2001-June-date.html.utf-authors.txt-sorted-count<br \/>\n174 2001-May-date.html.utf-authors.txt-sorted-count<br \/>\n176 1999-September-date.html.utf-authors.txt-sorted-count<br \/>\n184 1999-November-date.html.utf-authors.txt-sorted-count<br \/>\n192 2000-March-date.html.utf-authors.txt-sorted-count<br \/>\n222 2004-November-date.html.utf-authors.txt-sorted-count<\/p><\/blockquote>\n<p><strong>Months with fewer posters<\/strong>:<br \/>\n<code>sort -n posters-per-month | head -n10<\/code><\/p>\n<blockquote><p>16 2008-August-date.html.utf-authors.txt-sorted-count<br \/>\n32 2008-December-date.html.utf-authors.txt-sorted-count<br \/>\n36 2008-July-date.html.utf-authors.txt-sorted-count<br \/>\n41 1997-July-date.html.utf-authors.txt-sorted-count<br \/>\n44 2008-May-date.html.utf-authors.txt-sorted-count<br \/>\n49 1997-August-date.html.utf-authors.txt-sorted-count<br \/>\n51 2008-September-date.html.utf-authors.txt-sorted-count<br \/>\n51 2009-March-date.html.utf-authors.txt-sorted-count<br \/>\n53 2003-August-date.html.utf-authors.txt-sorted-count<br \/>\n53 2009-February-date.html.utf-authors.txt-sorted-count<\/p><\/blockquote>\n<p>Then I got interested in finding out <strong>who are the top posters throughout the archives<\/strong>. I didn&#8217;t want to write anything complex in order to sum every user&#8217;s post and I decided to find out who are the top5 posters in each month and the see whose names are repeated over and over in the top5.<\/p>\n<p>Using the following oneliner I was able to find the top5 posters of each month:<br \/>\n<code>for i in *-count; do tail -n5 $i >> top5-of-each-month.txt; done<\/code><\/p>\n<p>And then using a bit of perl I was able to extract the posters appearing most times, here&#8217;s the top 20 of them:<\/p>\n<blockquote><p>            8  Alexios Chouchoulas<br \/>\n      9  fs<br \/>\n      9  Spiros Bolis<br \/>\n     10  Alexandros Papadopoulos<br \/>\n     10  Giannis Stoilis<br \/>\n     10  Harris Kosmidhs<br \/>\n     10  Vasilis Vasaitis<br \/>\n     11  Giannis Papadopoulos<br \/>\n     11  Michael Iatrou<br \/>\n     11  Panos Katsaloulis<br \/>\n     11  \u0386\u03b3\u03b3\u03b5\u03bb\u03bf\u03c2 \u039f\u03b9\u03ba\u03bf\u03bd\u03bf\u03bc\u03cc\u03c0\u03bf\u03c5\u03bb\u03bf\u03c2<br \/>\n     16  George Notaras<br \/>\n     17  Nick Demou<br \/>\n     24  George Daflidis-Kotsis<br \/>\n     24  Michalis Kabrianis<br \/>\n     33  I.Ioannou<br \/>\n     41  DJ Art<br \/>\n     52  V13<br \/>\n     74  Christos Ricudis<br \/>\n     83  Giorgos Keramidas<\/p><\/blockquote>\n<p>The number appearing before the name is the number of months the poster has been inside the top5 posters for a month. That means that Giorgos Keramidas is in the top5 posters for 83 months out of 147 months of archives that I parsed. Pretty impressive!<\/p>\n<p>Then I wanted to create some graphs about <a href=\"http:\/\/forum.hellug.gr\">http:\/\/forum.hellug.gr<\/a> as well. HELLUG&#8217;s forum has less than 2 years of life so the graphs cannot really be compared to the ones from LGU. I just put them here for completeness.<\/p>\n<p><strong>Forum.hellug.gr &#8211; Posts per month<\/strong>:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-posts_per_month.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-posts_per_month-300x165.png\" alt=\"forumhelluggr-posts_per_month\" title=\"forumhelluggr-posts_per_month\" width=\"300\" height=\"165\" class=\"alignnone size-medium wp-image-476\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-posts_per_month-300x165.png 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-posts_per_month.png 556w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><strong>Forum.hellug.gr &#8211; New members per month<\/strong>:<br \/>\n<a href=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-new_members_per_month.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-new_members_per_month-300x215.png\" alt=\"forumhelluggr-new_members_per_month\" title=\"forumhelluggr-new_members_per_month\" width=\"300\" height=\"215\" class=\"alignnone size-medium wp-image-477\" srcset=\"https:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-new_members_per_month-300x215.png 300w, https:\/\/www.void.gr\/kargig\/blog\/wp-content\/forumhelluggr-new_members_per_month.png 527w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>I won&#8217;t do any further comments&#8230;I&#8217;d be glad to see yours though \ud83d\ude42<\/p>\n<p>P.S. I know that I could have used some form of database to store and process the results of those commands but I wanted to keep it as simple as possible. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Out of boredom I decided to parse the Linux-Greek-Users (LGU) archives and create some graphs. Then I wrote a few more oneliners to deduct some numbers out of the archives. These numbers may or may not mean anything to someone, it&#8217;s entirely up to the reader.. Since the archives contain some amount of spam (not [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"footnotes":""},"categories":[3],"tags":[65,68,63,66,595,67,47,64],"class_list":["post-463","post","type-post","status-publish","format-standard","hentry","category-linux","tag-forum","tag-forumhelluggr","tag-hellug","tag-lgu","tag-linux","tag-linux-greek-users","tag-oneliner","tag-statistics"],"aioseo_notices":[],"views":187637,"_links":{"self":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/463","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/comments?post=463"}],"version-history":[{"count":15,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/463\/revisions"}],"predecessor-version":[{"id":484,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/posts\/463\/revisions\/484"}],"wp:attachment":[{"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/media?parent=463"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/categories?post=463"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.void.gr\/kargig\/blog\/wp-json\/wp\/v2\/tags?post=463"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}