08/07/2013
Greek PGP Web of Trust 2012 edition
I’ve very glad for hosting this guest post. Dorothea put some real effort into it. So…enjoy!
—————————————————————————————————————–
In 2008 Patroklos Argyroudis created the first visualization of the greek PGP web of trust, based on information supplied mostly by people who attended a keysigning party at Thessaloniki. You can read his related posts at sysc.tl/tag/web-of-trust/ [0]
In 2012, during the second cryptoparty [1] at hackerspace.gr [2], George Kargiotakis suggested if someone wanted to update the network. I decided to undertake the task and you can see some of the visualizations below.
Visualizations:
1. Venn of persons that have signed others and of persons that have been signed by others
2. Greek PGP network for 2012
3. Trust in the 2012 Greek PGP network
4. Highlighting the persons who have signed more people
5. Do people trust more persons than they are trusted by?
6. Geolocation of individuals (globally)
7. Geolocation of individuals (in Greece)
8. Gender percentages
9. Educational and research institutes in the PGP network
10. Animation: Formation through time of associations that were active in 2012
11. Communities and the ten most important positions in the 2012 Greek PGP network according to Eigen value centrality
Additional sections:
12. Outline of methodology
13. Keyserver & keys used
14. Notes on methodology
15. Software and visualization notes
16. Problems encountered and how you can help
17. Future plans
18. Web references
19. Synopsis
20. Communication
21. Thanks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 ~ VENN of persons that have signed others and of persons that have been signed by others
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 ~ Greek PGP web of trust in 2012
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lines represent associations between two persons.
The line color is related to key signing reciprocity.
Green: both persons connected with the line have signed each other
Red: one person has signed the other, but the other has not signed back
Each circle represents a person from Greece, who had at least one key ‘active’ (not expired or revoked) on the 31st of December 2012 and who had at least one PGP association with another person from Greece. A person was defined as Greek based on the name/alias associated with his/her keys or based on his/her e-mail address associated with the key. Please see the section ‘Key filtering based on names and e-mail addresses -inclusion and exclusion of keys’ for a list of the criteria used. In some cases a circle represents a certification authority.
The circle diameter is related to trust (how many people have signed that person’s key(s) ).
The circle color is related to how many people that person has signed.
grey: has signed 0 persons (but has been signed by at least another Greek)
light blue: has signed 1 to 5 person(s)
purple: has signed 6 to 10 persons
pink: has signed 11 to 20 persons
orange: has signed 21 to 30 persons
red: has signed 31 to 40 persons
~~A note on data not being displayed on this network view~~
In addition to the main network, there were many small groups of people, unconnected to it. I decided to display only a selection of them. Specifically, in addition to the visible groups there were:
52 cases of associations between only 2 persons
13 cases of associations between 3 persons
8 cases of associations between 4 persons
1 case of associations between 5 persons
1 case of associations between 6 persons
1 case of associations between 7 persons &
1 case of associations between 9 persons
The data from these associations are taken into account for all the other visualizations.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 ~ Trust in the 2012 greek PGP network
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The size of a surname is related to how many persons have signed that person’s key(s).
The coloring is purely for aesthetic purposes.
Note: The entries ‘HY’,’uoc’,’noc’ and ‘cert’ come from key owner names separated by underscores (i.e. uoc_cert). I decided not to erase them in order to keep the ‘key’ shape intact.
In 2012, the ten persons that have received signatures by the most people, were:
1. Manifavas H
2. Maistrelis K
3+4. Glynos D & Zavras A
5+6. Bolis S & Liambotis F
7+8. Kargiotakis G & HY457_csd_uoc
9+10. Argyroudis P & Margaritis K
Names that appear together have been signed by an equal number of people.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4 ~ Highlighting the persons who have signed more people
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The size of a surname is related to how many people that person has signed (2012 Greek WoT data).
The coloring is purely for aesthetic purposes.
In 2012, the 10 persons who had signed more people till then, were:
1. Roussos N
2. Maistrelis K
3. Glynos D
4. Margaritis K
5+6. Kargiotakis G & Liambotis F
7+8. Balaskas E & Daskalopoulos K
9. Mavrogiannopoulos N
10. Kokkalis N
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5 ~ Do people trust more persons than they are trusted by?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I was interested to have a look whether persons in the PGP network were more trusting or more cautious, if something like that could be inferred by the number of people that one has signed or has been signed by.
They were cases of persons that signed no one (individuals on the Y axis), or that were signed by no one (individuals on the X axis).
Persons on the diagonal line in the middle have signed exactly the same number of persons as they have been signed by (i.e. Argyroudis P.).
Everyone above that line (x<y) has signed less people than s/he has been signed by, while everyone below that line (x>y) has signed more people than s/he has been signed by.
Please note that in some cases, a circle denotes more than one person, if the individuals have exactly the same XY co-ordinates.
It was found that 200 persons were a bit more trusting, while for 198 persons the opposite was true, they were more cautious, and 78 persons had signed exactly the same number of persons as they have been signed by*.
(however that does not necessarily mean that they have signed exactly the same persons that they have been signed by)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6 & 7 ~ Geolocation of individuals, globally and in Greece
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each sphere represents a person from Greece, who
– had at least one key ‘active’ (not expired or revoked) on 31 of December 2012
– had at least one association with another person from Greece and
– who had at least one e-mail address from which a location could be deduced
The central sphere represents all individuals whose location could not be determine from the e-mail address associated with the PGP key.
The coloring is related to the location.
Please note that this network representation includes the small groups not displayed in the main network visualization and that locations may have changed, as they were deduced from the e-mail addresses associated at the time that the keys were created.
Some PGP keys were created as part of a higher educational curriculum. This is reflected for example on the bottom of the network by the group (of students) around Manifavas H., who is an assistant professor at the TEI of
Crete. A similar case could probably be the circular group on the top right part of the graph, which is unconnected to the main network and mostly includes persons from the University of Crete (visualization 11).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8 ~ Gender
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Pie chart of the gender deduced for the key owners in the 2012 Greek PGP web of trust.
The gender was deduced from the name or nickname associated with the PGP key.
For some persons that used neutral nicknames, the gender could not be deduced.
There is a clear imbalance, with females representing only 9%.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9 ~ Educational and research institutes in the PGP network
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The sizes of the boxes and the numbers represent the sum of people from Greece with active PGP keys (31st of December 2012),
which have an e-mail address from the respective educational or research institute associated with at least one of their keys.
The five educational/research institutes with the most persons that have used the institutional e-mail address in one of their keys, in 2012 were:
1. University of Crete
2. National Technical University of Athens
3. University of Patras
4. Aristotle University of Thessaloniki
5. Technological Educational Institute of Crete
As you can see from the graph, there is a percentage of Greeks in the PGP network that are associated with foreign universities, such as MIT.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10 ~ Animation: Formation through time of associations that were active in 2012
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each circle represents a person from Greece.
Green lines: signature reciprocity
Red lines: one person has signed the other, but the other has not signed back.
The circle diameter is related to trust
The circle color is related to how many people that person has signed.
Please note that the animation does not show the evolution of the Greek PGP network, but rather how the relationships that were present in 2012 formed through time. Specifically, the animation does not display relationships that were lost, but only the ones that were created and existed till the end of 2012. Creating an animation of the Greek PGP network evolution is in the to-do list.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11 ~ Communities and the ten most important positions in the 2012 Greek PGP network according to Eigen value centrality
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
the size of each sphere is related to the number of persons that have signed the keys of that particular individual (trust). The color of the sphere depends on the community to which it was found to belong. Please note that the number of the detected communities depends on the value set for the modularity parameter during the analysis. The color of the directed edges is the same as the color of the source sphere. The small spheres on the top (back) are the small groups of persons which were not depicted on the main network (visualization 2).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12 ~ Outline of methodology
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
– Creation of list of persons from the 2008 Greek PGP web of trust, which was made by Patroklos Argyroudis (check out his posts at http://sysc.tl/tag/web-of-trust/ ) to be used for the network expansion
– Creation of list of common Greek first names to be used for network enrichment
– Creation of list of common Greek mail domains (ISPs, Universities, research institutes, see below) to be used for network enrichment
– Retrieval of html pages with PGP key and associated information from public keyserver
– Consolidation of data in a text file
– Data transformation
– Filtering out expired and revoked keys in spreadsheet program
– Filtering out keys out keys from foreign people and from Greek people who had no associations with at least another person from Greece
– Name standardization – necessary step for network creation
– Data formatting
– Network construction using Cytoscape
– Creation of other visualizations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13 ~ Keyserver & keys used
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Keyserver used:
http://keyserver.layer42.net:11371/ (doesn’t seem to work at the moment -June 2013)
Keys used for network construction:
Public PGP keys active (not expired or revoked) on the 31st of December 2012, belonging to people who had at least one PGP association with another person from Greece.
Please note that many of the PGP keys that were included in the analysis did not have an expiration date.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
14 ~ Notes on methodology
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~Initial file retrieval~~
Initially, key associated information was retrieved for the people in the Greek pgp web of trust constructed in 2008by Patroklos Argyroudis ( http://sysc.tl/tag/web-of-trust/ ). Then, the network was expanded by adding the keys of people that have signed the retrieved keys since 2008 and on, till no more keys for Greek people could be added that way.
~~Limitations of initial approach~~
This procedure misses relationships in two type of cases:
-persons that have been signed from someone from the network core but there is no signing back to anyone in the core
-persons that form groups totally unconnected to this network
Thus..
~~Data enrichment follow-up: querying keyserver for common names & mail domains~~
..in order to find people that might not be connected to those whose keys were already retrieved, two options were considered
(1) Retrieval of keys for common Greek first names.
This was tried with a couple of names. However, it was soon abandoned because:
~ some people were writing their names using Greek characters and some using Latin ones
~ different people were writing the same Greek name with many variations (consider Giannis, Yiannis, John)
~ some first names are common also in other languages (i.e. querying the server for ‘Μαρία’ is fine, but ‘Maria’ returns girls from Spain and other countries).
(2) Retrieval of keys for common Greek e-mail domains
I created a list of e-mail domains from:
~ The already retrieved keys
~ Greek ISPs
~ Greek universities, Technical Educational Institutes, Colleges, Academies and Schools (http://en.wikipedia.org/wiki/List_of_universities_in_Greece)
~~Limitations of querying the public keyserver with e-mail domains~~
It should be noted that while this approach worked well for most e-mail domains, querying the keyserver for otenet.gr
and a few others returned errors due to the multitude of hits. Therefore, I used a couple of more specific domains (i.e. hq.otenet.gr) in order to retrieve at least some of the associated keys.
~~Key filtering based on names and e-mail addresses – inclusion and exclusion of keys~~
Filtering out keys of non-Greek people
Keys included were from:
– persons with Greek surnames, even if one or more of their associated e-mail addresses were not ending in .gr
– people with Greek first names, even if one or more e-mail addresses associated with the specific keys were not ending in .gr
People excluded were:
– those with foreign first name, last name and e-mail addresses
– those with foreign first name and last name but at least one Greek e-mail address. These persons were few. I considered them as foreign people living/working in Greece but there is of course a chance that they might have Greek parents/origins.
Pseudonyms:
– If the pseudonym was a Greek word, the associated key was included
– If the pseudonym was foreign, but the associated e-mail address had a greek domain, the key was included
~~Other key filtering rules~~
– exclusion of keys expired or revoked on December 2012 and
– exclusion of keys not signed by or having signed at least one Greek person (other than the key owner)
~~Name standardization~~
Some people had several active keys, however their name was not written in exactly the same way in all their keys.
The following types of cases were encountered:
– transposition of the first and last name (Jane Doe and Doe Jane)
– variations in writing the first name or surname (Giannis, Yiannis)
– using the full first name in one key and a shortened one in another key (Panagiotis, Akis)
– adding a comment in the name field in one/some of the keys
Name standardization was necessary, as the network created displays the relationships between people and not between individual keys.
Therefore, names were standardized into:
– a Last name First name format,
– comments from the name field were removed and
– if a person had used several ways to write his/her name, one of them was chosen for all his/her keys
~~Data formatting prior to network construction~~
A tab delimited file was constructed, with a source and a target column, for import into Cytoscape. Every line denoted a directed relationship between two persons (signer and signee). This was the minimal information required for network construction.
For some of the visualizations, the following information was added:
-Year of signing (the oldest entry if a person has signed someone else multiple times)
-Gender (male, female or undefined for some pseudonyms)
-Research or educational institute
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
15. Software and visualizations notes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~Software used for network construction~~
Cytoscape was used, an open source bioinformatics software platform for complex network analysis and visualization.
~~Tag cloud creations~~
Tag clouds were created online using Wordle (by Jonathan Feinberg) or Tagxedo.
~~Geolocated Greek PGP 2012 network visualizations~~
The network visualization was created using Gephi, a free and open-source platform for interactive network visualizations.
The globe image from the countries visualization was extracted from the Marble software, an open source virtual globe and world atlas.
~~Representation of Educational & research institutes in the network~~
The initial visualization was created using Many Eyes which was subsequently modified.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16 ~ Problems encountered and how you can help
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The main ones were:
– e-mail domain queries giving error for multiple hits
– name variations in keys of the same person
– absence of automation i) in the process of obtaining information from the keyserver and ii) in data transformation and filtering.
If you would like to help in scripting the file retrieval from the keyserver or other parts of the methodology, feel free to contact me.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17 ~ Future plans
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are various network analysis metrics which I have not included here. I am not sure if they interest people, maybe I will update this post at some later point.
Besides automating some of the steps of the analysis, in the to do list is the animation of the network evolution including the relationships that disappear.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18 ~ Web references
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[0] sysc.tl/tag/web-of-trust/ – 2008 greek PGP web of trust by Patroklos Argyroudis
[1] https://www.hackerspace.gr/wiki/CryptoParty
[2] https://www.hackerspace.gr/
[3] keyserver.layer42.net:11371
[4] www.cytoscape.org – open source software for network analysis
[5] www.wordle.net – tag cloud creation
[6] www.tagxedo.com – tag cloud creation
[7] gephi.org – open source software for network analysis
[8] marble.kde.org – open source virtual globe and world atlas software
[9] www-958.ibm.com – data visualisation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19 ~ Synopsis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In 2012 there were 478 persons from Greece that had an active PGP key (not revoked or expired till 31.12.2012) and had at least one association with another Greek with an active PGP key (please see the methodology section or limitations). The majority of the keys did not have an associated expiry date. A number of those keys were created as part of a higher education curriculum, as reflected by the TEI of Crete community and probably the University of Crete community (image 12). From the 478 persons, 9% were females (image 8). Geolocation was obtained for 205 persons. The majority (175) lived in Greece, while the rest either lived or had lived in a foreign country (image 6), or were associated with two countries (Greece and a foreign one). For those that a city could be identified, it is visible in image 7. Please note that location does not necessarily reflect a place of living in 2012, but rather the location of a person at the time of key creation, which may or may not have changed. Looking into educational and research institutes as reflected by the key-associated e-mail addresses, the University of Crete and the National Technical University of Athens are in the first and second position, with 30 and 29 persons, respectively. However, most of the persons from the university of Crete are likely to be students having created their keys for a class. It should be added that some Greeks have created their PGP keys while being at foreign universities, such as MIT (image 9). Looking into the communities in the PGP network graph (image 11), there were many associations between just two to nine persons, unconnected with the main network (top of the graph). Focusing on individuals, Manifavas H. appears to be the person who has been signed by the most people, Roussos N.has signed more persons than anyone else.and Maistrelis K. appears to have the highest number of total connections. Finally, an animation of the formation of connections active in 2012 is visible in image 10.
If you notice any mistake, my apologies and feel free to contact me in order to correct it. This analysis might be updated in the future with a new version of the greek PGP Web of trust, overcoming the keyserver limitations and hence with a better coverage of data.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20 ~ Communication
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Any comments, criticisms, questions and suggestions are welcome.
Apologies if there is any mistake in names or in other places. If you notice any, please let me know.
If you want to message me directly,
-for people who would like to test their visual acuity, you can try to read this
-for the rest, it is dorothea.kazazi (skip the next 3 lines)
Cupcake ipsum dolor sit amet gingerbread. Danish chocolate bar marzipan
gummies sesame snaps. Candy canes brownie toffee bonbon. Jelly tart gummi
bears fruitcake fruitcake muffin jelly beans lollipop.
@gmail.com
( if you e-mail me, adding ‘pgp network’ in the subject line will ensure that the message will go to my inbox and won’t be lost in the spam folder )
Dorothea
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21 ~ Thanks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Guest post kindly hosted at void.gr, thanks.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22 ~ Extra Files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
List of People – PGP KeyIDs: PGPkeys_of_sources
—————————————————————————————————————–
I would only like to add here that Dorothea told me she didn’t have a lot of programming experience before she took up this huge challenge. I think we all need to say a very big “thank you” to her. Her work is just awesome! Can’t wait for the next version!
A great post to celebrate the 9 years of this blog 😀
Filed by kargig at 20:18 under Encryption,Greek,Internet,Linux,Privacy
Tags: gpg, keys, pgp, web of trust, WOT
3 Comments | 20,178 views
Really enjoyable work. You should now talk her into working on WOT “walking” algorithms 😉
I think there is a bug on the way you exclude the revoked keys. It seems that I have signed all these people, that haven’t signed my key back, but these signatures are made with my old (revoked) key. Actually all these connections are reciprocal (with my revoked key), but you have excluded only the one part 🙂
thanks argp 🙂
and thank you Niko for pointing that out 🙂 I will check it and make the respective updates