Kliki Politechniki

- autor: tsissput

Introduction:

The goal is to visualize PUT Computer Science Institute scientist according to their publications, who they make publications with, when they make it etc. We had only wide-audience available data and such was used. Document is structured as follows:

  1. In process section one can find information how the experiment was conducted.
  2. Statistical data sections gives some insights on quantitative measures of PUT scientists and publications.
  3. Finally Visualization section presents result graphs of scientists.

Process:

We decided to obtain data from google scholar for obvious reasons. Unfortunately no real API exists what would make our task much easier. Eventually we ended up parsing HTML with open source Python library. Unfortunately like quite a few libraries which had been tested before it was not working properly and we needed to introduce a few improvements.

In program we create graph in which each author corresponds to a node and each edge represents a fact of being co-author of the same article. Weight of such edge is number of citations and additional attributes are title and publish date. Additionally nodes, apart from name, contain information about total number of publications and citations as well as scientific degree and division within faculty of computer science the scientist work for. Data format is suitable to be imported to Gephi but also easely adjusted to other formats.

Statistical data:

In total we analyzed about 80 scientists out of total 120 due to their absence on google scholar. Here are some interesting charts (click on picture to see it in full size):

Scientific degrees and number of dempartment employees pie charts

total cytowania

Total number of publications and citations per division

total cytowania

Total citations of each article

total cytowania

Total number of citations per author

total cytowania

Total number of publications per author

total cytowania

Visualization:

Here we go with visualization of the data. We’ve used Python formatted csv files imported to Gephi. Mind that authors (nodes), have proper number of citations and publications but edges (being co-author of the same article) are created exclusively by articles made by at least two Poznań Univeristy of Technology workers (it still gives meaningful results but reader has to be aware it).

  1. Who is the most cited scientist from PUT? MostCited – labels
  2.  Who has the biggest number of publications? Full – publications
  3. Graph divided to clusters and then colored by Modularity Class pretty much visualizes CS subdivisions (real existing ones) Modularity Class – labels
  4. How far are you from Jan Węglarz? JW Heat Map according to common publications JW Heat
  5. You may also want to check Mikołaj Morzy MM Heat and Agnieszka Ławrynowicz AL Heat new heat maps
  6. Who has the biggest betweenness measure? Simple: Betweeness
  7. Last, but not least: who has the biggest number of PUT Publications-friends: Degree labels

Thanks for reading!

Authors: 100376 , 98436, 98758

Reklamy

Skomentuj

Wprowadź swoje dane lub kliknij jedną z tych ikon, aby się zalogować:

Logo WordPress.com

Komentujesz korzystając z konta WordPress.com. Log Out / Zmień )

Zdjęcie z Twittera

Komentujesz korzystając z konta Twitter. Log Out / Zmień )

Facebook photo

Komentujesz korzystając z konta Facebook. Log Out / Zmień )

Google+ photo

Komentujesz korzystając z konta Google+. Log Out / Zmień )

Connecting to %s

%d blogerów lubi to: