3
Community structure of web-graphs of academic institutions Alexey Medvedev Sobolev Institute of Mathematics, Novosibirsk, Russia ИМ СО РАН Introduction Real-life networks’ structure is far from being homogeneous. The structural heterogeneity may be apprehended by understanding the community structure of the network. Community detection is now one of the hot topics in network science [For09]. The concept of the community in a network is usually derived from common understanding of a community in social networks, which may be defined as the set of nodes which are more densely connected together than to the rest of the network. In this research we study the community structure of webgraphs of two academic organisations. The webgraphs are represented as directed weighted graphs, nodes are websites of organisations’ units and edges represent the number of hyperlinks between them. Data sets We study two datasets representing webgraphs of Siberian Branch of Russian Academy of Sciences (SB RAS), Russia, denoted as G , and Fraunhofer-Gesellschaft (FG), Germany, denoted as R . These datasets are naturally represented as directed and weighted graphs Γ=(V , E ). Deleting some particular nodes based on these graphs the two reduced graphs are considered, correspondingly G and R . The basic properties of these graphs are presented in the Table below. SB RAS (G ) reduced SB RAS ( G ) FG (R ) reduced FG ( R ) # of nodes (|V |) 95 79 72 71 # of edges (|E |) 949 297 321 180 diameter d 4 8 2 9 Table: Graph-theoretic properties Graph characteristics and measures I BETWEENNESS CENTRALITY Consider a directed graph G =(V , E ) and a node v V (G ). By betweenness centrality betw (v ) of a node v we mean the sum betw (v )= X s 6=v 6=t σ st (v ) σ st , where σ st is the total number of directed shortest paths from node s to node t and σ st (v ) is the number of those paths that pass through v . Remark. betw (v ) shows the importance of node v in terms of routing and connectivity. I MODULARITY Denote the node set as V = {1, 2,..., n}, i and j – nodes, w ij – weight of the edge (i , j ), w i = j w ij . Then the modularity can be defined as Q = 1 w X i X j w ij - w i w j w δ(C i , C j ) [-1, 1], where C i is the community assigned to the node i , δ(C i , C j ) is a Kronecker delta-function, valued 1 if nodes i and j are in the same community, and 0 else. The unweighted version of modularity Q unweighted is obtained from Q by forgetting the weight of every egde, i.e. for every edge (i , j ) the new weight w 0 ij is assigned as w 0 ij = 1, if w ij 6=0, 0, else. Remark. Q shows the quality of the graph partition into communities. Method I The search of the best partition into communities was performed by modularity maximization using a combination of heuristic algorithms, mainly the tabu search algorithm [AFG08]. I The observed heterogeneity in degrees and betweenness centrality scores makes it difficult to reveal communities in the network. We introduce reduced graphs G and R by deleting central nodes with high degree and betw (v ) scores. The deleted nodes represent administrative organisations. I We give both weighted and unweighted modularity scores for the best obtained partitions in the Table Graph |V | Q Q unweighted G 95 0.151552 0.150959 G 79 0.671547 0.382059 Graph |V | Q Q unweighted R 72 0.130562 0.258062 R 71 0.302629 0.562928 Table: Modularity ranks for initial and reduced graphs References [AFG08] A Arenas, A Fern´ andez, and S G´ omez. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, 10(5):053039, 2008. [For09] Santo Fortunato. Community detection in graphs. CoRR, abs/0906.0612, 2009. Acknowledgments This work has been supported by the Interdisciplinary Integration Project of SB RAS N.21 under the title Investigation of regularities and trends of self-organizing systems on the examples of the Web space and biological communities, Grant 12-01-00448 of the Russian Foundation of Basic Research and Grant NSh-1939.2014.1 of President of Russia for Leading Scientic Schools. Statistics for SBRAS webgraphs 0 2 4 6 8 10 0 5 10 15 20 25 30 35 40 45 Degree scores Number of nodes (a) Degree distribution in G 0 2 4 6 8 10 12 14 0 5 10 15 20 25 30 35 40 45 Degree scores Number of nodes (b) Degree distribution in G 0 10 20 30 40 50 60 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 Weighted degree scores Number of nodes (a) Weighted degree distribution in G 0 5 10 15 20 25 30 35 0 100 200 300 400 500 600 Weighted degree scores Number of nodes (b) Weighted degree distribution in G 0 10 20 30 40 50 60 70 80 0 500 1000 1500 2000 2500 3000 Betweenness scores Number of vertices (a) Betweenness centrality distribution in G 0 10 20 30 40 50 0 100 200 300 400 500 600 700 800 Betweenness scores Number of vertices (b) Betweenness centrality distribution in G Statistics for Fraunhofer-Geselschaft webgraphs 0 2 4 6 8 10 12 14 0 20 40 60 80 100 120 140 160 Degree scores Number of nodes (a) Degree distribution in R 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 16 Degree scores Number of nodes (b) Degree distribution in R 0 5 10 15 20 25 30 35 40 0 10000 20000 30000 40000 50000 60000 Weighted degree scores Number of nodes (a) Weighted degree distribution in R 0 10 20 30 40 50 60 70 0 500 1000 1500 2000 2500 3000 3500 Weighted degree scores Number of nodes (b) Weighted degree distribution in R 0 10 20 30 40 50 60 70 80 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Betweenness scores Number of vertices (a) Betweenness centrality distribution in R -5 0 5 10 15 20 25 30 35 0 100 200 300 400 500 600 700 800 900 Betweenness scores Number of vertices (b) Betweenness centrality distribution in R MM-HPC-2014. Text-mining and intelligent analysis of knowledge in databases e-mail: [email protected]

Alexey Medvedev Sobolev Institute of Mathematics ... · ИВТ СО РАН Институт вычислительных технологий СО ... MM-HPC-2014. Text-mining and

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Alexey Medvedev Sobolev Institute of Mathematics ... · ИВТ СО РАН Институт вычислительных технологий СО ... MM-HPC-2014. Text-mining and

Community structure of web-graphs of academic institutionsAlexey Medvedev

Sobolev Institute of Mathematics, Novosibirsk, Russia

ИМСО РАН

IntroductionReal-life networks’ structure is far from being homogeneous. The structural heterogeneity may beapprehended by understanding the community structure of the network. Community detection is now oneof the hot topics in network science [For09]. The concept of the community in a network is usuallyderived from common understanding of a community in social networks, which may be defined as the setof nodes which are more densely connected together than to the rest of the network.In this research we study the community structure of webgraphs of two academic organisations. Thewebgraphs are represented as directed weighted graphs, nodes are websites of organisations’ units andedges represent the number of hyperlinks between them.

Data setsWe study two datasets representing webgraphs of Siberian Branch of Russian Academy ofSciences (SB RAS), Russia, denoted as G , and Fraunhofer-Gesellschaft (FG), Germany, denotedas R. These datasets are naturally represented as directed and weighted graphs Γ = (V ,E). Deletingsome particular nodes based on these graphs the two reduced graphs are considered, correspondingly Gand R. The basic properties of these graphs are presented in the Table below.

SB RAS (G ) reduced SB RAS (G ) FG (R) reduced FG (R)

# of nodes (|V |) 95 79 72 71

# of edges (|E |) 949 297 321 180

diameter d 4 8 2 9

Table: Graph-theoretic properties

Graph characteristics and measures

I BETWEENNESS CENTRALITYConsider a directed graph G = (V ,E) and a node v ∈ V (G). By betweenness centralitybetw(v) of a node v we mean the sum

betw(v) =∑

s 6=v 6=t

σst(v)

σst,

where σst is the total number of directed shortest paths from node s to node t and σst(v) is thenumber of those paths that pass through v .

Remark. betw(v) shows the importance of node v in terms of routing and connectivity.

I MODULARITYDenote the node set as V = {1, 2, . . . , n}, i and j – nodes, wij – weight of the edge (i , j),wi =

∑j wij . Then the modularity can be defined as

Q =1

w

∑i

∑j

(wij −

wi wj

w

)δ(Ci ,Cj ) ∈ [−1, 1],

where Ci is the community assigned to the node i , δ(Ci ,Cj ) is a Kronecker delta-function, valued 1 ifnodes i and j are in the same community, and 0 else.

The unweighted version of modularity Qunweighted is obtained from Q by forgetting the weight of everyegde, i.e. for every edge (i , j) the new weight w ′

ij is assigned as

w ′ij =

{1, if wij 6= 0,0, else.

Remark. Q shows the quality of the graph partition into communities.

Method

I The search of the best partition into communities was performed by modularity maximization using acombination of heuristic algorithms, mainly the tabu search algorithm [AFG08].

I The observed heterogeneity in degrees and betweenness centrality scores makes it difficult to revealcommunities in the network. We introduce reduced graphs G and R by deleting central nodes withhigh degree and betw(v) scores. The deleted nodes represent administrative organisations.

I We give both weighted and unweighted modularity scores for the best obtained partitions in the Table

Graph |V | Q Qunweighted

G 95 0.151552 0.150959

G 79 0.671547 0.382059

Graph |V | Q Qunweighted

R 72 0.130562 0.258062

R 71 0.302629 0.562928

Table: Modularity ranks for initial and reduced graphs

References

[AFG08] A Arenas, A Fernandez, and S Gomez.Analysis of the structure of complex networks at different resolution levels.New Journal of Physics, 10(5):053039, 2008.

[For09] Santo Fortunato.Community detection in graphs.CoRR, abs/0906.0612, 2009.

AcknowledgmentsThis work has been supported by the Interdisciplinary Integration Project of SB RAS N.21 under the title«Investigation of regularities and trends of self-organizing systems on the examples of the Web spaceand biological communities», Grant 12-01-00448 of the Russian Foundation of Basic Research and GrantNSh-1939.2014.1 of President of Russia for Leading Scientic Schools.

Statistics for SBRAS webgraphs

0

2

4

6

8

10

0 5 10 15 20 25 30 35 40 45

Degree scoresNumber of nodes

(a) Degree distribution in G

0

2

4

6

8

10

12

14

0 5 10 15 20 25 30 35 40 45

Degree scoresNumber of nodes

(b) Degree distribution in G

0

10

20

30

40

50

60

0 5000 10000 15000 20000 25000 30000 35000 40000 45000

Weighted degree scoresNumber of nodes

(a) Weighted degree distribution in G

0

5

10

15

20

25

30

35

0 100 200 300 400 500 600

Weighted degree scoresNumber of nodes

(b) Weighted degree distribution in G

0

10

20

30

40

50

60

70

80

0 500 1000 1500 2000 2500 3000

Betweenness scoresNumber of vertices

(a) Betweenness centrality distribution in G

0

10

20

30

40

50

0 100 200 300 400 500 600 700 800

Betweenness scoresNumber of vertices

(b) Betweenness centrality distribution in G

Statistics for Fraunhofer-Geselschaft webgraphs

0

2

4

6

8

10

12

14

0 20 40 60 80 100 120 140 160

Degree scoresNumber of nodes

(a) Degree distribution in R

0

2

4

6

8

10

12

14

0 2 4 6 8 10 12 14 16

Degree scoresNumber of nodes

(b) Degree distribution in R

0

5

10

15

20

25

30

35

40

0 10000 20000 30000 40000 50000 60000

Weighted degree scoresNumber of nodes

(a) Weighted degree distribution in R

0

10

20

30

40

50

60

70

0 500 1000 1500 2000 2500 3000 3500

Weighted degree scoresNumber of nodes

(b) Weighted degree distribution in R

0

10

20

30

40

50

60

70

80

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Betweenness scoresNumber of vertices

(a) Betweenness centrality distribution in R

-5

0

5

10

15

20

25

30

35

0 100 200 300 400 500 600 700 800 900

Betweenness scoresNumber of vertices

(b) Betweenness centrality distribution in R

MM-HPC-2014. Text-mining and intelligent analysis of knowledge in databases e-mail: [email protected]

Page 2: Alexey Medvedev Sobolev Institute of Mathematics ... · ИВТ СО РАН Институт вычислительных технологий СО ... MM-HPC-2014. Text-mining and

Community structure of web-graphs of academic institutionsAlexey Medvedev

Sobolev Institute of Mathematics, Novosibirsk, Russia

ИМСО РАН

Siberian Branch of Russian Academy of Sciences, graph G

Physics

Math & computer scienceChemistry

Scienti�c centers

Geosciences

HumanitiesEnergy, mechanics, etc.

Biology

Nanotechnology& informatics

ГПНТБ СО РАН

Отделение ГПНТБ СО РАН

Портал СО РАН Президиум СО РАН

СОРАН ИНФО.

ИВМиМГ СО РАН

ИМ СО РАН

ИСИ СО РАН

ОФ ИМ СО РАН

ИАиЭ СО РАН

ИКФИА СО РАН

ИЛФ СО РАН

ИОА СО РАН

ИСЭ СО РАН

ИФ СО РАН

ИФМ СО РАН

ИФП СО РАН

ИЯФ СО РАН

КТИ НП СО РАН

КТФ ИГиЛ СО РАН

ТФ ИТПМ СО РАН

ИВМ СО РАН

ИВТ СО РАН

ИДСТУ СО РАН КТИ ВТ СО РАН

СКТБ Наука КНЦ СО РАН

ОУС СО РАН по НИТ

ИГиЛ СО РАН

ИСЭМ СО РАН

ИТ СО РАН

ИТПМ СО РАН

ИФПМ СО РАН

ИФТПС СО РАН

ИЭОПП СО РАН

ИК СО РАН

ИНХ СО РАН

ИППУ СО РАН

ИПХЭТ СО РАН ИУХМ СО РАН

ИХКГ СО РАН

ИХН СО РАН ИХТТМ СО РАН

ИХХТ СО РАН

ИрИХ СО РАН

МТЦ СО РАН

НИОХ СО РАН

АФ ЦСБС СО РАН

ИБПК СО РАН

ИБФ СО РАН

ИЛ СО РАН

ИМКБ СО РАН

ИОЭБ СО РАН

ИПА СО РАН

ИСиЭЖ СО РАН

ИХБФМ СО РАН

ИЦиГ СО РАН

ОУС БИО СО РАН

СИФИБР СО РАН

ЦСБС СО РАН

БИП СО РАН

ГИН СО РАН

ГС СО РАН

ИВЭП СО РАН

ИГ СО РАН

ИГАБМ СО РАН

ИГД СО РАН

ИГДС СО РАН

ИГМ СО РАН

ИГХ СО РАН

ИЗК СО РАН

ИМЗ СО РАН

ИМКЭС СО РАН

ИНГГ СО РАН

ИПНГ СО РАН

ИПРЭК СО РАН

ИУ СО РАН

ЛИН СО РАН

ЗСФ ИЛ СО РАН

НФ ИВЭП СО РАН ИАЭТ СО РАН

ИИ СО РАН

ИМБТ СО РАН

ИФЛ СО РАН

ИФПР СО РАН

БНЦ СО РАН ИНЦ СО РАН

КНЦ СО РАН

КемНЦ СО РАН

ОНЦ СО РАН

ТНЦ СО РАН

ТюмНЦ СО РАН

ЯНЦ СО РАН

Description. The original graph G is presented, the nodes are put together according to the subject areas (based on the information on the SBRAS website).Edge thickness represents its weight. It is seen that central administrative organisations are tightly connected to each other compared to the rest of thenetwork. Modularity score is small showing that nodes inside the partitions are loosely connected compared to the rest of the network.

Siberian Branch of Russian Academy of Sciences, graph G

Community 1

Community 2Community 3

Community 4

Community 5 ГС СО РАН ИАиЭ СО РАН

ИВМиМГСО РАН

ИГиЛ СО РАН

ИИ СО РАН

ИМ СО РАН

ИНГГ СО РАН

ИСИ СО РАН

ИТПМ СО РАН

ИФПР СО РАН ИХТТМ СО РАН

ИЦиГ СО РАН

ИЭОПП СО РАН

КТФ ИГиЛ СО РАН

ОФ ИМ СО РАН

ТФ ИТПМ СО РАН

ИАЭТ СО РАН

ИК СО РАН

ИМБТ СО РАН

ИНХ СО РАН

ИОА СО РАН

ИПХЭТ СО РАН

ИТ СО РАН

ИФЛСО РАН

ИФПСО РАН

ИФПМСО РАН

ИХКГ СО РАН

ИХН СО РАН МТЦ СО РАН

НИОХ СО РАН

ОНЦ СО РАН

ТНЦ СО РАН

ИВМ СО РАН

ИГ СО РАН

ИГХ СО РАН

ИДСТУ СО РАН

ИЗК СО РАН

ИНЦ СО РАН

ИСЭМ СО РАН ИрИХСО РАН

КНЦ СО РАН

ЛИН СО РАН

СИФИБР СО РАН

СКТБ НаукаКНЦ СО РАН

ТюмНЦСО РАН

ЯНЦ СО РАН

БНЦ СО РАН

ГИН СО РАН

ИБПК СО РАН

ИБФ СО РАН

ИВЭП СО РАН

ИГД СО РАН

ИГДС СО РАН

ИГМ СО РАН

ИЛ СО РАН

ИМЗ СО РАН

ИМКБСО РАН

ИМКЭС СО РАН

ИОЭБ СО РАН

ИПА СО РАН

ИПРЭК СО РАН

ИСиЭЖ СО РАН

ИУ СО РАН

ИУХМ СО РАН

ИФМСО РАН

ИХБФМСО РАН

ИХХТ СО РАН

КТИ ВТ СО РАН

КТИ НП СО РАН

КемНЦСО РАН

НФ ИВЭП СО РАН

ОУС БИО СО РАН

ЦСБС СО РАН

ИГАБМ СО РАН

ИЛФСО РАН

ИСЭ СО РАН

ИФ СО РАН

ИЯФ СО РАН

Description. The reduced graph G is obtained after deletion of central administrative nodes and nodes labelled ИВТ СО РАН and ОУС СО РАН поНИТ, which are of high degree and do not show distinguishable community membership. High modularity score provides the validity of the partition. Theunweighted modularity score is small, therefore the partition highly depends on weights. The conclusion is the community structure of G is distinguishable,communities contain institutes from different subject areas, showing cross-institutional collaboration.

Node listАФ ЦСБС СО РАН Ботанический сад СО РАН, Алтайский филиал

БИП СО РАН Байкальский институт природопользования СО РАН

БНЦ СО РАН Бурятский научный центр СО РАН

ГИН СО РАН Геологический институт СО РАН

ГПНТБ СО РАН Государственная публичная научно-техническая библиотека СО РАН

ГС СО РАН Геофизическая служба СО РАН

ЗСФ ИЛ СО РАН ИЛ СО РАН, Западно-Сибирский филиал

ИАиЭ СО РАН Институт автоматики и электрометрии СО РАН

ИАЭТ СО РАН Институт археологии и этнографии СО РАН

ИБПК СО РАН Институт биологических проблем криолитозоны СО РАН

ИБФ СО РАН Институт биофизики СО РАН

ИВМ СО РАН Институт вычислительного моделирования СО РАН

ИВМиМГ СО РАН Институт выч. математики и мат. геофизики СО РАН

ИВТ СО РАН Институт вычислительных технологий СО РАН

ИВЭП СО РАН Институт водных и экологических проблем СО РАН

ИГ СО РАН Институт географии им. В.Б.Сочавы СО РАН

ИГАБМ СО РАН Институт геологии алмаза и благородных металлов СО РАН

ИГД СО РАН Институт горного дела им. Н.А. Чинакала СО РАН

ИГДС СО РАН Институт горного дела Севера им. Н.В. Черского СО РАН

ИГиЛ СО РАН Институт гидродинамики им. М.А. Лаврентьева СО РАН

ИГМ СО РАН Институт геологии и минералогии им. В.С.Соболева СО РАН

ИГХ СО РАН Институт геохимии им. А.П. Виноградова СО РАН

ИДСТУ СО РАН Институт динамики систем и теории управления СО РАН

ИЗК СО РАН Институт земной коры СО РАН

ИИ СО РАН Институт истории СО РАН

ИК СО РАН Институт катализа им. Г.К.Борескова СО РАН

ИКФИА СО РАН Институт космофизических исследований и аэрономии СО РАН

ИЛ СО РАН Институт леса им. В.Н. Сукачева СО РАН

ИЛФ СО РАН Институт лазерной физики СО РАН

ИМ СО РАН Институт математики им. С.Л. Соболева CО РАН

ИМБТ СО РАН Институт монголоведения, буддологии и тибетологии СО РАН

ИМЗ СО РАН Институт мерзлотоведения им. П.И.Мельникова СО РАН

ИМКБ СО РАН Институт молекулярной и клеточной биологии СО РАН

ИМКЭС СО РАН Институт мониторинга климат. и эколог. систем СО РАН

ИНГГ СО РАН Институт нефтегазовой геологии и геофизики СО РАН

ИНХ СО РАН Институт неорганической химии им. А.В.Николаева СО РАН

ИНЦ СО РАН Иркутский научный центр СО РАН

ИОА СО РАН Институт оптики атмосферы имени В.Е. Зуева СО РАН

ИОЭБ СО РАН Институт общей и экспериментальной биологии СО РАН

ИПА СО РАН Институт почвоведения и агрохимии СО РАН

ИПНГ СО РАН Институт проблем нефти и газа СО РАН

ИППУ СО РАН Институт проблем переработки углеводородов СО РАН

ИПРЭК СО РАН Институт природных ресурсов, экологии и криологии СО РАН

ИПХЭТ СО РАН Институт проблем химико-энергетических технологий СО РАН

ИрИХ СО РАН Иркутский институт химии им. А.Е.Фаворского СО РАН

ИСИ СО РАН Институт систем информатики имени А.П. Ершова СО РАН

ИСиЭЖ СО РАН Институт систематики и экологии животных СО РАН

ИСЭ СО РАН Институт сильноточной электроники СО РАН

ИСЭМ СО РАН Институт систем энергетики им. Л.А. Мелентьева СО РАН

ИТ СО РАН Институт теплофизики им. С.С.Кутателадзе СО РАН

ИТПМ СО РАН Институт теоретической и прикладной механики СО РАН

ИУ СО РАН Институт угля Сибирского отделения РАН

ИУХМ СО РАН Институт углехимии и химического материаловедения

ИФ СО РАН Институт физики им. Л.В. Киренского СО РАН

ИФЛ СО РАН Институт филологии СО РАН

ИФМ СО РАН Институт физического материаловедения СО РАН

ИФП СО РАН Институт физики полупроводников им. А.В. Ржанова СО РАН

ИФПМ СО РАН Институт физики прочности и материаловедения СО РАН

ИФПР СО РАН Институт философии и права СО РАН

ИФТПС СО РАН Институт физико-технических проблем Севера СО РАН

ИХБФМ СО РАН Институт хим. биологии и фундаментальной медицины СО РАН

ИХКГ СО РАН Институт химической кинетики и горения СО РАН

ИХН СО РАН Институт химии нефти СО РАН

ИХТТМ СО РАН Институт химии твердого тела и механохимии СО РАН

ИХХТ СО РАН Институт химии и химической технологии СО РАН

ИЦиГ СО РАН Институт цитологии и генетики СО РАН

ИЭОПП СО РАН Институт экономики и организации пром. производства СО РАН

ИЯФ СО РАН Институт ядерной физики им. Г.И. Будкера СО РАН

КемНЦ СО РАН Кемеровский научный центр СО РАН

КНЦ СО РАН Красноярский научный центр СО РАН

КТИ ВТ СО РАН Конструкторско-технологический институт ВТ CО РАН

КТИ НП СО РАН Конструкторско-технологический институт НП СО РАН

КТФ ИГиЛ СО РАН Конструкторско-технологический филиал ИГиЛ СО РАН

ЛИН СО РАН Лимнологический институт СО РАН

МТЦ СО РАН Институт "Международный томографический центр"СО РАН

НИОХ СО РАН Новосибирский институт органической химии СО РАН

НФ ИВЭП СО РАН Новосибирский филиал ИВЭП СО РАН

ОНЦ СО РАН Омский научный центр СО РАН

Отделение ГПНТБ СО РАН Отделение ГПНТБ СО РАН в Академгородке

ОУС БИО СО РАН Объединенный ученый совет СО РАН по биологическим наукам

ОУС СО РАН по НИТ ОУС СО РАН по нанотехнологиям и информационным технологиям

ОФ ИМ СО РАН Омский филиал ИМ СО РАН

Портал СО РАН Портал Сибирского отделения РАН

Портал СОРАН.ИНФО Портал СОРАН.ИНФО

Президиум СО РАН Президиум СО РАН

СИФИБР СО РАН Сибирский институт физиологии и биохимии растений СО РАН

СКТБ Наука КНЦ СО РАН Специальное констр.-тех. бюро "Наука"КНЦ СО РАН

ТНЦ СО РАН Томский научный центр СО РАН

ТФ ИТПМ СО РАН Тюменский филиал ИТПМ СО РАН

ТюмНЦ СО РАН Тюменский научный центр СО РАН

ЦСБС СО РАН Центральный сибирский ботанический сад СО РАН

ЯНЦ СО РАН Якутский научный центр СО РАН

MM-HPC-2014. Text-mining and intelligent analysis of knowledge in databases e-mail: [email protected]

Page 3: Alexey Medvedev Sobolev Institute of Mathematics ... · ИВТ СО РАН Институт вычислительных технологий СО ... MM-HPC-2014. Text-mining and

Community structure of web-graphs of academic institutionsAlexey Medvedev

Sobolev Institute of Mathematics, Novosibirsk, Russia

ИМСО РАН

Fraunhofer-Gesellschaft, graph R

Light & Surfaces

Information & Communication Technology

Materials & Components

Life Sciences

Defense & Security

Production

Microelectronics

Fraunhofer-Gesellschaft

IAO

IRB

MOEZ

PYCO

ZVAISEC

ESKFIT

FOKUS

IAIS

IDMTIESE

IGD

IOSB

ISST

ITWM

IVI

MEVISSCAI

SIT

FEP

ILTIOF

IPM

ISTIWS

EMB IBMTIGB

IME

ITEM

IVV

IZI

EMFT

ENAS

IAF

IIS-EAS

IISB IMS

IPMS

IZM

HHI

IIS

ISIT

IFF

IMLIPA

IPK

IPT

IWUUMSICHT

UMSICHT-ATZ

EMI

FHR

FKIE

ICTINTIAP

IBP

IFAM

IKTSISC

IWESIWMLBF

WKI

IZFPIZFP-D

IFAM-DDISE

ISI

Description. The original graph R is presented, the nodes are put together according to the research groups (based on the information on the Fraunhoferwebsite). Edge thickness represents its weight. It is seen that there is one central administrative unit connected to every other node, however the connection toother central nodes is loose. Modularity score is small showing that nodes inside the partitions are loosely connected compared to the rest of the network.

Fraunhofer-Gesellschaft, graph R

Unassigned vertices

EMI

FKIE

ILT

ISC

IZI

IBP

IVI

MOEZ

UMSICHT

UMSICHT-ATZ

EMFT

ENAS

IISB

IZM

IAF

IAP

IWM

PYCO

FEP

IFAM

IFAM-DD

IKTS

IDMT

IISIIS-EAS

IPMS

IST

IVV

WKI

IME

IMLISST

IAIS

ICT

IESE

IFF

IBMTIPM

IOSB

IPK

IGDLBF

Fraunhofer-Gesellschaft

ISI

AISEC

EMB ESK

FHR

FIT

FOKUS

HHI

IAO

IMS

INT

IOF

IPA

IRBISE

ISIT

ITEM

ITWM

IWESIWS

IWUMEVIS

SCAI

SITIZFP

IZFP-D

IGB

IPT

Description. The reduced graph R is obtained after deletion of central administrative node ZV. The partition obtained has decent modularity score, howeverleaving a part of nodes unassigned to the communities. It is notable the institutes are highly connected to their regional departments. The unweighted modularityscore is high, therefore the partition is distorted by weights. The conclusion is the community structure of R is less transparent.

Node listAISEC Applied and Integrated Security

EMB Marine Biotechnology

EMFT Modular Solid State Technologies

EMI High-Speed Dynamics, Ernst-Mach-Institut

ENAS Electronic Nano Systems

ESK Embedded Systems and Communication Technologies

FEP Electron Beam and Plasma Technology

FHR High Frequency Physics and Radar Techniques

FIT Applied Information Technology

FKIE Communication, Information Processing and Ergonomics

FOKUS Open Communication Systems

FG Fraunhofer-Gesellschaft

HHI Telecommunications, Heinrich-Hertz-Institut

IAF Applied Solid State Physics

IAIS Intelligent Analysis and Information Systems

IAO Industrial Engineering

IAP Applied Polymer Research

IBMT Biomedical Engineering

IBP Building Physics

ICT Chemical Technology

IDMT Digital Media Technology

IESE Experimental Software Engineering

IFAM Manufacturing Technology and Advanced Materials

IFAM-DD Manufacturing Technology and Advanced Materials, Dresden

IFF Factory Operation and Automation

IGB Interfacial Engineering and Biotechnology

IGD Computer Graphics Research

IIS Integrated Circuits

IIS-EAS Integrated Circuits – Design Automation Division EAS

IISB Integrated Systems and Device Technology

IKTS Ceramic Technologies and Systems

ILT Laser Technology

IME Molecular Biology and Applied Ecology

IML Material Flow and Logistics

IMS Microelectronic Circuits and Systems

INT Technological Trend Analysis

IOF Applied Optics and Precision Engineering

IOSB Optronics, System Technologies and Image Exploitation

IPA Manufacturing Engineering and Automation

IPK Production Systems and Design Technology

IPM Physical Measurement Techniques

IPMS Photonic Microsystems

IPT Production Technology

IRB Information Center for Planning and Building

ISC Silicate Research

ISE Solar Energy Systems

ISI Systems and Innovation Research

ISIT Silicon Technology

ISST Software and Systems Engineering

IST Surface Engineering and Thin Films

ITEM Toxicology and Experimental Medicine

ITWM Industrial Mathematics

IVI Transportation and Infrastructure Systems

IVV Process Engineering and Packaging

IWES Wind Energy and Energy System Technology

IWM Mechanics of Materials

IWS Material and Beam Technology

IWU Machine Tools and Forming Technology

IZFP Non-Destructive Testing

IZFP-D Non-Destructive Testing, Dresden dept.

IZI Cell Therapy and Immunology

IZM Reliability and Microintegration

LBF Structural Durability and System Reliability

MEVIS Medical Image Computing

MOEZ Central and Eastern Europe

PYCO Polymeric Materials and Composites

SCAI Algorithms and Scientific Computing

SIT Secure Information Technology

UMSICHT Environmental, Safety and Energy Technology

UMSICHT-ATZ UMSICHT, Sulzbach-Rosenberg dept.

WKI Wood Research, Wilhelm-Klauditz-Institut

ZV Fraunhofer Gesellschaft headquarters

MM-HPC-2014. Text-mining and intelligent analysis of knowledge in databases e-mail: [email protected]