Topics as Clusters of Citation Links to Highly Cited Sources: The Case of Research on International Relations

Following Henry Small in his approach to co-citation analysis, highly cited sources are seen as concept symbols of research fronts. But instead of co-cited sources I cluster citation links, which are the thematically least heterogenous elements in bibliometric studies. To obtain clusters representing topics characterised by concepts I restrict link clustering to citation links to highly cited sources. Clusters of citation links between papers in a political-science subfield (International Relations) and 300 of their sources most cited in the period 2006-2015 are constructed by a local memetic algorithm. It finds local minima in a cost landscape corresponding to clusters, which can overlap each other pervasively. The clusters obtained are well separated from the rest of the network but can have suboptimal cohesion. Cohesive cores of topics are found by applying an algorithm that constructs core-periphery structures in link sets. In this methodological paper I only discuss some first clustering results for the second half of the 10-years period.


Introduction
If a topic is defined as a focus on scientific knowledge shared by a number of researchers then topics should manifest themselves in clusters of co-cited sources, because cited sources represent theoretical, methodological or empirical knowledge used or at least discussed by citing authors.
Topics can overlap in papers and even more in books if they deal with more than one topic.
Another kind of overlap can occur on the level of content of topics: shared knowledge itself can be in the foci of researchers working on different topics. We therefore need a clustering algorithm that delivers overlapping clusters.
When topics are represented as disjoint clusters of co-cited sources then they overlap in papers that cite sources in different clusters. But a cited source can also correspond to more than one topic. We therefore have to allow for overlapping clusters of cited sources, which to the best of my knowledge have not been produced in any co-citation analysis so far.
Co-citation analysis was independently proposed by Irina Marshakova (1973) and by Henry Small (1973). Small (1978) also introduced the notion of concept symbols represented by highly cited sources, for which co-citation clusters are constructed. By adding the papers that cite concept symbols in a co-citation cluster we augment the picture of the corresponding research front (Garfield 1985). Co-citation analysis is the usual approach to clustering concept symbols in citation networks but not the only possible one. I propose, instead, to cluster citation links from papers to concept symbols. Link clustering in the bipartite network of citing papers and cited sources avoids the projection onto the co-citation graph of sources and any need for normalising and thresholding co-citation strength. From clusters of citation links between papers and sources, overlapping clusters of citing researchfront papers and of cited concept symbols can be deduced. Thus, we obtain overlapping clusters of highly cited sources that are connected through papers that co-cite them. 1 Among several clustering methods that allow for overlapping clusters, link clustering has an important advantage when applied to citation networks: citation links are the thematically 1 Note, that here link clustering is not applied to cocitation links between concept symbols but to citation links in the bipartite network of citing papers and cited sources.
least heterogenous elements in bibliometric studies. In nearly all cases, a paper cites a source due to only one knowledge claim. Even when a paper refers to two or more knowledge claims in a cited source they often belong to one topic, especially if we search for larger and more general topics as is done here by restricting link clustering to citation links between papers and highly cited sources.
The topic definition and the link clustering approach applied here have recently been discussed by Havemann, Gläser, and Heinz (2017). In that paper, a new evaluation function for link clusters, Ψ, and a local memetic algorithm for link clustering based on this function, PsiMinL, were proposed and tested for two kinds of citation networks: a network of direct citations in a set of astronomy papers published within eight years, and a bipartite network of one volume of these papers and all their cited sources. I here also apply PsiMinL to a bipartite network of papers and sources but restrict the set of sources to highly cited ones.
Clustering links in networks instead of nodes had been introduced by Evans and Lambiotte (2009) and by Ahn, Bagrow, and Lehmann (2010). In both approaches graphs are partitioned into disjoint clusters of links. From them overlapping clusters of nodes are deduced. In contrast to these global methods, PsiMinL evaluates each link cluster in a local manner independently from other clusters. It therefore can produce clusters that overlap each other pervasively, i.e., not only in their boundaries but also in inner links and nodes. A local evaluation of clusters also matches the local character of topics (Havemann et al. 2017).
Clusters or communities in networks are considered as highly cohesive subgraphs that are well separated from the rest of the network (Fortunato 2010). There are cases where these two features of communities cannot be maximised at the same time. Methods can be classified with regard to producing well separated or well connected communities (Rosvall, Delvenne, Schaub, and Lambiotte 2019). Like several other algorithms, PsiMinL delivers clusters that can have low cohesion, i.e., they can easily be split into two or more sub-clusters. This bias of the algorithm is one of the evaluation function Ψ(L) for a cluster given as a link set L: it measures separation and is much less sensitive for changes in cohesion (Havemann, Gläser, and Heinz 2019).
The evaluation function Ψ(L) allows for lowly cohesive clusters but that does not hinder its use for an evaluation of topic clusters. Not all knowledge in a shared focus has to be cited in all papers that contribute to the corresponding topic. Only those sources have to be cited that are used for the production of new knowledge. Although authors often cite other sources, too, we cannot expect that all sources in a cluster are cited in all papers contributing to the topic.
Clusters of highly cited sources that represent topics have to be well separated but can have low internal cohesion.
A second argument for favouring well separated clusters is the hierarchical structure of sets of topics. A topic can have sub-topics, i.e., the splitting of its cluster should not be too difficult. Two topics can also overlap in one subtopic. Then we have no strict hierarchy but a poly-hierarchy (Havemann et al. 2017).
Nonetheless, we are interested in cohesive cores of topics corresponding to dense subgraphs of citation networks that are not necessarily well separated from the rest of the network. To extract such dense cores from a well separated link cluster an algorithm was proposed recently by Havemann et al. (2019). The CPLC-algorithm finds core-periphery structures of link clusters.
The analysis reported here was made within the Global Pathways project. 2 The aim of this project is to identify topic based, language based and regional or national substructures in research on international relations (IR).
I must leave all conclusions regarding the content structure of IR research to a forthcoming paper enriched with the project team's IR competence (Risse, Wemheuer-Vogelaar, and Havemann 2020). I here present results of a test of the proposed approach. The focus of the paper is on methodological challenges.

Data
For the analysis of IR literature within the Global Pathways project, we wanted to obtain a set of papers in Web of Science (WoS) that prioritises recall over precision. The time span for all downloads was 2006-2015. We started from 115 journals indexed in the WoS category International Relations and added four journals from Political Science. In the following, these journals are referred to as IR journals. We also searched for book chapters in the Book Citation Index of WoS that are categorised as International Relations.
All documents of those types that are usually published to communicate new research results, namely articles, letters, and proceedings papers (original papers), were downloaded, in addition also reviews and book reviews. WoS also offers access to SciELO (Scientific Electronic Library Online, a database mainly covering publications from Latin American countries). From SciELO, records categorised as International Relations were downloaded, too. The list of journals (Table 7 on p. 22) and further details of data can be found in Appendix (p. 16).
After identifying references automatically, as described in Appendix, the 300 most highly cited sources were selected. I searched manually for further references in the data set that could be identified with them. Here references to different pages and editions of books were identified. The list of top-300 sources can be found in Appendix (Table 9 on p. 23). 203 of them have been classified as dealing with IR themes (Table  3 on p. 17 in Appendix). Experiments with clustering smaller numbers of concept symbols revealed that approaching 300 highly cited sources, only peripheral topics were added and the central topic clusters had become stable.

Methods
In the following I will discuss some essential elements of the two algorithms applied. Reading these sections is useful for understanding the design of the experiments and their results. Readers who are not interested in methodological details can skip the next sections and proceed with the results in section 4.1. Further details can be found in the two papers mentioned

Link clustering: PsiMinL algorithm
In one sentence, PsiMinL is an evolutionary algorithm that searches in a cost landscape for local minima that correspond to well separated link clusters. Because genetic operators (mutation, crossover, and selection) are combined with deterministic local searches in the cost landscape, PsiMinL can be called a memetic al-gorithm (Neri, Cotta, and Moscato 2012). A PsiMinL glossary is in the Appendix (p. 17). Each possible link set L corresponds to a place in the cost landscape, the height of place L is given by the cost function normalised node-cut Ψ(L) of link set L. A lower value of Ψ(L) signals a better separated link set L. Normalised nodecut can be defined as with where k i is the degree of node i, k in i (L) its internal degree with respect to link set L, and k in (L) = n i=1 k in i (L) = 2|L|. Index i runs through all n nodes but k in i (L) = 0 for all nodes that are not attached to a link in L. Set E includes all m edges. Note, that σ Deriving their link-clustering approach, Evans and Lambiotte (2009) introduced a random linknode-link walker. The first summand on the right-hand side of Equation 1 is the probability of such a walker sitting on a link in L to escape from L and the second summand is the escape probability for the complement of L (Havemann et al. 2019). Further motivations for using the Ψ-function were given by Havemann et al. (2017).
A connected link set L that corresponds to a local minimum in the cost landscape is called a link cluster or a link community. The cost landscape is very rough, i.e., there are many local minima that differ only in a few links. We are interested in well separated link sets that differ from any better separated set in more than only some links. Therefore we need a resolution parameter r. It is used to decide whether we can consider a link set L as a valid community. If there is a link set L 0 with Ψ(L 0 ) < Ψ(L) and the two link sets differ in less than r|L| links then L 0 makes L invalid. In other words, we search for local minima with no lower place in the landscape within a radius r|L|.
A local search in PsiMinL is done by greedily including neighbouring links to a connected link set L or by excluding links from L that are attached to boundary nodes. Here I have implemented a procedure that tries to lower cost in an alternating sequence of link exclusion and inclusion until no further improvement is possible.
A simple local search-done by going downhill in the cost landscape-is soon trapped in the next local minimum. We allow the greedy algorithm in local searches to proceed even when the costs are rising. It stops and goes back to the place L min of the last cost minimum in the search if it does not find a place with lower cost after r|L min | steps, i.e., if it does not find a link set which makes L min invalid. In other words, the local search can tunnel through barriers in the cost landscape if the end of the tunnel is not too far. Then the link set at the end of the tunnel invalidates the cluster at the tunnel entry. 3 In memetic algorithms deterministic local searches are combined with evolutionary genetic operators, i.e., with mutation, crossover, and selection. We need randomness because even tunneling does not avoid trapping of local searches in local minima corresponding to invalid communities. A population is initialised from a seed subgraph by a local search followed by mutations and again local searches until the desired number of different individuals is reached. Mutation and crossover are used to explore the cost landscape around a preliminarily valid cluster at a local minimum that corresponds to the current best individual of a population.
If two clusters have well separating boundaries their intersection and their union could also have such a boundary. Therefore, offspring is made from intersection and union of parents. As one parent the current best individual is chosen, the other one is selected among those individuals that have large genetic distance (measured as set difference) from the best individual. After mutations and crossovers (both followed by local searches) the best individuals are selected for the next generation.
The memetic algorithm PsiMinL was implemented as an R-package 4 with parallel procedures for all members of a population that undergoes an evolution. Because each cluster is evaluated independently from all other ones, several evolutions starting from different seed subgraphs can run parallel, too. As seed subgraphs one can use clusters obtained from any fast clustering algorithm. The set of all valid clusters is totally independent from the set of seeds used to find them but there is no guarantee for finding all valid clusters with a given seed set.
Different runs of PsiMinL starting from the same seed can end in different local minima of the cost landscape. Tests of PsiMinL on the cost landscape of a large citation network of eight years of astronomy papers (Havemann et al. 2017) show two typical cases of path bifurcation. The algorithm can run into different hollows, or it ends at different places in the same hollow. In the second case, distances between different minima were found to be small, often much smaller than the resolution radius r|L|. That means, we can assume that further runs of PsiMinL improve and change a result only slightly.
To keep the overview over the many experiments necessary for finding as much as possible valid clusters in a network, it is convenient to ensure that in a local search starting from a mutant or from an offspring of the current best cluster L 0 and ending in a better one, L 0 is invalidated. Consequently, if the first place on the path downhill with a cost Ψ < Ψ(L 0 ) is not within a radius r|L 0 |, then the local search is stopped and the individual link set is excluded from further evolution.
PsiMinL has many parameters (population size, mutation variances and rates, number of crossovers, etc.) but only resolution r influences the results. All other parameters only have influence on the time needed to obtain them. 5 Recently, Gabardo, Berretta, and Moscato (2020) have proposed a new memetic algorithm for global link clustering resulting in overlapping communities of nodes. They evaluate whole disjoint link partitions with the density metric proposed by Ahn et al. (2010). Chalupa, Hawick, and Walker (2018) have tested different crossover operators combined with deterministic and randomised variants of local search for finding bottlenecks in networks that correspond to minima of conductance Φ, an evaluation function that favours well separated subgraphs in the world of node clustering as normalised node-cut Ψ does for link clustering. They found "sparse imbalanced cuts into a community and the rest of the network, as well as relatively balanced partitions" (p. 28 in preprint version). Like that of Lu, Hao, and Wu (2020) but in contrast to PsiMinL, their algorithm randomly selects genes of parents for offspring clusters and applies mutation only for population initialisation. Further papers related to algorithm PsiMinL are referred to by Havemann et al. (2017Havemann et al. ( , p. 1095). Evolutionary algorithms used for detecting communities in networks have been reviewed by Clara Pizzuti (2017).
Like conductance Φ, normalised node-cut Ψ neglects the direction of links. Thus, applying it to a bipartite network of papers and their cited sources means that papers and sources are treated symmetrically.

Cores and peripheries of link clusters: CPLC algorithm
CPLC constructs core-periphery structures (named towns, for short) in a given link set as nested subgraphs with decreasing cohesion. Large star subgraphs have a high local density of links. This density notion is the translation of usual graph density into the world of link clustering (Havemann et al. 2019, p. 5). For a recent review of algorithms for core-periphery construction see the paper by Tang, Zhao, Liu, Liu, and Yan (2019). In our case the largest stars are highly cited sources with their incoming citation links. A town is defined as a size ordered cluster of stars where two stars are never indirectly connected via smaller stars only. To illustrate this definition, we can imagine size of stars as height of hills. Then all smaller stars of a town can be reached from the largest one on a path that is never going uphill.
A star is connected to a town if it shares a minimum number of outer nodes with the set of town stars of equal or larger size; otherwise it becomes the centre of an independent town. The minimum number of outer nodes is determined by a resolution parameter q with 0 ≤ q < 1, which is used as a minimum threshold of relative overlap for a star to be attached to a town.
Instead of arbitrarily setting parameter q, its whole range is explored by starting with minimal resolution q = 0 and increasing it recursively to a value at that it is possible to obtain at least one more town in the given link set. To chose a resolution level at which useful core-periphery structures are constructed, different criteria can be applied. One can, e.g., consider towns at a level where the two largest stars in the link set are centres of different towns.
Towns of clusters can also be used to construct appropriate small seed subgraphs for PsiMinL.

Link clustering
I divided the period 2006-2015 into two 5-years periods for two reasons. First, because five years are enough to diminish the influence of random fluctuations of citation data. Second, because then a comparison of the two 5-years periods can be made. 6 Any paper that cites only one of the top-300 sources can be neglected when clusters of them are constructed. For clustering citation links to these sources, PsiMinL only needs papers that cite at least two of them. For -2010 there are 4,778 such papers and 6,494 papers for the last five years. Only papers in IR journals and books were included.
Seed subgraphs were made from disjoint clusters of cited sources that have been obtained by applying Ward clustering to the co-citation network of top-300 sources. Distances were calculated from the similarity of views (Gläser, Heinz, and Havemann 2015). 7 Usually, an optimal cut through the whole dendrogram of a hierarchical clustering is chosen to get a partition of a network. I have tested this approach to seed construction but starting from 15 middle-sized seeds, most of evolutions had a long path to go through the cost landscape: resulting clusters have sizes very different from their seeds (cf. Appendix, p. 20). Clusters of one cut through the dendrogram are not well suited as seed subgraphs for an algorithm that results in a poly-hierarchy of clusters. Therefore I have applied an alternative method: for different numbers of clustered top-300 sources, Ward clusters with longest branches in the dendrogram were selected for constructing seed subgraphs for link clustering. A Ward cluster has a long branch if it has relatively low variance and if the next 6 In addition, IR experts can better compare clusters obtained for this period with the results obtained by Kristensen (2018) who analysed author co-citation in IRpapers published in 2011-2015.
7 Ward clustering of views was made by Michael Heinz. Its results can be downloaded as R-object cv.RObj from https://zenodo.org/record/4181930 (Havemann 2020).  For any selected Ward cluster of co-cited sources the set of citation links to all its sources was used as a seed subgraph for link clustering. PsiMinL first makes a deterministic local search starting from a seed and then an evolutionary search. For a second run of memetic search I made additional seed subgraphs from intersections and unions of valid clusters. I also used selected core-periphery structures constructed by applying CPLC on valid clusters as seed subgraphs.
In all previous experiments we had fixed the resolution parameter on one level: r = 1/3. Here I allowed for several levels of resolution. First, resolution parameter r = 1/20 was chosen, which 8 Branch length measures cluster quality (Havemann, Gläser, Heinz, and Struck 2012, p. 8).
9 In addition, the set of seeds has been extended by including further 23 Ward clusters with shorter branches in the dendrogram. Results are in Appendix on p. 17. separates all clusters that differ in at least 1/20 of their links. For each seed, 16 independent evolutions were started with populations of eight individual connected subgraphs given as link sets. An evolution was stopped when during 100 generations the best individual could not be improved. In the next phase, the eight best of 16 resulting individuals formed a new population. This was repeated until most of the 16 evolutions gave the same result. 10 Then, the whole procedure was repeated but now with a larger resolution parameter r and using the results of the first run as seeds. I made such iterations on resolution levels with r = 1/10, 1/5, 1/4, 1/3. In each step of iteration a stronger condition for validity was applied than in the step before. All valid link clusters for, e.g., r = 1/4 are also valid for r = 1/5 but not the other way round.
The workflow of the whole procedure including pre-and post-processing is visualised in Fig To give an impression of memetic evolution, the search path starting from a large seed is described and visualised in Appendix (p. 18). Figure 2 shows costs Ψ and sizes of all 27 selected Ward seed-subgraphs, of results of initial local searches and of memetic searches on intermediary resolution levels, and of 11 resulting clusters on final resolution level (r = 1/3). Each seed is connected by a line with its intermediary results and its final cluster. The colours of lines are equal for all evolutions with the same final cluster. Cluster L (the largest one), e.g., is reached by starting memetic evolution from two large seeds with identifiers 297 and 298 (cf. Figure 8 in Appendix, p. 18). Seed 298 is the largest seed (175 of top-300 sources) and includes seed 297 (103 sources, s. Figure 7 in Appendix, p. 18). There are 171 sources with more than 95 % of their citation links in L, 160 of them are also in seed 298 (91 % of 175).
Clusters TL and TR are not valid on final resolution level but for r = 1/10 and r = 1/4, respectively. For next levels, PsiMinL found a path through the cost landscape that ends in clusters TLC and R, respectively. All other The first part of Table 1 lists data of all 13 clusters that have been reached from any of the selected 27 seeds. For clusters reached from more than one seed, the first column gives the id number of the seed that is nearest in size to the final cluster. In some cases, different evolutions ended up in slightly different variants of a cluster. The best one invalidates the other variants.
According to the definition in Equation 1 the cost function is equal for a link set and for its complement. Therefore, each complement of a cluster is also a cluster if it is a connected subgraph. Indeed, the largest valid cluster (with more than half of all links) is the complement of the second largest one: Complements of small subgraphs are nearly as large as the whole network and therefore not really interpretable as topics. We therefore only consider the complement of cluster B (on size rank 3, with about one third of all m = |E| = 30, 835 citation links). E − B is connected but we have to test whether it survives a local and a memetic search. That means, we have to use it as a seed subgraph for PsiMinL. E − B remained unchanged and therefore valid till resolution level r = 1/5. On level r = 1/4, PsiMinL invalidated E − B: it found a never rising path (with tunnels) through the cost landscape ending in R. The bipartite network of papers and sources is very large. Therefore, clusters are visualised on a projection of the bipartite network onto the co-citation graph of top-300 sources (Figure 3). This has the wanted side-effect that a visual comparison of the two approaches can be made (s.a. footnote 1). We expect that link-cluster boundaries prefer regions of sparse co-citation relations. Following Marshakova (1973) edges between the 300 selected sources were weighted with their co-citation numbers diminished by expectation values derived from a null model of independent citations. Only edges that are significant on a 95 % level have been used as input for the force directed placement of nodes. 11 The red line in the graphs of Figure 3 marks the boundary between R on the right side and its complement L on the left side of the graph. It connects 22 bridging sources that are cited by papers on both sides, beginning with Kant and von Clausewitz on the top and ending with Vachudova. More specifically, each of the bridging sources has not less than 5 % of its citation links in each of the two complementary link clusters. All other sources have more than 95 % in L or in R, respectively.

Duffield 2001
Hall Labels of sources are displayed for centres of 31 core-periphery structures obtained by running CPLC on the whole bipartite network and using results from the resolution level where the two most cited sources (Waltz 1979, Wendt 1999) become independent from each other. I have added labels of three sources at the ends of the red line (mentioned above) and of three sources that are centres in clusters (Arellano 1991, Evans 1995, Przeworski 2000. Labels are highlighted in bold for cited sources classified as belonging to the IR specialty. A cluster is marked by colouring sources that have more than 95 % of their citation links inside its link set. A co-citation edge is coloured if more than half of all its co-citing papers have citation links (to the two sources) that belong to the cluster's link set. The colour used in Figure 2 for a cluster is the same as in the graphs. 12 Bold cluster names are derived from the position in the graphs of Figure 3: In the upper graph in Figure 3, the red links and nodes represent cluster BRC (bottom right corner).
Pink elements correspond to cluster BCL (bottom centre left). Cluster BCL is also a subgraph of cluster BL (bottom left, pink and dark red). All these small clusters are subgraphs of BR, which therefore is visualised not only by green nodes and co-citation links but includes all coloured elements in the bottom right of this graph.
There are two small clusters in the first part of Table 1 that are named after their most cited source, both with relatively high Ψ-values: Cluster " Tarrow 1994" includes Sidney G. Tarrow's book Power in Movement: Social Movements and Contentious Politics and five other sources with related themes, all outside IR and inside cluster TR.
The cluster " Ostrom 1990" contains two sources with all their citation links: Elinor Ostrom's famous book Governing the commons (90 citations) is co-cited in 21 papers with The 12 Citation numbers of the set of 300 selected sources restricted to citation links in valid clusters can be downloaded as R-object ccs-v7.RObj from https://zenodo. org/record/4181930 (Havemann 2020). File read-me.R contains R-code for listing core sources of clusters. The dataset on Zenodo also includes lists of sources in clusters and on their boundaries (file Havemann2020topics.pdf) and lists of journals with numbers of papers citing sources in clusters (file citing.journals.of.clusters.pdf).
Tragedy of the Commons, the paper by Hardin Garrett published 1968 in Science (37 citations). 22 other sources have citation links within this cluster but get less than five citations from 106 papers belonging to it. The node with label "Ostrom 1990" can be found in the upper graph of Figure 3 near cluster BL (bottom left, pink and dark red).
The second part of Table 1 lists data of new clusters reached by starting PsiMinL from seeds that are unions of valid clusters in the first part. Unconnected unions cannot be seeds.
The three smallest clusters and BRC do not overlap each other in citation links, but one methodological book (Wooldridge 2002) is cited in all four clusters (by 30 papers in BCR, by one paper in each of the other three clusters). Thus, any union of them is a connected subgraph and can be used as a seed.
Seeds made from unions of cluster " Ostrom 1990" with each of the other three small clusters did not bring any new result. In all three cases, cluster " Ostrom 1990" was excluded already on the first resolution level (r = 1/20) and the other cluster was reached again by memetic search.
The union of BCL and BCR has 497 citation links and Ψ(BCL ∪ BCR) ≈ 0.18079. PsiMinL found the slightly better cluster BC (bottom centre) on a short path through the cost landscape and already on resolution level r = 1/20. All these statements hold analogously for cluster BRB (bottom right bottom) which is not far from the union of BCR and BRC. Starting PsiMinL from BCL ∪ BRC ended up in cluster BRC itself already at the first resolution level.
Both new clusters, BC and BRB, do not differ much from their seeds, which are (connected) unions of disjoint link sets. Thus, we can assume that they can easily be split into well separated parts. Indeed, running CPLC on, e.g., BC results in two towns very similar to BCL and BCR, respectively, already on resolution level q = 0. Therefore, we can expect that clusters BC and BRB are thematically not very homogenous. This can also be said about a cluster obtained from the union of cluster " Tarrow 1994" with cluster BCL (678 links, Ψ ≈ 0.25973).
Clusters TLC and TR overlap in only 24 links. Their union used as seed resulted in a new cluster with 8,027 links (T, for top), which is valid on all levels. 96 % of all links in TLC and 89 % of all links in TR are also in T.  Figure 4) I conclude that seeds that are connected unions of disjoint (or nearly disjoint) link sets are not useful for identifying homogenous topics.
Other (nontrivial) unions of overlapping clusters did not result in any new valid cluster. The same holds for intersections of valid clusters. Starting PsiMinL from intersection BR∩L, e.g., ended up with BC. I did not consider intersections of valid clusters that contain only a few links or more than 70 % of the links of the smaller cluster because one can then expect that PsiMinL only finds this smaller cluster again.
The left-hand side of Figure 4 visualises the poly-hierarchy of clusters. A blue line is drawn if the smaller cluster has less than 5 % of its links outside the larger cluster.
The tiny cluster BCL has 215 of its 231 links (93.1 %) in BC and is totally included into BL. Total inclusion is the exception. This is due to the normalisation in Equation 1. The cost of the smaller cluster is lower with some additional links but not the cost of the larger cluster because a smaller link set has a larger relative increase of the denominator k in (L) by including links than a larger set.
On the right-hand side of Figure 4, overlaps between four clusters are displayed that are not (nearly totally) included in a larger cluster. L and R have zero overlap by definition. 823 of all 858 links in L∩BR are also in B. The remaining 35 citation links are visualised by the direct edge between L and BR. The edge betwen B and BR is missing because all 2,345 links in B ∩ BR are either in L or in its complement R (s. Table 2).

Core-periphery structures
Constructing core-periphery structures of a cluster can reveal its highly cohesive cores if it has one or more of such cores. Clusters in the second part of Table 1 decay into two well separated subclusters. We can therefore neglect them when we look for cohesive cores.
For all other 11 valid clusters found on resolution level r = 1/3, core-periphery structures (towns) were constructed by running CPLC for a sequence of values of resolution parameter q ∈ [0, 1/2]. Figure 5 shows the four towns in TLC obtained by CPLC on resolution level q = 0.183. The pale blue town around Foucault (1975) has a larger periphery than the three other towns. I here only present this example, which at least gives cursory evidence that CPLC indeed reveals core-periphery structures in clusters. I have to leave a detailed examination of results to further work.
Towns of clusters were also used as seed subgraphs for finding further clusters. One example is a town of L with Wendt (1995) as the centre. Starting from this seed, PsiMinL rediscovered cluster TL. I selected those towns as seeds which promised to lead to new clusters from inspecting the co-citation graph (Figure 3). Further successful cases are the three clusters in the third part of Table 1, which are named after the centres of their seed towns.
The paper by Robert Cox (1981)  The book by Douglass North (1990, on the red line in Figure 3) is significantly often co-cited with the book by Oliver Williamson (1985), both dealing with economic institutions. They have all their citation links in this cluster. The next relevant source is Ostrom's book (1990), which is cited by ten cluster papers but gets 90 citations in the whole set.
Mancur Olson's book about The Logic of Collective Action (1965, on the right side of the red line in graphs of Figure 3) is the only full-member source in its cluster. In contrast to the other two clusters in the third part of Table 1, this cluster remains valid only till r = 1/5. For r = 1/4, PsMinL invalidated it by reaching BCR.  Waltz 1979 Foucault 1980 Organski 1980 Cox 1981 Gilpin 1981 Ruggie 1982 Krasner 1983 Anderson 1983 Gellner 1983 Doyle 1983 Axelrod 1984

Clustering method
Methods for the clustering of networks can use global evaluation functions that evaluate whole partitions, like modularity, or local functions that evaluate each cluster independently from others, like conductance or normalised cut for node clustering (Fortunato 2010) and normalised node-cut Ψ for link clustering. Topics are locally defined. This favours the use of local evaluation functions for topic reconstruction. Citation links are the thematically least heterogeneous bibliometric elements. This suggests to apply link clustering algorithms in citation networks. Topics can overlap and form a poly-hierarchy, which in turn means that topic clusters should not be too hard to split into sub-clusters. Thus cohesion cannot be the main criterion for evaluating a cluster. Until now PsiMinL is the only algorithm that is in line with all these demands. The price payed for this are long running times, the need for many CPUs, and a high complexity of the whole analysis (s.a. the discussion of computer running times of PsiMinL in Appendix on p. 20).
Next to these abstract and technical considerations, the crucial test relates to domain knowledge: Can experts interpret not only single clusters but also the poly-hierarchy they form and their overlaps? 13 I have to leave this for further work.
This paper makes several novel contributions. For the first time, I apply PsiMinL to a bipartite network of highly cited sources and papers citing at least two of them. I argue that this restriction is possible because top-cited sources serve as symbols for shared knowledge of a scientific community in a field and shared knowledge is what a topic defines. This restriction reduces the network size (by a factor of ten) and therefore also the computational effort. Also for the first time, I overcome the somewhat arbitrary choice of a fixed resolution by going through a sequence of resolution levels and using the resulting clusters on one level as seeds for the next one. A further novelty is that I construct initial seed subgraphs from clusters corresponding 13 Otherwise, all the effort becomes problematic. A further interesting question is, whether one finds top sources in overlaps that are cited for different reasons in different overlapping clusters, which was one of our arguments for clustering citation links.
to long branches in the dendrogram obtained by Ward co-citation clustering. This is also the first PsiMinL analysis of a specialty belonging to the social sciences.

Clustering results
Three different data models were used here, namely: 1. the bipartite network of top-300 sources and all papers in IR journals and books citing at least two of them (used by link clustering algorithm PsiMinL, leading to a poly-hierarchy of clusters), 2. the projection of the bipartite network onto the co-citation graph of top-300 sources (on which clusters are displayed after selecting significant links), and 3. a distance matrix between top-300 sources made from the co-citation projection weighted with Salton's cosine (used for constructing seed subgraphs from Ward clusters of views).
In spite of data differences, each link cluster concentrates in a certain region of the co-citation graph. Most of clusters have boundaries going to sparse regions of the graph. This is a first hint that PsiMinL applied on a bipartite network of papers and top-cited sources leads to reasonable clusters. I have to leave any further evaluation of contents of PsiMinL clusters and of their coreperiphery structures obtained here to IR experts (Risse et al. 2020).
I can, however, compare these clusters quantitatively with all clusters of views on all levels of hierarchical Ward clustering. How many top-300 sources of a Ward cluster are core members of any link cluster? The results are presented in Appendix (p. 21).
Three link clusters are never a best match of a seed, namely those made from unions of two clusters: BC, BRB, and T (second part of Table 1, p. 7). This corresponds to their probable thematic inhomogeneity discussed above.
There are five exact matches between clusters, which all have less than seven cited sources (Table 6, p. 21). The worst match is with cluster TL (Salton's cosine s ≈ 0.76). The division between the two largest clusters L and R is matched with values of s > 0.9.
All but one of the matched link clusters in the first part of Table 1 are matched best by their (nearest) seed. Only TLC is best matched by a Ward cluster that is not in the set of 27 long-branch seeds but among the 23 seeds with shorter branches (cf. footnote 9). PsiMinL reaches TLC from this seed too.
How can we interpret these good matches between link clusters and some Ward clusters of views that correspond to long branches in the dendrogram?
First, the two approaches are compatible and therefore supporting one another.
Second, the use of long-branch clusters as seed subgraphs for PsiMinL is confirmed as an efficient method. Starting from seeds from a global cut through the dendrogram needs longer paths in the cost landscape and resulted only in a subset of valid link clusters obtained with longbranch seeds. That means, starting from longbranch seeds we rediscover all clusters that were found with global-cut seeds. In other words, similarity of seeds and resulting clusters is not the reason for finding this set of clusters.
Experiments with seeds corresponding to 23 branches with sub-maximal length in their size classes showed that we can find more small valid link clusters when starting from small seeds with shorter branches too (cf. Appendix, p. 17). Some of these small clusters are not as well separated as the best clusters in Table 1 (p. 7). Their Ψvalues exceed 1/4 (cf. also Table 4, p. 17).
Evaluation function Ψ is always larger than the escape probability of the random link-nodelink walker (Evans and Lambiotte 2009), for small clusters only slightly larger, because the denominator of the second term in the definition of Ψ (Equation 1) is very large. That means, for Ψ < 1/2 the random walker's probability to remain within the cluster is always larger than to escape from it in the next step (P esc < 1/2).
An ordinary random walker hopping from node to node escapes from a weak node community as defined by Radicchi et al. (2004) also with a probability P esc < 1/2. Translating the definition of weak communities into the language of link clustering (Havemann et al. 2019), we can deduce that all clusters obtained here are link communities in the weak sense.
Recently Kristensen (2018) determined disjoint co-citation clusters of 332 authors highly cited as first authors in 106 IR journals in the period 2011-2015. His aim was to visualise the "communicative-sociological structures" of the discipline. He admits that neglecting co-authors of highly cited first authors can cause biases towards some authors, especially towards authors of theorising works. He found some authors with a "fairly stable position in the network" but others "whose work is used for positioning by several camps may shift camps depending on the specific threshold values" (p. 247).
In my approach each highly cited work can appear in more than one cluster because I produce overlapping clusters of cited sources. Topics overlap in authors even more than in papers or books but at first glance both networks show at least some similar structures. The contents of Kristensen's camps of authors and of link clusters obtained here cannot be compared without knowledge of the field.

Conclusions
Can PsiMinL be recommended for finding a poly-hierarchy of overlapping research topics of a specialty? The experiments made in this study suggest that we indeed obtain reasonable results by applying PsiMinL to a bipartite network of selected concept symbols and all papers citing at least two of them. 14 IR experts were able to interpret them (Risse et al. 2020). All resulting clusters were only slightly changed after adding missing links to the network (s. Appendix, p. 21). Several link clusters have a good match with Ward clusters of views (Table 6, p. 21). A comparison with results of further clustering algorithms applied on the same data would be useful for evaluating the new approach to clustering concept symbols. A first trial with classic co-citation analysis (single linkage of cosine weighted links) as done by Small and Sweeney (1985) was made. Also here, results suffer from chaining, the well-known disadvantage of single linkage. Differences between clusters obtained by PsiMinL and by other algorithms could by evaluated by experts of the specialty. I have to leave such comparisons to further work.
Generally, any partition of a network into disjoint clusters cannot be compared as a whole with a poly-hierarchy of overlapping clusters. A good matching of all clusters is only possible, if 14 One caveat has to be made: Researchers in International Relations as in other specialties in social sciences often refer to books as concept symbols. 175 of the top-300 sources are books (s. Table 3, p. 17). Thus, a success of the approach for specialties of natural science can be expected but not guaranteed. the clusters used for a quantitative comparison form a hierarchy that has many levels (like the Ward cluster of views discussed above).
Similar results of different clustering methods can be seen as a mutual support but different results do not falsify any of the methods. They can be interpreted as reconstructing legitimate alternative perspectives on the structure of a specialty's literature (Gläser, Glänzel, and Scharnhorst 2017). At most, one method could be judged as more accurate than the other when we compare both with regard to the purpose of clustering (Waltman, Boyack, Colavizza, and van Eck 2020). A poly-hierarchy of independently evaluated clusters, as delivered by PsiMinL, could represent already different perspectives on the analysed literature.
Evaluation function Ψ can be justified within the model of a random walker who should leave a cluster with low probability (Havemann et al. 2019). For finding node clusters, each step of a random walker starts and ends on a node. Link clusters can be constructed by starting and ending on links (Evans and Lambiotte 2009). 15 Random walks last long in well separated clusters. When a cluster contains sub-clusters which are only weakly connected with one another the chance to leave it can nonetheless be as low as to leave any of the two sub-clusters. In this sense random walkers are insensitive to inner cohesion of clusters. I argue that we need cohesion insensitivity when we want to obtain hierarchically organised sets of clusters. Only the smallest clusters can be expected not to decay into sub-clusters.
Seeing a research topic as a shared focus on scientific knowledge suggests that not separation but cohesion of views on knowledge should be the defining property of topics. We have tried to weaken this argument by pointing to coreperiphery structures and by proposing the simple CPLC algorithm that constructs such structures inside well separated link clusters (Havemann et al. 2019). This approach still rests on the assumption that topics can be represented by well separated clusters. The experiments with PsiMinL show that there are such topics but they do not prove that all research topics can be separated from the rest of a citation network. In dense cores of the network, separation 15 Recently, a random link-node-link walker's escape probability was used by Enders et al. (2020) to cluster 39 standard hypotheses about biological invasions for mapping this specialty. could fail as the occurrence of a terra incognita (a huge central cluster without substructures) in the analysis of astronomy and astrophysics seems to suggest (Havemann et al. 2017(Havemann et al. , p. 1105. Technically, PsiMinL is an evolutionary algorithm that searches for local minima in the cost landscape with evaluation function Ψ. PsiMinL starts memetic evolutions from seed subgraphs but the same valid cluster can be reached from different seeds (cf. Figure 2, p. 7). In this sense, the cluster solution is independent from seeds. The construction of seeds influences only the time needed for a solution and its completeness.
All technical parameters of PsiMinL also do not affect the results but only the time needed to obtain them. The only numerical parameter that influences the shape of clusters is resolution r. In this study, I have tested a procedure that makes the results less dependent on r. I started with low r and then iteratively used the clusters as seed subgraphs for running PsiMinL for higher levels of r. Because lower r means faster search this strategy could also be advantageous when results on only one resolution level are needed.
Evolutionary algorithms on large networks need much computing time. PsiMinL as other algorithms shifts the time problem at least partly to one of computing power by applying highly parallel procedures. Genetic operators can be applied parallelly on all individuals of a population. Because clusters are evaluated independently we can start PsiMinL parallelly from different seeds. Further optimisation of PsiMinL could be reached by finding optimal sets of technical parameters like population size, mutation rate etc. Another technique for reducing computing time could be to start with only two years and then use resulting clusters as seeds for larger periods, similar to reducing a large graph by random sampling (Azaouzi, Rhouma, and Ben Romdhane 2019, p. 23).
Finding a minimum in a large and rough cost landscape by applying an evolutionary strategy never comes to an end because we cannot prove that there is no lower place than the one found. PsiMinL searches for local minima and accepts a link cluster L as a valid solution if it is not made invalid by a lower place inside a radius of r|L| in the landscape. That means, here we cannot exclude that there are better variants of clusters but we also cannot maintain that we have found all valid clusters. Sometimes, PsiMinL invalidates a cluster not in the first trials. That means, we cannot be sure that a found cluster is really valid, but we can at least assume a weak validity when PsiMinL is not able to find a path to a better cluster after several trials.
Applying PsiMinL for finding link clusters in citation networks needs preprocessing (data cleaning, construction of seeds) 16 and postprocessing (selection of valid solutions, finding cohesive cores). Running PsiMinL many times for many seeds requires not only computing time and power but also a clear organisation of all procedures, selections, and validations. PsiMinL cannot be recommended for a user only interested in results before the whole procedure is transformed into a routine of automatic actions. More experience is needed for optimising the exploration of cost landscapes with PsiMinL. Then, hopefully, we can make a step further in codifying the procedure.

Acknowledgements
This work is part of the Global Pathways project sponsored by DFG (grant RI 798/11-1). 17 As a member of the project team, Felix Mattes, made all downloads and developed the algorithm for reference identification. Lixue Lin-Siedler helped in classifying references as scholarly ones. The team members and experts in international relations, Thomas Risse, Wiebke Wemheuer-Vogelaar, and Mathis Lohaus commented on results and classified the 300 highly cited sources. Special thanks to Jochen Gläser who gave valuable advices in the whole process of data collection and processing. The memetic algorithm was implemented as an R-package by Andreas Prescher in a project funded by the German Research Ministry (BMBF grant 01UZ0905). I thank Michael Heinz for many discussions and for applying an alternative clustering method. He, Jochen Gläser, Alexander Struck, Mathis Lohaus, and Martin Enders also commented on drafts of the paper. The comments of two anonymous reviewers were also very helpful for improving the paper, many thanks! Finally I thank the developers of L A T E X and of R. 18 16 But note, that tedious cleaning of citation data can be reduced to highly cited sources when citation networks of concept symbols are clustered.

Data availability
The raw data used in this paper were obtained from the Web of Science database produced by Clarivate Analytics. Due to license restrictions, the data cannot be made openly available. To obtain Web of Science data, please contact Clarivate Analytics. 19 Results of cleaning and clustering can be found on Zenodo (Havemann 2020).

Competing interests
The author declares that he has no competing interests.

Appendix Data
In Table 7 (p. 22) all downloaded journals are listed together with numbers of papers with references in the whole 10-years period and in both 5-years periods, respectively. 20 We identified references to highly cited sources in the set of downloaded papers. Among these, our IR experts selected 137 IR books and 102 IR papers. The paper set was expanded by downloading 14,389 papers citing these sources but published in 3,135 non-IR journals and serials and in 2,405 non-IR books. 21 All records without references were excluded because they would not be part of a citationbased network. This results in a total number of 71,210 records of the following types: 53,889 original papers, 15,524 book reviews, and 1,797 review articles. Because not all variants of references to one and the same source are identified in WoS we need some further identification, but we have excluded references to non-scholarly historical sources from this tedious task. Thus, among the reference strings of publications, a subset was categorised as probably referring to scholarly sources when they included an author name, a year, and did not refer to a newspaper. To these references an algorithm for reference identification was applied, which is described below.
All in all, 1,143,317 different reference strings to scholarly sources were identified. They represent about two millions of citations of 992,582 identified sources. In many papers the authors refer to different pages of a source-whether be it a book or a paper. Because such references were identified here (but generally not in WoS) there are some duplicated links between citing papers and cited sources. 24,308 citing papers are also cited sources.
References where identified by an algorithm that Felix Mattes implemented as a script in R (R Core Team 2015). It calculates distances between elements of different reference strings 20 The full names of journals can be read in the csvfile journals.csv which can be downloaded from https: //zenodo.org/record/4181930 (Havemann 2020).
21 Data expansion to papers in non-IR journals and books was neglected for the clustering exercise because it would bias the results towards the views of authors outside IR. These papers have been downloaded to get a comprehensive bibliography of IR papers.
(names of authors and of journals, book titles, volumes, pages, and DOIs). String distance is based on OSA distance (Optimal String Alignment, a restricted Damerau-Levenshtein distance) in R-package stringdist (Van der Loo 2014). Distance is defined as zero when all characters of the shorter string occur in the same order in the longer one. Beginning pages are allowed to have a maximum difference of ten. Publication years must be equal. After calculating distances for each element of references, distances between whole references are determined as a weighted sum of element distances. We experimented with different weights and thresholds to obtain a criterion for identification which avoids false positives but also false negatives. The results of reference identification can be downloaded from https://zenodo.org/ record/4181930 (file ref.sou.csv ). There you also find the R-script used for reference identification (Havemann 2020). Table 9 on pp. 23-35 gives the top-300 sources highly cited in IR papers 2006-2015 in alphabetical order of the first author or editor (co-authors and co-editors can be found in the last column).
The selection is based on citation numbers in the expanded data set. Therefore it is biased towards the 239 IR sources used for data expansion (because highly cited non-IR sources could have less gain in citation numbers from data expansion). This bias has to be considered when results are interpreted. 22 The column cit shows the numbers of papers citing the sources in [2006][2007][2008][2009][2010][2011][2012][2013][2014][2015]. The ranks in the first column are determined according to these citation numbers. Ties are broken by year of first publication, beginning with older sources. Ranks in bold numbers belong to sources classified as IR sources by experts. Table 3 contains numbers of top-300 sources with regard to publication type and content.
The oldest top-300 source is Kant's Perpetual Peace (1795). There are two further sources that were first published before 1900 (von Clausewitz and Marx), and three in the first half of the twentieth century (Weber, Polanyi, and Morgenthau).
In Figure 6 the distribution of publication years of all other 294 highly cited sources is visu- 22 The citation numbers in column cit of Table 9 on pp. 23-35 equal the numbers of citing papers in IR journals only, i.e, without data expansion. There is a large overlap between the 203 IR sources among the top-300 and the 239 IR sources used for data expansion.  1956 1960 1961 1965 1966 1970 1971 1975 1976 1980 1981 1985 1986 1990 1991 1995 1996 2000 2001 2005 2006 2010  valid cluster a cluster L that differs in at least r|L| links from any cluster with lower cost Ψ (0 < r < 1 measures resolution)

Selecting seeds
The dendrogram of Ward clustering of views is in Figure 7. In Figure 8 branch lengths in the Ward dendrogram are displayed for all cluster sizes. Red numbers in both figures are identifiers of 27 selected Ward clusters with longest branches in their size classes. They are used for constructing seed subgraphs for memetic evolutions. I have checked whether seeds made from further 23 Ward clusters with shorter branches in the dendrogram results in new valid clusters.   Starting from 16 of them, PsiMinL rediscovered six clusters (five times TLC, three times BR and BCR, two times L and R, and one time BRC). The other seven memetic searches discovered new clusters (Table 4). Six of them are small (maximum of 432 links) and not very well separated (Ψ > 0.27). Seed 196 (six sources) leads to a larger cluster, which has the same six top-300 sources as full members. They all deal with terrorism and form the top right corner of TR (violet in lower graph of Figure 3, p. 8).

Search starting from seed 296
To give an impression how PsiMinL finds lower places in the cost landscape, the height profile of the local search path that starts from seed 296 is displayed in Figure 9. Greedy exclusion of links starts from the violet point and tunnels through several barriers (red and grey curves). Then the algorithm tries to exclude further links but the barrier is to thick to tunnel through for resolution r = 1/20 (grey curve on the left side). The different lines beginning at the green point correspond to 16 independent searches. Each line connects the current best clusters in a memetic evolution. The best result (red point) was reached by one search (red line). In a second phase, 16 evolutions started from the red point (using different eight best clusters as individuals for all 16 populations). One of them (blue line) reached R (blue point). In a third phase, PsiMinL tried to improve this cluster in further 16 searches but without success. Here the populations were initialised by mutating R until eight different individuals were generated. For seed 296, the memetic evolutions including searches for all resolution levels needed 8.75 h computer time with the R-package PsiMinL implemented under Ubuntu on a Dell machine with 56 CPUs (for further technical parameters, see next section). The initial local search was made with an R-script (without using package PsiMinL) and needed 80 s. Figure 11 is a diagram of computer running times for different seeds. It shows that running times depend only partly on size of the final clusters. 23 The straight line corresponds to a linear increase of time with size of one hour per 1000 links. For all but one of the small final clusters, PsiMinL needs more time per link. But for larger clusters, roughly speaking, the minimal time increases linearly with size and the maximal times are not extremely far away from minimal times (with the exception of seed 286, from which first the large cluster TL is reached before evolution ends up in the smaller cluster 23 Note, that some final clusters of the same colour differ slightly in size. As mentioned in section 4.1 (p. 5), in some cases, evolutions end up in slightly different variants of a cluster. The best one invalidates the other variants. Maximally, each PsiMinL process with eight individuals in a population needs eight CPUs. Thus, eight parallel processes running on 56 CPUs do not impede each other and, for a network with about 30,000 links, the whole memetic search can be completed within a few days.

Results with seeds from a disjoint partition
In a first trial, I constructed seed subgraphs from a cut through the Ward dendrogram in Figure 7 (p. 18) that delivers 15 clusters. On last resolution level r = 1/3 the independent procedures starting from these seeds converged toward seven valid clusters, which are also valid for r < 1/3 by definition (the five largest clusters, cluster BRC, and cluster BC in Table 5). Figure 12 shows costs Ψ and sizes of all 15 Ward seed-subgraphs, of results of initial local searches and memetic searches on intermediary  resolution levels, and of their resulting clusters on final resolution level. Seed size does not correspond to cluster size and different seeds end in the same cluster. The diagram visualises cost-size data of seeds and clusters obtained with an incomplete network with 30,614 citation links. After adding 221 links (mainly citations of Kant, 1795), I used all clusters as seeds for new memetic evolutions, which ended in very similar clusters. They are equal to some of the clusters in Table 1 (p. 7). The data of corresponding clusters in the incomplete network are shown in Table 5. The link cluster obtained from Olson (1965) has only this paper in S(L), which is trivially matched with itself. Results are in Table 6. All best matching seeds but seed 276 can be found in    Figure 7. Seed 276 and seed 272 are direct sub-clusters of seed 287, which can be found in the lower part of the dendrogram.  population size 8 number of individuals that produce offspring by crossovers and mutation mutation variance 1 percentage of genes (links) that are randomly altered in mutation of the best individual renewal variance 6 percentage of genes (links) that are randomly altered in a renewal mutation mutation rate 4 number of mutants in each generation number of crossovers 4 number of gene combinations (unions and intersections) in each generation of the current best individual with other individuals renewal period 10 number of generations after that a renewal mutation is made if the best individual remains the same stopping period 100 number of generations after that evolution is stopped if the best individual remains the same minimal period 200 minimal total number of generations of a population