An application of neighbourhoods in digraphs to the classification of binary dynamics

Abstract A binary state on a graph means an assignment of binary values to its vertices. A time-dependent sequence of binary states is referred to as binary dynamics. We describe a method for the classification of binary dynamics of digraphs, using particular choices of closed neighbourhoods. Our motivation and application comes from neuroscience, where a directed graph is an abstraction of neurons and their connections, and where the simplification of large amounts of data is key to any computation. We present a topological/graph theoretic method for extracting information out of binary dynamics on a graph, based on a selection of a relatively small number of vertices and their neighbourhoods. We consider existing and introduce new real-valued functions on closed neighbourhoods, comparing them by their ability to accurately classify different binary dynamics. We describe a classification algorithm that uses two parameters and sets up a machine learning pipeline. We demonstrate the effectiveness of the method on simulated activity on a digital reconstruction of cortical tissue of a rat, and on a nonbiological random graph with similar density.


Introduction
A binary state on a graph means an assignment of binary values to its vertices.A motivating example in this article appears in the context of neuroscience.If one encodes the connectivity of a neuronal network as a directed graph, then the spikes produced by the neurons at an instant of time is a binary state on the encoding graph.Allowing time to vary and recording the spiking patterns of the neurons in the network produces an example of a binary dynamics on the encoding graph, namely a one-parameter family of binary states on its vertices.A network of neurons that receives external signals and responds to those signals thus generates a binary dynamics.Binary dynamics appear in other contexts as well [1,2], but in this paper we use networks of spiking neurons as a primary example.
The signal classification problem, i.e., the task of correctly pairing a signal injected into a neuronal network with the response of the network, or in other words, identifying the incoming signal from the response, is generally very challenging.This paper proposes a methodology by which this task can be approached and provides scenarios in which this methodology is successful.
Considering raw binary states on a large graph is generally quite problematic for a number of reasons.First, the sheer number of theoretically possible states makes analysing a collection of them a daunting task [3,4].Moreover, natural systems such as neuronal networks tend to be very noisy, in the sense that the emerging dynamics from the same stimulus may take a rather large variety of forms [5,6].Finally, it is a general working hypothesis in studying network dynamics that the network structure affects its function [7,8,9,10].This paradigm in neuroscience is often encapsulated by the slogan "neurons that fire together tend to wire together".Hence, when studying dynamics on a neuronal network, it makes sense to examine assemblies of vertices, or subgraphs, and the way in which they behave as dynamical sub-units, instead of considering individual vertices in the network [11,12,13].
In previous studies we considered cliques in a directed graph, with various orientations of the connections between nodes, as basic units from which one could extract information about binary dynamics [14,15].However, the results in these papers fell short of suggesting an efficient classifier of binary dynamics [14,.Indeed, when we applied the methods of [14,15] to the main dataset used in this paper, we obtained unsatisfactory classification accuracy.This suggests that in a graph that models a natural system cliques may be too small to carry the amount of information required for classification of a noisy signal.This motivates us to build our classification strategy on neuron assemblies, where the richer structure serves a dual purpose of amalgamating dynamical information and regulating the noise inherent in single neurons or cliques.
The guiding hypothesis of this paper is that a collection of vertex assemblies, forming a subgraph of the ambient connectivity graph encoding a network, can be used in classification of binary dynamics on the network.A network of spiking neurons is our primary example.Taking this hypothesis as a guideline, we introduce a very flexible feature generation methodology that takes as input binary dynamics on a digraph G induced on a preselected collection of subgraphs of G, and turns it into a feature vector, which can then be used in machine learning classification.The neighbourhood of a vertex v in the graph G, namely the subgraph of G that is induced by v and all its neighbours in G, suggests itself naturally as a type of subgraph to be considered in this procedure, and is a central object of study in this paper.Vertex neighbourhoods have been studied extensively in graph theory and its applications [16].An outline is given below and a full description in Methods.
The way we apply the method can be summarised as follows.Given a directed graph G we use a variety of real valued vertex functions that we refer to as selection parameters and are derived from the neighbourhood of each vertex, to create a sorted list of the vertices.With respect to each such parameter, we pick the "top performing" vertices and select their neighbourhoods.To that collection of subgraphs we apply our feature generation method, which is based again on applying the same parameters to the selected neighbourhoods, now in the role of feature parameters.All the parameters we use are invariant under isomorphism of directed graphs, i.e. graph properties that remain unchanged when the vertices are permuted while leaving their connectivity intact.Therefore we occasionally refer to certain parameters as "graph invariants".
The choice of parameters is related to measures of network connectivity and architecture.For instance, the parameters fcc and tcc (see Table 1) are examples of measures of functional segregation [10].The parameters we refer to as spectral parameters arise in spectral graph theory [17] and are prevalent in many applications, including in neuroscience.For instance, the paper [18] studies the Laplacian spectrum of the macroscopic anatomical neural networks of macaques and cats, and the microscopic network of the C-elegans.The topological parameters, such as the Euler characteristic ec and Betti numbers are classical topological invariants.In [15] these were used in various ways to extract information on structure and function and their interaction in the Blue Brain Project reconstruction on the neocortical column.The parameter size is a natural parameter associated to any graph and is closely related to firing rate in neuroscience.However, most of the parameters we tested were never examined in a neuroscientific context.Our aim was to investigate which parameters may prove useful in classification of binary dynamics without making any assumptions about their relevance.It is exactly this approach that allowed us to discover that certain spectral parameters perform strongly as selection parameters, while others do not.At the same time a newly introduced topological parameter, "normalised Betti coefficient" nbc shows strong performance as a feature parameter when tested on neighbourhoods with low selection parameter values, but not on high selection values.
The primary test of our methods in this paper is done on data generated by the Blue Brain Project that was also used in [19] for signal classification by established neuroscience methodology.The data consists of eight families of neuronal stimuli that are injected in a random sequence to the digital reconstruction of the neocortical column of a young rat.This reconstructed microcircuit consists of approximately 31,000 neurons and 8,000,000 synaptic connections, and is capable of receiving neuronal signals and responding to them in a biologically accurate manner [20].We used 60% of the data to train a support vector machine, and the remaining 40% for classification.With our methods we are able to achieve classification accuracy of up to 88%.
In this paper we did not attempt to explain the relevance of any of the mathematical concepts we use to neuroscience, as our main aim was to discover and investigate the utility of various concepts.However, in [19] the same dataset is studied by standard techniques of computational neuroscience combined with the ideas presented in this paper.In particular, it is shown that an informed choice of neighbourhood improves classification accuracy when compared to traditional methods.Interestingly, selection of neighbourhoods that improved performance with the technique presented in [19] show reduced performance with the techniques presented in this article, and vice versa.In both projects a classification accuracy of nearly 90% was achievable, but with different selection parameters (see Results).This suggests that considering vertex neighbourhoods as computational units can be beneficial in more than one way.
To further test our methods in different settings, we used the NEST -Neural Simulation Tool [21] to generate neuronal networks.This software package simulates network models of spiking neurons using simplified neuron models to allow more flexibility and faster processing speed.We created a collection of eight families of stimuli, but on random graphs with varying densities, and applied our machinery to that dataset.Here again we obtained classification accuracy of up to 81%.
Important work on (open) vertex neighbourhoods was reported recently in [16].Our approach is independent of this work and is different from it in a number of ways.Most significantly, we do not study the structure of the entire graph and its dynamical properties by means of its full neighbourhood structure.Instead, we aim to infer dynamical properties of the graph from a relatively small collection of vertices, selected by certain graph theoretic and topological properties, and their neighbourhoods.
High resolution figures and supplementary material is available at the Aberdeen Neurotopology Group webpage.In particular, we included a comprehensive visualization of Figure 1.A neighbourhood in a digraph, marked in red, with its centre marked solid colour.spectral graph invariants of the Blue Brain Project graph, as well as other types of stochastically generated graphs, animations of some of the background work for this project, and a list of links to software implementing the methodology described in this paper.

Results
We start with a brief description of the mathematical formalism used in this article and our approach to classification tasks.This is intended to make the section accessible to readers without a strong mathematical background.We then proceed by describing our main data source and the setup and implementation of our experiments.Following this preparation we present our results, validation experiments, and an application of the same techniques in a different setup.

2.1.
A brief introduction to the mathematical formalism.In this article a digraph will always mean a finite collection of vertices (nodes) V and a finite collection of oriented edges (arcs) E. Reciprocal edges between a pair of vertices are allowed, but multiple edges in the same orientation between a fixed pair of vertices and self-loops are not allowed.
The fundamental mathematical concept essential for our discussion is that of the neighbourhood of a vertex in a digraph; Figure 1.Let G be a digraph, and let v 0 be any vertex in G.The neighbours of v 0 in G are all vertices that are "one step away" from v 0 , in either direction.The neighbourhood of v 0 in G is the subgraph of G induced by v 0 and all its neighbours, which we denote by N G (v 0 ).The vertex v 0 is referred to as the centre of its neighbourhood.
Numerical invariants of digraphs can be found in pure and applied graph theory literature, many of those found their uses in theoretical neuroscience (see [10] for a good survey).Some such invariants are used in this article, and a few are introduced here for the first time (e.g.transitive clustering coefficient).Other parameters we used are defined by using topological constructions that arise from digraphs.Such constructions are typically invariant under digraph isomorphism.Standard tools of algebraic topology can then be used to extract numerical invariants of graphs in ways that take emerging higher dimensional structure into account.
There are many ways in which one can associate a topological space with a digraph.In this article we use the directed flag complex .It is a topological space made out of gluing together simplices in different dimensions, starting at 0-simplices (points), 1-simplices (edges), 2simplices (triangles), 3-simplices (tetrahedra) etc.The n-simplices in a directed flag complex associated to a digraph are its directed (n+1)-cliques, namely the ordered subsets of vertices {v 0 , v 1 , ..., v n }, such that there is an edge from v i to v j for all i < j. Figure 2 shows the directed flag complex associated to a small digraph.The directed flag complex was introduced and used for topologically analysing structural and functional properties of the Blue Brain Project reconstruction of the neocortical columns of a rat [15].The interested reader may find a comprehensive survey of directed flag complexes and other topological concepts in the Materials and Methods section of that paper.If v 0 is a vertex in G, we denote by Tr G (v 0 ) the directed flag complex of N G (v 0 ).A digraph (left), the associated directed flag complex as a topological space (centre), and its maximal cliques of (right).
2.2.The classification method.We now describe briefly our approach to classification of binary dynamics.For a precise mathematical definition of what we mean by binary dynamics see Methods.The task at hand can be described as follows.We are given a large set of instantiations of binary dynamics on a fixed digraph G, each of which is labelled by a symbol from some relatively small set.The label of each binary dynamic is unique and known.The aim is to produce a machine learning compatible topological summary for each binary dynamics, so that when the summaries are introduced in a random order, one can train on part of the data with known labels and predict the unknown labels of the remaining part.The first step is selection of neighbourhoods.For each vertex v in the digraph G we consider its neighbourhood N G (v) and the associated directed flag complex Tr G (v).We then compute a variety of numerical graph parameters of N G (v) and topological parameters of Tr G (v).These parameters are used to create a ranked list of vertices in G.We then select for each parameter 50 vertices that obtained the top (or bottom) values with respect to that parameter.We now have a set of 50 neighbourhoods corresponding to each parameter.A parameter that is used in this step is referred to as a selection parameter, and we denote it by P .A short summary of the main parameters we used with their abbreviations is in Table 1.A detailed description of the parameters is given in Methods.

Abbreviation
In the second step we introduce binary dynamics in G.Each instantiation of the dynamics consists of several consecutive time bins (in our experiments we used two, but there is no limitation).For each time bin we consider the neurons that were active and the subgraph that they induce in each of the neighbourhoods we preselected.This gives us, for each selection parameter and each time bin, a set of 50 subgraphs that correspond to a particular instantiation of binary dynamics on G.
The third step is vectorising the data, i.e., a computation of the same graph parameters and topological parameters for each of the subgraphs resulting from the second step.When we use our parameters in the vectorisation process they are referred to as feature parameters, and are denoted by Q.This now gives a vector corresponding to each instantiation of the dynamics, and the pair (P, Q) of selection and feature parameters.
The fourth and final step is injecting the data into a support vector machine.In this project we used 60% of the data for training and the remaining for testing.See Figure 3 for a schematic summary of the process.We note that the method described here is an example of a much more general methodology that is described in detail in the Methods section of this article.In particular, the graph and topological parameters that we chose to work with are selected from within the abundance of mathematical concepts that arise in graph theory, combinatorics and topology.We do not attempt in this article to associate a neuro-scientific meaning to these parameters.
2.3.The data.Our main source of data is a simulation that was run on a Blue Brain Project reconstruction of the microcircuitry of the somatosensory cortex in the brain of a rat [20].From this model we extract the connectivity of the microcircuit in the form of a digraph whose vertices correspond to neurons, and with an edge from v to u if there is a synaptic connection from the neuron corresponding to v to the one corresponding to u.We denote the Blue Brain Project digraph by G.The digraph consists of 31,346 vertices and 7,803,528 edges.The connectivity matrix of this specific circuit, as well as 41 other instantiations of the reconstruction, is accessible on the Digital Reconstruction of Neocortical Microcircuitry.
The binary dynamics we experimented with consists of eight stimuli families labelled 0-7.For each stimulus a random subset (10%) of afferent neurons is activated.The stimuli differ with respect to which subset of afferent neurons is activated, where afferents can be shared between stimuli.The probability of a given afferent being associated with two given stimuli is 1%.In each stimulation time window one and only one stimulus is presented.The stimuli were injected into the circuit in a random sequence of 200 milliseconds per stimulus, and 557 repeats for each stimulus label.The dataset thus consists of 4456 binary dynamics functions.The task is to determine the label of that stimulus, i.e. the expected output is an integer from 0 to 7. Thus, the chance level performance would be 12.5%.More detail on the source of data, biological analysis and an alternative approach to classification of the same data is in [19].
2.4.Setup.We computed all the graph parameters listed in Table 1, as well as additional parameters listed in Supplementary Material, for all neighbourhoods in the digraph (see Supplementary Material -Data and Code, for a brief description of computational methods and links to software).We fixed a positive integer M , and for each selection parameter P we selected the vertices v 1 , v 2 , . . ., v M , whose neighbourhoods N G (v 1 ), . . ., N G (v M ) obtained the top (or bottom) M values of the parameter P (see Step II) in Methods).We experimented with M = 20, 50, 100 and 200.Here we report on the results we obtained for M = 50, which provided the highest classification accuracy.For M = 20 performance was strong as well, but for M = 100 and 200 the improvement compared to M = 50 was relatively minor, and not worth the additional time and computation needed.
2.5.Vector summaries.Each binary dynamics in our dataset has time parameter t between 0 and 200 milliseconds.The subinterval [0, 60] is where almost all the spiking activity is concentrated across the interval.Furthermore, the bulk of the stimulus is injected in the first 10ms.Since we aimed to classify the response to the stimulus rather than the stimulus itself, we chose ∆ = [10,60] and divided that interval into two 25ms subintervals, as experimentation showed that these choices provide the highest classification accuracy (see Step I) in Methods).
We denote each instantiation of binary dynamics on G by B n , for n = 1, . . ., 4456.Each instantiation consists of two binary states B n 1 , B n 2 , corresponding to the neurons that fired in each of the 25ms subintervals.For each selection parameter P , and each of the corresponding neighbourhoods N G (v m ), m = 1, . . ., 50, we computed the subgraphs N m,k of N G (v m ) induced by the binary state B n k , that is, the subgraph induced by the neurons that fired in the given interval.This gave us, for each binary dynamics B n and each graph parameter P , a 2 × 50 matrix U P n of subgraphs of G, whose (m, k) entry is N n m,k .(see Step II) in Methods).Finally, for each graph parameter Q (from the same list of parameters) we applied Q to the entries of the matrix U P n to obtain a numerical feature matrix U P,Q n corresponding to the binary dynamics function B n , the selection parameter P , and the feature parameter Q.The matrix U P,Q n is a vector summary of the binary dynamics B n .(see Step III) in Methods).
2.6.Classification.For each pair of graph parameters (P, Q) the vector summaries {U P,Q n } were fed into a support vector machine (SVM) algorithm.Our classification pipeline was implemented in Python using the scikit-learn package and the SVC implementation therein.The SVC was initialised with default settings and we used a 60/40 train/test split.The kernel used was Radial Basis function.We used one-versus-one approach for multiclass classification.For cross-validation we used standard 5-fold cross-validation in scikit-learn.The results are presented in Figure 4.For each of the selection parameters we tested, we considered both the neighbourhoods that obtained the top 50 values and those that obtained the bottom 50 values.In all the experiments, four parameters gave markedly better performance when used as feature parameters than all other parameters: Euler characteristic (ec), normalised Betti coefficient (nbc), size and Bauer Laplacian spectral radius (blsr).All four perform significantly better than other feature parameters when the neighbourhoods were selected by bottom value parameters.With respect to top value selection parameters, ec and size, performed well, while nbc and blsr were significantly weaker as feature parameters, except when coupled with Chung Laplacian spectral gap (clsg).The neighbourhoods selected by top values of selection parameters gave best results when the selection parameter was one of the spectral graph invariants, while selecting by bottom value of selection parameters, the two types of clustering coefficients (cc and tcc) and Euler characteristic (ec) performed best.
Interestingly, the two best performing feature parameters, Euler characteristic and size, gave good results across all selection parameters, and performed almost equally well, regardless of whether the neighbourhoods were selected by top or bottom selection parameter value.This suggests that, at least in this particular network, the choice of feature parameter plays a much more important role in classification accuracy than any specific selection parameter.On the other hand, examining the rows of the best performing feature parameters, in Figure 4, we see a difference of up to 27% (top ec), 40% (top nbc) and 18% (top size) in classification accuracy, depending on which selection parameter is used, suggesting that, within a fixed choice of a feature parameter, the selection parameter may play an important role in the capability of the respective neighbourhoods to encode binary dynamics.Note that randomly classifying the 8 stimuli gives an accuracy of 12.5%.

2.7.
Validation.In order to validate our methods, we created five experiments, the results of which we then compared to a subset of the original tests.In each case we retrained the SVM algorithm and then retested.
A motivating idea in neuroscience in general, and in this work in particular, is that structure is strongly related to function.Our approach, using neighbourhoods sorted by graph parameters and using the same graph parameters as feature parameters is proposed in this article as a useful way of discovering combinations of parameters that achieve good classification results of binary dynamics.To test the validity of this proposal, we challenged our assumptions in five different ways, as described below.2.7.1.Random selection.In this simple control experiment we test the significance of the selection parameter by comparing the results to a random choice of 50 vertices and performing the same vector summary procedure on their neighbourhoods.Twenty iterations of this experiment were performed, and the results for each feature parameter were compared to the outcome for the same feature parameter and the selection parameter with respect to which this feature parameter performed best.The results are described in Figure 5.We observe that in almost all cases reported here a choice of neighbourhoods determined by a selection parameter outperforms a random choice (in some cases marginally).We also note that in all those cases the performance of a choice informed by one of these selection 9 parameters exhibits a more consistent behaviour in terms of classification accuracy.This can be seen from the considerably larger error bars in the case neighbourhoods are selected at random.On the other hand, for some feature parameters a random choice does not seem to be a disadvantage, even compared to the best selection parameter with respect to this feature parameter (Supplementary Figure 3).This suggests that while selection and generation of vector summary by objective parameters are advantageous, experimentation is generally necessary in order to decide which parameters best fit the classification task.
2.7.2. Neighbourhood vs. centre.A working hypothesis in this paper is that neighbourhoods carry more information about a binary dynamics than individual vertices.We examined for each selection of 50 neighbourhoods by a graph parameter, as described above, the classification capability of the centres of these neighbourhoods.Specifically, this experiment is identical to the original classification experiment, except for each selection parameter P the two rows of the corresponding feature matrix have binary values, where the j-th entry in row i is set to be 1 if the j-th neuron in the sorted list fired in the i-th time bin at least once and 0 otherwise.These feature vectors were then used in the classification task using the same train and test methodology.For each of the selection parameters we tested, we considered both the top 50 and the bottom 50 neurons in the corresponding sorted list.The results of this experiment were compared with the original experiments, and are shown in Figure 6.We note that in all cases a very significant drop in performance occurs.Interestingly, some vertices in the top 50 of a sorted list show classification accuracy that is far better than random, while the bottom 50 give performance comparable to random (for example, fcc).In some cases however, the bottom 50 vertices give better performance than the top 50.This suggests that the selection parameters play a role in classification accuracy even before considering the activity in the neighbourhood of a vertex.
We also note that for almost all top valued selection parameters recorded in Figure 6 and some of the bottom valued ones, the classification performance using the centre alone is significantly better than random.This observation reinforces the idea that selection parameters inform on the capability of neurons to inform on activity.For each selection parameter we considered the degrees of the 50 selected centres.For a centre v i of degree d i we then selected at random d i vertices in the ambient graph and considered the subgraph induced by those vertices and the centre v i .We used these 50 subgraphs in place of the original neighbourhoods.In this way we create for each centre a new subgraph with the same vertex count as the original neighbourhoods that is unrelated to the centres in any other controllable way.We extracted feature vectors using these subgraphs for each of the selection parameters and repeated the classification experiment.The results were compared to the original results with respect to the strongest performing feature parameter.Notice that these are always either ec or size, both of which can be applied to an arbitrary digraph, not necessarily a neighbourhood.The results of this experiment were compared with the original experiments, and are shown in Figure 7.There is a clear drop in performance for all selection parameters except fcc (Fagiolo's clustering coefficient; See Methods).Furthermore, classification using these subgraphs shows considerably larger error bars.This suggests that using neighbourhoods with our methodology is advantageous.One explanation for this may be the tighter correlation of activity among neurons in a neighbourhood, compared to an arbitrary subgraph of the same size in the network, but we did not attempt to verify this hypothesis.2.7.4. Fake neighbourhoods.In this experiment we considered for each centre its degree and selected at random the corresponding number of vertices from the ambient graph.We then modified the adjacency matrix of the ambient graph so that the centre is connected to each of the vertices selected in the appropriate direction, so as to preserve the centre's in-and outdegree.Computationally, this amounts to applying a random permutation to the row and the column of each of the centres.The result is a new ambient graph, where the old centres are now centres of new neighbourhoods.We extracted feature vectors using these "fake neighbourhoods" and repeated the classification experiment.The results were compared with the original classification.The outcome is illustrated in Figure 8.
We note that with respect to almost all selection parameters there is a significant drop in performance resulting from this modification.The one exception is fcc, where ec as a feature parameter actually sometimes gives slightly better results, but with a large error bar.It is interesting that the results are similar for some of the parameters to those observed in previous experiment (Figure 7), but quite different for others.However, the drop in performance is similar in both cases.We make no hypothesis attempting to explain these observations.2.7.5. Shuffled activity.In this experiment we applied a random permutation σ of the neuron indices in the Blue Brain Project microcircuit, so that neuron σ(i) now receives the spike train (sequence of spikes) of neuron i for each stimulus.That is, we precompose the binary dynamics with σ to get a new binary dynamics, which still appears in eight varieties, since the operation of permuting the neuron indices is bijective.In other words, we can reconstruct the original activity from the shuffled activity by applying the inverse permutation σ −1 .The same selection and feature parameters were used and the resulting data was used for training and testing.The results are shown in Figure 9.We observe again that there is a significant drop in performance resulting from this shuffling.This is quite surprising since the shuffled activity spike train should give eight families of stimuli that carry some sort of internal resemblance, and since we retrained and tested with these stimuli, one could expect that the classification results will be comparable to those of the original experiments.That not being the case suggests that structure and function in the Blue Brain Project reconstruction are indeed tightly related.

2.8.
Testing the method on an artificial neuronal network.To test our methods in a non-biological binary dynamics setting, we conducted a set of experiments with the NEST simulator [21].The NEST software simulates spiking neuronal network models and offers a vast simplification of neuronal networks that are based on the exact morphology of neurons (such as the Blue Brain Project reconstructions).It also provides great flexibility in the sense that it allows any connectivity graph to be implemented in it and any initial stimulation to be injected into the system with the response modulated by various flexible parameters.
To move as far as possible from a strict biological setup, we generated a number of Erdős-Rényi random digraphs on 1000 vertices, which we implemented on NEST.We then created 8 distinct stimuli, each enervating a random selection of 100 vertices of the graph.A random sequence of stimuli was then created, with each stimulus type repeated 500 times.Our experiment consisted of injecting the sequence of stimuli into the simulator, for a duration of 5ms, one every 200 ms, to reduce the influence of one stimulus on the next.To introduce some randomness, the start time of each stimulus is randomly selected from the first 10ms , the strength of each stimulus is multiplied by a random number between 1 and 2, and background noise is included (using NEST's noise generator device with strength 3).For each 200ms interval, the first 10ms were not included in the classification.As a result some of the input may be included in the classified data, but never more than 4 ms, and for approximately 60% of the 4000 stimuli the input is completely excluded from classification.The code used to create these experiments is available online, and the experiments are presented visually in Figure 10.
The spikes from this simulation were then extracted and were run through the same pipeline as the Blue Brain Project data.We experimented with graph densities of 0.08, 0.01 and 0.005, and with selections of 10, 20, and 50 neighbourhoods.Figure 11 shows the performance by the selection parameters from Table 1.Size was used in all cases as a feature parameter.The best performance was obtained with 50 neighbourhoods, with graph density of 0.01 in almost all selection parameters.The results of experiments with all parameters can be seen in Supplementary Figure 5.
Interestingly, the middle graph density of 0.01 consistently performed equally as well or better than both the denser 0.08 and less dense 0.005 across all feature parameters, except neighbourhood size (size) and adjacency spectral gap (asg).Another interesting observation is that the strongest selection parameter in this experiment turns out to be normalised Betti coefficient (nbc), or transitive clustering coefficient (tcc), depending on if "strongest" is taken to mean with the highest individual accuracy or with the highest average accuracy from cross-validation, respectively.Both of these selection parameters in the Blue Brain Project experiments exhibited rather mediocre performance (see Figure 4,left).This suggests that different networks and binary dynamics on them may require experimentation with a collection of selection (and feature) parameters, in order to optimise the classification accuracy.

Discussion
In this paper we examined the concept of a closed neighbourhood in relation to the classification of binary dynamics on a digraph.Regardless of what the source of the binary dynamics is, but with the assumption that it is given in a time series of labelled instantiations, we ask how can the dynamics be read off and classified.In the context of neuroscience, which is our primary motivation for this study, this is a question on the boundary between computational neuroscience and machine learning.Our methods provide a method of addressing this question.We proposed a methodology that will take as input binary dynamics on a digraph and produce a vector summary of the dynamics by means of combinatorial and/or topological parameters of a relatively small number of neighbourhoods.Using this methodology we experimented with a dataset implemented on the Blue Brain Project reconstruction of the neocortical column of a rat, and on an artificial neural network with random underlying graph implemented on the NEST simulator.In both cases the vector summaries were then run through a support vector machine algorithm that was able to achieve a classification accuracy of up to 88% for the Blue Brain Project data and up to 81% for the NEST data.
We used the same parameters both for selecting neighbourhoods and for the creation of feature vectors.We saw that certain spectral graph parameters used as selection parameters perform significantly better than more classical parameters such as degree and clustering coefficients.We also observed that the parameters that performed best as feature parameters were the simplest ones, namely size and Euler characteristic.Comparison to randomly selected neighbourhoods showed that the methodology works reasonably well even without selecting the neighbourhoods in an informed way, but that neighbourhoods selected in a way informed by graph parameters gives in general a better performance with a much smaller error range.
Our aim was to demonstrate that certain selections of subgraphs, informed by objective structural parameters, carry enough information to allow classification of noisy signals in a network of spiking neurons.In this paper the subgraphs selected are closed neighbourhoods, and the selection criteria are our chosen selection parameters.We did not however show, or attempted to demonstrate, that the use of neighbourhoods as a concept, or graph parameters as a selection mechanism are the best methodology.The same techniques could be applied to other subgraph selections and other vectorisation methods, which can be analysed by our pipeline with relatively small modifications.
Another aspect of our ideas that was not exploited at all in this project is the use of more than a single graph parameter in the selection procedure.We did show that different parameters are distributed differently in the Blue Brain Project graph, and hence one may hypothesise that optimising neighbourhood selection by two or more parameters may give improved classification accuracy.
As our aim was not to obtain the best classification, but rather to provide a good methodology for ingesting binary dynamics on a digraph and producing machine learning digestible data stream, we did not experiment with other more sophisticated machine learning algorithms.It is conceivable that doing so may produce even better classification accuracy than what is achieved here.
Finally, our approach is closely related to graph neural networks where convolution is performed by aggregating information from neighbourhoods, i.e. for every vertex, features are learned from all the adjacent vertices.The pipeline presented in this paper also takes as input sequences of neural firings and sequences of neuron assemblies which turn the firing patterns into feature values.The interaction of our work and the modelling perspectives from graph neural networks and sequence-to-sequence learning might thus pose an interesting future research question.

4.1.
Mathematical Concepts and Definitions.We introduce the basic concepts and notation that are used throughout this article.By a digraph we mean a finite, directed simple graph, that is, where reciprocal edges between a pair of vertices are allowed, but multiple edges in the same orientation between a fixed pair of vertices and self-loops are not allowed.
Topology is the study of topological spaces -a vast generalisation of geometric objects.In this paper we only consider spaces that are built out of simplices.Simplices occur in any dimension n ≥ 0, where a 0-simplex is a point, a 1-simplex is a line segment, a 2-simplex is a triangle, a 3-simplex a tetrahedron and so forth in higher dimensions.Simplices can be glued together to form a topological space.A good survey for this material intended primarily for readers with a neuroscience background can be found in the Materials and Methods section of [15].
We now describe a general setup that associates a family of topological objects with a digraph.A particular case of this setup is the main object of study in this paper.Definition 4.1.A topological operator on digraphs is an algorithm that associates with a digraph G a topological space Γ(G), such that if H ⊆ G is a subgraph then Γ(H) ⊆ Γ(G) as a closed subspace.
That is, a topological operator on digraphs is a functor from the category of digraphs and digraph inclusions to the category of topological spaces and inclusions.The flag complex of G (ignoring orientation), the directed flag complex [22], and the flag tournaplex [14] are examples of such operators.Definition 4.2.Let G = (V, E) be a digraph, and let v 0 ∈ V be any vertex.
• The neighbours of v 0 in G are all vertices v 0 = v ∈ V that are incident to v 0 .
• The open neighbourhood of v 0 is the subgraph of G induced by the neighbours of v 0 in G.The closed neighbourhood of v 0 in G is the subgraph induced by the neighbours of v 0 and v 0 itself.We denote the open and closed neighbourhoods of v 0 in G by N • G (v 0 ) and N G (v 0 ) respectively.More generally: • Let S ⊆ V be a subset of vertices.Then N • G (S) denotes the union of open neighbourhoods of all v ∈ S. Similarly N G (S) is the union of all closed neighbourhoods of vertices v ∈ S.
Notice that if S = {v 0 , v 1 }, and v 0 and v 1 are incident in G, then N • G (S) = N G (S).In this paper we will mostly consider closed neighbourhoods.Neighbourhoods are also used in the paper [19], which is closely related to this article.Terminology 4.3.Let G be a digraph and let S be a subset of vertices in G. Unless explicitly stated otherwise, we shall from now on refer to the closed neighbourhood of S in G simply as the neighbourhood of S in G.In the case where S contains a single vertex v 0 , we will refer to v 0 as the centre of N G (v 0 ).
The topological operator we consider in this article is the directed flag complex of a digraph which we recall next.See Figure 2 for an example.Definition 4.4.A directed n-clique is a digraph, whose underlying undirected graph is an n-clique, and such that the orientation of its edges determines a linear order on its vertices.An ordered simplicial complex is a collection X of finite ordered sets that is closed under subsets.The n-simplices of an ordered simplicial complex X are the sets of cardinality n + 1.If G is a digraph, then the directed flag complex associated to G is the ordered simplicial complex whose n-simplices are the directed (n + 1)-cliques in G.We denote the directed flag complex of a digraph G by |G|.

4.2.
Encoding binary dynamics on neighbourhoods.We now describe our approach to classification of binary dynamics on a graph in general terms.Definition 4.5.Let G = (V, E) be a graph (directed or undirected).A binary state on G is a function β : V → {0, 1}.Equivalently, a binary state on G is a partition of V into two disjoint subsets that correspond to β −1 (0) and β −1 (1), or alternatively as a choice of an element of the power set P(V ) of V .A binary dynamics on G is a function B : R ≥0 → P(V ) that satisfies the following condition: • There is a partition of R ≥0 into finitely many half open intervals {[a i , b i )} P i=1 for some P ≥ 1, such that B is constant on [a i , b i ), for all i = 1, . . ., P .
Activity in a network of neurons, both natural and artificial, is a canonical example of a binary dynamics on a directed network.

4.2.1.
Setup.The task we address in this section is a general classification methodology for binary dynamics functions.Namely, suppose one is given • a set of binary dynamics functions In addition, we operate under the assumption that functions labeled by the same label are variants of the same event (without specifying what the event is, or in what way its variants are similar).The aim is to produce a topological summary for each B i in a way that will make the outcome applicable to standard machine learning algorithms.We next describe our proposed mechanism.4.2.2.Creation of vector summary.Fix a graph G and a real-valued graph parameter Q, that is, a real-valued function taking digraphs as input and whose values are invariant under graph isomorphisms.Suppose that a set of labeled binary dynamics functions {B n } N n=1 on G is given.Select an M -tuple (H 1 , H 2 , . . ., H M ) of subgraphs of G, for some fixed positive integer M .
Fix a time interval and divide it into time bins.In each bin, record the vertex set that showed the value 1, that is, was active at some point during that time bin.For each 1 ≤ m ≤ M , restrict that set to H m and record the subgraph induced by the active vertices.Apply Q to obtain a numerical M -tuple, and concatenate the vectors into a long vector, which encodes all time bins corresponding to the given dynamics.
We now describe the procedure more accurately in three steps.I) Interval partition uniformising.Fix an interval I = [a, b] ⊂ R ≥0 and a positive integer II) Subgraph extraction.For 1 ≤ n ≤ N and each 1 ≤ m ≤ M , let β n m,k denote the binary state on H m defined by Let H n m,k ⊆ H m be the subgraph induced by all vertices in the set β n m,k .We refer to H n m,k as the active subgraph of H m with respect to the binary dynamics function B n .III) Numerical featurisation.For each 1 ≤ n ≤ N , let q n m,k denote the value of Q applied to H n m,k .Let F n denote the M × K matrix corresponding to the binary dynamics function B n , that is (F n ) m,k = q n m,k .For use in standard machine learning technology such as support vector machines, we turn the output of the procedure into a single vector by column concatenation.The output of this procedure is what we refer to as a vector summary of the collection {B n } N n=1 (Figure 3).It allows great flexibility as its outcome depends on a number of important choices: • the ambient graph G, • the selection procedure of subgraphs, • the interval I and the binning factor K, and • the graph parameter Q.
All these choices may be critical to the task of classifying binary dynamics functions, as our use case shows, and have to be determined by experimentation with the data.

4.3.
Selection and feature parameters.In this section we describe the graph parameters used in this article.Some of these parameters are well known in the literature.All of them are invariant under digraph isomorphism.The parameters presented in this section are the primary parameters used for both selection and generation of vector summaries.We chose these particular parameters either because of their prevalence in the literature, or for their strong performance as either selection or feature parameters in classification tasks.Other parameters we examined are mentioned in Supplementary Materials.Throughout this section, we let G = (V, E) denote a locally finite digraph (that is, such that every vertex is of finite degree).For k ≥ 1 and v 0 ∈ V , we let S k (v 0 ) denote the number of directed (k + 1)-cliques that contain v 0 .In particular S 1 (v 0 ) = deg(v 0 ).4.3.1.Clustering coefficients.In [23] Watts and Strogatz introduced an invariant for undirected graphs they called clustering coefficient.For each vertex v 0 in the graph G, one considers the quotient of the number t v 0 of triangles in G that contain v 0 as a vertex by the number deg(v 0 ) 2 of triangles in the complete graph on v 0 and its neighbourhood in G.The clustering coefficient of G is then defined as the average across all v 0 ∈ G of that number.Clustering coefficients are used in applied graph theory as measures of segregation [10].

4.3.2.
Clustering coefficient for digraphs.The Watts-Strogatz clustering coefficient was generalised by Fagiolo [24] to the case of directed graphs.Fagiolo considers for a vertex v 0 every possible 3-clique that contains v 0 , and then identifies pairs of them according to the role played by v 0 , as a source, a sink, or an intermediate vertex (see Figure 13, (A), (B) and (C)).Fagiolo also considers cyclical triangles at v 0 and identifies the two possible cases of such triangles (see Figure 13, (D)).The Fagiolo clustering coefficient at v 0 is thus the quotient of the number of equivalence classes of directed triangles at v 0 , denoted by t v 0 , by the number of such classes in the complete graph on v 0 and all its neighbours in G. Thus, if v 0 is the i-th vertex in G with respect to some fixed ordering on the vertices, and A = (a i,j ) is the adjacency matrix for G, then and the clustering coefficient at v 0 is defined by

Transitive clustering coefficient.
A directed 3-clique is also known in the literature as a transitive 3-tournament.Our variation on the clustering coefficient, the transitive clustering coefficient of a vertex v 0 in a digraph G, is the quotient of the number of directed 3-cliques in G that contain v 0 as a vertex by the number of theoretically possible such 3-cliques.Let ind(v 0 ) and oud(v 0 ) denote the in-degree and out-degree of v 0 .Let I v 0 , O v 0 and R v 0 denote the number of in-neighbours (that are not out-neighbours), out-neighbours (that are not in-neighbours) and reciprocal neighbours of v 0 , respectively.Notice that (1) ind We introduce our variation on Fagiolo's clustering coefficient.
Definition 4.6.Define the transitive clustering coefficient at v 0 by .
A justification for the denominator in the definition is needed and is the content of the Lemma 1 in Supplementary Materials.
Let A = (a i,j ) denote the adjacency matrix for G with respect to some fixed ordering on its vertices.Then for each vertex v 0 ∈ G that is the i-th vertex in this ordering, S 2 (v 0 ) can 20 be computed by the formula 4.3.4.Euler characteristic and normalised Betti coefficient.The Betti numbers of the various topological constructions one can associate to a digraph have been shown in many works to give information about structure and function in a graph.A particular example, using Blue Brain Project data is [15].4.3.5.Euler characteristic.The Euler characteristic of a complex is possibly the oldest and most useful topological parameter, and has been proven to be useful to theory and applications.In the setup of a directed flag complex (or any finite semi-simplicial set) the Euler characteristic is given as the alternating sum of simplex counts across all dimensions: where |X n | is the number of n-simplices in X.Alternatively, the Euler characteristic can be defined using the homology of X by where F is any field of coefficients.The Euler characteristic is a homotopy invariant, and can take positive or negative values according to the dominance of odd-or even-dimensional cells in the complex in question.
4.3.6.Normalised Betti coefficient.The normalised Betti coefficient is based on a similar idea to the Euler characteristic.It is invariant under graph isomorphism, but is not a homotopy invariant.Also, unlike the Euler characteristic, it is not independent of the chosen field of coefficients.We view the normalised Betti coefficient as a measure of how "efficient" a digraph is in generating homology, without reference to any particular dimension, but with giving increasing weight to higher dimensional Betti numbers.
Let G be a digraph, and for each k ≥ 0, let s k (G) denote the number of k-simplices in the directed flag complex |G|.Fix some field F. By the Betti number β i of G we mean the dimension of the homology vector space H i (|G|, F).
Normalised Betti coefficients can be defined by any linear combination of Betti numbers, and also in a much more general context (simplicial posets), which we did not explore.Both the Euler characteristic and the normalised Betti coefficients are invariants of digraphs, and to use them as vertex functions we consider their value on the neighbourhood of a vertex.

Size (vertex count).
The size of a digraph can be interpreted in a number of ways.One standard way to do so is for a fixed simplicial object associated to a digraph, one counts the number of simplices in each dimension.This will typically produce a vector of positive integers, the (euclidean) size of which one can consider as the size of the digraph.Alternatively, the simplex count in any dimension can also be considered as a measure of size.In this article we interpret size as the number of vertices in the digraph.Thus by size of a vertex v 0 ∈ G we mean the vertex count in N G (v 0 ).When working with binary states on a digraph, neighbourhood size means the number of vertices that obtain the value 1 in N G (v 0 ).4.3.8.Spectral invariants.The spectrum of a (real valued) square matrix or a linear operator A is the collection of its eigenvalues.Spectral graph theory is the study of spectra of matrices associated to graphs.It is a well developed part of combinatorial graph theory and one that finds many applications in network theory, computer science, chemistry and many other subjects (See a collection of web links on Applications of Spectral Graph Theory).The various versions of the Laplacian matrix associated to a graph plays a particularly important role.An interesting work relating neuroscience and the Laplacian spectrum is [18].
The spectral gap is generally defined as the difference between the two largest moduli of eigenvalues of A. In some situations, for instance in the case of the Laplacian matrix, the spectral gap is defined to be the smallest modulus of nonzero eigenvalues.Given a matrix and its spectrum, either number can be computed.As a standard in this article spectral gaps are considered as the first type described above, except for the Chung Laplacian spectrum, where the spectral gap is defined to be the value of the minimal nonzero eigenvalue.However, in several cases we considered both options.To emphasise which option is taken we decorated the parameter codes from Table 1 with a subscript "high" (referring to the difference between the two largest moduli) or "low" (referring to the smallest modulus of a nonzero eigenvalue).For example, Figures 7, 8, 9 have bls low as a parameter, indicating the lowest nonzero value in the Bauer Laplacian spectrum (that is, the minimal nonzero eigenvalue of the Bauer Laplacian matrix).Another variant of the standard concepts of spectra is what we call the reversed spectral gap (Definitions 4.8 and 4.10).
Yet another common invariant we considered is the spectral radius which is the largest eigenvalue modulus of the matrix in question.We consider here four matrices associated to digraphs: the adjacency matrix, the transition probability matrix, the Chung Laplacian and the Bauer Laplacian, with details to follow.4.3.9.The adjacency and transition probability matrices.Let G = (V, E) be a weighted directed graph with weights w u,v on the edge (u, v) in G, where w u,v = 0 if (u, v) is not an edge in G. Let W G = (w u,v ) denote the weighted adjacency matrix of G. Let oud(u) denote the out-degree of a vertex u.The transition probability matrix for G is defined, up to an ordering of the vertex set V , to be the matrix P G , with where D −1 out (G) is the diagonal matrix with the reciprocal out-degree 1/out(u) as the (u, u) entry, if out(u) = 0, else the (u, u) entry is 0. Definition 4.8.Let G be a digraph with adjacency matrix A G and transition probability matrix P G .The adjacency spectral gap and the transition probability spectral gap of G are defined in each case to be the difference between the two largest moduli of its eigenvalues.
If we replace in the definition of P G the matrix D out (G) by D in (G) of in-degrees, we obtain a variant of the transition probability matrix, which we denote by P rev G , and its spectral gap is referred to as the reversed transition probability spectral gap.
For our specific application we considered the ordinary (as opposed to weighted) adjacency matrix, namely where all weights w u,v are binary.We considered as parameters the spectral radius of the adjacency and transition probability matrices.4.3.10.The Chung Laplacian.Chung defined the directed Laplacian for a weighted directed graph in [17].The Perron-Frobenius theorem [25] states that any real valued irreducible square matrix M with non-negative entries admits a unique eigenvector, all of whose entries are positive.The eigenvalue for this eigenvector is routinely denoted by ρ, and it is an upper bound for any other eigenvalue of M .
If G is strongly connected (that is, when there is a directed path between any two vertices in G), then its transition probability matrix is irreducible, and hence satisfies the conditions of the Perron-Frobenius theorem.Thus P G has an eigenvector, all of whose entries are positive.The Perron vector is such an eigenvector φ that is normalised in the sense that v∈V φ(v) = 1.Let Φ denote the diagonal matrix with the v-th diagonal entry given by φ(v), and let P denote the transition probability matrix P G .Definition 4.9.Let G be a strongly connected digraph.The Chung Laplacian matrix for G is defined by

,
where P * denotes the Hermitian transpose of a matrix P .The Chung Laplacian spectral gap λ for a digraph G is defined to be the smallest nonzero eigenvalue of the Laplacian matrix.23 The Chung Laplacian spectral gap λ of a strongly connected digraph G is related to the spectrum of its transition probability matrix P by [17,Theorem 4.3], which states that the inequalities (5) min hold, where the minima are taken over all eigenvalues of P .The theory in [17] applies for strongly connected graphs and we therefore defined the Laplacian spectral gap of a neighbourhood to be that of its largest strongly connected component.We use the spectral gap of the Chung Laplacian for the largest strongly connected component of a neighbourhood as a selection parameter.When used as a feature parameter we consider the spectral gap of the largest strongly connected component of the active subgraph of the neighbourhood.We also use the spectral radius of the Chung Laplacian, both as selection and feature parameter.4.3.11.The Bauer Laplacian.The requirement that G is strongly connected is a nontrivial restriction, but it is required in order to guarantee that the eigenvalues are real.An alternative definition of a Laplacian matrix for directed graphs that does not require strong connectivity was introduced in [26].Let C(V ) denote the vector space of complex valued functions on V .The Bauer Laplacian for G is the transformation ∆ If ind(v) = 0 for all v ∈ V , then ∆ G corresponds to the matrix in (G) is defined analogously to D −1 out (G) in Definition 4.8, and W G is the weighted adjacency matrix.In our case W is again taken to be the ordinary binary adjacency matrix.
Definition 4.10.The Bauer Laplacian spectral gap is the difference between the two largest moduli of eigenvalues in the spectrum.

Supplementary Material
Lemma 5.1.Let G be a digraph and let v 0 ∈ G be a vertex.Then the number of possible directed 3-cliques containing v 0 is given by (7) deg Proof.The set of in-neighbours of v 0 give rise to 2 Iv 0 2 = I v 0 (I v 0 − 1) directed 3-cliques containing v 0 .Similarly the out-neighbours of v 0 give rise to O v 0 (O v 0 − 1) directed 3-cliques containing v 0 .A choice of each gives an extra I v 0 O v 0 directed 3-cliques.Next, each reciprocal neighbour together with either an in-neighbour or an out-neighbour gives rise to three directed 3-cliques at v 0 .The total number of those is 3R v 0 (I v 0 + O v 0 ).Finally, pairs of reciprocal neighbours give rise to six directed 3-cliques at v 0 , and the total number of those is 6 Rv 0 2 = 3R v 0 (R v 0 − 1).Let P(v 0 ) denote the total number of transitive 3-tournaments that can be formed by v 0 and its neighbours.Summing up we have 5.1.Size, distribution and structure of neighbourhoods in a sample digraph.We compare neighbourhoods in a sample digraph sorted by the parameters listed in Table 1 in terms of some structural features.The digraph G we use is the connectivity graph of the Blue Brain Project reconstruction of the cortical microcircuitry in a young rat brain [20].The data we used is available at [27].Our classification experiments are done on the same microcircuit.We also applied the same measurements to other collections of digraphs and obtained different results.Since our aim is primarily to examine possible relationship between structure and function, we do not report those results here.These extended results are presented at Aberdeen Neurotopology Group webpage.
We considered the top 50 vertices in the graph sorted by the parameters listed in Table 1.For each parameter we computed the size in terms of number of vertices in each neighbourhood and the pairwise intersections, again in terms of the number of vertices in each intersection.In Table 14 we report the minimum, maximum and average of these numbers among the 50 neighbourhoods with highest value for each parameter.We also computed the first six Betti numbers of each neighbourhood and report the average of these numbers for each parameter.Finally, we considered the union of neighbourhoods in decreasing order, sorted by each parameter, and computed the number of centres required for their neighbourhoods to cover 90% of the neurons in entire microcircuit (that is, 28,310 neurons).
We notice that the top 50 centres with respect to the last six parameters listed in Table 14 tend to generate neighbourhoods of size close or below the average, with relatively very small intersection.This correlates well with their capacity as selection parameters in our experiments (see Figure 4).However, the two types of clustering coefficients, fcc and tcc, also generate small top neighbourhoods with small intersection, but are not exceptional as selection parameters.
We also examined the distribution of values for each parameter across the entire graph.The outcome is given in Figure 15, which visually justifies considering neighbourhoods with both highest and lowest parameter values.We did not find a correlation between the distribution of parameter values and their performance as selection or feature parameters.
We are therefore led to the conclusion that the performance of graph parameters as selection and/or feature parameters cannot be explained by the structural features we examined.This compares well with the conclusion drawn in [19], in which similar experiments using the same dataset but with a different methodology yield results that cannot be explained by structural features such as size and mutual intersection.The coverage capability of neighbourhoods sorted by various graph and topological parameters is related to another graph theoretic concept.Let G be a digraph.If S is the entire vertex set of G, then N G (S) = G, but the converse is not true, as S may be much smaller than the full vertex set and still satisfy this condition.Subsets of vertices whose neighbourhoods are the entire graph are well studied in graph theory [28,Section 12.4].Definition 5.2.Let G be a finite digraph with vertex set V .A subset S ⊆ V is a dominating set if N G (S) = G.The minimum cardinality of a dominating set for G is called the domination number and is denoted by γ(G).A dominating set of cardinality γ(G) is said to be a minimum dominating set.Computing a minimal dominating set is known to be an NP hard problem [29], though there exist good approximation algorithms.A good summary of the problem and common approaches appears in [30].In Table 14 we present, among other computations, the size of neighbourhoods and the number of neighbourhood from a sorted list that it takes to cover 90% of the Blue Brain Project microcircuit.Depending on the selection parameter used, the results are quite different.This suggests that a choice of neighbourhoods informed by certain vertex parameters may give ways of producing more efficient approximation algorithms for the domination number of graphs.5.2.Further Graph parameters.We describe here further graph and topological parameters we examined.5.2.1.Degrees.For each vertex v in a graph G, its (total) degree deg(v) is the number of vertices in the open neighbourhood of v.The in-and out-degree of v, denoted ind(v) and oud(v) respectively, mean the number of in-and out-neighbours of v respectively.These 29 invariants were examined as graph parameters in our classification algorithm and were found inefficient, except in the case of size, which is very closely related to degree and turns out to be the strongest feature parameter we found.
The factor k/(k+1)(n−k) normalises the invariant, so that D k (v 0 ) = 1 for every 1 < k < n if v 0 is a vertex in G that is a complete digraph on n vertices.This is explained in the next lemma.Proof.We prove the statement by a double counting argument closely following the one given in [31,Section 10.4].Let U be the set of all pairs (τ, σ) where σ is a directed (k + 1)-clique containing v 0 and τ ⊆ σ is a directed k-clique containing v 0 .Then one can count the number of elements of U in two ways.First, the number of k-sub-cliques τ of a fixed (k + 1)-clique σ containing v 0 is exactly k, therefore On the other hand, a fixed k-clique τ is a subclique of at most (n − k)(k + 1) distinct (k + 1)cliques σ, because there are (n − k) different choices for a vertex that together with τ will form a k + 1 clique, and once a vertex was chosen there are (k + 1) distinct orientations on the extra k edges, so that the outcome is a directed (k + 1)-clique.Therefore, Topological operators on digraphs respect inclusions, by definition, and therefore transform a digraph that is filtered by subgraphs into a space that is filtered by closed subspaces.).The subspace F ω n (Γ(G)) will be referred to as the n-th ω-filtration layer of Γ(G).From a data analysis point of view filtering Γ(G), as proposed in Definition 5.6, can be applied in several ways.In particular, persistent homology [32] can be used to extract information from the topology in a way that is sensitive to the ordering chosen.As the orderings can be induced from various vertex functions, the filtrations enable probing into the effect these vertex functions have on the subspace topology.In other words, such filtrations give ways of building Γ(G) as an increasing union of subspaces, and different choices of orderings may result in totally different sequences of subspaces.In this article we used graph and topological parameters to determine the ordering on vertices.We also considered only the top (or bottom) of the ordered lists of vertices, and hence studied only the bottom layers of the resulting filtrations.5.4.Data and code.The data used is available at https://doi.org/10.5281/zenodo.4290212.The entire analysis code can be obtained from https://github.com/JasonPSmith/TriDy.The code for the NEST experiments is available at https://github.com/jlazovskis/neurotop-nest/.The computations for this paper were done using the Maxwell HPC cluster at the University of Aberdeen.To ensure the calculations were computed in a reasonable time frame we used a combination of parallelisation and publicly available packages with efficient algorithms.In particular, the structural parameters of each neighbourhood can be computed independently, so were done simultaneously across multiple nodes and cores.To compute many of the parameters standard python packages were sufficient, such as numpy, scipy and networkx.However, for the more computationally intensive topological parameters we used variations of the Flagser software [22].
Figure2.A digraph (left), the associated directed flag complex as a topological space (centre), and its maximal cliques of (right).

Figure 3 .
Figure 3.A schematic description of the vector summary and classification pipeline.

Figure 4 .
Figure 4. Results of 8 stimuli classification experiments.Range of cross-validated accuracy is indicated by four smaller squares in each square.Left: Classification accuracy selecting the 50 neighbourhoods with highest parameter value.Right: Classification accuracy selecting the 50 neighbourhoods with lowest parameter value.Compare with Supplementary Figure 3.

Figure 5 .
Figure 5.The classification performance based on the neighbourhoods of 50 randomly selected vertices (blue), compared to the performance of neighbourhoods selected by graph parameters with respect to a selection of feature parameters (red).Errors bars indicate range over 20 iterations.Labels on the red error bars indicate the selection parameter that performed best with respect to the indicated feature parameter.Compare with Supplementary Figure 2.

Figure 6 .
Figure 6.Classification results by binary vectors using only the centres of each of the top and bottom 50 neighbourhoods for each parameter.For comparison, the performance for each selection parameter classified by the highest performing feature parameter is included.

Figure 7 .
Figure 7. Classification by subgraphs of the same vertex count as the neighbourhoods selected by the specified selection parameters.The results of classification by the highest performing feature parameters are above each of the columns.

Figure 8 .
Figure 8. Classification by "fake neighbourhoods": Original classification with respect to best performing feature parameter is given for comparison.

Figure 9 .
Figure 9. Classification of shuffled binary dynamics functions and comparison to the top results for the original dynamics.

Figure 10 .
Figure 10.Eight types of input stimuli for Erdős-Rényi random digraphs, executed as a single 800 second experiment.Top row: Sequence of stimuli types, 500 of each, and relative strength of input for each stimulus.Second row: Spiking neurons on a 1000 ms interval from the experiment.Bottom left: Spiking neurons and length of external input on a 18 ms interval.Third row right: Random selections of 100 vertices from 1000 vertices, acting as receptors of external input.Bottom row right: Distribution of randomly selected relative strength and input stimulus time offset over the whole experiment.

Figure 11 .
Figure 11.Classification of eight random signals on an Erdős-Rényi random digraph on 1000 vertices and connection probabilities of 8%, 1% and 0.5% and selection of 10, 20, and 50 neighbourhoods, modelled on a NEST simulator.Selection parameters are the same as in the main example and feature parameter is always size.Graph G means the BBP graph and its performance with respect to size as feature parameter is given for comparison.Compare with Supplementary Figure 5.

Figure 12 .
Figure 12.An open neighbourhood (left) and a closed neighbourhood (right) in a digraph, marked in red, with its central vertex marked solid colour.

Figure 13 .
Figure 13.Eight possible directed triangles on the same three vertices.The pairs correspond to the identifications made by Fagiolo, with changes denoted by dotted edges.In the definition of the transitive clustering coefficient, the triangles in (A), (B) and (C) are counted individually, and those in (D) are ignored.

Definition 4 . 7 .
Let G be a locally finite digraph.Define the normalised Betti coefficient of G to be

Figure 15 .
Figure 15.Distribution of parameter values across the entire Blue Brain Project microcircuit.The numbers on the right are minimum to maximum values.The values on the x-axis are the relative parameter values, rescaled from 0 to 1. Compare with Figure 18

Lemma 5 . 4 .
For each pair of natural numbers 0 < k < n, any digraph G on n vertices, and any vertex v 0 in it, S k (v 0 ) S k−1 (v 0 ) ≤ (k + 1)(n − k) k with equality obtained if and only if G is a complete digraph on n vertices.
|U | ≤ (n − k)(k + 1)S k−1 (v 0 ).Comparing the two expressions, we have:kS k (v 0 ) ≤ (n − k)(k + 1)S k−1 (v 0 ),which, upon reordering gives the claimed upper bound.Computing the ratio for a complete digraph on n vertices shows that this upper bound is sharp.We remark that, while we use the density coefficients as vertex parameters, one can define a global density coefficient on a digraph G with vertex set V byD k (G) for any 2 ≤ k ≤ |V | − 1, D k (G) = 1 if and only if G is a complete digraph on V .Since any digraph on V is a subgraph of the complete digraph on V , D k (G) provides a set of numerical invariants for digraphs, parameterised by dimension (size of clique), which measure a notion of size of the digraph in comparison to the complete digraph on the same vertex set.In our specific application, density coefficients did not prove efficient as selection or feature parameters.5.3.Digraph filtrations.Definition 5.5.Let G = (V, E) be a digraph, and let Γ be a topological operator on digraphs.For a vertex v ∈ V , let Γ G (v) denote Γ(N G (v)).If S ⊆ V is any subset, let Γ G (S) def = Γ(N G (S)) = v∈S Γ G (v).

Figure 16 .
Figure 16.Classification accuracy of signals on the Blue Brain Project microcircuit using 50 randomly selected neighbourhoods, compared with accuracy of neighbourhoods selected by parameters with respect to the same feature parameter.Compare with Figure 5.

Figure 17 .
Figure 17.Results of classification experiments with respect to all parameters.Left: Classification accuracy with respect to 50 top value vertices by selection parameter.Right: Classification accuracy with respect to 50 bottom value vertices by selection parameter.Compare with Figure 4.

Figure 18 .
Figure 18.Distribution of all parameter values across the entire Blue Brain Project microcircuit.The numbers on the right are minimum to maximum values.The values on the x-axis are the relative parameter values, rescaled from 0 to 1. Compare with Figure 15.

Figure 19 .
Figure 19.Classification of eight random signals on an Erdős-Rényi random digraph on 1000 vertices and connection probabilities of 8%, 1% and 0.5% and selection of 10, 20, and 50 neighbourhoods, modelled on a NEST simulator.Selection parameters from Figure 11 are included, along with additional parameters.Feature parameter is always size.Graph G means the Blue Brain Project graph and its performance with respect to size as feature parameter is given for comparison.

Table 1 .
A partial list of the selection and feature parameters examined in this project.See Supplementary Material for additional parameters.
Figure 14.Size, pairwise intersections, average Betti numbers for the top 50 neighbourhoods of each parameter, and 90% coverage of the graph by neighbourhoods of highest valued centres, by each parameter.The last row is the same among all vertices, with the last entry on the right giving the average number required for 90% coverage over 50 random permutations.
5.2.2.Reciprocal degree.By the reciprocal degree of a vertex v we mean the number of neighbours that are both in-neighbours and out-neighbours.We used reciprocal degree in this work in two ways.The sum of all reciprocal degrees in a neighbourhood (abbreviated rc), and the reciprocal degree of the centre (rc-centre).5.2.3.Density coefficients.Every (k + 1)-clique contains k + 1 k-cliques.But no number of k-cliques in a graph is guaranteed to form any (k + 1)-cliques.The density coefficient is a ratio of the number of (k + 1)-cliques by that of k-cliques, normalised in its ambient graph.Definition 5.3.Let G be a digraph with n vertices.For k ≥ 2 define the k-th density coefficient of G at v 0 by the formula