Abstract
We present Monte Carlo Physarum Machine (MCPM): a computational model suitable for reconstructing continuous transport networks from sparse 2D and 3D data. MCPM is a probabilistic generalization of Jones’s (2010) agent-based model for simulating the growth of Physarum polycephalum (slime mold). We compare MCPM to Jones’s work on theoretical grounds, and describe a task-specific variant designed for reconstructing the large-scale distribution of gas and dark matter in the Universe known as the cosmic web. To analyze the new model, we first explore MCPM’s self-patterning behavior, showing a wide range of continuous network-like morphologies—called polyphorms—that the model produces from geometrically intuitive parameters. Applying MCPM to both simulated and observational cosmological data sets, we then evaluate its ability to produce consistent 3D density maps of the cosmic web. Finally, we examine other possible tasks where MCPM could be useful, along with several examples of fitting to domain-specific data as proofs of concept.
1 Introduction
To understand a process, one must become it: The intelligence that evolved in nature therefore adapted itself to design pressures presented by various natural processes. To take an example that is central to this article, the protist Physarum polycephalum (slime mold) has been successfully used as an unconventional analog computer (Adamatzky, 2010, 2016). As a “thinking machine” and a creative pattern finder (Adamatzky et al., 2013), Physarum can solve hard spatial problems via its physical growth—most prominently, finding optimal paths and transport networks connecting sets of data points where an underlying network structure is expected. The disadvantages, however, are significant: slow growth of the actual organism, difficulty of feeding it large inputs (e.g., more than a few dozen data points), and limited dimensionality. For these reasons, it is useful to instead use a virtual, simulated counterpart—even if not as complex as its biological template—as for many problems a simulation that qualitatively resembles Physarum’s spatial behavior can be expressive enough (Jones, 2010, 2015b).
The universe of problems involving interconnected network-like structures ever expands in the Information Age—mostly due to communication and transportation networks (including road and utility networks, printed circuits, socioeconomic networks, and the Internet). The importance of epidemiological network modeling has never been as apparent as in 2020/2021. And, in parallel to humanmade networks, we continue to discover new natural phenomena with significant interconnected patterns: from molecular structures, through neuronal networks, circulatory/capillary systems, fungal and rhizomatic networks, all the way up to the network of dark matter filaments and intergalactic gas known as the cosmic web. All these phenomena share an important feature: In one way or another they are built on the notion of optimal transport. Our initial concern in this line of work has been with the last case: reconstructing the filamentary structures of the cosmic matter distribution in a consistent and verifiable way.
To this end, we present Monte Carlo Physarum Machine(MCPM): a simulated Physarum machine that algorithmically mimics the organism’s growth. Relying on Physarum’s ability to closely approximate optimal transport networks, we feed to its virtual counterpart sparse point data as 3D attractors. MCPM’s agents navigate these data to find the most probable paths between them; the superimposed aggregate of their trajectories then forms the candidate network structure. We capture the totality of these trajectories as a density field in 3D space, giving rise to continuous network structures which we will refer to as polyphorms. To demonstrate the practicality of the model, we apply MCPM to galaxy and dark matter halo data sets, interpreting the resulting continuous networks as proxies for the knots and filaments of the cosmic web.
The first contribution of this article is the full description of the MCPM model (section 4). MCPM is a generalization of the seminal method described by Jones (2010), from which we draw much inspiration (section 5; Figure 1). We expand Jones’s work in several ways:
All decisions in MCPM are stochastic. This makes the model significantly more customizable, as prior domain-specific knowledge can be encoded in probability density functions sampled by the model’s agents.
We extend Jones’s model from 2D to 3D, building on the stochastic formulation. We achieve this by redesigning the sampling stencil to work with only a binary sensing decision, rather than performing dense directional sampling as in Jones’s proposed 3D extension (Jones, 2015b). Consequently, this choice also opens up the future possibility for higher-dimensional application of the method.
We add a new data modality decoupled from the algorithm’s signaling mechanism, to reconstruct the agent distribution. This modality, called a trace, records the collective spatial history of agents’ positions. After the model reaches equilibrium, the trace provides more accurate networks in conditions where the agent density by itself would be too low for getting robust reconstructions.
The second contribution is the case study applying MCPM to data sets whose geometric complexity is higher than that of optimal transport networks but still resembles them to a large degree, especially with regard to their topology. Here we study the fitting process and reconstruction results using a task-specific variant of MCPM (section 6), which we have previously applied in the context of cosmological modeling and visualization (Burchett et al., 2020; Elek et al., 2021; Simha et al., 2020). We demonstrate how the probabilistic form of MCPM benefits the task of reconstructing the cosmic web, the structural features of which are best described by a density field, rather than a network in the strict sense (section 7). We posit that many real-world data sets share this property and offer several trials to support this (section 8).
2 Background
2.1 Physarum Machines: Real and Virtual
Nature offers a wealth of inspirations for solving difficult computational problems, often referred to as metaheuristics. Natural selection itself can be framed as a constrained optimization process with selective pressures acting as design criteria. This is a fact that the tradition of evolutionary computation builds on (Goldberg, 1989; Hingston et al., 2008), as well as more directly biology-inspired optimization methods (Mirjalili et al., 2020).
Among the sources of inspirations, one organism clearly shines: Physarum polycephalum, commonly known as slime mold, has captured the attention of computational researchers over the past two decades (and half a century longer in biology; Bonner, 2009). The growth patterns of the actual organism have been used to address many spatial problems: maze solving (Nakagaki et al., 2000), shortest path finding (Nakagaki et al., 2001), transportation design (Tero et al., 2010), the traveling salesman problem (Jones & Adamatzky, 2014a), and Voronoi diagram estimation (Jones & Adamatzky, 2015), among others. A survey of this active area of research is given by Sun (2017). Standing out is the extensive work of Andrew Adamatzky summarized in his books (Adamatzky, 2010, 2016). His team’s recent work goes as far as proposing a Physarum-based hardware for solving suitable problems (Whiting et al., 2016).
The approach of using Physarum as an analog computer—leading to the moniker Physarum Machines—has been popular because of the organism’s propensity to systematically explore its environment for food and shape itself into intricate networks to interconnect with it. Food sources thus become straightforward proxies for input data, while different chemical and physical stimuli are available to further steer Physarum’s growth (Adamatzky, 2010).
There are downsides too, however. Physarum grows slowly and is only suitable for small inputs (approximately up to dozens of distinct points). It is also surface-borne; and while heightfields are perfectly feasible (Evangelidis et al., 2017), truly 3D volumetric data sets are not. On the other hand, simulated slime mold—a virtual Physarum machine (VPM)—can be designed to overcome these obstacles. VPMs usually simulate the organism bottom-up, obtaining Physarum’s approximations from an iterated composition of simple, local rules.
In this work, we build on the simulation model of Jones (2010). While other computationally efficient VPM models have been proposed—for instance, those based on cellular automata (Dourvas et al., 2019; Kalogeiton et al., 1986) and formal logic rules (Schumann & Pancerz, 2016)— we adopt this agent-based model due to its ability to construct precise trajectories. Jones’s model is related to agent-based modeling of crowds and flocks (Olfati-Saber, 2006; Reynolds, 1987), but uses a continuous representation of the agents’ density for navigation (Jones, 2010, 2015b). On top of the successful application to classic problems such as road network optimization (Adamatzky et al., 2017), this model is suitable for abstract and interactive tasks: B-spline fitting (Jones & Adamatzky, 2014b), convex and concave hull approximation, and robot navigation (Jones, 2015a). In our case, it is the model’s ability to approximate optimal transport networks over a set of points that made it an ideal base for us.
2.2 The Search for the Cosmic Web
The problem that originally motivated this work is the detection and reconstruction of the intergalactic medium (IGM), which forms the large-scale structure of the Universe (also known as the cosmic web). This section briefly summarizes the existing work in this domain. For more details, please refer to the comprehensive survey by Libeskind et al. (2018), as well as our prior work (Burchett et al., 2020; Simha et al., 2020).
Composed mainly of hot gas and plasma, the IGM is purported to trace the underlying dark matter scaffolding of the cosmic web to contain a significant yet unobserved portion of the Universe’s conventional matter: at least 40% estimated in total. This refers to the widely known missing baryon problem (Fukugita et al., 1998). The IGM has long been thought to contain the unaccounted baryonic matter based on findings from large-scale cosmological simulations, such as the Millenium (Springel et al., 2005), EAGLE (Schaye et al., 2015), and Bolshoi-Planck (Klypin et al., 2016) simulations. Furthermore, the IGM hydrogen absorption lines frequently detected in spectra of distant quasars (Lanzetta et al., 1995) have long been assumed to trace the cosmic web structure (Cen et al.,1994). Recently, astronomers have used the dispersion measure of fast radio bursts to find the missing baryons, verifying that they reside in the IGM (Macquart et al., 2020); however, important astrophysical problems remain, as the precise spatial distribution of the IGM is still heavily underconstrained.
The large-scale simulations also unanimously agree on the expected distribution of the dark matter and thus the IGM: After billions of years, the combined effect of gravitational attraction and dark-energetic repulsion has shaped the universe as a vast network (Figure 2). This quasi-fractal structure (Scrimgeour et al., 2012), widely known as the cosmic web, consists of knots (galaxies and their clusters) interconnected by a complex network of filaments and separated by vast cells of significantly under-dense space called voids. In addition, the entire structure is supported by a scaffolding of dark matter, according to theoretical predictions. These simulations themselves have been constructed to confirm the earlier findings of the Zeldovich framework (Doroshkevich, 1970; Icke, 1973; Zeldovich, 1970), which on theoretical grounds predicts the emergence of these very structures under the influence of the dominant force dynamics known in the Universe.
The pertinent question is: How do we transfer the indirect knowledge about the cosmic web to actually reconstructing its structure? By definition, the IGM filaments would consist of diffuse gas that generally does not emit light at levels detectable by modern telescope instrumentation, making them extremely challenging to observe directly. On top of that, even the galaxies, which form within filaments and nodes and serve as the only luminous tracers of the cosmic web, are observable only to a degree. For instance, with increasing redshift (i.e., distance from Earth), galaxies (particularly fainter ones) are more difficult to detect and may not be captured by astronomical surveys. We are therefore dealing with inherently incomplete, heterogeneous data.
The problem can be approached from multiple different angles. The most computationally amenable representations of network structures are based on graphs—the field of combinatorial optimization (Cook et al., 1997; Papadimitriou & Steiglitz, 1998) operates within this paradigm. Graphs can be regarded as topological structures (Porter & Gleeson, 2016; Woodhouse et al., 2016) or even as a proxy for 3D geometry (Govyadinov et al., 2019). Similarly, in the astronomical context, both topological (Aragón-Calvo et al., 2010) and geometric (Chen et al., 2016; Tempel et al., 2014) approaches use graphs or partially connected graph-like structures as a core representation. Ultimately, most of the existing approaches have focused on cataloguing and classifying the cosmic large-scale structure (Libeskind et al., 2018). To our knowledge, no available method is able to recover a complete 3D density map that is inherently driven by the underlying filamentary structure of possibly incomplete (especially observational) data. Given that the cosmic web is a heterogeneous multiscale structure without an intrinsic topology, having an accurate estimate for its density distribution is bound to benefit many use cases, including our prior work: better understanding the transition from the galactic to intergalactic environments (Burchett et al., 2020), characterizing the intergalactic medium toward singular astronomical phenomena (Simha et al., 2020), and potentially shedding new light on the missing baryon problem itself.
We approached this challenge from the perspective of finding a sound structural interpolator for the galaxy data, which on the scales where the cosmic web exhibits distinct features (about 1 to 100 Mpc) can be represented as a weighted set of points in 3D space. Optimally, the resulting model has to enable the following:
detection of distinct anisotropic structures at least several megaparsecs away from the galaxies, but ideally providing consistent density estimates in the entire region of interest;
detection of geometric features across several orders of magnitude in terms of both spatial scale and density; and
structural transfer, allowing a calibration of the model on dense simulated data before deploying it on sparse observed data.
Per our findings, the presented biologically inspired swarm-based approach is able to cover all the above criteria. In this article, we focus on the simulation and modeling perspective; more details toward the required astronomical and visualization tasks are provided in Elek et al. (2021).
3 Overview of Methodology
In its core, MCPM is a dynamic computational model defined by a set of priors: a set of probability density functions and their parameters obtained from domain-specific knowledge and/or fitting to training data. Once configured, the model can be fitted to input data: a weighted set of points in 2D or 3D (Figure 3, left). The result of this fitting is a continuous geometric structure (Figure 3, middle) which we interpret as a transport network over the input data. Rather than a graph, this geometry is represented as a density field stored as a densely sampled regular lattice (Figure 3, right).
Section 4 defines the core components of MCPM on an abstract level, i.e., without specifying the prior probability density functions. In section 5, we outline the differences between MCPM and the method of Jones (2010) and show that MCPM is the natural generalization of Jones’s method. Following that, in section 6, we detail the specific version of MCPM that we designed for the task of finding a meaningful cosmic web map from observational data provided by astronomers: either galaxies or dark matter halos.
4 Monte Carlo Physarum Machine
Monte Carlo Physarum Machine is a hybrid model—it has a discrete and a continuous component. The discrete component is an ensemble of particle-like agents that can freely navigate the simulation domain; these agents serve as a fragmented representation of the virtual organism. The continuous component is a 3D scalar lattice that represents the concentration of a marker that facilitates information exchange between the agents and the data. The model’s behavior is based on a feedback loop between these two components, executed in two alternating steps: propagation and relaxation (refer to Algorithms 1 and 2 and the diagrams in Figure 4).
(1) Propagation(Algorithm 1): The propagation step is executed in parallel for each of the agents, which are the model’s device for exploring the simulation domain. Each agent’s state is represented by a position and a movement direction, which are stochastically updated (Figure 4(a)–(c)) to navigate through the deposit field (referred to as a “trail” in Jones, 2010; see Figure 4(d); gray cells). The deposit field is stored as a 3D lattice of scalar values, representing the marker (in biological contexts usually referred to as chemoattractant) emitted by both the input data points and the agents. The deposit effectively guides the agents toward the other agents as well as the data: The agents move with higher likelihood to the regions where deposit values are higher. In addition to the deposit, we also maintain a scalar trace field (Figure 4(d); green cells), which records the agents’ equilibrium spatial density but does not participate in guiding them. (A detailed explanation of a trace is provided in section 5.)
Successive propagation steps build up the agents’ trajectories: random walks that follow the structure of the deposit field and are recorded in the trace field. While each individual agent draws a seemingly chaotic path (Figure 5, left), a superposition of many agents averaged over a sufficient time window results in a smooth converged structure (Figure 5, right).
It is worth noting that the data points are also represented by agents, but of a special type: one that does not move, but merely emits deposit according to its weight (“mass”). We found this to be the most consistent way of handling the input data: The relative structural impact of the agents and the data now simply become a question of tuning the amount of deposit they each emit.
(2) Relaxation(Algorithm 2): The relaxation step ensures that the simulation eventually reaches an equilibrium. To this end, the deposit field is spatially diffused by a small isotropic kernel and attenuated—that is, each cell is multiplied by a value <1. The trace field is also attenuated but does not diffuse in order to preserve the geometric features in the agents’ distribution, which the trace represents. (Please refer to section 5 for further discussion on the trace field.)
The simulation reaches its equilibrium when the amount of deposit and trace injected into the respective fields in the propagation step equals the amount removed by the attenuation in the relaxation step. With the parameters used in our experiments (section 7), this usually takes hundreds of iterations.
4.1 Probabilistic Sampling
The core of MCPM is the agent propagation defined in terms of three probability distributions, the configuration of which can be tuned in runtime. These are:
Pdir for sampling the agents’ directional decisions during the sensing and movement phases; defined on the unit sphere relatively to the current direction of propagation.
Pdist for sampling the agents’ distance decisions during the sensing and movement phases; defined in positive real numbers along the current propagating direction.
Pmut for making the binary mutation decision whether the agent should remain on its current course or take the newly sampled direction according to Pdir.
Now, it is good to take a step back and understand why such a model produces spatial networks and why these align with the input data. Figure 6 supports this discussion. The key is that the agents are navigated toward large pools of deposit—this is ensured by defining Pmut such that it preferentially selects those directions where higher deposit values have been sensed. Agents are therefore attracted toward data in their vicinity (thus interconnecting them), as well as other nearby agents (thus reinforcing existing pathways). The shape characteristics of the network (curvature of the paths, size, and acuity of the features, as well as connectivity patterns) are further dependent on the combined influence of Pdir and Pdist. In particular, Pdir needs to be strongly forward-oriented, which ensures momentum preservation and prevents the agents from getting stuck in local pools of built-up deposit.
In summary, MCPM solves tasks by acting as a structural interpolator for input data. When adequately configured and parametrized, it can find natural transport networks that connect neighboring points as a human observer would intuitively do—even in the topologically richer 3D space (section 7). These networks are represented by the converged trace, i.e., by a continuous density field rather than an explicit graph.
5 Relation to Max-PM
Our main motivation for extending Jones’s (2010) Max-PM model was the question of parametrization. While Max-PM is configurable enough to reproduce the morphology of Physarum Polycephalum and beyond, the agents always respond to different deposit concentrations in the same way: follow the direction of maximum deposit concentration. Hence the alias Max-PM (with PM standing for Physarum machine).
At any rate, this behavior—especially after our initial experiments in 3D—led to overly condensed pathways and only moderately connected networks (more in section 7; see Figure 7). While captivating and resembling a 3D Physarum, the fits that Max-PM produced did not match our target cosmological data well. Another observation was the necessary increase of sampling effort: Max-PM in 2D uses 1 + 2 samples for directional sensing; our extension to 3D needs at minimum 1 + 8 directions (determined empirically). This increases the sampling effort considerably as well as the number of access operations to the deposit lattice.
These considerations led us to target the agents’ directional sampling, specifically modifying it to use stochastic Monte Carlo sampling. In the remainder of this section, we discuss our specific extensions of Max-PM in detail.
5.1 Stochastic Sampling of Agent Trajectories
The way in which Max-PM treats agent navigation is suitable for the original context: the Physarum agents use chemotaxis to move around, following the chemoattractant concentration gradient (represented by the deposit field). In our and arguably many other scenarios—where the main concern is recovering a network structure that fits target data—we need a more complex behavior that can be configured based on available knowledge about the data distribution. The main concern is that in any given geometric configuration of point data, a multitude of feasible pathways is available to meaningfully interconnect them.
As an example, consider the elementary configuration in Figure 8(a): We want the resulting network to branch out when connecting the data point A with B and C, which means that the agent traveling from A has to choose steering toward B or C with roughly the same probability. This ensures that the aggregate distribution of all agents passing through this location is going to have a branched shape. In Figure 8(b)–(c) we see an actual such configuration: agents arriving from the bottom-left branch split into two streams and fluently transition to the more complex region in the right.
This behavior is represented by the mutation decision (section 4) encoded by the discrete probability Pmut—that is, whether the agent should remain on its current course, or branch out in the direction generated in the sensing step.
To provide additional flexibility in representing paths with different scales and curvatures, we modified the agents’ spatial decisions to behave probabilistically as well, rather than using constant steps and angles for the agents’ movement (Jones, 2010). This behavior is defined by the continuous probability density functions Pdir (2D distribution on a sphere) and Pdist (1D distribution on a half-line).
We can now demonstrate how Max-PM can be framed as a special case of MCPM.
Define the mutation probability Pmut as / ( + ), where d0 is the deposit value ahead of the agent and d1 is the deposit in the mutated direction. Such definition effectively means that the larger of the deposits will always be selected for the agent to follow.
Define Pdir and Pdist using Dirac delta distributions with the angular and distance parameters as offsets for the peaks.
In section 6 we discuss the Pdir, Pdist, and Pmut that define the particular variant of MCPM we employed for reconstructing the cosmic web structure.
5.2 Extension from 2D to 3D
Enabled by the notion of the mutation probability Pmut we can now simplify the sampling stencil—that is, how many directions need to be sampled during an agent’s sensing stage. We observed that in Max-PM the network complexity (connectivity) depends on the number of directional samples, and that this becomes even more pronounced in the topologically richer 3D space.
Given that the directional navigation in MCPM is controlled by Pdir and Pmut, we can reduce the necessary number of samples to two: one in the forward direction, and one in the mutation direction. Any direction where Pdir is nonzero can potentially be sampled, and even directions with low deposit have a chance to be selected (subject to the specific definition of Pmut). This has two advantages: the savings in directional sampling can be reinvested in increasing the resolution of the agent’s trajectory, and extensions to higher than three dimensions are now possible without increasing the sampling effort.
The fact that high-dimensional sampling and integration problems can be effectively solved with binary sampling stencils is well established in the Monte Carlo community. Both direct (Kajiya, 1986; Kalos & Whitlock, 2008; Veach, 1997) and Markov-chain Monte Carlo (Hastings, 1970; Metropolis et al., 1953; Veach & Guibas, 1997) avoid the curse of dimensionality this way when constructing random walks. Moreover, this strategy can even be the most efficient one, as documented, for instance, by the ubiquity of path tracing methods for simulating light transport in the visual effects industry (Christensen & Jarosz, 2016; Fong et al., 2017). Indeed, our method draws significant inspiration from these works.
5.3 Aggregation of Agent Trajectories
Rather than using the deposit to recover the reconstructed network (as in Jones’s method (2010, 2015b)) we use the trace for this purpose. This has multiple advantages:
- (1)
Resolution: In contrast to the deposit, the trace field is not subject to spatial diffusion—structural details are therefore preserved (Figure 9).
- (2)
Specificity: Recovering the network from the trace becomes much easier than from the deposit, as in the latter, the data term has by design a much higher impact than the agent term.
- (3)
Interpretability: While the exact mathematical properties of the trace in relation to Pmut, Pdir, and Pdist are as yet unknown, they are nevertheless in a direct relationship. One way to interpret the trace is the total probability of the agents’ distribution marginalized over all possible geometric configurations that can occur in a given data set.
Another way of interpreting the trace is that it provides a continuous estimate of the geometric and topological structure outlined by the input data. This is related to the use of such a structure by Jones and Saeed (2007), referred to as a trail in their work. The application of a trail here was to substitute for standard image-processing kernels, as used for denoising or contrast enhancement. Unlike their trail, the trace lives in a different frame of reference than the input data: The data are sparse and unordered points, while the trace is a function densely sampled in space and continuous. Other than that, these data structures are conceptually similar.
To succinctly describe the trace in text, we use term polyphorm to refer to the particular continuous geometries represented by the trace field, as well as a general label of the concept of a continuous transport network, i.e., a network implicitly defined by the trace density field (as opposed to a network explicitly represented by a graph or other means).
5.4 Fully Continuous Representation
Our last design choice is to fully decouple the representation of agents and the underlying discrete data structures. In Max-PM, only one agent can be present in each cell of the deposit field at any given time. This is suitable for simulating the actual organism, as each agent represents a small fragment of the body and therefore occupies a certain volume of space.
In our case, this behavior is no longer desired: Our agents represent an abstract spatial distribution with possibly several orders of magnitude in dynamic range. For instance, cosmic overdensities between 10−3 and 102 are important in the astrophysical inquiry. We therefore allow agents to move freely without enforcing any limit on their number in each grid cell. This has additional benefits: It decreases the amount of necessary bookkeeping, and decouples the reconstructed network structure from the data structures’ resolution (as demonstrated in section 7).
6 MCPM for Cosmic Web Mapping
In sections 4 and 5 we described the components of MCPM and provided the rationale behind our design. Now we specify the probabilities Pdir, Pdist, and Pmut, which together define the variant of MCPM used in our previous work (Burchett et al., 2020; Elek et al., 2021; Simha et al., 2020) for the cosmic web reconstruction task. We likewise use it in the experiments throughout section 7 and most of the article.
Directional distribution Pdir
The main criterion for Pdir is that the agents should maintain a certain momentum—otherwise, if allowed to turn too rapidly, they would invariably get stuck in the local maxima of the deposit field. The simplest choice here is uniform distribution over the forward-facing cone of directions. This distribution is parametrized by a single scalar parameter: the opening angle of the cone. Routines for efficiently drawing samples from this distribution can be found, for instance, in Watson (1982).
Other viable choices of Pdir include the spherical Gaussian, the von Mises–Fisher distribution (Fisher, 1953), the Henyey–Greenstein distribution (Henyey & Greenstein, 1941), and their truncated versions.
When configuring Pdir, the rule of thumb we used in our fitting confirms the findings of Jones (2010): The sensing angle should be larger than the mutation (i.e., turning) angle. We typically use twice the value for sensing than for turning. The variation of these and other parameters is further explored in section 7.
Distance distribution Pdist
Since the cosmic web is built up from filaments consisting mostly of hot gas, we use the Maxwell–Boltzmann distribution for sampling the agents’ sensing and movement distances. This statistical distribution describes the velocity dispersion of particles in idealized gas and plasma (Mandl, 1988), which makes it suitable for our use case and potentially for modeling other physical systems. Efficient sampling from this distribution is described, for example, by Hernandez (2017).
Other good candidates for Pdist include the exponential distribution (e.g., to model bacterial motion dynamics (Li et al., 2008)), log-normal (for particles under Brownian motion), Poisson, Gamma, and other distributions defined in R+. Convex combinations of multiple modes are also possible, if different distinct scales of features are expected in the modeled system. The weight of the tail in Pdist determines the range of feature scales MCPM will be able to reconstruct: In the simplest case of constant sensing and movement distances, the model will be sensitive to only one characteristic feature size.
When generating the sensing and movement distances for an agent, we use the same seed to sample Pdist—this is to avoid the cases when the movement distance would exceed the sensing distance. Even more so than in the directional sampling, here we find that the sensing distance should be significantly larger than the movement distance: A good starting point is a difference of about an order of magnitude.
Mutation probability Pmut
An alternative definition of Pbranch could assume that the agents move in a participating medium with density proportional to the deposit, in which case the agent dynamics would be subject to the eikonal equation and Pmut could be derived from the Beer–Lambert law. This is not our case, however, as collisions between the intergalactic medium particles do not have appreciable impact on the cosmic web geometry.
It is worth noting that by increasing the values of the sampling exponent we approach the limit case discussed in section 5, i.e., MCPM becomes identical with Max-PM for s → ∞.
7 Experiments
Our open-source implementation of MCPM called Polyphorm (Elek and Forbes, 2021) is written in C++ and uses DirectX with GPU compute shaders for executing the parallel propagation and relaxation steps, as well as facilitating an interactive visualization.
Our GPU implementation was developed on an NVIDIA TitanX, and is capable of fitting 10 million agents over 1,0243 deposit/trace grids at roughly 150–200 ms per model iteration, with an additional 30–100 ms needed for the visualization. Including the agents and rendering buffers, no more than 6 GB GPU memory is consumed at that resolution, using float16 precision to store both the deposit and trace grids. In all our experiments the model converges in fewer than 700 iterations (about 1–3 min of real time). It is thanks to the massive parallelism of modern GPUs, combined with the high locality of memory accesses in MCPM, that we are able to simulate 10M or more agents at interactive rates. This is essential for obtaining noise-free simulation results. Further performance data are available in Elek et al. (2021).
The performance scales close to linear with the number of voxels, and sub-linearly with the number of agents: 100M agents runs at 300 ms ceteris paribus, that is 10× more agents at only 2× the slowdown (compared to 10M agents). Using more agents has minimal impact on the model’s spatial details, but it does increase effective resolution by proportionally reducing the Monte Carlo noise levels.
All of the model’s parameters (specific values are provided in the following subsections, with reference to the pseudocode in Algorithms 1 and 2) can be modified at any point during the simulation, which allows for easy tuning of the resulting 3D polyphorm structures. Edits typically take several seconds to manifest, both visually and in the value of the loss function (in Section 7.3 we discuss the details of our employed loss function).
We rely on cosmological data sets to conduct the practical part of the evaluation, in which we focus on fitting the model. These include the Bolshoi-Planck large-scale cosmological simulation (Klypin et al., 2016) at zero redshift (i.e., corresponding to the present-day Universe) with 840K dark matter halos extracted by the Rockstar code (Behroozi et al., 2012). These are simulated data with a priori known ground-truth cosmic web structure, which is why we use that to calibrate the MCPM parameters. Our target data set for reconstructing the cosmic web structure comprises 37.6K galaxies (their locations and masses) from the Sloan Digital Sky Survey (SDSS) catalog (Alam et al., 2015), spanning redshifts between 0.0138 and 0.0318. For more details about the astronomical evaluation of our results, please refer to Burchett et al. (2020) and Simha et al. (2020). Here we will be concerned with the behavior of MCPM and the geometry generated by it.
For the visualizations, we use direct volume rendering (Beyer et al., 2015) (for field data) and particle rendering (for the agents). Sometimes, in addition to projections of the full 3D data, we isolate slices and cross sections where better insight is necessary. Full exposition of our visualization methodology is provided in Elek et al. (2021).
All the following experiments use the MCPM variant defined in section 6, unless explicitly mentioned.
7.1 Self-Patterning
Even though we primarily care about the data-driven mode of operation, we start by looking at polyphorms produced by the model without any prior stimuli. In the following examples we work with a 5403 simulation grid uniformly filled with 2M agents at the start of the simulation.
In line with Jones’s (2010) findings we observe that the free patterns created by MCPM are stable in the qualitative sense. That means that for any given parametrization the generated features will be comparable, but rarely static: The generated polyphorms are in constant flux. In the experiment in Figure 10 we explore a number of configurations that led to distinctive results. This experiment was conducted in a single session, changing one or multiple MCPM parameters at a time and waiting for the simulation to sufficiently converge after resetting the agents to random initial positions.
In addition to these meta-stable patterns, a plethora of transient ones is accessible. These occur when the model’s configuration changes significantly, as a sort of dynamic phase change. Figure 11 shows a few illustrative examples of serendipitous trips through the parameter space. Given that these forms are dependent on the momentary state of the model before the phase change, and are unique to both the original and modified parametrization, there is currently no way to enumerate them systematically.
Even though all the parameters impact the polyphorm, the most significant determinants of shape are the sensing angle, the sensing distance, and the respective movement parameters. These were the parameters we adjusted the most frequently during fitting. In the experiment illustrated in Figure 12 we therefore focused on exploring these fitting dimensions. Curiously, we observed correlation between the stability of the fit and how interconnected the network structure is: The polyphorms in the upper right part of the figure were much more persistent in time than those in the left.
Another important design dimension is to consider different behaviors of the agents. In particular, many natural systems contain repulsion as another important shaping force. Encoding this type of behavior in MCPM simply means reversing the sampling decision given by Pmut (Algorithms 1 and 2). Consequently, the agents will navigate toward lower deposit concentrations, i.e., locations not occupied by other agents or data points. More complex repulsion behaviors could also be encoded through specialized redesigns of Pdir and Pmut.
Figure 13 documents three MCPM runs in 1,0243 simulation grids and 10M agents, comparing the original attractive agent behavior, its repulsive inversion, and a 50/50 mix of the two behaviors. In line with the previous experiments, the attractive behavior leads to a network-like polyphorm that, after forming, remains relatively stable and slowly condenses to a thicker and sparser structure. On the other hand, the repulsive behavior forms cellular patterns around the sites where they were initialized. These patterns are only transient, however: As the agents continue repelling each other, the pattern breaks down and dissipates. Finally, the mixed behavior—where each agent randomly decides to be attracted or repelled by a deposit—leads to a structure that is a blend of both: sites form cells, neighboring sites merge together, and the resulting interconnected structure slowly diffuses. Figure 13 maps the temporal progression of each regime, and additionally shows a 3D view of the entire structure at a point when its features are most visible.
7.2 Uniform Stimuli
We now start introducing data in homogeneous configurations. We show that MCPM produces meaningful networks over points with different uniform distributions in 3D: in a regular lattice (Figure 14), distributed randomly (Figure 15), and following a low-discrepancy blue noise distribution (Ahmed et al., 2016; Figure 16).
In a regular lattice (Figure 14) MCPM robustly finds straight connections between points and, due to the multiscale movement of the agents (cf. section 6), also the diagonals of the sub-boxes. The model can be tweaked to prefer the diagonals simply by increasing SD to about 70–80 vox.
The fitting takes about 300 iterations of the model to converge (see the discontinuity in the energy plot in Figure 14, bottom, where we restarted the fitting process). Remarkably all the data points receive nearly identical density of the agents, as shown in the histogram.
Randomly distributed points (Figure 15) have a tendency to cluster, leading to a spurious reinforcement of some of the fitted network connections. As a result, fitting from uniformly initialized agents takes somewhat longer, about 500 iterations. The density histogram at data points is close to Gaussian, with a slight positive skew, and variance proportional to the density fluctuation of the data.
In Figure 16, using a low-discrepancy Poisson-like distribution of the data (Ahmed et al., 2016), we obtain perhaps the most organic of homogeneous patterns. Since the low-discrepancy points emulate the distribution of cells in biological tissues (and similar such phenomena), the resulting fit resembles a natural scaffolding, not unlike bone marrow (also cf. section 8.3). The structure also bears resemblance to rhizomatic systems (such as tree roots and mycelial networks).
Fitting takes about 500 iterations and results in a density distribution not unlike the random case, but with a slightly stronger positive skew. Tweaking the values of SE and then trying different distance sampling distributions would impact the acuity of the fitted network, and would expand or narrow the resulting distribution histogram.
7.3 Fit Energy (Loss Function)
MCPM solves the problem of reconstructing a volumetric transport network over a set of discrete points. The network should include all input points, perhaps except for outliers. We also want the weight of the input points to influence the strength of the network connections, i.e., to attract more agents.
7.4 Bolshoi-Planck Data (Simulated)
Our first evaluation of an actual reconstruction scenario is based on the Bolshoi-Planck (B-P) cosmological simulation at redshift zero (Klypin et al., 2016) with dark matter halos extracted by the Rockstar method (Behroozi et al., 2012). Halos are considered to be significant clusters of dark matter and are the formation sites of galaxies; The most massive halos are typically located at the intersections (knots) of filaments. Importantly, at the scales of our interest, the halos are sufficiently represented by their 3D position and mass. The full data set contains 16M halos, most of which have extremely low mass and are not significant in determining the overall filamentary structures. Therefore, we remove all halos with mass Mhalo < 5 × 1010 Msun, leaving approximately 840K of the significant ones. These halos span a cubic volume approximately 170 Mpc on a side.
Because dark matter is gravitationally coupled to conventional, baryonic matter (on large scales), it follows that their distributions will mostly have the same structure. Also, since the B-P data set is simulated, it does not suffer from key limitations of observed data, namely, data points missing due to occlusions or other observational issues. This makes the B-P data an ideal fitting prior, and as such we use them to calibrate the MCPM model’s parameters to further use with the observed SDSS data (section 7.5).
Figure 18 shows the MCPM fit to B-P data. We obtain a detailed polyphorm with features on multiple scales: In line with theoretical predictions, we observe characteristic loops and filaments on scales from a single Mpc to about 100 Mpc. The quasi-fractal nature in the data is also revealed—we observe self-similar knots and voids (lacunae) at different scales.
Fitting to the B-P data takes about 900 iterations and (similarly to the uniformly distributed point sets) results in a near-Gaussian density distribution at the halo locations, as seen in the histogram in Figure 18. The fits match the underlying data extremely well: More than 99.9% of the halos are covered by the reconstructed network, and as detailed in Burchett et al. (2020) we find a strong and reliable correlation between the MCPM trace and the ground-truth particle density of the B-P data set. This was crucial to establish the mapping between the MCPM trace density and the cosmic overdensity, to yield interpretable astronomical results. Further numerical evaluation of this correlation is provided in section 7.10. The progression of the fitting and the formation of the obtained geometric features is documented in Figure 19.
7.5 Sloan Digital Sky Survey Data (Observed)
Having verified in section 7.4 that MCPM is capable of fitting simulated structures arising from the current cosmological theory, we now move to the actual observed data. The target here is to reconstruct the cosmic web spanning 37.6K spectroscopically observed galaxies extracted from the SDSS catalog (Alam et al., 2015). The selected galaxies are within the redshift range [0.0138, 0.0318], equivalent to the distances between 187 and 437 million light years from Earth, which on cosmic scales is relatively close to home. Just as with the Bolshoi-Planck data, the galaxies at these scales are sufficiently represented as points in 3D space (derived from the celestial coordinates of each galaxy and its corresponding redshift-derived distance) with weights proportional to the galactic masses. After the spatial transformation of galaxy coordinates, the galaxies are contained within a volume with 280 Mpc in its largest dimension.
The main challenge faced here is the incompleteness of the data: Some galaxies elude observation because of either their low apparent brightness or their being too closely projected on the sky to other bright galaxies. Whatever the case, we need MCPM to be robust under these conditions. The most important precaution is to prevent the model from detecting filaments where there are not sufficient observations to support them. To this end, we disable the agent-emitted deposit to make sure that all detected structures are due to actual data.
Figure 20 shows the MCPM fit to the SDSS data. To obtain this result, we use the same model parameters as for fitting the Bolshoi-Planck data, thus facilitating a prior-informed feature transfer. In line with that, the fitted polyphorm ends up with structurally similar features—in spite of fitting to an order-of-magnitude fewer points, we observe filaments, knots, and voids on the expected scales. The density distribution of the network likewise exhibits a clearly near-Gaussian profile, as well as similar convergence characteristics (Figure 21).
The MCPM fit also reveals an artifact in the data: the so-called fingers of god, elongated structures pointing toward Earth (visible in the vertical 2D slices in Figure 20, right). These result from biased conversions of redshift to line-of-sight distance in the observations; compensating for them is therefore subject to corrections applied to the data itself—for more details please refer to the discussion in Burchett et al. (2020). Currently we do not apply any such corrections.
A key validation of this fit is based on correlating the reconstructed filament network with spectroscopic sightlines from distant quasi-stellar objects (quasars). This is the main result in Burchett et al. (2020), in which we found a strong positive correlation between neutral hydrogen absorption signatures and the MCPM-derived density in low-to-medium density regimes, and a significant anti-correlation in the high-density regime. The latter can be attributed to the lack of neutral hydrogen in the close proximity of galaxies, as hydrodynamical shocks and other processes associated with galaxy formation ionizes the hydrogen gas and thus suppresses its ability to absorb the light from quasars.
Crucially, what enabled the validation of the MCPM-generated map is the continuous nature of the fit. Rather than obtaining a binary polyphorm detecting the presence of agents, the probabilistic nature of MCPM yields a scalar trace field expressing the time-averaged density of the agents. We interpret this trace simultaneously as a density and a probability that the detected features are significant; as described above, we establish a monotonic mapping between the MCPM trace and the cosmic overdensity, which is a directly interpretable quantity in the astronomical context.
At this point, in reference to the requirements formulated in section 2, we can conclude that the model qualifies for the designated task of reconstructing the cosmic web map from observed data:
MCPM naturally synthesizes anisotropic filamentary features and, when configured correctly, finds a cohesive transport network over the input data.
MCPM outputs a continuous density map with features across multiple levels of scale and intensity.
MCPM can be easily pre-trained on prior data and transferred to target data through the optimized values of its (hyper-)parameters.
7.6 Distribution Analysis
In analyzing the polyphorm patterns recovered by MCPM, it is important to ask whether these are arbitrary: just random, haphazard connections between neighboring points.
A detailed look into the input data and the resulting fits shows that this is not the case: As analyzed in Figure 22, many individual data points come together to give rise to any single filament. This matches the prior domain knowledge: The average distance between neighboring galaxies is in the order of 100 kpc, while the cosmic web structures occur roughly between 1 and 100 Mpc. Crucially, it is the galaxies (or dark matter halos) whose distribution is aligned with the nodes of the cosmic web, not the other way around. Their locations therefore sample the much smoother high-level structure of the cosmic web, and this ultimately is what MCPM traces.
The above is of course subject to the configuration of the model. Our fits to both B-P and SDSS data use the sensing distance of 2.5 Mpc (as calibrated by fitting to the former). Structures significantly smaller than this would not be captured in the fit: They are essentially invisible to the agents. This is what makes the domain-specific knowledge valuable—it constrains the fitting process and prevents us from obtaining nonexistent structures.
Going one step further, in Figure 23 we zoom in on individual knots (intersections) of the reconstructed filaments. Here the visualization segments the trace field into three bins: low (blue), medium (green), and high (red) density values. The agents’ distribution here agrees with the astronomical intuition: The cores of the filaments and the knots receive the highest trace densities, which then fall off toward the outskirts of the observed structures.
7.7 Branching Probability
The agents’ decision whether to branch out, and with what probability, is one of the key determinants of the shape and complexity of the generated polyphorms. Here we explore this aspect of the model.
The variant of MCPM defined in section 6 uses a probabilistic formulation for the mutation decision, controlled by a single scalar parameter: sampling exponent (SE). Increasing the value of SE leads to sharper and more concentrated filamentary structures (Figure 24), which also increases the energy of the fit. On the downside, the number of data points not covered by the fitted polyphorm increases, and for the limit case of infinite SE we miss more than 3% of data. The fitting process therefore involves finding a tradeoff between acuity of the structure and its coverage. As a result, our fitting to the astronomical data sets uses moderate values of SE, usually between 2.2 and 4.
In the right panel of Figure 24 we compare MCPM to Max-PM (Jones, 2010). Here we used 8 directional samples to compute the maximum of the deposit for the agents to follow in every step. As discussed in section 5, this increases the structural acuity even further, optimizing the network by erasing smaller filaments, rounding off loops, etc. While a sensible behavior for Physarum, this behavior causes a dramatic reduction of the network connectivity. As a result, 32% of input data points are missed by Max-PM, making this a poor fit (in spite of the seemingly high fitting energy I, cf. section 7.3). Section 7.10 offers a more detailed quantitative comparison between MCPM and Max-PM.
7.8 Trace and Deposit Field Resolution
Here we examine the scaling of MCPM with respect to the resolution of the deposit and trace field representations.
In Figure 25 we compare three different field grid resolutions (with proportionally scaled numbers of agents), keeping the remaining model parameters fixed at the nominal values. We observe that while the amount of reproduced detail decreases with resolution—and the amount of MC noise increases as the number of agents goes down—the overall structure of the polyphorm remains largely intact. This is a desirable outcome, as we want to avoid a tight coupling between the precision of the representation and the geometry of the reconstructed network.
7.9 Data Set Decimation
The ability of the model to retain structure is important when it is trained on dense data (such as the Bolshoi-Planck halos) and then transferred to incomplete data from observations.
In Figure 26 we look at the structure retention when reducing the number of dark matter halos in the Bolshoi-Planck data set by thresholding the halo mass. By keeping only the heavier subset of halos we emulate the actual observations: The smaller, dimmer galaxies are less likely to be captured by SDSS and other such surveys. We see that even when decimating the input data by nearly an order of magnitude (from 870K halos down to about 100K), MCPM finds a good portion of the filaments—all of the larger ones and some smaller ones. This gives us a clue about how much structure might be missing when fitting to the SDSS data. In Burchett et al. (2020) we explain the process of matching the density of these two distinct data sets in more detail.
7.10 Quantitative Comparison Between MCPM and Max-PM
We conclude this section with a detailed numerical study of MCPM in comparison to Max-PM (Jones, 2010), complementing the preceding qualitative results. In this experiment we fit both models to the Bolshoi-Planck simulation data (section 7.4). Focusing on the accuracy of the fit, we do not apply any thresholding and retain all 12M dark matter halos, with the ultimate aim of matching the reference density field produced by the simulation. In contrast to the halo catalog, the reference field represents the actual density of the simulated matter tracers, binned in a 1,0243 grid.
To measure the extent to which MCPM and Max-PM match the reference data, we repeat the same procedure used to obtain the cosmological overdensity mapping in Burchett et al. (2020). We first fit to the full 12M-halo catalog using both MCPM and Max-PM in a 1,0243 simulation grid (which allows us to establish a one-to-one mapping to the reference grid without any interpolation). We then standardize both fits to make sure their means and variances match, thus obtaining comparable distributions. Finally, we compute 2D histograms binned over the fitted and the target density ranges, recording the mutual correlations between the corresponding voxels in 3D space.
The results are summarized in Figure 27. We observe two key differences between MCPM and Max-PM. First, MCPM covers a significantly wider range of the reference density field, roughly by 5 orders of magnitude in the low end of the distribution. In practice, this means that MCPM can capture much fainter features of the target structure and characterize smaller substructure within the cosmic web. This mirrors our findings from section 7.7, where we see a significant portion of the halos being missed by the Max-PM fit. Second, the MCPM fit has considerably better precision compared to Max-PM. This is apparent from the spread of the histograms, which we visualize by plotting the percentile bounds of the per-row conditional distributions.
From these results, we conclude that MCPM is able to reconstruct an accurate proxy of the reference density field from very sparse data, i.e., using 12M weighted halo points compared to the 1B voxels in the reference density grid. Compared to Max-PM, MCPM improves the cosmic web density reconstruction by several orders of magnitude, which in practical terms means extending the reconstruction from only the dense structures (knots and major filaments) all the way to the edges of cosmic voids.
8 Discussion
The presented MCPM model is suitable for revealing geometric structures in incomplete data, as demonstrated by mapping the cosmic web filaments in SDSS galaxy data. It is especially effective if a training data set is available to determine the prior parameters (as we did with fitting to Bolshoi-Planck dark matter halos) and if expert knowledge is available to determine the appropriate probability distributions (as we did by using the Maxwell–Boltzmann distribution for agent distance sampling).
Fitting MCPM to data is a robust, reliable process. For any given parametrization the model eventually converges to the same outcome (minus the MC variance), regardless of the initial distribution of the agents. To our surprise, we have not observed any tendency of the model to get stuck in local optima.
We now conclude the article with an open-ended discussion about the current state of the model and what research directions we plan to explore in the future.
8.1 Mapping the Cosmic Web
In hindsight, using a virtual Physarum machine to reconstruct the cosmic web seems like a very intuitive thing to do: MCPM has already provided a viable solution for several astronomical use cases (Burchett et al., 2020; Simha et al., 2020). We continue to inquire into the implications of this possibility. This includes:
further ways to validate the model’s performance in the presented tasks, as well as in new contexts;
continuing the design pathway toward a more specific (accurate) variant of MCPM tailored for astronomy;
developing deeper mathematical foundations for these models; and
topological analysis for graph extraction (directly applicable to catalogue the cosmic web filaments, but also in other tasks; see below).
This applies also to using MCPM for examining and quantifying other astronomical data sets. One example is the EAGLE data set (Schaye et al., 2015), which includes billions of simulated macro-particles (tracers) of the intergalactic medium but over a smaller high-resolution volume and including gas and the resulting complex hydrodynamical astrophysics. For a demonstration, in Figure 28 we fit to a single page (a sub-volume of a single redshift snapshot) of these data: a cubic region with roughly 4 Mpc across (i.e., two orders of magnitude smaller than the Bolshoi-Planck data). This is the smallest scale where the cosmic web starts becoming apparent—on the edge of circumgalactic halos, which are well resolved in these data. The structures that MCPM finds here are intriguing, and further investigation involving observational data will show whether they are real and significant.
8.2 Temporally Varying Data
In this article, we have focused on static input data, and sought an equilibrium solution that best represents it. Of course, Physarum itself does not behave this way—it never truly halts its growth and remains in a constant state of flux during its plasmodium stage. This behavior happens in response to environmental changes around the organism.
For MCPM, the above translates to handling dynamic, time-varying data. Cosmological inquiry routinely uses time-varying data, since the evolution of the Universe can be approximately traced by matter halos (section 7.4) that vary in position, mass, and numbers. Beyond cosmology, such data are produced by models of population dynamics, traffic, epidemiology, neuronal networks, and so forth.
As an illustrative example, we consider the problem of 3D edge bundling in complex network visualization (Bach et al., 2017; Ferreira et al., 2018; Holten & Van Wijk, 2009). We used MCPM to fit and visualize a human brain connectome (Xu et al., 2019), a set of 105 high-level nodes representing different subregions of the brain (so-called modes). To simplify this experiment (Figure 29), we disregard the connectome edges and simply weigh each node by its overall connectivity strength. The result is a fuzzy activity map. In future, adding the edges as a directional guide for the agents could provide us with a more intricate spatial map with hierarchical grouping of connections and color differentiation between the modes.
Currently, MCPM can be used for time-varying data if the temporal evolution of the data set is matched with the model dynamics. For dynamic data with stored snapshots—such as the cosmological simulations—this can be achieved by fitting to a weighted combination of two adjacent snapshots, similar to keyframing in computer animation. In the more general case of dynamically evolving data (e.g., produced in real-time by a simulation), the sensing decisions of the agents would need to incorporate additional knowledge about the data, for instance, by extrapolating the current state of the data set forward in time.
8.3 Biomimetic Modeling
Whether the polyphorm is a natural 3D generalization of the real-world Physarum’s plasmodium is debatable. But undeniably the patterns generated by MCPM do have an organic quality to them.
For instance, networks generated on low-discrepancy sampled point sets—even with spatially varying point density—resemble organic scaffoldings found in diatoms, corals, and bone marrow Figure 30. This could have applications in tissue engineering (Derby, 2012; Thrivikraman et al., 2019), either via 3D printing or other fabrication methods.
3D printing in general would benefit from organically inspired internal object scaffoldings, since the predominantly used periodic lattices do not lead to even stress distribution (Safonov & Jones, 2017). MCPM could be applied in this context to generate aperiodic quasi-crystalline structures that mimic scaffoldings produced by topological optimization methods. Since MCPM is designed with the intent of being calibrated on reference data, we could find structurally sound parametrizations by fitting to topologically optimized scaffoldings, and then apply these efficiently during the preprocessing of each 3D print with sensitivity to their peculiar geometries.
Beside scaffoldings, it would be very interesting to see whether other natural networks—like mycelium or even neuronal networks—could be modeled by specialized variants of MCPM. And similarly to the real organism, adding a repulsive gradient could lead to further and more expressive types of polyphorm geometries.
8.4 Artificial Networks
Many kinds of constructed (abstract) networks are conveniently represented as graphs (e.g., social, logistic, dependency, decision, automaton). Some network optimization problems are also amenable to Physarum-based solutions (Sun, 2017). Especially problems with a lot of inherent uncertainty like traffic routing and disease spreading could benefit from MCPM, thanks to its customizable probabilistic form.
In Zhou et al. (2020) we investigate the application of MCPM to language embedding data. Language embeddings (Mikolov et al., 2013; Vaswani et al., 2017) are compressed high-dimensional representations produced by transformer neural nets. Their usefulness and predictive power are unmatched in natural language processing, as demonstrated by the recent GPT-3 model (Brown et al., 2020). Our work (Zhou et al., 2020) presents early evidence that embeddings could have an internal structure described by an optimal transport network; see the example in Figure 31.
Given that MCPM can produce robust continuous networks up to three dimensions, we see a great potential in generalizing it to higher dimensions, which now becomes feasible thanks to the stochastic formulation and binary sampling stencil. Visualizing and automatically navigating high-dimensional latent spaces such as the language (and other kinds of) embeddings would be very beneficial, as insights into these increasingly popular compressed representations are still lacking.
8.5 Structure-Sensitive Learning
MCPM bears close resemblance to other self-organizing methods for unsupervised learning. The self-organizing map (Kohonen, 1982) and neural gas (Fritzke, 1994; Martinetz & Schulten, 1991) are good examples of such methods.
The self-organizing map is a topology-preserving dimensionality reduction algorithm: Similar to UMAP, it projects high-dimensional training data to a low-dimensional space (typically 2D) while preserving neighborhood relationships. During the training process the map reorganizes itself by locally “gravitating” toward clusters of data in the original high-dimensional space. The neural gas network operates similarly; however, its topology is not predetermined—instead, neural gas assumes the presence of a high-dimensional topological structure which it tries to adapt to.
Compared to these methods, MCPM does not maintain explicit connectivity between the agents; instead, they flow freely in the domain and all information exchange is done through the deposit field. As a result, the topology of the fit is implicit in the continuous trace density estimate. Recovering the topology explicitly (e.g., as a graph) would be valuable both in the context of the cosmic web reconstruction (to provide a classification mechanism for the filaments and knots) but also outside of it; Adapting the neural gas algorithm to work on density fields seems like a viable option here.
The advantage that MCPM offers over these methods is the continuous nature of the reconstructed network, permitting its interpretation as a density field (Burchett et al., 2020; Simha et al., 2020) or a throughput field (Zhou et al., 2020). This is enabled by the core design of MCPM: The solution is not a fixed point in the search space, but rather, an average of the possible dynamic agent trajectories.
8.6 MCPM as a Complex Dynamic System
A noteworthy aspect of the model is that we often find the best-fitting polyphorms near tipping points of its dynamic behavior. This happens especially with regard to changing the scale-related parameters like the sensing distance (SD). In a typical scenario we would start fitting with high values of SD, then gradually decrease it. That in turn decreases the size of reconstructed features, up to the point where the system reaches a supercritical state and “dissolves” in Monte Carlo noise. That tipping point would usually be a local energy minimum. It would be interesting to further examine this aspect of the model more formally.
Based on this, we speculate that MCPM could detect its tipping points, for example, through a redesign of the loss function (section 7.3). This could lead to a specialized Physarum-based AI. Such an enhanced model would be able to locally modify its parametrization in order to fit to heterogeneous data. One possible path toward that would be to implement a spatial voting scheme where the agents would locally contribute information from their recent exploration of the domain and contribute their individual parameter estimates.
9 Conclusion
We have presented Monte Carlo Physarum Machine (MCPM), a stochastic computational model inspired by the growth and foraging behavior of Physarum polycephalum (slime mold). MCPM takes as its input a weighted point cloud in 2D or 3D, and produces a continuous transport network spanning the point data as an output. The output is represented by a scalar density field expressing the equilibrated spatial agent concentration, and we refer to this entity as a polyphorm to capture both its geometric quality as well as origin in a single term.
The main contribution of this work with regard to its predecessor (Jones, 2010) is the probabilistic approach for modeling the agent behavior. This makes the model configurable for a range of applications. The specific version of MCPM analyzed in this article has been designed for application in astronomy, aiming to make sense of large cosmological data sets and reconstruct the filamentary structure of the cosmic web. Thanks to MCPM, we have made new discoveries relating to the structure of the intergalactic medium, which composes the cosmic web (Burchett et al., 2020; Simha et al., 2020). We will continue this line of inquiry and seek out other applications where MCPM can be of use, such as scientific visualization (Elek et al., 2021) and natural language processing (Zhou et al., 2020).
Our open-source implementation of MCPM is called Polyphorm (Elek & Forbes, 2021). Polyphorm runs primarily on the GPU, taking advantage of the inherent parallelizability of MCPM, and provides an interactive reconstruction and visualization experience. We invite the reader to use the software, import their own domain-specific data, and enrich our understanding of where such a model could be meaningfully applied.
There seems to be a degree of universality to MCPM—our experience so far has been that people with very different backgrounds have been able to identify with the generated polyphorm geometries in their specific ways. This is not surprising, since the agents approximately follow paths of least resistance. This energy-minimization property will likely be part of many natural or abstract systems.
We believe that this work reinforces the possibility of Nature as an information processing entity—a computable Universe. The piece of evidence we contribute is that the small set of mathematically coherent rules presented here might apply to such different phenomena at vastly divergent scales.
Acknowledgments
The authors wish to thank Jan Ivanecký for developing an early prototype of the Polyphorm software and valuable technical assistance during the development, Daisuke Nagai for insightful advice, and the anonymous reviewers whose feedback strengthened the article. We gratefully acknowledge the NVidia Hardware Grant for the donation of the TitanX development GPU.