Abstract

We show that self-replication of a chemical system encapsulated within a membrane growing from within is possible without any explicit feature such as autocatalysis or metabolic closure, and without the need for their emergence through complexity. We use a protocell model relying upon random conservative chemical reaction networks with arbitrary stoichiometry, and we investigate the protocell's capability for self-replication, for various numbers of reactions in the network. We elucidate the underlying mechanisms in terms of simple minimal conditions pertaining only to the topology of the embedded chemical reaction network. A necessary condition is that each moiety must be fed, and a sufficient condition is that each siphon is fed. Although these minimal conditions are purely topological, by further endowing conservative chemical reaction networks with thermodynamically consistent kinetics, we show that the growth rate tends to increase on increasing the Gibbs energy per unit molecular weight of the nutrient and on decreasing that of the membrane precursor.

1 Introduction

At the basis of life, chemical self-replication is believed to require specific features such as autocatalysis or metabolic closure. In most previous theoretical work, such features were either made explicit in the chemical reaction network or were emergent as the result of complexity. Examples of the former are Ganti's chemoton, where molecules and chemical pathways are assigned specific roles and functions [9], and Cornish-Bowden's instantiation of Rosen metabolism-repair (M,R) systems, where explicit pathways account for the synthesis of all enzymes [20, 19]. Examples of the latter are Kauffman's autocatalytic protein sets, in which autocatalysis emerges as the result of complexity in random Boolean networks [12], and Kaneko's protocell toy model in which some molecules are assigned specific roles, but all other cell constituents interact and are produced through a large random catalytic chemical reaction network [11, 13]. Our own previous work is also an example of the latter: Building upon Kaneko's work, two molecules (nutrient, membrane precursor) were assigned specific roles, and all other constituents were produced by a large random chemical reaction network [2].

Yet, experimental attempts towards artificial life and accompanying modeling work rely upon much simpler chemistries, typically consisting of only a couple of constituents and reactions [14, 6, 15]. This raises the question of what the minimal conditions for a self-replicating chemical system truly are.

In this work we show that self-replication of a chemical system encapsulated within a membrane growing from within is possible without any explicit feature such as autocatalysis or metabolic closure, and without the need for their emergence through complexity. We use a protocell model relying upon random conservative chemical reaction networks (CRNs) with arbitrary stoichiometry, and we investigate the protocell's behavior, varying the number of reactions. We elucidate the underlying mechanisms in terms of simple minimal conditions pertaining only to the topology of the embedded CRN. A necessary condition for protocell stationary growth is that each moiety must be fed, and a sufficient condition is that each siphon is fed. These conditions are consistent with previously reported mathematical proofs [3].

Further endowing the CRN with thermodynamically consistent kinetics, we show that the growth rate increases on increasing the Gibbs energy per molecular weight of the nutrient and on decreasing that of the membrane precursor. We propose an approximate thermodynamical model to account for this behavior. This approximate model requires knowledge of the CRN topology, the Gibbs energies of chemical constituents, and the nutrient diffusion and membrane parameters, but does not require knowledge of the CRN kinetics.

This article is organized as follows. Section 2 gives definitions and methods. Section 3 gives results. Section 4 is devoted to a discussion, and Section 5 is the conclusion.

2 Definitions and Methods

In this text, a vector v is written in boldface. Its ith coordinate is denoted vi. The ith coordinate of a vector vk, carrying its own index k is denoted vk,i.

2.1 Definitions

Moieties Following standard practice in CRN theory [8], a stoichiometry matrix S may be associated to any CRN. Only conservative CRNs are considered in this work, with all material fluxes (nutrient, membrane precursor incorporation or dilution) made explicit in the protocell model. There thus exists a strictly positive mass vector m such that mTS = 0 (mi is the molecular mass of species Ai). There may exist more than one solution to mass conservation. The set of mass vectors compatible with mass conservation is {m | mi > 0, i = 0,…, N − 1, and mTS = 0}. It is a pointed convex cone having p linearly independent generating vectors {bk}k=0,…, p−1, which constitute a basis of Ker(ST) [21]. Moieties are the elements of such a set of generating vectors and correspond to positive linear combinations of concentrations that are left invariant by the chemical reactions. The support of a moiety b is the subset of chemical species Ai along which b has nonzero components.

Siphons The concept of siphon was first applied to CRNs to study persistence [1]. A siphon Z is a subset of species such that for each species Ai in Z and every reaction where Ai appears as product, at least one of the reactant species also belongs to Z. A siphon is minimal if it does not contain strictly any other siphon. Siphons are essentially the sets of chemical species whose absence cannot be compensated by the chemical reactions.

2.2 Random Network Generation

Topology Any reaction with arbitrary stoichiometry may be decomposed into elementary monomolecular or bimolecular steps, at the expense of increasing the number of chemical intermediate species (even if short-lived). These elementary reactions typically represent elementary physico-chemical events, most of which only involve one or two reactants. This is the reason why we only consider reactions of the kind
formula
where {Ai}i=0,…, N−1 is the set of all N reacting chemical species intervening in the network.

By convention, whenever we refer to a total number of reactions given by R, we choose to count forward and reverse reactions as a single reaction (all chemical reactions are assumed reversible for consistency with thermodynamics [7]). The pool of elementary reactions contains transformation reactions plus association–dissociation reactions (allowing 2AiAk, and forbidding the non-conservative Ai + AkAi), which makes a total of elementary reactions. To generate a random CRN, an initial reaction is first randomly picked in this pool, followed by additional random choices. Compatibility with mass conservation is checked upon each addition by running a previously published standard algorithm [21], which relies upon computing the set of generating vectors for the pointed convex cone of admissible mass vectors (moieties). If a pulled reaction is not compatible with the existing set, it is rejected from this and any subsequent choice. As the number of reactions in the network under construction increases, the number of moieties is reduced, and eventually reaches unity, at which point the rank of the network is N − 1. We denote by Rth1 the number of reactions for which this first threshold is reached, and call the corresponding CRN the threshold-1 network. By construction, Rth1N − 1, because the rank of an N × R stoichiometry matrix S is at most R. Beyond that threshold, it suffices to check that each additional reaction satisfies mass conservation for the single admissible mass vector (single, that is, up to a multiplying factor). At some point, a maximum size is reached when none of the remaining reactions is compatible with mass conservation. We denote by Rmax the corresponding number of reactions, and call the corresponding CRN the maximum-size network.

Minimal siphons are also monitored during this construction process. Increasing the number of reactions increases the number of species participating in minimal siphons, and eventually the only siphon is the full set of species. We denote by Rth2 the number of reactions for which this second threshold is reached, and call the corresponding CRN the threshold-2 network. Consistent with the proof given in  Appendix 1, we observe that Rth2Rth1.

By this random generation process we find that statistical distributions are (i) narrow for Rth1, (ii) broader for Rth2, and (iii) even broader for Rmax. Generating a sample of 1000 networks with N = 10 chemical species, and pulling any elementary reaction with equal probability, we found 9 ≤ Rth1 = 10 ± 1.6 ≤ 21; 9 ≤ Rth2 = 13 ± 4 ≤ 43; and 14 ≤ Rmax = 66 ± 22 ≤ 108. The boundary numbers are the minimum and maximum values in the sample, and the number after ± designates the standard deviation. These numbers are to be compared with a total of 495 elementary reactions in the entire pool of elementary reactions. Figure 1 shows these statistical distributions. Although Rth2 = Rth1 for 393 out of the 1000 pulled networks, there were 12 networks for which the single siphon threshold could not be reached even for R = Rmax and for which no Rth2 could be defined.

Figure 1. 

Histogram distribution of Rth1 (left, bin size = 1), Rth2 (center, bin size = 2), and Rmax (right, bin size = 5) for a sample of 1000 random CRNs.

Figure 1. 

Histogram distribution of Rth1 (left, bin size = 1), Rth2 (center, bin size = 2), and Rmax (right, bin size = 5) for a sample of 1000 random CRNs.

Kinetics Once the network topology has been defined, mass-action kinetics are then assigned. A first step consists in assigning random standard chemical potentials {μ0, i}i=0,…, N−1 (also known as standard Gibbs free energies of formation) to the species {Ai}i=0,…, N−1. The second step consists in using these to determine the forward versus reverse directions (under standard conditions) for each reaction by comparing the sum of reference chemical potentials for products minus that for reactants, Δμ. The third and final step consists in assigning random kinetic coefficients for forward reactions, and computing kinetic coefficients for reverse reactions, consistently with thermodynamics, using
formula
where c is the standard concentration for which standard chemical potentials are defined, δ is the difference in molecularity between forward and reverse, R is the ideal gas constant, and T is the absolute temperature. It is numerically verified that such random CRNs endowed with thermodynamically consistent kinetics verify detailed balance and minimize their free energy, for a closed system at equilibrium.
Numerical parameters were chosen as follows, with U(x, y) designating a random real number with equal probability between x and y. Taking the reference concentration c = 1 M, random reference chemical potentials were chosen as μ0, i/RT = U(−7.5, 7.5) so that the maximum variation in chemical potential approaches that for ATP hydrolysis. Random forward kinetic coefficients were taken spanning two orders of magnitude through k = kavg × 10U(−1,1) with kavg = 1 × 102 s−1 for monomolecular forward reactions and kavg = 1 × 104 M−1·s−1 for bimolecular forward reactions, consistent with typical kcat and kcat/Km for enzymatic processes. The choice of typical kcat and kcat/Km as average kinetic coefficients is justified as follows. Actual metabolic networks are made of enzymatic reactions, which may be decomposed as an association elementary step followed by a dissociation elementary step:
formula
Michaelis-Menten kinetics are derived by approximating the net balance across these two reactions (each considered with mass-action kinetics) under the following assumptions: The second step should be strongly forward biased with E : SE + P (dissociation) having a significantly large kinetic coefficient kcat; and the sum of kinetic coefficients for E : SE + S and E : SE + P (dissociations) divided by the kinetic coefficient for E + SE : S (association) should be a sufficiently low concentration Km. This corresponds to monomolecular (to bimolecular) forward elementary reactions having kcat (kcat/Km) as typical kinetic coefficient.

When generating multiple CRNs, a new random set of standard chemical potentials and a new random set of forward kinetic coefficients are generated for each CRN. Figure 2 gives an example random CRN.

Figure 2. 

Example of a random conservative CRN, with N = 10. Here rank(S) = N − 1, and the single siphon thresholds were both reached for Rth1 = Rth2 = 9 reactions (wide blue lines), while maximum size was reached for Rmax = 71 reactions (wide blue and black lines). Each reaction is bidirectional, although for clarity only arrows indicating forward reactions are shown. Single mass vector components are {mi}i=0,…,9 = {2, 2, 4, 2, 1, 4, 3, 1, 2, 2}.

Figure 2. 

Example of a random conservative CRN, with N = 10. Here rank(S) = N − 1, and the single siphon thresholds were both reached for Rth1 = Rth2 = 9 reactions (wide blue lines), while maximum size was reached for Rmax = 71 reactions (wide blue and black lines). Each reaction is bidirectional, although for clarity only arrows indicating forward reactions are shown. Single mass vector components are {mi}i=0,…,9 = {2, 2, 4, 2, 1, 4, 3, 1, 2, 2}.

2.3 Protocell Model

In this section, we present a protocell model that embeds the previously considered CRN within a membrane synthesized from within. Two species participating in the CRN play special roles: the nutrient Anu flowing by diffusion across the membrane, and the membrane precursor Ame. Figure 3 gives a schematic representation of the protocell model.

Figure 3. 

Schematic of the protocell model. The nutrient Anu flows bidirectionally across the membrane by diffusion. The membrane results from the self-assembly of the membrane precursor Ame. The membrane surface area grows as the result of unidirectional Ame incorporation in the self-assembled membrane.

Figure 3. 

Schematic of the protocell model. The nutrient Anu flows bidirectionally across the membrane by diffusion. The membrane results from the self-assembly of the membrane precursor Ame. The membrane surface area grows as the result of unidirectional Ame incorporation in the self-assembled membrane.

Assumptions

  • 1. 

    Constant surface-area-to-volume ratio : An example of cellular shape compatible with this assumption is that of a thin filament. This is a simplifying assumption, as in the most general case this ratio should only be constrained to be periodic, ρ(t + T) = ρ(t), where T is the cell cycle period, with the additional constraint so that at the end of a cell cycle, one ends up with exactly the material required for two new identical cells.

  • 2. 

    Self-assembly of a precursor Ame in a structured membrane: Incorporation of Ame from inside the protocell into the growing membrane is kinetically controlled, with kinetic coefficient Kme per unit area, or kme = ρ0Kme per unit volume, owing to assumption 1. Denoting by Nme the number of molecules per unit area in the self-assembled membrane, Cme = ρ0Nme is the effective membrane concentration as if all self-assembled molecules were dissolved in the cell volume.

  • 3. 

    Semipermeability of self-assembled membrane to a nutrient Anu flowing by passive diffusion: The resulting nutrient influx is Finput,nu per unit area, or finput,nu = ρ0Finput,nu per unit volume, owing to assumption 1.

  • 4. 

    Homogeneous concentrations: All chemical species are assumed to be homogeneously distributed, and all intracellular diffusion effects are neglected. This is a simplifying assumption, as one should expect the concentration of the nutrient Anu (respectively, the membrane precursor Ame) to be highest (lowest) near the membrane that acts as an effective source (sink) for such chemical species.

  • 5. 

    Constant outside nutrient concentration, even in the presence of a growing protocell: This would typically be the case with an outside growth medium whose volume is large compared to that of the protocell.

Cell division is not taken into account in this work. We shall consider the protocell as compatible with self-replication if the concentration vector trajectory converges asymptotically either towards constant concentrations (protocell stationary growth) or towards periodic concentrations with periodicity T. In the latter case, an additional requirement is that the cell surface area (and volume ) be doubled every T, , where is the instantaneous growth rate.

Ordinary differential equation (ODE) system With the above assumptions, the ODE system governing the time evolution of concentrations inside the protocell is given by
formula
where c is the concentration vector inside the protocell, S is the stoichiometry matrix, f = f(c) is the reaction flux vector giving the reaction rates as a function of the concentration vector c, finput is the nutrient flux vector with sole nonzero component finput,nu along Anu, foutput is the membrane precursor incorporation flux vector with sole nonzero component foutput,me along Ame, and λc represents the dilution factor with growth rate λ given by
formula
where is the protocell volume (surface area). The latter equality in the above equation results from assumption 1.
From Assumption 3 we have
formula
where is as an effective diffusion constant and cout,nu is the constant outside nutrient concentration.
From assumption 2, where kme was defined as an effective kinetic coefficient, we have
formula
λ can be related to foutput,me and Cme as follows. As Ame gets incorporated into the growing membrane with rate (per unit area) Fme = Kmecme, the membrane area grows as
formula
Equations 26 thus give the ODE system governing the time evolution of concentrations for the growing protocell. The CRN is characterized by its stoichiometry matrix S and by the reaction kinetics represented by the dependence of the reaction flux vector on the concentration vector, f = f(c). The membrane is characterized by kme and Cme accounting for precursor incorporation, as well as by accounting for nutrient diffusion. And the environment is characterized by the outside nutrient concentration cout,nu.

Minimum conditions for stationary growth We have previously proved necessary and sufficient conditions for protocell stationary growth [3]:

  • 1. 

    A necessary condition is that each moiety b must be fed. Otherwise, all species in the support of b go extinct by dilution, as there is no material influx into this support.

  • 2. 

    A sufficient condition for the existence of a fixed point is that each siphon contains the support of a moiety (that is fed). Extending the results in [3], we have recently proved that a weaker sufficient condition for the existence of a fixed point is simply that each siphon is fed [4].

These conditions hold for any conservative CRN endowed with any kinetics and for any membrane parameters. Although this is not considered in the present numerical analyses, the proofs in [3] hold even if some species leak out through the membrane.

2.4 Numerical Methods

In order to determine the protocell system trajectory for various combinations of membrane (, kme, Cme) and environmental (cout,nu) parameters, a set of nominal parameters is first chosen, and the parameter space is then explored by varying nominal parameters over several orders of magnitude.

Nominal parameters Nominal (, cout,nu, kme, Cme) values were selected that we believed to make biological sense for E. coli, chosen as an arbitrary reference organism. These values are , cout,nu = 1 × 10−3 M, kme = 1 × 10−3 s−1, and Cme = 5 × 10−2 M. The chosen cout,nu is a typical glucose concentration used in E. coli growth medium. The rationale for the chosen Cme value is the following: The total dry weight in the exponential growth phase is estimated to be 500 fg, out of which 8% corresponds to the membrane [17]. This gives a total dry weight of 40 fg for the membrane, corresponding to a density of 40 g·l−1 for an estimated 1-μm3 total cell volume. The membrane is primarily constituted of phosphatidylethanolamine with molecular weight 744 g·mol−1 [10]. This gives . As there is no equivalent direct way of estimating a biologically sensible kme, we proceeded indirectly. The total metabolite concentration in E. coli is 300 mM [17]. We arbitrarily assumed that membrane precursors constitute 10% of this pool, that is, cme ≈ 30 mM. The cell division time is typically 40 min, corresponding to a growth rate of 4 × 10−4 s−1. This is given for our model by Equation 6, from which we extract kme = λCme/cme ≈ 1 × 10−3 s−1. Last, was chosen so that the asymptotic value of cme approximately matches the above expected cme.

Approach for exploring the parameter space

  • 1. 

    The protocell trajectory was first integrated for the N × N = 100 possible (Anu, Ame) combinations using the above nominal parameters (, cout,nu, kme, Cme).

  • 2. 

    Anu (Ame) was then fixed as the species that maximizes (minimizes) μ0, i /mi. As explained in  Appendix 3, this (Anu, Ame) combination tends to maximize the growth rate. Then the parameters (, cout,nu, kme, Cme) were varied around their nominal values the following way:

    • (a) 

      First, (kme, Cme) were kept constant at their nominal values, and the ODE system was integrated for various (x, y) = (, cout,nu) values.

    • (b) 

      Second, (, cout,nu) were kept constant at their nominal values, and the ODE system was integrated for various (x, y) = (kme, Cme) values.

      In each case, three values of x (generated in three-order-of-magnitude variation around the nominal value), for each of which nine values of y (generated in one-order-of-magnitude variation around the nominal value) were tested.

Each of the 154 trajectories (100 trajectories for the (Anu, Ame) span + 27 trajectories for the (, cout,nu) span + 27 trajectories for the (kme, Cme) span) was integrated using three different choices of initial conditions: all ci equal to 1 × 10−9 M; all ci equal to 1 M; and all ci equal to their equilibrium concentrations cequ,i (cequ as defined in  Appendix 2). These various initial conditions are referred to as low, high, and medium, respectively.

This characterization process was repeated for the maximum-size, the threshold-2, and the threshold-1 (if different from threshold-2) networks. Additional characterizations were also carried out after removing the last-added reaction from the threshold-1 network.

ODEs were handled in vectorial form using the odeint native ODE solver from the scientific Python software package (Python version 2.7.3) with dynamical adjustment of the time step.

Investigated CRNs In addition to the reference CRN example shown in Figure 2, a consecutive series of ten CRNs out of the sample of 1000 networks with N = 10 chemical species was also fully characterized. This series was chosen so as to include one of the peculiar CRNs exhibiting shorter siphons even at maximum size. A table summarizing qualitative results for all 1 + 10 tested CRNs is given in the following section. Figures showing quantitative results are only given for the reference example. The corresponding figures for the other 10 CRNs are given in Supplementary Material.

3 Results

3.1 Qualitative

Qualitative findings for the 1 + 10 tested CRNs are given in Table 1. CRNs are indexed by their original numbering when generating the sample of 1000 random CRNs, but have been grouped by similarity in features and behavior.

Table 1. 

Summary of qualitative protocell results for the 1 + 10 tested CRNs. The top line gives the CRN index. CRN index 1 is the reference example shown in Figure 2. CRN index 231 has a shorter siphon even at maximum size. CRNs are grouped by similarity in features and behavior. Failed combinations are for medium or high initial conditions. The notation (p, qr) for failed (x, y) = (, cout,nu) or (kme, Cme) combinations refers to failed y-indices from q to r for the x-index p, out of the 3 × 9 tested (x, y) combinations. “n.” stands for none.

CRN index1236239230231233234235238232237
Rth1 13 13 11 11 10 11 
Rth2 13 13 12 n. 13 11 11 15 15 15 
Rmax 71 85 87 75 25 53 67 61 56 108 75 
Ai outside minimal siphon n. n. n. A9 A4 A4A2 A5A3 A0 A3
     A7  A6   A5
          A6
          A7 
Failed (nu,me) for R = Rth1 n. n. n. (9,1), (4,0), (4,1), (2,1), (5,5), (3,3), (0,1), (3,1), 
   (9,6), (4,1), (4,2), (2,2) (5,6), (3,8) (0,3), (3,6), 
   (9,9) (4,4), (4,3),  (5,7),  (0,6) (3,7), 
    (4,6), (4,4),  (5,8),   (3,8), 
    (4,8), (4,5),  (5,9),   (5,1), 
    (4,9) (4,6),  (6,3),   (5,5), 
     (4,7),  (6,5),   (5,8), 
     (4,8),  (6,6),   (6,1), 
     (4,9)  (6,7),   (6,6), 
       (6,8)   (6,7), 
          (6,8), 
          (7,1), 
          (7,6), 
          (7,7), 
          (7,8) 
Anu for max.  A0 A3 A8 A3 A8 A9 A6 A3 A5 A0 A3 
Failed (, cout,nun. n. n. n. n. n. n. n. n. (1,1–5), All 
         (2,1–5),  
         (3,1–5)  
Failed (kme,Cmen. n. n. n. n. n. n. n. n. (2,1–9), (2,1–9), 
         (3,1–9) (3,1–9) 
CRN index1236239230231233234235238232237
Rth1 13 13 11 11 10 11 
Rth2 13 13 12 n. 13 11 11 15 15 15 
Rmax 71 85 87 75 25 53 67 61 56 108 75 
Ai outside minimal siphon n. n. n. A9 A4 A4A2 A5A3 A0 A3
     A7  A6   A5
          A6
          A7 
Failed (nu,me) for R = Rth1 n. n. n. (9,1), (4,0), (4,1), (2,1), (5,5), (3,3), (0,1), (3,1), 
   (9,6), (4,1), (4,2), (2,2) (5,6), (3,8) (0,3), (3,6), 
   (9,9) (4,4), (4,3),  (5,7),  (0,6) (3,7), 
    (4,6), (4,4),  (5,8),   (3,8), 
    (4,8), (4,5),  (5,9),   (5,1), 
    (4,9) (4,6),  (6,3),   (5,5), 
     (4,7),  (6,5),   (5,8), 
     (4,8),  (6,6),   (6,1), 
     (4,9)  (6,7),   (6,6), 
       (6,8)   (6,7), 
          (6,8), 
          (7,1), 
          (7,6), 
          (7,7), 
          (7,8) 
Anu for max.  A0 A3 A8 A3 A8 A9 A6 A3 A5 A0 A3 
Failed (, cout,nun. n. n. n. n. n. n. n. n. (1,1–5), All 
         (2,1–5),  
         (3,1–5)  
Failed (kme,Cmen. n. n. n. n. n. n. n. n. (2,1–9), (2,1–9), 
         (3,1–9) (3,1–9) 

CRNs in the first vertical block (indices 1, 236, and 239) are such that Rth1 = Rth2, while all others had a single shorter minimal siphon at R = Rth1 (having a single minimal siphon at R = Rth1 is not a general property: It is possible to build counterexamples having several minimal siphons and a single moiety). CRNs in the second vertical block (indices 230, 231, 233, 234, 235, and 238) are such that the Ai that maximizes μ0, i/mi belongs to the shorter minimal siphon at R = Rth1. CRNs in the last vertical block (indices 232 and 237) are such that the Ai that maximizes μ0, i/mi is outside the shorter minimal siphon at R = Rth1, which affects the protocell stationary growth capability when varying (, cout,nu) or (kme, Cme).

Their features have been grouped by horizontal blocks. The first block gives topological features. The second block gives protocell characteristics when varying (Anu, Ame) with fixed nominal (, cout,nu, kme, Cme) parameters. The last block gives protocell characteristics when varying (, cout,nu) or (kme, Cme) with fixed (Anu, Ame) such that Anu (Ame) maximizes (minimizes) μ0, i/mi.

Findings are summarized below:

  • 1. 

    For maximum-size or threshold-2 networks (i.e., for CRNs having the full set of species as single siphon), protocell trajectories asymptotically converge towards constant strictly positive concentrations (stationary growth regime) for any (Anu, Ame) and (, cout,nu, kme, Cme) combination. The same stationary concentrations are reached for any of the three choices of initial conditions.

  • 2. 

    For threshold-1 networks such that Rth1 < Rth2:

    • (a) 

      If Anu belongs to the minimal siphon, protocell trajectories converge towards constant strictly positive concentrations (stationary growth regime) for any (Anu, Ame) and (, cout,nu, kme, Cme) combination. The same stationary concentrations are reached for any of the three choices of initial conditions.

    • (b) 

      If Anu lies outside the minimal siphon, stationary growth may be reached for some Ame and (, cout,nu, kme, Cme) choices, but not for others. The same behavior (stationary growth or not) and the same stationary concentrations (in case of stationary growth) are reached for the medium and high initial conditions. In some cases, stationary growth is reached from low but not from medium or high initial conditions, or stationary growth is reached from all initial conditions but different concentrations are reached for the low initial conditions. Multistationarity is thus in principle possible, although in these cases stationary concentrations reached starting from low initial conditions are so low for species in the unfed siphon that they are not physically sensible and are not considered further in this work.

  • 3. 

    Removing any reaction from threshold-1 networks (in which case CRNs have two moieties) typically results in no stationary protocell growth for any (Anu, Ame) and (, cout,nu, kme, Cme) combination or for any choice of initial conditions. At least one of the chemical species asymptotically disappears. However, exceptions are encountered with possible stationary growth when the supports of the two moieties intersect for some species and when Anu is chosen in this intersection.

Finding 3 is consistent with the necessary condition given in Section 2.3 (each moiety must be fed). At threshold 1, rank(S) = N − 1 and the only moiety is the mass vector m having as support the full set of species. The necessary condition is thus automatically satisfied for any (Anu, Ame) choice. But removing any reaction below threshold 1 typically results in rank(S) = N − 2, and there are two moieties, the support of which is most often disjoint. This explains why in general such a CRN cannot grow with a single nutrient flux (finding 3 above), with possible exceptions when the supports of the two moieties intersect for some species and Anu is chosen in this intersection (in the most general case where the supports do not intersect, we have verified numerically that feeding the network with two nutrient fluxes such that each moiety is fed also typically results in stationary growth).

Finding 1 is consistent with the sufficient condition given in Section 2.3 (each siphon contains the support of a moiety that is fed). If RRth2 (threshold-2 or maximum-size networks), then the only moiety is the mass vector m having as support the full set of species that is also the only siphon. The sufficient condition is thus satisfied for any (Anu, Ame) choice. The proof in [3] is only partial, because it only ensures existence, not uniqueness or stability. Finding 1 suggests that this fixed point might be stable and unique, that is, a global attractor. Further work is required either to confirm this hypothesis with an extended proof, or to invalidate it with a counterexample.

Finding 2a is consistent with the weaker sufficient condition mentioned in Section 2.3: Having each siphon fed (and not necessarily having each siphon containing the support of a moiety that is itself fed) suffices to ensure stationary growth, also with the corresponding fixed point apparently being a global attractor.

Besides the stationary regimes observed in this work, an oscillatory behavior such that c(t + T) = c(t) (T being the cell cycle period) would also be compatible with cell replication provided the cell surface area (or volume ) is doubled every . Although we did not observe such behavior in our random CRNs, we note that the proved necessary and sufficient conditions in [3] also hold for both stationary and oscillatory cases.

3.2 Quantitative

Protocell stationary growth characteristics are given below for the reference example shown in Figure 2 (maximum-size and threshold networks; threshold 1 and threshold 2 coincide for this example). Characteristics for the 10 other random CRNs (threshold-1, threshold-2, and maximum-size networks) are given in the Supplementary Material.

 Appendix 2 presents an approximate thermodynamic model that gives an estimate of stationary concentrations assuming the out-of-equilibrium protocell is close to thermodynamic equilibrium. This approximate model will prove useful when analyzing the dependence of the stationary out-of-equilibrium state on protocell parameters.

Growth rate as function of (Anu, Ame)Figure 4 gives the growth rate λ and the ratio λ/λequ (where λequ is the estimated growth rate using the approximate thermodynamic equilibrium model) as a function of the (Anu, Ame) combination for both threshold and maximum-size networks. Chemical species indices are ranked by decreasing reference chemical potential per unit molecular weight, μ0,i/mi. Carrying out a two-dimensional linear regression, this is the ranking that tends to give the highest correlation coefficient (averaging this correlation coefficient over all tested 1 + 10 CRNs) compared to a ranking by decreasing μ0, i or to the unranked indices: λ tends to increase with increasing μ0, nu/mnu and with decreasing μ0, me/mme.  Appendix 3 explains why such a trend is also to be expected from the approximate thermodynamic equilibrium model. Although all (Anu, Ame) combinations theoretically result in stationary growth, only those combinations that tend to have a high μ0, nu/mnu (or a low μ0, me/mme) result in a significant growth rate.

Figure 4. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of the (Anu, Ame) combination for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Species indices were ranked by decreasing reference chemical potential per unit molecular weight, μ0, i/mi.

Figure 4. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of the (Anu, Ame) combination for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Species indices were ranked by decreasing reference chemical potential per unit molecular weight, μ0, i/mi.

Both threshold and maximum-size networks exhibit similar λ patterns (left plots), whereas λ/λequ (right plots) exhibit significant differences. This is because λ spans about 20 orders of magnitude whereas λ/λequ only spans about 3 orders of magnitude. λequ is thus a fair predictor of the λ-versus-(Anu, Ame) pattern. This also shows that features in these patterns originate mostly from the characteristics of chemical species {(mi, μ0, i)}i=0,…,N−1 as well as from the protocell parameters (, cout,nu, kme, Cme), but not from the CRN kinetics, because the approximate model is independent of these (and is the same for both threshold and maximum-size networks). The maximum-size network is closer to equilibrium than the threshold network, which is consistent with the larger number of chemical pathways through which the system can relax.

Growth rate as function of (, cout,nu) and of (kme, Cme) We here keep (Anu, Ame) = (A0, A7), which maximizes μ0, nu/mnu and minimizes μ0, me/mme. Figure 5 [Figure 6] gives the growth rate λ and the ratio λ/λequ for varying (, cout,nu) [for varying (kme, Cme)] for both threshold and maximum-size networks. As intuitively expected, the growth rate increases with increasing effective diffusion constant , nutrient concentration, cout,nu in the outside growth medium or kinetic coefficient kme for membrane precursor incorporation; and it decreases with increasing effective membrane concentration Cme.  Appendix 3 explains why the trends in Figures 5 and 6 (left) for the dependence of λ on (, cout,nu) or (kme, Cme) are to be expected from the approximate thermodynamic equilibrium model.

Figure 5. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of cout,nu for three values of , for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. (kme, Cme) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

Figure 5. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of cout,nu for three values of , for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. (kme, Cme) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

Figure 6. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of Cme for three values of kme, for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. (, cout,nu) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

Figure 6. 

Growth rate λ (left) and ratio λ/λequ (right; λequ is the estimated growth rate from the approximate thermodynamic equilibrium model) as a function of Cme for three values of kme, for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. (, cout,nu) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

Figures 5 and 6 (right) show that λ remains within a factor of two from λequ, except at the highest value for the threshold network. The membrane precursor stationary concentration cme (proportional to λ through Equation 6) is thus fairly close to its equilibrium value. But this is not true of all chemical species. This is illustrated in Figure 7, giving the stationary concentrations {ci}i=0,…, N−1 and the ratios {ci/cequ,i}i=0,…, N−1 (where cequ is the concentration vector estimated using the approximate thermodynamic equilibrium model given in  Appendix 2) as a function of the outside nutrient concentration cout,nu with all other parameters (, kme, Cme) fixed to their nominal values and (Anu, Ame) = (A0, A7). Although cme = c7 remains fairly close to equilibrium, other concentrations may be several orders of magnitude away from equilibrium. Here again, the maximum-size network is significantly closer to equilibrium than the threshold network, which is consistent with the larger number of chemical pathways through which the system can relax.

Figure 7. 

Stationary concentrations {ci}i=0,…,N−1 (left) and ratios {ci/cequ,i}i=0,…,N−1 (right; cequ is the concentration vector estimated using the approximate thermodynamic equilibrium model) as a function of cout,nu for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. Species in the legend box were ranked by decreasing reference chemical potential per unit molecular weight, μ0, i/mi. The parameters (, kme, Cme) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

Figure 7. 

Stationary concentrations {ci}i=0,…,N−1 (left) and ratios {ci/cequ,i}i=0,…,N−1 (right; cequ is the concentration vector estimated using the approximate thermodynamic equilibrium model) as a function of cout,nu for the threshold (top) and maximum-size (bottom) example random CRN shown in Figure 2. Top and bottom plots share the same x-axis. Species in the legend box were ranked by decreasing reference chemical potential per unit molecular weight, μ0, i/mi. The parameters (, kme, Cme) are kept constant at their nominal values, and (Anu, Ame) = (A0, A7).

4 Discussion

4.1 Chemical Interpretation of Necessary and Sufficient Conditions

The necessary condition is consistent with common intuition: Either species in the support of an unfed moiety go extinct by dilution, or the growth rate tends towards zero. Models of specific metabolic pathways often make directly apparent the existence of certain moieties. Taking glycolysis as an example, all reactions that involve ATP (involve ADP) result in it being downconverted to ADP (upconverted to ATP). As a result, if such a metabolic pathway were to be considered in a closed environment, the sum of concentrations for ATP and ADP would remain constant. And if glycolysis were to be considered in a cellular model with a finite growth rate (and thus dilution), the concentrations of ATP and ADP would progressively be reduced towards zero by dilution. Full cell models must therefore also explicitly take into account additional metabolic pathways that synthesize ATP (or ADP). In essence, this is what the necessary condition says.

The chemical interpretation of the sufficient condition is more subtle. Siphons are the sets of species that can go extinct independently of other species. Dilution accompanying growth tends to drain siphons unless they are directly fed (sufficient condition), and species in minimal siphons tend to go extinct. The reason why this sufficient condition is not also necessary (when there exists some siphon that is not directly fed, the protocell may or may not reach stationary growth) is that siphons are connected to other species through pass reactions of the kind
formula
where Aj and Ak belong to the siphon and Ai does not. Whether such a pass reaction is capable of transferring mass into the siphon at a sufficient rate to compensate dilution depends not only on the way the concentrations cj and ck may tend towards zero relative to cme (which drives the growth rate and thus the dilution), but also on the value of ci (Ai outside the siphon belongs to the support of a fed moiety) as well as on the kinetic coefficients of the pass reaction. This is consistent with finding 2b in Section 3.1: Stationary growth may or may not be reached, depending on the chosen parameters (Anu, Ame) and (, cout,nu, kme, Cme).

4.2 Growth Drives the System Away from Thermodynamic Equilibrium

With nutrient flowing across the membrane by bidirectional diffusion, one could expect the system to relax towards thermodynamic equilibrium, with the same chemical state inside and outside the protocell. That would be the case if the only contributions to the ODE system given by Equation 2 were the contributions from chemical reactions given by Sf and from the nutrient flux given by Equation 4. But there are two additional contributions that drive the system away from thermodynamic equilibrium: the membrane precursor incorporation rate −foutput (with foutput,me given by Equation 5) and the dilution factor −λc (with λ given by Equation 6). These additional contributions are generally only negligible if the membrane precursor concentration cme is very low, which results in a negligible growth rate. This is typically the case when the Gibbs energy per unit molecular weight is high for Ame or low for Anu. In such cases, not only is the nutrient concentration inside the protocell very similar to that in the outside growth medium, but so are also all other concentrations (unless there is a continuous flow of Anu through the outside growth medium as hypothesized in [22]).

If on the contrary the Gibbs energy per unit molecular weight is high for the nutrient Anu and low for the membrane precursor Ame, then the growth rate tends to be significant. With such an energetic configuration, the rate of transformation of Anu into Ame (as well as into all other cell constituents) tends to be large, the nutrient concentration inside the protocell is significantly below that in the outside growth medium, and the membrane precursor concentration tends to be high. This results not only in a large growth rate, but also in a protocell that is significantly driven away from equilibrium.

4.3 Osmotic Pressure and Cell Geometry

Our assumption of a constant surface-area-to-volume ratio ρ could typically correspond to a situation where the cell membrane is be so rigid that it is constrained to take a filament shape regardless of any difference in osmotic pressure across the membrane.

Conversely, if the cell membrane were free to take any shape without any mechanical constraint, the cell shape (and corresponding ratio ρ) would adjust to balance the osmotic pressure across the membrane. ρ would then be an additional variable besides the concentration vector c in the ODE system. Assuming the dynamics of such a balancing to be very fast, this would correspond to a constraint of the kind uTc = uTcout, where u is the osmolar vector (such that ui is the number of particles obtained by dissolving one molecule of Ai in water) and cout is the constant outside concentration vector. Taking the derivative with given by Equation 2 would relate λ = to (c, ρ) through
formula
where Finput (Foutput) is the nutrient (membrane precursor incorporation) flux vector per unit area. The missing additional differential equation for ρ could be simply obtained from
formula
This would make the set of ODEs fully self-consistent, with the trajectories of both the concentration vector c and the surface-area-to-volume ratio ρ being solutions of this extended set of ODEs, similarly to the approach taken in [16]. It should be noted that the proofs in [3] do not hold in such a case. Under which minimal conditions would such a scheme result in protocell stationary growth or in a periodic regime compatible with self-replication remains an open question.

A more realistic situation should lie somewhere between these two extremes, taking into account specific mechanical properties of the membrane. This is typically the approach followed in [5] or [23].

4.4 Comparison with Previously Reported Necessary and Sufficient Conditions

Within the framework of the proposed protocell model, a necessary condition for stationary growth is that each moiety must be fed, and a sufficient condition is that each siphon is fed. In this section, we compare these conditions with previously reported minimal conditions.

An osmotic synchronization condition for stationary reproduction of a protocell has previously been reported in [15]. This condition was derived by combining two conditions:

  • 1. 

    From purely geometric considerations, the growth control coefficient, defined as the relative rate of increase of the cell membrane volume divided by the relative rate of increase of the cell membrane area must be equal to unity for stationary cell replication to occur.

  • 2. 

    Water flows instantaneously across the membrane to balance any difference in osmotic pressure that may result from a difference in total concentration of chemical species between the inside of the cell and the outside growth medium.

Combining these two conditions results in the osmotic synchronization condition, which is not obviously met by any CRN, even those satisfying our sufficient condition. Condition 1 is equivalent to our assumption 2 of having a constant ratio. We believe that, in the most general sense, this ratio need not necessarily be constant, but should only be periodic, , with T being the cell cycle period (just as the concentration vector c need not necessarily be constant but should be periodic, c(t + T) = c(t)). Whether such a relaxation of condition 1 would lead to an extension of the class of CRNs satisfying the osmotic synchronization condition remains an open question (see the above discussion in Section 4.3). In any case, the simple necessary and sufficient conditions introduced in the present article correspond to a simplified model that focuses on the CRN properties, and that does not take into account other physical constraints such as the balance of osmotic pressure or mechanical properties of the membrane. Future work is required to build a generic model including such additional features, along the lines presented in Section 4.3.

Different sufficient conditions for emergent synchronization in protocells have also been proved in [6], synchronization referring in this work to the property of having the rate of synthesis for the membrane material matching that for the inner cell constituents. Such an emergent synchronization is quite distinct from earlier proposals [9] where synchronization was achieved by some ad hoc mechanism. Our results could also be qualified as emergent synchronization, because such a synchronization is automatically achieved once our sufficient condition is met. However, it is difficult to compare the earlier work with ours, because of the significant differences in model assumptions: In [6] self-replication of protocell constituents occurs without explicit nutrient flux or mass conservation, and the membrane precursor concentration is assumed constant (corresponding to a membrane growing from the outside). On the one hand, the protocell models for which the sufficient conditions in [6] were proved are a closer match to experimental approaches towards artificial life than ours. On the other hand, our approach may prove more general with respect to the metabolic network represented by the CRN.

A different line of work [13] has established different growth phases for a generic protocell model embedding a CRN of arbitrary complexity. In addition to various growth phases characterized by high or low growth rate and high or low nutrient influx, a death phase was also identified, corresponding to a negative growth rate. This death phase occurs if the nutrient concentration in the outside growth medium, cout,nu, is too low. We see such behavior only for CRNs that do not comply with the sufficient condition. But for those CRNs that are compliant with that condition, stationary growth is guaranteed for any cout,nu value. Here again, this may arise from different model assumptions: In [13] the moiety constituted by the energy currency molecules (A and B, using the notation of [13]) is kept at constant total concentration, and the membrane precursor concentration is also kept constant (which could correspond to a membrane growing from the outside, as in [6]). In our model, every moiety must be explicitly fed, and the membrane precursor concentration is a variable, which corresponds to the case of a membrane growing from within. Regarding the random CRN topology, all reactions in [13] are of the autocatalytic type. Excluding the specific moiety or siphon constituted by energy currency molecules, this choice is likely to result in a single moiety with all species having the same mass, and the large number of reactions considered (R/N ≥ 8) is likely to result in a single siphon. This should make the CRNs considered in [13] a particular case of the more general class of CRNs considered in the present article, and as such cannot account for the difference in behavior.

4.5 Relevance of Random CRNs

We do not make any claim regarding the chemical relevance of random CRNs, nor of the particular choice of N = 10 used in numerical analyses. Our goal in using such an approach is to understand the properties of any conservative chemical reaction networks (including real ones), and to unveil the underlying mechanisms behind such properties. Derived laws should then apply to any reaction network, including real ones. The necessary and sufficient conditions proved in [3] are a first step in this direction. More work remains to be done towards a finer delineation of the class of CRNs compatible with self-reproduction. Likewise, the approximate model approach is applicable to any thermodynamically consistent conservative CRN, and not just to the random ones generated in this work.

4.6 Robustness

This discussion addresses two aspects: (i) robustness against by-products, side reactions, or leakage through the membrane; and (ii) robustness against removing chemical reactions or changing kinetics.

Regarding (i), any conservative CRN also encompasses situations where elementary building blocks may be degraded or produced from simpler constituents inside the protocell. The proved necessary and sufficient conditions in [3] even hold in the case where some species may leak out of the membrane by passive diffusion. Considering all species that may diffuse through the membrane as a single set including nutrients and leaking species, the sufficient condition grants the existence of a fixed point, but does not predict which species concentrations settle at a higher or lower value inside versus outside concentration. In such a stationary growth regime, we can simply observe that concentrations settle to values such that as a whole there is a net positive material influx to ensure a positive growth rate, that is, the rate of mass nutrient influx exceeds the rate of mass leakage.

Regarding (ii), we have already mentioned that removing any reaction from a threshold-1 network typically results in no stationary growth with a single nutrient, unless the supports of the two moieties intersect for some species and the nutrient belongs to that intersection. Removing any reaction from a threshold-2 network typically results in this CRN having a shorter siphon. If Anu is chosen within this siphon, then protocell stationary growth is still obtained. Else, protocell stationary growth may or may not be possible, depending on the particular CRN, its kinetics, or protocell membrane parameters. Removing any reaction from a maximum-size network typically results in this CRN being still above threshold 2, which automatically assures protocell stationary growth regardless of the chosen (Anu, Ame) and of the protocell parameters. The above statements are consistent with common intuition: A larger number of reactions tends to increase the robustness with respect to reaction removal (robustness in terms of qualitative ability to grow). From a quantitative standpoint, the out-of-equilibrium stationary growth state gets closer to the approximate thermodynamic equilibrium model on increasing the number of reactions. This approximate model only depends on the individual characteristics of chemical species (masses and reference chemical potentials) and is independent of the CRN topology or reaction kinetics. This shows that increasing the number of reactions increases the quantitative phenotype stability upon reaction removal or upon changes in reaction kinetics.

4.7 Minimal Requirements for Self-Replication

Within the framework of our protocell model, there is no requirement for any specific features nor for their emergence through complexity. Any CRN such that the only siphon is the full set of species results in stationary protocell growth for any (Anu, Ame) choice and for any (, cout,nu, kme, Cme) parameter set. With our random network generation process, Figure 1 shows that this is reached on average for a number of reactions of R = Rth2 = 13 corresponding to a fairly low complexity ratio R/N = 1.3. Even smaller networks such that Rth1R < Rth2 (corresponding to a complexity ratio 1 ≤ R/N < 1.3) result in protocell stationary growth for any Ame choice and for any (, cout,nu, kme, Cme) parameter set, provided Anu belongs to some minimal siphon. And if there exists some siphon that does not contain Anu, our numerical results show that there exists some (, cout,nu, kme, Cme) parameter range for which stationary growth can also be achieved. It thus appears that the class of CRNs compatible with protocell stationary growth is very broad.

5 Conclusion

Using a generic protocell model, we have shown that self-replication does not require any specific features such as autocatalysis or metabolic closure. Nor does it require the emergence of such features through complexity. Most random networks lead to protocells capable of stationary growth over a very broad parameter range, even at very low complexity with a number of reactions that is comparable to the number of chemical species. We have illustrated simple necessary and sufficient conditions for protocell stationary growth. A necessary condition is that each moiety must be fed. And a sufficient condition is that each siphon is fed. These conditions are consistent with previously reported mathematical proofs and hold for any conservative CRN, kinetics, membrane parameters, or growth medium nutrient concentration.

Although the theoretical capability for stationary growth only depends on the CRN topology, we have found that the nutrient (respectively, the membrane precursor) should have a high (a low) Gibbs energy per unit molecular weight for the growth rate to be significant. This was explained using an approximate thermodynamic equilibrium model. This model only requires knowledge of the chemical species characteristics (reference chemical potential and molecular weight), of the kinetics for nutrient influx and membrane precursor incorporation, and of membrane structural properties (number of molecules per unit membrane area). But it does not require knowledge of the reaction network topology or of the CRN kinetics.

Acknowledgments

The authors would like to thank Pierre Legrain, Loïc Paulevé, and Laurent Schwartz for stimulating discussions, as well as anonymous reviewers for their useful comments.

References

1
Angeli
,
D.
,
De Leenheer
,
P.
, &
Sontag
,
E. D.
(
2007
).
A Petri net approach to the study of persistence in chemical reaction networks
.
Mathematical Biosciences
,
210
(
2
),
598
618
.
2
Bigan
,
E.
,
Steyaert
,
J.-M.
, &
Douady
,
S.
(
2013
).
Properties of random complex chemical reaction networks and their relevance to biological toy models
. In
Proceedings of Journées Ouvertes Biologie Informatique Mathématiques (JOBIM 2013)
. .
3
Bigan
,
E.
,
Steyaert
,
J.-M.
, &
Douady
,
S.
(
2014
).
On necessary and sufficient conditions for proto-cell stationary growth
. In
Proceedings of the Fifth International Workshop on Static Analysis and Systems Biology (SASB 2014). Electronic Notes in Theoretical Computer Science
(in press). Available at http://arxiv.org/abs/1411.6772
.
4
Bigan
,
E.
,
Steyaert
,
J.-M.
, &
Douady
,
S.
(
2015
).
Necessary and sufficient conditions for protocell growth
.
Journal of Mathematical Biology
(
submitted
).
5
Božič
,
B.
, &
Svetina
,
S.
(
2004
).
A relationship between membrane properties forms the basis of a selectivity mechanism for vesicle self-reproduction
.
European Biophysics Journal
,
33
(
7
),
565
571
.
6
Carletti
,
T.
,
Serra
,
R.
,
Poli
,
I.
,
Villani
,
M.
, &
Filisetti
,
A.
(
2008
).
Sufficient conditions for emergent synchronization in protocell models
.
Journal of Theoretical Biology
,
254
(
4
),
741
751
.
7
Ederer
,
M.
, &
Gilles
,
E. D.
(
2007
).
Thermodynamically feasible kinetic models of reaction networks
.
Biophysical Journal
,
92
(
6
),
1846
1857
.
8
Érdi
,
P.
, &
Tóth
,
J.
(
1989
).
Mathematical models of chemical reactions: Theory and applications of deterministic and stochastic models
.
Manchester
:
Manchester University Press
.
9
Gánti
,
T.
(
1997
).
Biogenesis itself
.
Journal of Theoretical Biology
,
187
(
4
),
583
593
.
10
Huijbregts
,
R. P. H.
,
de Kroon
,
A. I. P. M.
, &
de Kruijff
,
B.
(
2000
).
Topology and transport of membrane lipids in bacteria
.
Biochimica et Biophysica Acta (BBA)—Reviews on Biomembranes
,
1469
(
1
),
43
61
.
11
Kaneko
,
K.
(
2006
).
Life: An introduction to complex systems biology
.
Berlin
:
Springer-Verlag
.
12
Kauffman
,
S. A.
(
1993
).
The origins of order: Self-organization and selection in evolution
.
New York
:
Oxford University Press
.
13
Kondo
,
Y.
, &
Kaneko
,
K.
(
2011
).
Growth states of catalytic reaction networks exhibiting energy metabolism
.
Physical Review E
,
84
(
1
),
011927
.
14
Luisi
,
P. L.
, &
Varela
,
F. J.
(
1989
).
Self-replicating micelles—a chemical version of a minimal autopoietic system
.
Origins of Life and Evolution of the Biosphere
,
19
(
6
),
633
643
.
15
Mavelli
,
F.
, &
Ruiz-Mirazo
,
K.
(
2013
).
Theoretical conditions for the stationary reproduction of model protocells
.
Integrative Biology
,
5
(
2
),
324
341
.
16
Morgan
,
J. J.
,
Surovtsev
,
I. V.
, &
Lindahl
,
P. A.
(
2004
).
A framework for whole-cell mathematical modeling
.
Journal of Theoretical Biology
,
231
(
4
),
581
596
.
17
Neidhardt
,
F. C.
(
1996
).
Escherichia coli and Salmonella: Cellular and molecular biology
.
Washington, DC
:
ASM
.
18
Onsager
,
L.
(
1931
).
Reciprocal relations in irreversible processes. I
.
Physical Review
,
37
(
4
),
405
.
19
Piedrafita
,
G.
,
Montero
,
F.
,
Morán
,
F.
,
Cárdenas
,
M. L.
, &
Cornish-Bowden
,
A.
(
2010
).
A simple self-maintaining metabolic system: Robustness, autocatalysis, bistability
.
PLoS Computational Biology
,
6
(
8
),
e1000872
.
20
Rosen
,
R.
(
1991
).
Life itself: A comprehensive inquiry into the nature, origin, and fabrication of life
.
New York
:
Columbia University Press
.
21
Schlosser
,
P. M.
, &
Feinberg
,
M.
(
1994
).
A theory of multiple steady states in isothermal homogeneous CFSTRs with many reactions
.
Chemical Engineering Science
,
49
(
11
),
1749
1767
.
22
Serra
,
R.
, &
Villani
,
M.
(
2013
).
Mechanism for the formation of density gradients through semipermeable membranes
.
Physical Review E
,
87
(
4
),
062814
.
23
Surovtsev
,
I. V.
,
Zhang
,
Z.
,
Lindahl
,
P. A.
, &
Morgan
,
J. J.
(
2009
).
Mathematical modeling of a minimal protocell with coordinated growth and division
.
Journal of Theoretical Biology
,
260
(
3
),
422
429
.

Appendix 1 Proof for Rth2Rth1

The proof proceeds in two steps. First, we shall prove that the support of any moiety is a siphon. Second, we shall prove that if there is more than one moiety (i.e., if R < Rth1), then at least one of them has its support shorter than the full set of species. This support is thus a siphon that is shorter than the full set of species (i.e., R < Rth2). This proves that Rth2Rth1.

The proof of the two steps is given below.

  • 1. 

    First step: The following definition will prove useful for this first step. For any moiety b and any reaction in a CRN, we define the product b-weight (the reactant b-weight) as the sum of the components of b along product species (reactant species) weighted by the corresponding stoichiometric coefficients. Conservation ensures that the b-weight is conserved on either side of the reaction, that is, the product b-weight equals the reactant b-weight.

    The support supp(b) of any moiety b is a siphon. This is because if any species Ai in supp(b) appears on the product side of a reaction, then the corresponding product b-weight is strictly positive, and so is the corresponding reactant b-weight. A strictly positive reactant b-weight implies that at least one of the reactant species belongs to supp(b), which matches the definition of a siphon.

  • 2. 

    Second step: If a CRN has more than one moiety (i.e., rank(S) = Np with p > 1), then at least one of the moieties has its support shorter than the full set of species. If that were not the case, one could find two moieties b1 and b2 having as support the full set of species. Defining i0 as the species index for which b1, i/b2, i is minimized, and x = b1, i0/b2, i0, one can define a new moiety b = b1xb2. This b is a moiety because (i) it is a linear combination of two elements of Ker(ST) and as such also belongs to Ker(ST), and (ii) all its components are positive or null. In particular, bi0 = 0, which results in supp(b) being shorter than the full set of species.

Appendix 2 Approximate Thermodynamic Model

Assuming the protocell out-of-equilibrium stationary state is close to equilibrium, this state can be determined from the mass balance relation. Multiplying both sides of Equation 2 by mT and denoting the density D = mTc gives
formula
We have in the stationary state. Feeding Equations 4 and 5 into Equation 9 and rearranging terms gives
formula
Defining Dme = mmeCme, Equation 10 can be rewritten as
formula

The equilibrium chemical potential vector μequ and the mass vector m both belong to Ker(ST). Considering only CRNs at or above threshold 1, the mass vector is the only moiety, and dim(Ker(ST)) = 1, which forces μequ and m to be collinear, μequ = αm.

The equilibrium concentrations cequ,i are uniquely determined by α through
formula
Feeding Equation 12 into Equation 10 gives a polynomial equation in where the left-hand side (LHS) is a constant and the right-hand side (RHS) is a polynomial expression containing only positive terms, which uniquely defines α and the equilibrium state cequ.

cequ only depends on {mi}i=0,…,N−1 and {μ0,i}i=0,…, N−1 and not on kinetics, nor on topology. In particular, it is the same for maximum-size, threshold-1 or threshold-2 CRNs, as well as for any other subset of the maximum-size network having rank N − 1. Further, cequ is an approximation of the stationary concentration vector denoted c, assuming the latter to be close to thermodynamic equilibrium. Although c may be far away from thermodynamic equilibrium, we shall see that many of the features in the dependence of c on (, cout,nu, kme, Cme) and on the (Anu, Ame) combination are actually also exhibited by cequ. This approximate model holds for any conservative and thermodynamically consistent CRN and does not presume specific kinetics (mass-action or other).

By a thermodynamically consistent CRN [7], we refer to any CRN such that every reaction is bidirectional (quasi-irreversible reactions are simply reversible with one direction having very low rate) and such that it is endowed with kinetics ensuring detailed balance at equilibrium for a closed system, that is, for every reaction the forward and reverse rates are exactly balanced [18]. Detailed balance ensures that the free energy is minimized at equilibrium.

This approximate model can be extended to the general case where there are several moieties {bk}k=0,…,p−1 that constitute a basis of Ker(ST). Decomposing μequ on this basis (μequ = ∑k=0,…,p−1 αkbk) and multiplying both sides of Equation 2 by , gives a set of p coupled polynomial equations in {exp(αk/RT)}k=0,…, p−1, where the p LHSs are constant and strictly positive (because each moiety must be fed according to the necessary condition) and the RHSs are polynomial expressions with only positive terms. This uniquely defines {αk}k=0,…,p−1 and the equilibrium state cequ.

Appendix 3 Trend Analysis for the Approximate Thermodynamic Equilibrium Model

This appendix analyzes the way the approximate thermodynamic equilibrium model (as defined in  Appendix 2) depends on various model parameters, and how this is in agreement with numerical observations.

Equation 11 (restated below) is a polynomial equation in that defines the equilibrium concentration vector μequ = αm:
formula
with
formula
formula
formula
We wish to determine how this equilibrium state varies with varying the (Anu, Ame) combination and with varying the (, cout,nu) and (kme, Cme) parameters. We shall follow the approach presented in the main text for the exploration of the parameter space. First, we will keep (, cout,nu) and (kme, Cme) constant and analyze how the growth rate λ depends on the choice of the (Anu, Ame) combination ( Appendix A3.1). Second, we will then fix the choice of the (Anu, Ame) combination, keep (kme, Cme) constant, and analyze how the concentrations {ci}i=0,…,N−1 and the growth rate λ vary with varying (, cout,nu) ( Appendix A3.2). Third, we will also fix the choice of the (Anu, Ame) combination, keep (, cout,nu) constant, and analyze how the concentrations {ci}i=0,…,N−1 and the growth rate λ vary with varying (kme, Cme) ( Appendix A3.3).

A3.1 Vary the (Anu, Ame) Combination; Keep (, cout,nu) and (kme, Cme) Constant

We shall proceed with a crude analysis of the solutions to Equation 11, assuming that its right-hand side is dominated by either one or the other contribution, or λ(Dme + D).

A3.1.1 Assume , or Equivalently, cnucout,nu
The unknown α is approximately determined from
formula
α being thus determined, one can verify the conditions under which the main assumption is actually verified. Equation 11 may be rewritten as
formula
with
formula
and
formula
The main assumption is equivalent to having both RHS1 ≪ 1 and RHS2 ≪ 1. α having been determined in Equation 16 above, one can calculate RHS1 and RHS2:
formula
Noting that
formula
gives
formula
Similarly,
formula
Noting that
formula
gives
formula
Noting that
formula
shows that for RHS1 ≪ 1 and RHS2 ≪ 1 to hold, one should try to choose Anu with a low μ0,nu/mnu (or Ame with a high μ0,me/mme).
A3.1.2 Assume , or Equivalently, cnucout,nu

We shall further assume that either Dme or D dominates the other.

  • 1. 
    Starting with the further assumption that DDme, we then have
    formula
    and the unknown α is approximately given by
    formula
    α being thus determined, one can verify the conditions under which the main assumption
    formula
    is actually verified. Noting that
    formula
    we have
    formula
    And for the main assumption cnucout,nu to be verified, one should try to choose Anu with a high μ0,nu/mnu (or Ame with a low μ0,me/mme).
  • 2. 
    Continuing with the further alternative assumption that DmeD, we then have
    formula
    where Ai0 is the most abundant species, the contribution of which dominates D. The unknown α is then approximately given by
    formula
    α being thus determined, one can verify the conditions under which the main assumption as in step 1 is actually verified. Noting that
    formula
    we have
    formula
    And for the main assumption cnucout,nu to be verified, one should try to choose Anu with a high μ0,nu/mnu (or Ame with a low μ0,me/mme).

The above is is just a trend analysis and does not guarantee that a suitable choice of (Anu, Ame) suffices to make one or the other contribution on the right-hand side of Equation 11 dominant, because the actual values of these contributions also depend on the particular numerical values used in the model. It illustrates the rationale for ordering the different species according to their reference chemical potential per unit molecular weight, μ0, i/mi, when plotting the protocell growth rate λ as a function of the (Anu, Ame) combination in the main text. And it explains why one should try to choose nutrient (respectively, membrane precursor) with a high (low) reference chemical potential per unit molecular weight for the growth rate to be significant, as seen in Figure 4.

A3.2 Vary (, cout,nu); Keep the (Anu, Ame) Combination and (kme, Cme) Constant

We are now interested in determining the way solutions to Equation 11 vary with (, cout,nu). Equation 11 may be rewritten as
formula
where f and g are monotonically increasing functions of α defined by
formula
formula

We will consider both variables and cout,nu in turn.

A3.2.1 Vary ; Keep cout,nu Constant
All other parameters being kept constant, Equation 35 uniquely defines α as a function of . Differentiating this equation with respect to the variable gives
formula
where f′ and g′ designate the derivatives of the functions f(·) and g(·). We thus obtain :
formula

The equilibrium state solution of Equation 11 is such that cout,nu > cnu = f(α) (i.e., cout,nuf(α) > 0), and f′(α) + g′(α) > 0 because f and g are monotonically increasing functions of α. This shows that . As α increases with increasing , all concentrations ci as well as the growth rate λ (which is an increasing function of cme) also increase. This explains the pattern observed in Figure 5.

A3.2.2 Vary cout,nu; Keep Constant
All other parameters being kept constant, Equation 35 uniquely defines α as a function of cout,nu. Differentiating this equation with respect to the variable cout,nu gives
formula
We thus obtain dα/dcout,nu:
formula

This shows that dα/dcout,nu > 0. As α increases with increasing cout,nu, all concentrations ci as well as the growth rate λ (which is an increasing function of cme) also increase. This is consistent with the pattern observed in Figures 5 and 7.

A3.3 Vary (kme, Cme); Keep the (Anu, Ame) Combination and (, cout,nu) Constant

We are now interested in determining the way solutions to Equation 11 vary with (kme, Cme). Equation 11 may be rewritten as
formula
where F is a monotonically increasing function of α defined by
formula
We will consider both variables kme and Cme in turn.
A3.3.1 Vary kme; Keep Cme Constant
All other parameters being kept constant, Equation 42 uniquely defines α as a function of kme. Differentiating this equation with respect to the variable kme gives
formula
where F′ designates the derivative of the function F(.).
Further noting that
formula
results in
formula
F, cme, and D being monotonically increasing functions of α, the denominator in the above expression is negative and dα/dkme < 0.

Feeding this information into Equation 45 also shows that dλ/dkme > 0, because F and D are monotonically increasing functions of α.

As a result, increasing kme results in a decrease in all concentrations but in an increase in the growth rate λ. This is consistent with the pattern observed in Figure 6.

A3.3.2 Vary Cme; Keep kme Constant
All other parameters being kept constant, Equation 42 uniquely defines α as a function of Cme. Differentiating this equation with respect to the variable Cme gives
formula
Further noting that
formula
and that
formula
results in
formula
This shows that , because F, cme, and D are monotonically increasing functions of α.
Feeding Equation 50 back into Equation 48 and rearranging terms leads to
formula
This shows that dλ/dCme < 0, because F, cme, and D are monotonically increasing functions of α.

As a result, increasing Cme results in an increase in all concentrations and in a decrease in the growth rate λ. This is consistent with the pattern observed in Figure 6.

Author notes

Contact author.

∗∗

Laboratoire d'Informatique (LIX), École Polytechnique, 91128 Palaiseau Cedex, France. E-mail: bigan.erwan@orange.fr (E.B.); steyaert@lix.polytechnique.fr (J.-M.S.)

Laboratoire Matière et Systèmes Complexes, UMR7057 CNRS, Université Paris Diderot, 75205 Paris Cedex 13, France. E-mail: stephane.douady@univ-paris-diderot.fr (S.D.)

Supplementary data