Inspired by gamma-band oscillations and other neurobiological discoveries, neural networks research shifts the emphasis toward temporal coding, which uses explicit times at which spikes occur as an essential dimension in neural representations. We present a feature-linking model (FLM) that uses the timing of spikes to encode information. The first spiking time of FLM is applied to image enhancement, and the processing mechanisms are consistent with the human visual system. The enhancement algorithm achieves boosting the details while preserving the information of the input image. Experiments are conducted to demonstrate the effectiveness of the proposed method. Results show that the proposed method is effective.
The description of neural processing has piqued interest in temporal correlation, which involves the precision timing of spikes (Hopfield, 1995; Singer & Gray, 1995; Gray, 1999; Victor, 2000; Reinagel & Reid, 2002; Wang, 2005; VanRullen, Guyonneau, & Thorpe, 2005; Izhikevich, 2006; von der Malsburg, Phillips, & Singer, 2010; Gütig, Gollisch, Sompolinsky, & Meister, 2013; Nikolić, Fries, & Singer, 2013). Neurons communicate by spikes that carry information about their time of arrival, and stimulus information can be encoded in the timing of individual spikes (Victor, 2000). Neurons encode information about the spatial image content in the timing of the first spike (Gütig et al., 2013; Gollisch & Meister, 2008). Neural networks can represent information through the explicit time at which a spike occurs, and Hopfield (1995) pointed out that gamma-band oscillations play a very important role in representing information. Gamma-band oscillations are a fundamental process in cortical computation. They have been discovered in the primary visual cortex (Eckhorn et al., 1988; Gray, König, Engel, & Singer, 1989), and numerous studies have discussed the processes underlying them (Fries, 2009; Buzsáki & Wang, 2012). After the discovery of gamma-band oscillations (Eckhorn et al., 1988; Gray et al., 1989), Eckhorn, Reitboeck, Arndt, and Dicke (1990) proposed the linking field network inspired by gamma-band oscillations, and it was applied to scene segmentation using temporal correlation (Stoecker, Reitboeck, & Eckhorn, 1996). Temporal correlation provides an elegant method of scene analysis and may encode feature binding among neurons (Milner, 1974; von der Malsburg, 1994; Gray, 1999). These findings have supported the binding problem addressed in a special issue of Neuron in September 1999. A study of the dynamics of coupled neural oscillators that interact with spikes via a fast threshold modulation (Somers & Kopell, 1993, 1995) produces a temporal correlation approach for solving the problem of scene analysis (Wang & Terman, 1997; Wang, 2005). Béroule (2004) surveyed how temporal correlation had been implicated in perception, learning, and memory. Synaptic efficacy among neurons was modified under the influence of spike-timing-dependent plasticity (Izhikevich, 2006; Lubenov & Siapas, 2008), and Izhikevich (2004, 2006, 2007) studied various models of cortical neurons and analyzed their features and computational efficiencies. Haken (2005, 2008) devoted work to the relationship between the synchronization of spikes and pattern recognition.
Johnson et al. studied synchronous spike dynamics and developed pulse-coupled neural networks (PCNN) (Johnson & Ritter, 1993; Johnson, 1994; Johnson & Padgett, 1999), which have been widely applied to image processing (Ranganath, Kuntimad, & Johnson, 1995; Johnson & Padgett, 1999; Ranganath & Kuntimad, 1999; Kuntimad & Ranganath, 1999; Ma, Zhan, & Wang, 2011; Lindblad & Kinser, 2013). The evidence of precise spike-timing dynamics can be seen from a special issue of IEEE Transactions on Neural Networks on PCNN in May 1999 and on temporal coding in July 2004. The time series of PCNN, the summation of spikes in a time course, has been applied to invariant image feature extraction (Johnson, 1994; Zhang, Zhan, & Ma, 2007; Zhan, Zhang, & Ma, 2009). The time matrix of PCNN is defined to record the time when each neuron fires the first spike (Johnson & Padgett, 1999). It was used to mark different regions for image segmentation (Stewart, Fermin, & Opper, 2002). Zhan et al. (2009) found that the time matrix had a high sensitivity for low intensities but a low sensitivity for high intensities.
In this letter, we propose a feature linking model (FLM). We find that the time matrix of FLM has a logarithmic relationship with a stimulus when the threshold decays exponentially, and records the timing of spikes simultaneously. FLM has two inputs: feeding inputs and linking inputs. It has a similar input structure to PCNN, but there are two leaky integrators in FLM rather than the three in PCNN. FLM is more effective for obtaining synchronization in a region and desynchronization among different regions. If the feeding synaptic weight of FLM is set to 0, the membrane potential is the same as the spiking cortical model (SCM) (Zhan et al., 2009). If the linking synaptic weight of FLM is set to 0, the membrane potential is the same as the intersecting cortical model (ICM) (Ekblad & Kinser, 2004). Besides the mechanism of desynchronization, we use the time matrix as a fundamental equation of FLM to enhance image contrast.
Image enhancement improves visual appearance in some predefined sense. In most cases, the enhancement effects are evaluated by human visual perception, and image enhancement methods improve the detail for the human visual system (HVS). Most image enhancement methods are based on histograms in the image domain (Stark, 2000; Arici, Dikbas, & Altunbasak, 2009; Gonzalez, Woods, & Eddins, 2009; Xu, Zhai, Wu, & Yang, 2014), and the histogram equalization is one the best-known methods. The histogram is also combined with other techniques for image enhancement (Gonzalez et al., 2009; Tizhoosh, 2000; Cheng & Xu, 2000). Histogram-based methods are useful for images with a poor intensity distribution. Details can be enhanced by filters in the image domain or some transform domains to increase high-frequency coefficients (Starck, Murtagh, Candes, & Donoho, 2003; Tang, Peli, & Acton, 2003; Tang, Kim, & Peli, 2004). These methods are difficult in selecting the parameters for high-frequency components. The time matrix is applied to image enhancement with SCM (Zhan et al., 2009; Ma, Teng, Zhan, & Zhang, 2012), but the results of SCM produce overenhancement and render parts of dark regions white.
FLM is applied to image enhancement based on the timing of the first spike, which reveals much of the image information. There is a corresponding neuron for each pixel in the image, and the stimulus corresponds to the grayscale intensity of an image. A natural image is input to the network, and the output is obtained based on the time matrix. The time matrix is recorded in the single-pass working form of FLM. The time matrix has an approximate logarithmic relation with the stimuli matrix, which is consistent with the Weber-Fechner law. We set parameters carefully under the qualitative analysis of FLM and simulate the Mach band effect in the image enhancement algorithm. The enhanced way of FLM is consistent with HVS.
This paper makes the following contributions:
A neural network FLM is designed. Besides the mechanism of linking modulation and dynamic threshold inspired by the gamma band oscillations, the time matrix of FLM, which has neurophysiological support, is emphasized in this paper.
We use the single-pass working form to obtain the time matrix. This form makes it easy to understand the time structure of FLM. FLM produces spikes synchronously via the modulation of two types of synaptic inputs, and we analyze two types of waves that are related to these synaptic inputs.
An image enhancement method is proposed. The method simulates the Mach band effect well, and the processing mechanism is consistent with the Weber-Fechner law. Thus, the processed results of FLM are consistent with HVS.
The rest of the letter is organized as follows. In section 2, we present FLM. In section 3, we describe the time matrix and the single-pass working form of FLM. In section 4, we introduce image enhancement in detail. Section 5, presents numerical experiments and comparison results. Section 6 concludes with some discussion.
2 Feature Linking Model
The feature linking model (FLM) has three components: the membrane potential, the threshold, and the action potential. A dendrite receives postsynaptic action potential through synapses from receptive fields. The produced action potential is transferred to the neighboring neurons by means of localized synapses located on the dendrites. Electrical charges at the synapses produce the membrane potential. If the potential is large enough to exceed a threshold, the neuron generates an action potential or spike.
In FLM, the membrane potential and the threshold are represented by leaky integrators.
2.1 Leaky Integrator
2.2 Membrane Potential
The postsynaptic action potential increases the threshold by an amount h so that a secondary action potential cannot be generated during a certain period, and the increased threshold decays with the time constant g.
2.4 Action Potential
2.5 Feature Linking Model
Different from the integrate-and-fire model, FLM has a secondary synapse and the dynamic threshold. The secondary synapse is the linking synapse inspired by the gamma-band synchronization (Eckhorn et al., 1990) and the dynamic threshold is designed to simulate the refractory period of a neuron (French & Stein, 1970; Eckhorn et al., 1990), so FLM has properties close to a biological neural structure. FLM has fewer parameters and variables than PCNN, and FLM is described by using two leaky integrators but PCNN has three. FLM simplifies the membrane potential to a single equation rather than using three in PCNN, because the membrane potential of most biological neural networks is represented by a leaky integrator (Koch & Segev, 2000). In PCNN, the pulse period is emphasized (Johnson & Ritter, 1993; Johnson, 1994; Johnson & Padgett, 1999), but we use the time matrix of FLM to process a signal that has neurophysiological support. The time matrix records the firing order of neurons and reflects the synchronization. With the global inhibition term in FLM, it is effective to obtain synchronization of neurons in each single region and desynchronization among different regions (Stewart et al., 2002).
Both PCNN and FLM have two types of synapses. SCM has only linking synapses and ICM has only feeding synapses. If is set to 0, FLM is the same as SCM (Zhan et al., 2009). If is set to 0, FLM is the same as ICM (Ekblad & Kinser, 2004). Therefore, FLM is flexible with regard to the set of parameters.
2.6 Feature Linking via Synchronization
There are two types of synaptic inputs in FLM. The feeding synaptic inputs and the linking synaptic inputs modulate the membrane potential of neurons in the neighborhood to be coupled to each other when postsynaptic action potentials go through the synapses, and the coupled neurons in the neighborhood produce spikes synchronously.
As shown in Figure 3, the changes in the matrix Y are observed, and the changes look like waves traveling. The example shows that once a neuron is firing, its efficacy always exists in the iteration and spreads as waves. There are two distinct types of synchronization in FLM: stimulus forced and stimulus induced (Eckhorn et al., 1990). The stimulus-forced synchronization is related to the feeding waves, and the stimulus-induced synchronization is related to the linking waves. The two forms of waves spread from a central neuron with radii that increase step by step. All the neurons in the neighborhood are coupled to each other, and a fired neuron captures some of the neighboring neurons to fire synchronously. Several neighboring neurons are fired, and their neighboring neurons can be captured by them. The feeding waves, efficacy propagate to all neurons once it has fired, and the linking waves select only neurons whose stimuli are similar to the central one neuron.
As the overall structural features are the primary data of human perception (Arnheim, 1954), FLM processes images in the light of HVS. The synchronization reveals that two pixels with similar intensity in a neighborhood are usually not able to be perceived by humans. FLM has the property that neurons with similar stimuli in a region are modulated by the waves to make corresponding neurons in the region fire synchronously.
3 Time Matrix
3.1 Time Matrix
Most action of neurons occur when the action potentials are produced the first time. A time matrix T is defined for the first firing time of neurons (Johnson & Padgett, 1999; Stewart et al., 2002; Zhan et al., 2009).
3.2 Single Pass
The single pass, a working form of FLM, is completed when all of neurons have generated the action potentials (Johnson & Padgett, 1999). The single-pass form is used to obtain the time matrix.
The single-pass working form of FLM is a method for the network stopping condition—that is, the network automatically stops when all neurons are fired.
The single pass can be realized when the threshold amplification factor h is set to a large enough value that neurons generate spikes only once (Kuntimad & Ranganath, 1999).
In implementation, the single-pass working form can be given by algorithm 1. In that algorithm, is the total number of neurons, and counts the number of the fired neurons.
4 Image Enhancement
The intensity of the input image corresponds to the stimulus of the network; the neuron is located at pixel . A two-dimensional image matrix with size is represented by neurons.
Elements in the matrix S are assigned values larger than 0; otherwise, these neurons cannot be captured and never fire when their thresholds are positive values. Values in the stimulus matrix have the smallest grayscale, . Under these considerations, is set to .
4.1 Relationship between Tij and Sij
Based on equations 2.9 and 2.13, we sketch three curves in Figure 4. In the figure, the neuron with a higher stimulus produces the first action potential at the iterative time 3; its value in time matrix is 3. The curve of is decided by the parameter f, Sij, and . The curve of is decided by g and .
The neuron with the higher stimulus produces an action potential naturally at time . The action potential modulates the membrane potential of its neighboring neuron with a low-stimulus , and the lower stimulus can be captured and fired at or in advance. If there is no feature-linking modulation, the lower stimulus is fired at naturally.
Equation 4.5 indicates that the relationship between T and S is consistent with the Weber-Fechner law when f and are set to constants (Zhan et al., 2009). The Weber-Fechner law reveals that the objective intensity and the human subjective response are related logarithmically.
As equation 4.4 is an implicit function obtained under the assumption that there is no synaptic modulation and equation 4.5 is obtained under the further assumption that the discrete time tends to a large value, the time matrix cannot be obtained by equations 4.4 and 4.5. This section is a way of qualitative analysis. In implementation, we obtain the time matrix by equation 3.1 and algorithm 1.
4.2 Parameter f
We substitute equations 4.6 and 4.7 into 4.4 to obtain an implicit function between Tij and Sij, and draw the curve of the implicit function. It can be seen from equations 2.12 and 4.3 that pixels with high intensity are usually fired earlier than low-intensity pixels. The effect of equation 4.6 delays the firing time for the high-intensity pixels to make their values Tij higher because a high-intensity pixel Sij with a lower fij affects the stable value of to be lower, as shown in Figure 4. Therefore, c0, c1, and are adjusted, respectively, to 0.75, 0.05, and 0.4 by using the implicit function, equation 4.4.
The parameter matrix f is smoothed by the gaussian filter with the standard deviation of 1 in order to tend to be more homogeneous for each region with a similar intensity.
4.3 Initial Threshold
Figure 5 is drawn by using equations 4.7 and 4.9. After being filtered by the Laplacian operator, the pixels with the low-intensity side of the edge obtain a positive value while the pixels with the high-intensity side obtain a negative value. When the filtering result of the Laplacian operator is subtracted by the stimulus signal, the low-intensity side of the edge is lower, while the high side is higher, so the edge becomes sharper to achieve the edge enhancement.
4.4 Output-Enhanced Image
4.5 Optimize the Output
As some neurons may fire far earlier or later than most others, we optimize the intensity values in grayscale image to new values in J such that 2% of the data is saturated at low and high intensities of . This increases the contrast of the output image J.
The simplest way to solve equation 4.13 is to loop m and over the whole intensity range and compute the difference of every latent pair (m, ) that satisfies conforms the constraint condition. That fact is not necessary to loop over the whole intensity range because the variable m varies only from 0 to a certain intensity whose cumulative distribution probability (CDP) satisfies CDP. In this way, we narrow the searching range of m. In the same way, the searching range of is narrowed too. After ascertaining the searching ranges of m and , the next step is computing the difference of each latent pair (m, ) and finding the desired pair with the minimum difference.
5 Experimental Results
We compare our approach with state-of-the-art methods qualitatively and quantitatively. A large number of experiments have been conducted to demonstrate the effectiveness of the proposed method.
5.1 Experiment Setup
Before the algorithm is implemented, we initialize all elements in the matrices U, Y, and T to 0. is initialized by equation 4.7.
After a number of tests, scalar parameters of the proposed algorithm are given in Table 1. As discussed in section 3, h is set by using equation 7 of Kuntimad and Ranganath (1999), by which the each neuron is guaranteed to produce a spike once during a cycle. d is set by considering Stewart et al.’s (2002) analysis. Considering equation 2.9, g is set to a value that is close to 1, which makes the iteration number large. A relatively large iteration number implies that the values in the time matrix have a large range and the output image has a high grayscale resolution.
|h .||d .||g .||.||.|
|h .||d .||g .||.||.|
The proposed FLM method is compared with six different algorithms based on the histogram equalization (HEQ) (Gonzalez et al., 2009), the fuzzy set method (FSM) (Cheng & Xu, 2000; Gonzalez et al., 2009), the discrete cosine transform domain method (DDM) (Tang et al., 2003), the SCM method (Zhan et al., 2009; Ma et al., 2012), the generalized equalization model (GEM) (Xu et al., 2014), and gradient distribution specification (GDS) (Gong & Sbalzarini, 2014), respectively.
The parameter settings of the other methods are as follows. HEQ has no parameter. The parameter of DDM is set to 1.2. The default parameters given by respective authors are adopted for the methods based on FSM (Cheng & Xu, 2000; Gonzalez et al., 2009), SCM (Zhan et al., 2009; Ma et al., 2012), GEM (Xu et al., 2014), and GDS (Gong & Sbalzarini, 2014).
5.2 Quantitative Evaluation
5.3 Visual and Quantitative Comparison
The experiments are conducted to demonstrate the effectiveness of the proposed image-enhancement algorithm. We select two grayscale images and three color images. The two grayscale images (cameraman.tif and tire.tif) are provided in Matlab image toolbox. A color image (sunset) is downloaded from Koren (2004). Two other color images (flower and lunaria) are from Farbman, Fattal, Lischinski, and Szeliski (2008). The input images and the enhanced images obtained by different methods are shown in Figures 6 to 10.
HEQ, DDM, and FSM enhance contrast globally, so they are not suitable for every image. HEQ obtains good results for the images tire, flower, and lunaria, as shown in Figures 7b, 9b, and 10b, respectively. However, the tire is still not clearer than the result of FLM (see Figure 7h). The background green plants are not enhanced well in Figure 9b. For the dark region and the leaves of the lunaria, FLM algorithm always obtains a clearer result than HEQ, as shown in Figure 10.
In the DDM algorithm, the contrast is defined in DCT domain, and the details of the image are globally enhanced with a unified parameter. The results of DDM algorithm improve visibility in only a limited way, as shown in Figures 6c, 7c, 8c, 9c, and 10c.
The FSM algorithm does not enhance the local contrast, and the results are even not clearer than input images, as shown in Figures 6d, 7d, and 10d. Globally, the FSM algorithm makes the dark pixels darker and the bright ones brighter. This algorithm has a good visual performance of the lawn in Figure 6d.
Some dark pixels in SCM results change to bright ones, as shown in Figures 7e, 9e, and 10e. These dark pixels are not fired, and their values in time matrix are still the initial value 0; the inverse operation makes them change to white (Zhan et al., 2009). Some results of the SCM algorithm suffer from contrast degradation in bright regions, as shown in Figures 6e, 8e, 9e, and 10e, because a lot of the bright pixels in SCM fire too early and their values in time matrix tend to close. The problems of the SCM algorithm are solved in the FLM algorithm well. In FLM, the single-pass working form and the positive stimulus matrix S guarantee that even the neurons with the lowest stimuli will fire. The attenuation coefficients f are set as a matrix related to the input image rather than a small, single, scalar value, which keeps the proposed algorithm located at the dark zones correctly. The algorithm enhances the dark zones while preserving the contrast within the bright scene as much as possible.
The GEM algorithm is easily regressed to HEQ. The GEM results of the tire are a little worse than HEQ, and the other four are similar to HEQ. GEM is as similar as HEQ to enhance image contrast, and it is mainly used to correct the tone of images (Xu et al., 2014).
The GDS-based method improves an image’s quality by remapping the image, which makes the distribution of the image’s gradients matches the specified distribution (Gong & Sbalzarini, 2014). In practice, it yields an inconspicuous enhancement effect, as seen in Figures 6g, 7g, 8g, 9g, and 10g.
The FLM image enhancement algorithm enhances the contrast globally because of the Mach band effect. What is more, the Mach band effect makes the edges clear—the coat in Figure 6h, the small cannon in Figure 8h, and the red flower in Figure 9h. FLM renders it easy to see pixels with invisible low intensity such as the coat of the cameraman (see Figure 6a), the dark regions of the flower (see Figure 9a), and the dark regions of the lunaria (see Figure 10a). Because of the limited dynamic range of grayscale, the global intensity increases when invisible regions tend to be easily seen. When the global intensity increases, the details are still clear in the FLM results, but the images are a little overexposed. FLM enhances the contrast for the low intensity because of the logarithmic relationship between T and S, and pixels with high intensity keep their contrast because of the effectiveness of the parameter f. Such visual performances are seen in all results of FLM. Because information preservation is as important as enhancing the contrast, the processed results of FLM are consistent with HVS.
The quantitative evaluation metrics are summarized in Table 2. The grayscale intensity or the value channel of the HSV model is used to evaluate the performance. The objective quantitative evaluation is consistent with the subjective visual effect of the enhanced images.
|Image .||HEQ .||DDM .||FSM .||SCM .||GEM .||GDS .||FLM .|
|Image .||HEQ .||DDM .||FSM .||SCM .||GEM .||GDS .||FLM .|
Note: The best results are highlighted in bold.
5.4 Further Experiments
Further experiments are conducted to demonstrate the effectiveness of FLM with different databases: Berkeley segmentation database (BSD) (Arbelaez, Maire, Fowlkes, & Malik, 2011), digitally retouched image quality database (DRIQ) (Vu, Phan, Banga, & Chandler, 2012), LIVE image quality database (Wang, Bovik, Sheikh, & Simoncelli, 2004; Sheikh, Sabir, & Bovik, 2006), and real blur image database (RBID) (Laboratório de Processamento de Sinais). We use the 500 images in BSD, the 26 reference images in DRIQ, the 29 reference images in LIVE, and the 590 images in RBID, for a total of 1145 images to evaluate the FLM method against other methods.
We calculate the average performance of different methods in each database and the best performance rate of FLM for each database. The results are in Tables 3 and 4. As shown in Table 3, FLM outperforms the other methods in terms of contrast, spatial frequency, and gradient. Table 4 indicates that FLM enhances most images in the four databases.
|Database .||HEQ .||DDM .||FSM .||SCM .||GEM .||GDS .||FLM .|
|Database .||HEQ .||DDM .||FSM .||SCM .||GEM .||GDS .||FLM .|
Note: The best results are highlighted in bold.
|Database .||Number .||Contrast .||Spatial Frequency .||Gradient .|
|Database .||Number .||Contrast .||Spatial Frequency .||Gradient .|
We propose FLM inspired by gamma-band oscillations. We use the single-pass working form of FLM to obtain the time matrix, and the single pass makes it easy to understand the time structure of the network. FLM has two types of waves: feeding and linking. We study the FLM time matrix and propose an effective method for image enhancement. The comparisons with HEQ, DDM, FSM, SCM, and GEM illustrate the validity of the proposed algorithm. The FLM-based image enhancement method is a general and powerful technique that can be applied to low-contrast images and obtain satisfactory results. The processing mechanisms of the algorithm are consistent with HVS.
This work was supported by the National Science Foundation of China under grant no. 61201422 and the Specialized Research Fund for the Doctoral Program of Higher Education under grant 20120211120013. We are extremely grateful to Yide Ma, Hongjuan Zhang, and Fei Teng for giving us many useful suggestions. We thank Dani Lischinski and Norman Koren for letting us use their images.