Abstract
Beamforming on the icosahedral loudspeaker (IKO), a compact, spherical loudspeaker array, was recently established and investigated as an instrument to produce auditory sculptures (i.e., 3-D sonic imagery) in electroacoustic music. Sound beams in the horizontal plane most effectively and expressively produce auditory objects via lateral reflections on sufficiently close walls and baffles. Can there be 3-D-printable arrays at drastically reduced cost and transducer count, but with similarly strong directivity in the horizontal plane? To find out, we adopt mixed-order Ambisonics schemes to control fewer, and predominantly horizontal, beam patterns, and we propose the 3|9|3 array as a suitable design, with beamforming crossing over to Ambisonics panning at high frequencies. Analytic models and measurements on hardware prototypes permit a comparison between the new design and the IKO regarding beamforming capacity. Moreover, we evaluate our 15-channel 3|9|3 prototype in listening experiments to find out whether the sculptural qualities and auditory object trajectories it produces are comparable to those of the 20-channel IKO.
Early work on compact, spherical loudspeaker arrays with controllable directivity was described by Warusfel, Derogis, and Causse (1997) and by Pollow and Behler (2009). Platonic solids (regular convex polyhedra, such as dodecahedra or icosahedra) offer practical housings because of their symmetries and their small number of faces, each of which can contain a loudspeaker pointing outward in a unique direction. Conventional spherical beamforming on the 12 transducers of a dodecahedron uses spherical harmonics up to the second order, while on the 20 transducers of the icosahedron, it is limited to third order. To overcome the limitation, array-specific acoustic radiation modes have been proposed by Pasqual et al. (2010), but those modes would require a frequency-dependent beam encoding. Alternatively, the number of transducers per surface has been increased beyond one, e.g., to six per each of the 20 icosahedral facets by Avizienis et al. (2006), which, however, is only practical with high-frequency tweeters because of their small size.
Recently, Zotter et al. (2017) presented the icosahedral loudspeaker (IKO) as an instrument for electroacoustic music in an article in this journal that outlines the theoretical principles of spherical beamforming and exemplary practical tools required for its use (ambix and mcfx VST plugins). Wendt et al. (2017b) and Sharma, Frank, and Zotter (2019) investigated auditory sculptures and their attributes that emerge for exemplary static and time-varying beam compositions, and hereby provide a descriptive framework for the artistic practice. In these beam compositions, sound is projected onto walls and baffles to produce auditory objects via acoustic reflections, essentially via horizontal beams, that are most effective. This article investigates an alternative, 3-D-printable, compact spherical loudspeaker array design customized to producing horizontal beams.
The article begins with a presentation of the proposed mixed-order schemes to increase the horizontal resolution of, for example, dodecahedral arrays from second to third order and icosahedral arrays from third to fourth order. Its main targets are new three-ring layouts and their scheme to effectively reduce the number of transducers. The Array Simulation section numerically simulates the mixed-order layouts and compares their beamforming capacity based on 2-D and 3-D metrics of effective beamforming order. Modal beamforming on common-enclosure loudspeaker arrays requires decoupling of the transducer movements and radial filtering, which is not productive at high frequencies. The Control Filter Design section introduces a measurement-based, low-latency, two-band process with regularization at low frequencies to minimize filter lengths and All-Round Ambisonics Decoding (AllRAD) panning at high frequencies for minimal grating lobes. The Directivity Measurements section verifies the gain in beamforming capacity of the new processing scheme and the 3|9|3 loudspeaker, as depicted in Figure 1, based on openly accessible measurement data. The final Listening Experiments section assesses auditory sculpture attributes and auditory trajectories obtained with the 3|9|3 loudspeaker, comparing them to those of the IKO.
Normalized Mixed-Order Directivity Patterns
Mixed-Order Transducer Layouts
The mixed-order schemes in Figure 3 and the associated spherical harmonic subsets can be controlled using either Platonic layouts or the new three-ring layouts consisting of an upper, a horizontal, and a lower ring. The nomenclature refers to a specific layout, for example, the 3|9|3 layout with transducers in the horizontal ring and transducers in the two other rings. The Platonic arrays can also be seen as three-ring layouts, with the middle ring being a zigzag ring of loudspeakers oriented at positive and negative elevation angles in alternation. That is, the dodecahedron as a layout and the icosahedron as a layout, which yields extended mixed-order control schemes for those Platonic arrays. The coordinates of the new three-ring layouts are given in Table 1.
. | 5|10|5 . | 4|8|4 . | 3|9|3 . | 3|7|3 . |
---|---|---|---|---|
, at | 0:36:324 | 0:45:315 | 20:40:340 | 0:51.4:308.6 |
, at | 18:72:306 | 0:90:270 | 0:120:240 | 20:120:26 |
, at | 54:72:342 | 45:90:315 | 60:120:300 | 80:120:320 |
. | 5|10|5 . | 4|8|4 . | 3|9|3 . | 3|7|3 . |
---|---|---|---|---|
, at | 0:36:324 | 0:45:315 | 20:40:340 | 0:51.4:308.6 |
, at | 18:72:306 | 0:90:270 | 0:120:240 | 20:120:26 |
, at | 54:72:342 | 45:90:315 | 60:120:300 | 80:120:320 |
Coordinates are denoted as [start:step:stop] degrees of azimuthal coordinates of the horizontal, upper, and lower ring of a layout. Zenith coordinates are for horizontal, upper, and lower ring respectively. The 1|7|1-layout (not shown) is an exception: The nonhorizontal positions are the poles .
Table 2 shows that all matrices (subsets see Figure 3) are sufficiently well-conditioned as is finite and close to unity.
. | . | . |
---|---|---|
Dodecahedron | 12 | 1.6 |
Icosahedron | 20 | 2.4 |
1|7|1 | 9 | 1.6 |
3|7|3 | 13 | 2.0 |
3|9|3 | 15 | 1.9 |
4|8|4 | 16 | 1.7 |
5|10|5 | 20 | 1.8 |
. | . | . |
---|---|---|
Dodecahedron | 12 | 1.6 |
Icosahedron | 20 | 2.4 |
1|7|1 | 9 | 1.6 |
3|7|3 | 13 | 2.0 |
3|9|3 | 15 | 1.9 |
4|8|4 | 16 | 1.7 |
5|10|5 | 20 | 1.8 |
Number of transducers and the condition number of used in the speaker configurations.
Array Simulation
In the following we numerically simulate the mixed-order layouts by means of the spherical cap model and compare their beamforming capacity based on 2-D and 3-D metrics of effective beamforming order.
Spherical Cap Model for Sound Radiation
Simulation Results
An OpenSCAD model of the 3|9|3 array was created to 3-D-print the necessary spherical housing (open access at https://git.iem.at/s1330219/cmj_mocsla.git). The housing has been printed with a radius of 0.12 m and is mounted with fifteen 2.5-in. wide-band transducers from SB Acoustics. The odd number of transducers and their low-frequency roll-off at about 100 Hz suggest adding a subwoofer, yielding a 15.1-channel layout (beamformer plus subwoofer) that proved effective in listening sessions.
Control Filter Design
This section discusses the design of control filters and its practical implementation as multiple-input, multiple-output (MIMO) finite impulse response (FIR) filter matrices.
Overview
Linkwitz-Riley Band Splitting
A Linkwitz-Riley crossover is composed of two cascaded low-pass Butterworth filters for the low band and two cascaded high-pass Butterworth filters for the high band (cf. D'Appolito 1987). The correspondingly squared Butterworth frequency high- and low-pass responses exhibit −6 dB at the crossover frequency , and their phase is either strictly opposite or strictly matching at every frequency. Summing the bands with the suitable sign ensures a flat response when gains are equal, or a well-behaved interference when gains differ. We utilized cascaded third-order Butterworth filters for a sixth-order crossover between the two bands.
MIMO Crosstalk Canceler
The reduced stiffness of the air enclosed when mounting the loudspeakers in a common enclosure can support beamforming by reducing the acoustic load on the loudspeakers, in particular at low frequencies. But it also introduces acoustic crosstalk that needs to be dealt with for beamforming (Zotter et al. 2017). If one transducer is moved by a signal, the others will start to move passively, but beamforming requires independent control of the transducers.
Low-Frequency Beamforming below Aliasing
The cutoff frequencies of the filter bank were chosen to ensure a limited loudspeaker excursion across the frequency bands, and their array-specific values are found in Table 3.
. | . | . | . | . | . |
---|---|---|---|---|---|
82 | 146 | 250 | 318 | 450 | |
ico-o4 | 38 | 77 | 141 | 209 | 253 |
. | . | . | . | . | . |
---|---|---|---|---|---|
82 | 146 | 250 | 318 | 450 | |
ico-o4 | 38 | 77 | 141 | 209 | 253 |
Frequencies in Hz.
High-Frequency AllRAD Panning
Band Summation and On-Axis Equalization
The MIMO FIR time-domain response of is obtained by equidistant sampling in the frequency domain, using with , and 16,384 points, followed by an inverse FFT to the time domain. Windowing the impulse responses to 1,024 samples is possible due to the low-latency designs, enabling real-time and live-performance applications. The real-time FIR matrix convolution can use the jconvolver or mcfx_convolver plug-ins, for instance.
Directivity Measurements
As a verification method, acoustic MIMO measurements with a surrounding semicircular microphone array were taken, similar to the measurements taken by Schultz et al. (2018). By placing the loudspeaker array on a remotely controllable turntable, a sampling grid with a resolution of is achieved.
Horizontal and Vertical Cross Sections of Beam Patterns
Effective Orders of Directivity across Frequency
The effect of the various subsystems in the control filter design is analyzed in Figure 11 for the 3|9|3 array. A frequency-independent spherical harmonics decoder alone hardly accomplishes beamforming of a first-order directivity below 1.6 kHz (light gray curve). Applying the limited radial filters boosts the effective beamforming order (gray curve, “radfilt”) most distinctively and reaches horizontal orders of three and global 3-D orders of two. Finally, the directivity increases by up to half an order below 1.6 kHz by applying the crosstalk canceler, and above 2.9 kHz the fifth-order AllRAD Ambisonics panning provides a boost by up to one order in the 3-D map of the highest frequencies (dark curve “allrad_ctc_radfilt”). As before, the curves do not quite reach the theoretical predictions (dashed curve, “model”) in the modal beamforming range.
Figure 12 analyzes variation induced by beamforming direction. The IKO array—built with high-quality and, hence, more costly parts—maintains a similarly effective beamforming order for different beamforming directions, whereas the 3-D-printed 3|9|3 prototype varies with a peak in directivity for beams in the direction of one of its loudspeakers ( azimuth). The measurement data is available in the Spatially Oriented Format for Acoustics, AES69-2015, and can be downloaded from https://phaidra.kug.ac.at/o:91326 and https://phaidra.kug.ac.at/o:67609.
Listening Experiments
Above, the 3|9|3 prototype was shown to have beamforming performance similar to the more powerful and larger 20-channel IKO.13 Naturally, the frequency range for beamforming is higher because of its smaller size. Although the vertical beamforming capacity is weaker compared with the IKO, the fourth-order horizontal beamforming design effectively exceeds the conventional third-order beamforming of the IKO, as used in previous tests and concerts. Because the analysis above is limited to technical beamforming measurements and metrics, this section addresses in greater detail the question of whether the auditory impressions achievable with the 3|9|3 are comparable to those of the IKO. We adopt some of the perceptual analysis methods established in previous studies on the IKO to clarify the 3|9|3-prototype's potential to be used as an affordable, personal electroacoustic musical instrument.
Work by Wendt et al. (2017a) and by Laitinen et al. (2015) discusses the option of pointing beams towards or away from the listener as means of positioning auditory objects in terms of distance. Moreover, Wendt et al. (2017b) and Zotter et al. (2017) show that time-varying beamforming is capable of moving auditory objects through the interior of the playback environment. Sharma, Frank, and Zotter (2019) establish and evaluate three auditory sculptural attributes produced by a small set of signals laid out in static and time-varying beam compositions.
Listening Experiment 1: Auditory Sculpture Attributes
The listening experiment was based on comparative characterization of miniature electroacoustic compositions using a limited number of well-described sounds and their beamforming trajectories, as defined by Sharma, Frank, and Zotter (2019, Experiment 3). The goal of the comparative rating is to evaluate the perceptual discernibility of the three sculptural qualities directionality, contour, and plasticity:
Directionality describes the potential of auditory objects in the auditory sculpture to dynamically guide the listeners attention through a room;
Contour describes the degree of dependency of the auditory sculpture's outline (silhouette) on the listening position, taken and imagined from temporal evolution; and
Plasticity describes the degree of depth grading of the spatially layered auditory objects of the auditory sculpture in the room.
Figure 13 compares the results obtained for the 3|9|3 prototype with the results obtained in the 2019 experiment for the IKO array (whose statistics used 29 data points per condition). The rating of the conditions in the sculptural quality space is quite similar, and the mostly contoured, unidirectional condition can be considered identical between both experiments. Condition , which used a horizontally circular beam trajectory of pink noise, was rated less directional for the 3|9|3 prototype than for the IKO. Informal reports by the listeners suggest that the contour of the auditory object is not compact and smooth in space but rather jumps and occasionally exhibits two separate high- and low-frequency auditory objects. Our hypothesis is that the increase of the directivity and higher operational beamforming frequency range of the 3|9|3 prototype might isolate the wall reflections better, but this also causes an inconsistent auditory object trajectory, with low frequencies dispersed. Moreover, the horizontal loudspeakers of the IKO aim, in alternation, at the elevations and so might never excite the wall reflections as targeted at high frequencies. A similar consideration could be used to argue that the conditions and have been rated less directional and as having a higher plasticity.
Listening Experiment 2: Auditory Object Trajectories
The second listening experiment is aligned with the test design and conditions tested by Wendt et al. (2017b) and Zotter et al. (2017) using the IKO, but here the experiment is instead tested with the 3|9|3 loudspeaker, set up at the same position and in the same environment. Six conditions were used that represented three different trajectories, each presented with two different sound stimuli (continuous pink noise and a grain sequence). The three investigated trajectories are:
a beam towards the listener, fading the Ambisonics order from five to zero (omnidirectional) and back (using the size knob in the VST plugin ambix_encoder),
a circular rotation starting left and moving its horizontal beam clockwise, and
a cross-fade from a sound beam toward the left wall to one pointing to the right wall.
As in the prior experiments with the IKO, the experimental task used a GUI implemented with Pure Data to position ten markers that each represented the auditory event location at half a second within the looped playback time (each of the conditions was five seconds long).
There were 13 participants, and it took them on average 24 minutes to complete the task. Each participant was tested with the six stimuli in a random permutation, each test performed twice to permit checking for consistency of ratings. Data from the first, ninth, and tenth participants were discarded because their standard deviation for repeated ratings exceeded 2 m.
Although auditory front-to-back trajectories of the grain signals (Figure 14b) yield a slightly larger spatial span for the 3|9|3 array (ellipses in light gray) than for the IKO (dark gray dots), we see an opposite tendency for the front-to-back movement of the noise signal in Figure 14a, in which the IKO condition spans a larger range. In any case, the monotonic mapping is qualitatively matching. The full rotation of noise in Figure 14c shows that beamforming on the 3|9|3 array appears to be superior, or at least equally capable, in projecting stationary broadband sound to lateral walls. The 3|9|3 auditory trajectories in Figures 14c and 14d cover a greater area and, although they are similar to those of the IKO, their details differ and the trajectory in the latter is offset. A comparable, if not superior, control seems to be confirmed by the dedicated left-to-right movement of noise in Figure 14e. In contrast, the transient grain stimuli in Figures 14d and 14f are not fully lateralized to the right wall as with the IKO. Perhaps as in Experiment 1, the difference in the details can be explained by the loudspeaker directions of the IKO, which imply a deflection of high-frequency content from the horizontal plane.
Despite the fact that there are noticeable differences in the precise shapes of the ratings, we assume that the results match sufficiently well for practical applications.
Conclusion
We have presented a mixed-order control theory that extends beamforming technology with compact spherical loudspeaker arrays. To evaluate the design goal of an improved horizontal beam control, we used a radiation model and introduced the effective horizontal (2-D) and global (3-D) order measures, first to prove the concept on Platonic-solid loudspeaker arrays. Mixed-order control increases the effective horizontal beamforming order from second to third order for the dodecahedral loudspeaker array, and from third to fourth order for the icosahedral array, with negligible impact on the effective 3-D order.
New mixed-order layouts were introduced that are composed of three loudspeaker rings. The dedicated mixed-order layouts save transducers while achieving equal or higher beam orders in the horizontal plane. They are especially suited for the proposed high-frequency AllRAD panning as many on-axis loudspeaker directions are aligned with the horizontal plane to support horizontal amplitude panning directions for a better directivity focus of high frequencies.
Based on directivity measurements of the IKO and the proposed 3-D-printable and inexpensive prototype of the 3|9|3 loudspeaker, we could prove the practical feasibility and effectiveness of the proposed control-filter design based on beamforming with radial filters and crosstalk cancellation at low frequencies, and AllRAD panning at high frequencies.
Two listening experiments that were introduced and tested with the IKO loudspeaker in previous publications were repeated with the new 3|9|3 prototype. They confirm the practical applicability of the new loudspeaker as it achieves results in terms of auditory-sculpture qualities and auditory-object trajectories that are similar to those of the IKO, which is more powerful but more expensive. This makes the 3|9|3 loudspeaker an alternative, potentially a personal, electroacoustic musical instrument.
We point readers to a repository at https://git.iem.at/s1330219/cmj_mocsla.git, which contains open-source code for filter design and directivity plots as well as CAD files for 3-D-printing. We also refer the reader to the open measurement data at https://phaidra.kug.ac.at/o:91326 and https://phaidra.kug.ac.at/o:67609.
Acknowledgments
We thank Gerriet K. Sharma for setting up the conditions of the listening experiments with the 3|9|3 loudspeaker, Sharma and Valerian Drack for conducting the listening experiments, the voluntary participants of these experiments, and the Austrian Knowledge Transfer Centre South (WTZ-Süd, PI at KUG/IEM: Robert Höldrich) for enabling a substantial part of our work.
This article is a revised and extended version of the paper “Design and Control of Mixed-Order Spherical Loudspeaker Arrays” (Riedel, Zotter, and Höldrich 2019), presented at the International Computer Music Conference.