Abstract
The fast-developing intelligent infrastructure landscape catalyzes transformative new relationships of human, technology, and environment and requires new socio-technical configurations of information practice and knowledge work. With a focus on data as the source of intelligence, this paper aims to explore the shifting scenarios and indicative features of data science solutions for intelligent system applications and identify the evolving knowledge spaces and integrative learning practices in the “smart” landscape. It projects and discusses the democratization of data science platforms, the distribution of data intelligence on the edge, and the transition from vertical to horizontal data solutions in solving intelligent system problems. Through mapping the changing data research landscape, this work further reveals essential new roles of knowledge architects and social engineers in enabling dynamic data linking, interaction, and exploration for transdisciplinary data convergence.
1. INTRODUCTION
The landscape of intelligent infrastructure is changing at unprecedented speed, enabled by artificial intelligence (AI) and robotics, deeper understanding of societal and environmental changes, advances in learning sciences, and new conceptions of work and workplaces, as well as innovations in pervasive, intelligent, and autonomous systems [1]. Along with these changes, there are immediate challenges such as public health emergency response and risk management, demand for skills not met by current educational pathways, algorithmic biases and security threats, as well as undesirable impact on the built environment and natural ecosystem. The co-evolution of cyber-physical-and-human environments presents tremendous opportunities for higher education to reform academic enterprise and transform future discourse.
In this evolving landscape, we need to develop convergent research that systematically addresses the technological, the environmental, the biomedical, and the societal dimensions of future work. This requires the integration of domain knowledge across biotechnology, engineering, environmental sciences, learning sciences, data and information sciences, as well as social, behavioral, political, and economic sciences. Concomitantly, we need to grow integrative learning to instill a convergent perspective and system view that is essential to ensure that intelligent work technologies will strengthen the social fabric, improve the public health, and sustain the environmental prosperity. The ongoing global health emergency and COVID-19 pandemic will only accelerate this academic enterprise transformation.
Uniquely anchored in the evolving landscape, this paper proposes a coordinated set of community engagement activities across the data spectrum to elicit integrated solutions. It also presents an educational pathway designed to amplify the intrinsically transdisciplinary nature of data science work in building smart resilient communities. With a focus on data as the source of intelligence, this paper explores the following questions:
What is the evolving nature of integrative learning and convergent research in the emerging knowledge enterprise?
What are the shifting scenarios and indicative features of data science solutions for intelligent system applications?
What are the emerging problem spaces and evolving knowledge practices in the “smart” landscape?
To these effects, this work contributes conceptual understanding and empirical knowledge to infrastructure planning and community building for a transdisciplinary discourse. By emphasizing the importance of re-designing and re-configuring data organization and information modeling for convergent solutions in the “smart” framework, this paper identifies novel problem spaces and evolving knowledge practices in the new landscape.
2. AN INTELLIGENT INFRASTRUCTURE VIEW
The realization of smart and connected communities requires advanced technological developments such as the Internet of Things (IoTs), network sensing, mobile and pervasive computing, as well as social and cognitive computing. These advances can be applied to enable smart and energy-efficient homes and buildings, support automated highways and skyways, and provide in-body networks for monitoring, analyzing, and treating medical conditions, among others [2]. Such applications can help solve a wide variety of societal challenges related to healthcare, energy, transportation, climate, finance, and disaster management. For instance, the advancement and application of Deep Reinforcement Learning (DRL) can control traffic signal systems in real-time to enhance transportation efficiency for urban environment. This is due to the fact that DRL is capable of continuing sensing and learning to capture complex patterns through trial and error with the power of deep learning and reinforcement learning [3].
An intelligent infrastructure features increasingly autonomous systems “that can sense and learn the human's cognitive and physical states while having the ability to sense, learn, and adapt in their environments” to deal with changing conditions [4]. In such increasingly complex adaptive systems, intelligent machines “can shape human behaviors and societal outcomes in both intended and unintended ways,” and conversely, human agents “also create, inform, and mold the behaviors of intelligent machines” [5]. The mechanisms at play are cyber-physical–and-human systems and “how they interact, combine and change” [6]. With the growing ubiquity and complexity of hybrid human-machine interactions in natural settings, there has sprouted the study of human-machine ecologies to understand “the emergent properties that arise from many humans and machines coexisting and collaborating together” [7]. “At the core of all such systems and applications, critical issues are security, privacy, reliability, resiliency, and robustness” [8].
Today we widely acknowledge the high stakes and share a sense of urgency in transforming humanenvironment relationships. In shaping the future, we must deliberately build resilient infrastructure, promote sustainable industrialization, and foster responsible innovation [9]. As such, global scientific knowledge production and intervention are imperatively needed particularly in visions of health crisis prevention, climate change remediation, green energy transition, and bioeconomic reconfiguration for an ailing and over-burdened planet. New forms of knowledge and expertise are called upon and challenged to determine “how technological innovation and economic reform should together re-order human-environment relationships in the name of security, social-political-and-economic stability, and human wellbeing” [10].
Machine intelligence has the potential to augment human performance and alter our collective behaviors in fundamental ways – from political accountability to civic participation and from group-wide coordination to society-wide effects. These will require us to explore socio-technical reconfigurations of information practice and knowledge work, design for system automation and augmented performance, and position for community impact and environmental safety. From agri-tech implementation and crop optimization to clean transportation and renewable energy, the new infrastructure will transform our lives through artful integration and sensible solutions.
With advancing infrastructure, the world will become “more connected, networked, and traceable” with data changing “from static, complete, and centralized to dynamic, incomplete, and distributed,” while knowledge discovery will increasingly focus on “methodologies for identifying valid, novel, and potentially useful and meaningful patterns from such data” [11]. In this fast developing knowledge space, we need integrated data and informatics solutions with sound information modeling and data structuring to provide foundations for effective human-machine communication and human-algorithm collaboration.
3. EVOLVING RESEARCH AND EDUCATION IN THE NEW LANDSCAPE
“Intelligent infrastructure systems will interact with social systems, public policy, and the natural environment” and will leverage “the ever-increasing ubiquity of data and mobility” to support better decisions and create adaptable communities [12]. These areas are inextricably linked, and when considered together, will highlight new enrichment opportunities for students and stimulate novel engagement pathways for scholars.
3.1 Collaborative Data Engagement
The growing diffusion of intelligent infrastructure will accelerate data science applications in practical settings and industrial operations. Massive and heterogeneous data collected from different sensor networks and interconnected systems will need to be adequately combined, synthesized, and presented for effective use. Almost every enterprise – from construction monitoring and production improvement to urban computing and business competition – calls upon data science techniques and solutions to manage and elaborate data and to enhance decision and add intelligence.
Consequently, a wide range of modern infrastructures and community contexts will benefit immensely from new advances covering all facets of the data discovery process. They present a rich confluence of opportunities around the delivery of real-life impact. These opportunities recast the pursuits of knowledge from disconnected workflows to deep synchronicity of skills. The converging path relies on both intentional and serendipitous intellectual encounters as well as purposeful synergistic efforts across data science algorithm development and infrastructure specific application. Its success though is contingent upon deliberate funding mechanism at the edge of cyber-human versus cyber-physical systems and situates in specific contexts and real scenarios.
3.2 Fusion of Data, Knowledge, and Expertise toward Addressing Societal Challenges
In knowledge work, “coding, programming and algorithmically interpreting data is informed by the concepts, ideas, values and affordances of the technologies and people that engage with data” [13]. To break silos of data analysis and interpretation when addressing complex infrastructure problems thus requires broadening of vision and fusion of knowledge. Particularly, significant societal questions as to how to manage complex living systems and maintain ecosystem resiliency in the face of changing conditions require effective governance and policy intervention. These measures should balance our values with respect to the human condition and environmental vitality.
Different modes of “fusion” could also spark alternative solutions and shape different outcomes. It is known that future prospects often drive our decision processes. Prospective imaginaries may help foster early-stage exploration and broaden idea generation for solving grand infrastructure challenges and data fusion problems. As such, we could challenge conventions and raise questions on whether alternative trajectories or contrasting mechanisms can be identified apart from mainstream discourses. We could discuss whether counter-narratives or competing storylines can be presented concerning smart infrastructure and community development.
With appropriate risk-taking, we shall re-imagine and re-formulate prospective scenarios and governance structures for networked technologies and data configurations. These will open up new possibilities and allow for different tracks of development. More so, we may probe what could happen if future scenarios are contested and digital promises become contradictory. By creatively and analytically constructing and contesting future prospects, we could challenge traditional roadmaps and invent new pathways.
3.3 An Integrative Learning Model
Intelligent infrastructure research revolves around advancing the human condition, enhancing social equity, and improving environmental sustainability. In such diverse areas of knowledge intersection, there is an increasing demand for a holistic educational approach to hardwire data science skills into all around capability building and to infuse social responsibilities and environmental sensitivities into learning practice. This new learning mechanism is a telling shift of our traditional education system and a turning point from our discipline-bound teaching mechanism. This shifting paradigm should be reflected in our teaching approaches and learning methods by fundamentally breaking the walls and changing the educational treatment.
Real-world solutions to complex societal problems demand cross-walking and inter-linking of data competencies and quantitative skills with other disciplinary knowledge. By this measure, we should promote data fluency across the curriculum and teach students to incorporate statistical thinking and computational reasoning into their daily scholarship. Moreover, we should augment student learning by embedding various literacy topics such as strategies for finding data and solutions around how to clean, vet, and process data for analytical purposes [14]. These skills are especially needed today in facing the rising wave of misinformation, disinformation and mal-information. The learning takes place while students are solving actual problems through applied research, policy analyses, or engagement projects. They work with real data sets, actual scenarios, and case studies, so that everything that the students learn will be continuously re-contextualized by connecting to real-world complex problems.
In wake of the current global health crisis caused by COVID-19, we have realized more than ever “the interconnected nature of modern societies” and the convolutional “interacting feedback loops” among our health, financial, ecological, political, and social systems. The pandemic response not only emphasizes biotechnology research and medical treatments but also exposes the need to understand “the societal context from which these inventions arise or into which they must be placed” [15]. Confronted with race, inequality, and public trust issues, our future workforce needs to grapple with information complexity within various contexts and be able to integrate social values, community interests, and user perspectives into technological advances and scientific advancements. This calls for an integrative learning model to instill systems thinking and holistic understanding among students for them to gain a humanistic perspective, technological literacy, and transdisciplinary competency.
3.4 A Nested Teaching Scenario
To foster integrative learning around infrastructure problem solving, “faculty and students from all participating disciplines as well as experts from industries and non-profit partners” should “come together for hands-on explorations of real-world problems” [16]. In teaching, it is important to bring together multidisciplinary educators and domain experts whose substantive knowledge, theoretical insights, and methodological expertise can be assembled in meaningful ways around complex projects. This transdisciplinary discourse can take on different modes and fluid configurations. What's at its heart is to expand learning through re-socialization of knowledge engagement, exchange, and production.
As the key to bring relationships, the common presence and social bond among participating educators, practitioners, and students from various domains and sectors can optimally serve to disseminate tacit knowledge, exchange practical experiences, and build empirical insights in a collective fashion and intuitive way. Human interaction is an essential need. Bringing different stakeholders together in the same class and forging social cues and personal ties among them help align their goals, formalize a common language, and foster mutual understanding and knowledge assimilation. With this physical assembly of teaching “squads,” all participants actively interweave their knowledge and perspectives with others, seek new understanding, and build meaningful connections along the way. Situated in the same classroom, they can achieve an iterative inquiry dynamics, closely woven design, and harmonized knowledge building throughout the shared journey, and can course-correct as needed once an educational program is underway [17].
3.5 Adaptive Global Education Engagement
More broadly, in the advancing globalization of educational pathways, it will become increasingly important to expand learning across geo-economic-and-political boundaries and democratize data for broader access and experimentation. Open and equitable access for information use and knowledge discovery is much desired in today's higher education.
Underlying the above view is the vision of a synchronized data and workflow system as well as diversity-aware technology environment to incubate educational connections and catalyze social relationships across geographical, economical, and cultural contexts [18]. More profoundly, it necessitates exploration into how forms of differentiation and locality can be leveraged to address complex social, economic, health and environment problems in global contexts. With many possibilities in academic programming, we can situate student learning and community engagement within a broader transnational context in efforts to build networked infrastructures and connected communities.
In promoting adaptive global education engagement, we can further inculcate generous thinking and humanistic perspective among our students for the sake of sustainability, solidarity, and the common good. As well, we can instill cultural intelligence and global adaptability among young people in the interest of future wellbeing, environmental vitality, and global prosperity. These collective efforts will converge into a holistic education system.
4. FEATURES OF DATA SCIENCE SOLUTIONS FOR INTELLIGENT SYSTEM APPLICATIONS
With the digital transformation, it is important to understand the shifting scenarios and indicative features of data science solutions for intelligent system applications. New data research scenarios emerge in the “smart” ecosystem context and have inextricable linkage with process discovery and digital library solutions [6]. These are synthesized and displayed in Figure 1.
Data research scenarios and solutions in smart ecosystem.
4.1 Transition from Vertical to Horizontal Data Solutions
In recent years, data scientists have raced to build powerful algorithms in nearly every vertical from law, medicine, and economy to transportation, agriculture, and energy. Now comes the era of intelligent infrastructure that is about integrated solutions where collections of algorithms start to collaborate on complex tasks and in real situations. There needs to be horizontal solutions integrating vertical ones. To achieve that, we need to reconcile diverse domain perspectives and intersect different knowledge spaces, and build data models and semantic frameworks to bridge different “verticals” of the infrastructure framework, from health and energy to nature and transportation, and so on.
These overarching models and intricate frameworks should connect the dots across the continuum of data applications, from generic algorithm development and computational programming to production-ready specific applications and pervasive computing at the edge of community networks. By scoping and orchestrating an effective data network, we can weave together an intricate Web of interactions that connect individual practices and support convergent solutions. Such foundational work and knowledge engineering will further advance AI capabilities to transform Big Data to Smart Data with dynamic signals, flexible classifications, and optimized outputs.
4.2 Distributed Data Intelligence on the Edge
Going forward, we expect to see distributed data intelligence on the edge to provide on-premise solutions and offer practical impact. When solving real-world problems in distributed and autonomous environments, “intelligence will decentralize and be embedded closer to the devices carrying out the inspections” [19]. This way, we can actually make data collection, processing, and analysis more efficient and safer, and at the same time, we can address privacy, security, and latency concerns more effectively [20]. This is especially meaningful given the recent advances and future trends of AI-driven edge intelligence for connected living.
This anticipated shift has great implications and highlights a highly immersive experience for research and learning. Particularly in this new environment, we shall take data science practice to the edge of intelligent system networks where students and researchers will be able to investigate, experiment, and implement solutions right on site. They can do so more efficiently when directly interacting with the sources of data challenges, participating in actual decision making and policy monitoring, and making real-time adjustment while witnessing impact generation.
4.3 Smart Data Work and Greater Process Discovery
As organizations increase the use of AI to extract greater values from their digital assets, foundational frameworks such as community data models, metadata tagging, and domain ontologies, as well as solutions for automating the slicing and dicing of data will become even more critical elements of organizational data infrastructures. “Smart” data work will require greater process discovery. Here, “process discovery is like a sensor embedded in the application that learns all of the user journeys, using AI to predict the optimal path for interacting with a system” [19].
To this effect, we can expect more personalized and adaptive discovery experience, where researchers will be able to talk to computer programs to adaptively redesign and optimize analytical frameworks or investigative pathways in real time. In this sense, we can expect data processing to change from reactive in nature to prescriptive in essence. Traditional search functions will give way to cognitive search that allows for real-time path selection and instant knowledge delivery.
4.4 Democratization of Data Science Platforms
Modern crises such as global health emergencies, existential risks of food and water insecurity, systemic economic instability, and human-induced climate change all call into question the capacity of democracy to meet the challenges of averting crises [21]. Such grand challenges demand scientific research and technology development to “be set free to produce solutions” [10]. This requires the democratization of data science platforms to facilitate convergent research and integrative learning in efforts to systematically address societal grand challenges.
With democratized approaches, we can collect, merge, and synthesize data from different contexts and at global scales to make breakthroughs such as the genome sequencing of the coronavirus in combating COVID-19 and the first global mapping of “Wood Wide Web.” The former allows us to understand how the virus is evolving, mutating, and spreading in different geographic areas. Also as contact tracing, mobility tracking, and symptom checking become prevalent in preventing the spread of the virus, rapid data gathering and sharing in a responsible and ethical manner is more essential than ever. The latter is a mapping of the complex underground Web of roots, fungi, and bacteria beneath every forest that provides trees with nutrients and reacts to the climate impact [22]. This breakthrough has the potential to advance our progress in fighting climate change and addressing carbon cycling. In either case and in times of crises, scientific and reliable data access, sharing, and dissemination along with collective intelligence, collaboration and innovations are critically needed.
Furthermore, different data that come from different contexts can enhance neural network training and machine learning accuracy to further improve our understanding of causality. Specifically, “with multiple context-specific data sets” that “are selected smartly from a full spectrum of contexts”, the final correlations should approximate “the invariant properties of the ground truth” [23]. This will make sure machine learning, deep learning, and AI tools and models can better monitor the pandemic outbreak or environmental risk, optimize surveillance, and engender rapid responses.
5. EMERGING PROBLEM SPACES AND CHANGING KNOWLEDGE PRACTICES
As the pervasiveness, complexity, and magnitude of intelligent systems grow, the urgency to create scalable solutions for managing, processing, and synthesizing big infrastructure and mobility data will only increase. This transdisciplinary domain of practice will require novel conceptual modeling and thematic mapping of its data collections for effective search, discovery and exploration. In these regards, various challenges arise dealing with data representation, modeling, and visualization. Without proper documentation of the meta-conditions of data, researchers will only struggle to link data, information, and insights back into the fast-moving development lifecycle.
To resolve organizational problems, we need to identify key analytical parameters, define metadata frameworks, and extract essential features for search and discovery in this evolving domain of practice. All point to re-designing and re-configuring data organization and information modeling for intelligent system applications and convergent work solutions. Ultimately, we need to build overarching data models to scale into different “verticals” of intelligent infrastructure framework, from energy and transportation to building and climate, and across the continuum of data applications, from novel analytics to practical solutions. This essentially requires the role of knowledge architects to reconcile and intersect different knowledge domains, fueling human intelligence into machine intelligence for dynamic data linking and integration.
To address operational workflows, data fusion techniques, representation methodologies, and pipeline optimization still need to be devised. The solution, on the one hand, lies in technical breakthroughs. “Smart” data work will need AI assistance and autonomous solutions, especially toward deep semantics and dialog with intelligent agents. On the other hand, the solution rests on the human aspect of fostering a feedback loop across a range of data agents and mobilizing a “community learning cycle” across the whole data spectrum [24]. In pursuit of integrated solutions, we must enable data workflows across human networks. “The feedback processes introduced may fundamentally alter the accumulation of knowledge,” directly affecting human and machine culture [5]. These socio-technical reconfigurations of data practice and information work will catalyze knowledge assimilation across the intelligent systems.
Just as big data are used to extract insights and improve experiences, more granular “small data” will be used to drive human actions and improve individual's wellbeing “based on their real-world behaviors, capabilities, and needs” [19]. To curate more satisfying user experiences, we should connect the small data that can optimize an individual's discovery experience with the big data that can uncover solutions on a broader scale. This way, we can sequence the user journey together and gain insights on their behavioral patterns and cognitive processes. The intelligent infrastructure development will certainly propel the collection of granular small data and behavioral metadata that can then be mined to shed new lights on users' activities. With that, we can advance user behavioral modeling by focusing on more granular small data generated by individuals while interacting with big infrastructure data. This will improve our understanding of emergent human information behaviors and sense making processes in the “smart” environment, which could further help automate simple tasks and augment users' abilities to execute more complex actions, through AI tools and deep learning techniques.
Overall, in the new environment, we need to proactively forge new linkages and create novel connections through various means, such as developing and implementing community models for data and metadata, engaging data exchanges and synthesis efforts, and designing for integration. We also need to stimulate demand for data from the grass roots by creating a cross-cutting data framework and operational workflow for the knowledge domain and by democratizing data work techniques and analytical capabilities across the community. This way, people can act on their ideas, create values, and deliver solutions. All of these will require system design and social engineering to intentionally build connections and align interests toward solving bigger infrastructure problems [25]. In social informatics perspective, the “usage and effects” of new technologies “are tied to the socio-technical practices of design and use, situated in specific contexts” [26]. In this sense, we should purposefully reconfigure the ways that knowledge is created across disciplines and systems in order to address the full complexities of broad infrastructure problems. These efforts will ensure we stay competitive and “on the edge” in a fast-moving technology landscape.