Challenges Posed by Big Data on the Foundations of Scientific Thinking
- Date: November 5, 2021
- Location: OWSpace, Beijing
Science is rooted in the human pursuit of certainty. Its practitioners seek to render worldly phenomena understandable, explicable, and manipulable. This pursuit of certainty manifests mainly as an inquiry into causal relationships underpinning the occurrence and development of phenomena. Scientific thinking is originally built upon the notion of determinism.
However, the advent of big data has created huge challenges for determinism. Interconnectivity and mutual influence have created an open and borderless Internet of Things. As such, the essential relationship between things is no longer causal, but correlative. Causality is merely a scientific illusion of the “small data era”. Big data’s rise will necessitate a shift toward “iteration” of knowledge rather than “a complete certainty of the truth”— an old-fashioned idea from the classical science era.
Agenda:
19:00 – 19:05|Opening Remarks
19:05 – 19:50|Keynote Speech
19:50 – 20:30|Q & A
Key Discussion Topics:
• How can we seek a balance between openness and completeness in the realm of scientific inquiry?
• Can methods of experience and experiment yield outcomes that reliably determine cause and effect?
• How should we define “small data?” What distinguishes it from “big data?”
Speaker:
WU Jiarui
• Researcher, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences
After receiving his Ph.D. from ETH Zürich in 1994, Dr. Wu Jiarui served as a postdoctoral researcher at the State University of New York Health Science Center until 1997. Currently, he is the executive dean of the School of Life and Health Sciences at the Institute for Advanced Study, University of Chinese Academy of Sciences (UCAS) Hangzhou; director of the UCAS Hangzhou Advanced Research Institute; director of the Zhejiang Province System Health Science Primary Laboratory; editor-in-chief of the Journal of Molecular Cell Biology; associate editor of BMC Systems Biology; associate editor of Chemistry of Life; and associate editor of Medicine & Philosophy. Dr. Wu chairs the professional committee at the Chinese Society of Biochemistry and Molecular Biology. He is also a member of the Eighth Expert Advisory Committee in the chemical sciences division of the National Natural Science Foundation of China. In 1998, he received a grant from the National Science Fund for Distinguished Young Scholars and was selected for the Hundred Talents Program at CAS. In 2009, he was named as one of Shanghai’s leading talents.
Dr. Wu’s laboratory uses systems biology methods to study the molecular mechanisms behind the appearance and evolution of chronic diseases such as diabetes and tumors. He has published over 100 research papers in international academic journals.
Moderator:
BAI Shunong
• Professor, School of Life Sciences at Peking University
• 2020-2021 Berggruen Fellow
Bai Shunong is a professor at the School of Life Sciences, Peking University. Since his postgraduate training in 1983, Professor Bai’s experimental research on plant development has led him to form original perspectives on plant cultivation. He believes that it is necessary to revive the views held by founders of modern botany — such that plants are not individual organisms like animals; instead, they are an “aggregate” of many “individuals”. The “Plant Morphology 123” theory he proposed regarding plant development integrates recent novel concepts such as “plant development unit”, “sexual reproduction cycle”, and “plant development program” into a fixed system. While trying to understand the internal dynamics of plant development, Bai Shunong has also developed a keen interest in exploring the nature of life. He believes that the question: “what does ‘living’ mean?” is fundamental to understanding life. In cooperation with two mathematicians, he proposed that the essence of “living” is, in fact the “structure for energy cycle”.
REPORT
The 16th Berggruen seminar, Challenges Posed by Big Data on the Foundations of Scientific Thinking, was held at OWSpace in Beijing on September 28, 2021. Around 60 participants attended the event in person, while the number of viewers watching the livestream via Bilibili peaked at 14,000.
The event was moderated by Bai Shunong, 2020-2021 Berggruen Fellow and Professor at the School of Life Sciences at Peking University. The keynote presentation was made by Dr Wu Jiarui, researcher at the Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences. Dr. Wu is also the current Executive Dean of the School of Life and Health Sciences at the Institute for Advanced Study, University of Chinese Academy of Sciences (UCAS) Hangzhou and the Director of the Key Laboratory of Systems Biology at the Chinese Academy of Sciences. Throughout his decades-long academic career, he has focused on how to combine the fruits of frontier research with philosophical inquiry. His research attempts to understand life and the world amidst the mutual inspiration and interrogation engendered by collisions between science and philosophy.
In his view, science is founded on humankind’s pursuit of certainty. Its goal is to understand, explain, and control the physical world. This pursuit of certainty is primarily expressed as the exploration of causal relationships behind the occurrence and development of events through research. Its foundational thinking is built upon determinism, whereas the arrival of the big data era has posed huge challenges for determinism: all things are interconnected and mutually interacting, forming an open and borderless “Internet of Things.” Relationships among things, therefore, are correlative rather than causal. Causality is but a scientific illusion of the “small data era.” What big data brings is not the comprehensive and certain truth of the classical scientific era, but rather incomplete knowledge that can be repeated, or “iterated.”
Determinism: Foundations of Scientific Thinking
Pierre-Simon Laplace, the great 18th century French mathematician, once said: “We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at any given moment knew all of the forces that animate nature and the mutual positions of the beings that compose it, if this intellect were vast enough to submit the data to analysis…the future, just like the past, would be present before its eyes.” Dr. Wu believes that this determinism constitutes the foundational thinking underlying conventional scientific research.
At the ontological level, this form of determinism is reflected as a form of reductionism. As Erwin Schrödinger pointed out in his 1944 treatise What is Life?, “the events that happen within [an organism] must obey strict physical laws.” This inspired a huge number of physicists and chemists who realized that life is not mysterious; its foundations were merely the molecules of genes and proteins. Physics and chemistry could similarly enter the science of life. As a result, the field of molecular biology emerged in the 1950s and quickly became a highly productive and prominent frontier approach to biological research. Protein structures and models were explained through chemistry — the mechanisms whereby viruses infected cells were subjected to biochemical analysis. Discovering these reasons and mechanisms for physiological and pathological workings became the most important research direction in the life sciences, based on a reductionist, deterministic understanding of life.
At the epistemological level, the ultimate goal of modern biological research driven by determinism — as Francis Crick put it — was to use physics and chemistry to explain all biological phenomena. This turned biology, traditionally a descriptive science, into an experimental science driven by hypotheses. A closely related outcome was that reductionism assumed a dominant position — in other words, complex life systems could be understood by deconstructing them into parts that were then individually analyzed. Put simply, life throughout most of the twentieth century was like a big machine. Once we knew how each part worked, life would no longer be mysterious and unknowable. The central task of the biologist was to propose hypotheses that would be tested through well-designed physical and chemical experimentation.
However, traditional molecular biology’s conceptions of life were challenged in the late 20th century by the emergence of life sciences research based on big data.
Big Data: Open Science
Karl Popper once said that all scientific propositions must be falsifiable; unfalsifiable theories cannot become scientific theories. Imre Lakatos later amended this theory and introduced the concept of “research programmes.” Molecular biology adhered to this principle. Guided by its research program, it discovered and first proposed a set of hardcore principles (for example, the central principles behind the transcription of DNA into RNA and the translation of RNA into proteins) that could not be empirically refuted, as well as tested, modified, and updated. This is a characteristic of life sciences methodology that is based on reductionism.
However, the Human Genome Project, which began in the 1990s, used another method. The researchers hoped to use big data to thoroughly understand the human genome, i.e. the genetic components of 46 chromosomes, to decipher the mysteries of 3 billion base pairs. In 2001, the project outlined a draft sketch of the human genome that was roughly 95% complete, a milestone in life sciences research. Even today, the project has yet to be fully completed. Limited by our understanding of chromosomes, research may perhaps continue perpetually and continually iterate and improve, such that knowledge will continue to advance. Big-data-based “iterative” methodology differs from traditional forms of research. It does not propose a core framework or principle, but instead relies on data discoveries to constantly update core perspectives, grasp the “continuity” of knowledge, and look towards open and unpredictable outcomes.
When life sciences shifted from reductionism to “iterative” methodologies, they completed the process Thomas Kuhn called “paradigm shifting,” wherein hypothesis-driven research paradigms move toward data-driven methodologies. The “Big Data+” research model is revamping our understanding of various disease relationships and pathologies — and driving significant advances in the field.
Big Data: A Correlative World
Conventional biological research founded upon determinism is often geared towards seeking the molecular causes of pathological activity. However, in many circumstances, we can at best discover a sufficient condition for a physiological or pathological event. The pursuit of causal relationships that are certain, sufficient, and necessary is extremely challenging.
Meanwhile, contemporary life sciences research based on big data no longer solely seeks specific causal relationships, but rather, looks for more precise and broader descriptive correlations. One example would be how diet intervenes with the occurrence and development of disease: those who consume fruits daily respectively have a reduced chance of dying from cardiovascular disease or coronary heart disease by 40% and 34%, respectively, compared to those who do not.
Views of life based on big data subscribe to a form of gridded causal inference. Each of our 20,000-odd genes may more or less contribute to the formation of a disease. Every part of an organism participates in life processes holistically. Under this interpretation, correlation can be significantly extended, but causality is immensely difficult to pin down. Each of us is like a living network where all things are interconnected, while life is a complex, layered system constituted by genes, proteins, cells, organs, and, finally, the organism.
Dr. Wu argues that the biggest challenge faced by the life sciences comes from the conflict between researchers’ deterministic thinking and the daily occurrences of life. Humankind continues to hope for definite answers, causal relationships, and the interpretability of the world. But in reality, life is an open and complex system. There is no creator that mapped out each step in the evolution of life on earth four billion years in advance. Evolutionary processes are, for the most part, uncertain. Put differently, happenstance is the true driver of life. This view happens to coincide with the ideas that Professor Bai Shunong has espoused and advocated in the column Baihua.
After the end of the keynote presentation, Professor Bai expanded upon Dr. Wu’s views and suggested that researchers should perhaps closely consider the origins of our preference for “certainty,” free themselves from a narrow view of life, and truly attempt to understand the physical and intellectual world from the perspective of randomness.
In the future, the Berggruen Research Center at Peking University will continue to explore scientific and philosophical conceptions of “life” in an era of change. We hope to leave behind one-sided interpretations and instead seek “uncertain” answers within a broader natural world to achieve the iteration of conceptualizations and renewal of ideas in the era of big data.
Original article in Chinese by Li Zhilin
Trasnlated: Intern, Berggruen Research Center at Peking University
Additional editing: Christopher Eldred and Sarah Gilman