The Architectural Relevance of Reinforcement Learning

October 01, 2020

It’s been 50 years since Gordon Pask wrote his seminal piece on the Architectural Relevance of Cybernetics. As I will try to argue with the post, Cybernetics and Reinforcement Learning are the two faces of the same coin and Reinforcement Learning has become a mature enough field for people to actually use for breakthroughs and day-to-day life; One field that can potentially change it fundamentally is Architecture and Urban Planning. But let’s begin with a short intro to Cybernetics.

Cybernetics is the multidisciplinary field concerned with feedback loops, control and communication. Feedback loops is a fundamental property of a system that receives influence from internal or external to the system. Control and communication are mechanisms of a system to steer its state towards “desired” ones (assuming that one cares to remain to one). Feedback loops thinking can be applied to multiple fields, such as organization, cognition, sociology, epidemiology and any kind of field which is concerned with input/output systems, complexity and feedback loops.

It’s not a difficult to imagine how cybernetics can be relevant to architecture, as the fundamental property of architecture is to host intelligent and interactive agents while maintaining certain qualities of interest, such as stability, aesthetics, robustness in time, etc. Historically, such constructions have been seen as a static material system. Over the last several decades, the language of architecture changed to include the concepts of the lifecycle of a building, talking about adaptation and dynamic systems, where the building responds to a complex set of goals.

D(q(z)p(zx))=Eq[logq(z)]Eq[logp(zx)]=Eq[logq(z)]Eq[logp(z,x)]+Eq[logp(x)]=Eq[logq(z)]Eq[logp(z,x)]+logp(x)\begin{aligned} D(q(z)||p(z|x)) &=\\ \mathbb{E}_q[\log q(z)] - \mathbb{E}_q[\log p(z|x)] &=\\ \mathbb{E}_q[\log q(z)] - \mathbb{E}_q[\log p(z, x)] + \mathbb{E}_q[\log p(x)] &=\\ \mathbb{E}_q[\log q(z)] - \mathbb{E}_q[\log p(z, x)] + \log p(x) \end{aligned}

Perceiving the architecture as a dynamic entity brings certain challenges. We need new materials and tools to make architecture system robust and adaptable to the different non-static goals. First we need to think what are the interacting parts and the feedback-loops at play. Habitats are the main focus if the hierarchy — they are the causa finalis. Other feedback loops compose of architecture with the environment, other structures, the city, the laws, even nature itself.