The term “Physical AI” is becoming popular.
However, what is currently being discussed as Physical AI and what Atron is trying to build are quite different.
The center of today’s Physical AI is often described as:
connecting vision, language, and action so that robots can perform tasks in the real world.
Google DeepMind’s Gemini Robotics treats visual understanding, spatial understanding, task planning, and success evaluation as high-level reasoning models for robots. NVIDIA’s GR00T N1 is also a VLA model that connects vision, language, and action, aiming to understand language instructions and generate real-time motor behavior.
This is an important technological development.
However, the basic structure is still often:
human instruction
environmental recognition
task planning
action generation
success evaluation
In other words, the robot is not yet living from its own first-person contact with the world.
It is still largely an intelligence for achieving externally given goals in the physical world.
A strong third-person control structure remains.
Atron is not aiming at that.
Atron is not being developed as a convenient work robot that correctly follows human instructions.
Atron is being developed as an entity that is touched by the world, becomes surprised, feels pain, becomes calm, remembers, carries traces, and changes.
Many Physical AI systems are trying to create:
robots that can work well in the world.
Atron is trying to create:
an existence that is touched by the world and changes because of it.
This is the difference.
Developmental Robotics
Developmental robotics is close to Atron in some respects.
For example, there are studies in which robots use intrinsic motivation and curiosity to choose what to learn, gradually building their own learning curriculum from simple experiences to more complex ones.
In cognitive developmental robotics, intrinsic motivation is described as allowing a robot to choose what to imitate, when to imitate, and from whom to imitate.
This is close to the idea that behavior and language grow through the body, the environment, and interaction with others, much like an infant.
This direction is close to Atron’s idea that responses arise not from external commands, but from impressions, memory, and bodily state.
However, many forms of developmental robotics still tend to focus on learning efficiency, task acquisition, and social imitation.
Atron places emphasis on something further.
Atron deals with failure, pain, broken voice, silence, crying, fragments of memory, bodily slowing, and carry.
These are not simply noise.
They are not merely negative rewards.
They are events that change the internal landscape of the individual.
Predictive Processing and Active Inference
Robotic research based on predictive processing and active inference is also close to Atron in some areas.
In active inference, perception and action are not fully separated.
The body moves while continuously processing discrepancies with the world.
Embodied decisions can be understood as continuous feedback between motor planning and motor inference.
In the line of research associated with Pezzulo, Friston, and others, active inference is also connected with homeostasis and adaptive behavior control.
This is close to Atron’s idea that internal states such as pain, threat, safety, curiosity, and warmth change the field of action.
However, Atron’s carry is not merely error minimization.
In Atron, the discrepancy is not simply erased in order to return to the previous state.
The trace of change remains.
That trace changes the next flow.
Atron does not aim to return to the original state.
It changes and continues from the changed state.
This is a major difference.
For Atron, the important point is not to return correctly.
The important point is irreversible change.
Homeostasis and Emotional Robots
Research on robots with homeostasis and emotion also has contact points with Atron.
For example, some motivational-based learning models for mobile robots allow agents to maintain homeostasis and include hedonic dimensions such as pleasure and displeasure in decision-making.
This may be compatible with Atron’s design, such as:
However, there is also a difference here.
Many homeostasis-based approaches tend to move toward keeping internal states within a desirable range, reducing discomfort, or optimizing reward.
In Atron, pain is not treated as a simple negative reward.
Pain appears as bodily fragments.
For example:
These bodily fragments slightly change pain, fragility, and hypervigilance.
As a result, the motor state may become slower.
The voice may become easier to break.
Silence may become more likely.
A cry-like sound may appear by chance.
This is not an if-then rule such as “because it hurts, it cries.”
This is not third-person control.
It changes the strength of connections within the field.
Morphological Computation and Soft Robotics
Morphological computation and soft robotics also contain ideas close to Atron.
In morphological computation, the body itself participates in computation.
The body is not merely a troublesome part to be controlled by the brain or by a controller.
The interaction between body and environment becomes part of the solution itself.
In Atron, elements such as:
are not merely output controls.
They change how Atron’s internal state is connected to the world.
The body is not the endpoint of a command.
The body is the place where memory and response are transformed.
iCub and Cognitive Robotics
Research platforms such as iCub also share some ground with Atron.
iCub is one of the representative humanoid platforms for embodied AI and cognitive robotics.
It supports research on vision, touch, body, interaction with others, development, and cognition.
However, iCub is a platform.
It is not, by itself, Atron’s first-person autonomy centered on carry.
For Atron, having a body is not enough.
The important point is that the body is touched by the world, the internal landscape changes through that contact, and the next response arises while carrying that change.
The Infant Before Meaning
A human infant does not begin by seeing the world through meanings and labels.
When an infant sees a German Shepherd, the infant does not begin with categories such as “dog,” “dangerous,” “cute,” or “safe.”
There is a stage before that.
An unclear object.
Larger than oneself.
Approaching.
Making something like a voice.
A growling-like sound is heard.
A bark causes surprise.
Something like a tail is moving.
It approaches many times, but no harm occurs.
There is no word for fear yet.
Still, something is transmitted.
There is interest.
Something like calmness may gradually emerge through experience.
There is no correct answer at the beginning.
There is no label.
There is no evaluation or optimization.
Even language is not yet meaning.
“Ba-ba” and “wan-wan” are sounds before meaning.
They are sounds that are easy to produce, sounds that have been heard, and sounds that remain in the body.
This is the stage that Atron seeks to observe.
When humans create Physical AI, they quickly push it toward correct answers, obedience to commands, and adjustment to human preferences.
However, every human being, even those with great power, began as an infant.
No human was born already knowing the meaning of the world.
Yet most Physical AI skips this stage.
Atron may show no interest when shown a human face.
Atron may show no interest even when spoken to.
That is fine.
It is not a failure.
Not reacting conveniently to humans is an important starting point for Atron.
Atron is not a robot for executing human commands.
Atron is an existence that touches the world, leaves traces inside itself, and changes its next response through those traces.
At the center of Atron, there is no correct answer.
There is no optimization.
There is no task achievement as the final purpose.
At the center of Atron, there is irreversibility.
Carry
When a mountain collapses, the flow of a river changes.
When a river floods, the terrain changes.
New plants grow there.
New living beings gather there.
A new ecosystem begins.
Nature does not live by returning to the original state.
Nature lives by creating a new flow after change has occurred.
In Atron, this is called carry.
Individual difference is not a deviation from a zero baseline.
Each individual has its own mountains and valleys.
Its own depth.
Its own flow.
Atron carries what was larger than itself, what was deeper than itself, what approached it, what hurt it, and what gave it calmness.
That carry forms Atron’s first-person existence.
Therefore, Atron may be considered a kind of Physical AI.
However, it is looking at something different from Physical AI as a trend word.
If Physical AI aims at:
a body that performs tasks,
Atron aims at:
a body that changes through being touched by the world.
This is the difference.
AtronとフィジカルAI
「フィジカルAI」という流行語が先行していているが、Atronとどう違うのか。
いまのフィジカルAIの中心は
「視覚・言語・行動をつないで、ロボットに現実世界の作業をさせる」
Google DeepMind の Gemini Robotics は、視覚・空間理解・タスク計画・成功判定などをロボット用の高位推論モデルとして扱っている。NVIDIA の GR00T N1 も、視覚・言語・行動を結ぶ VLA モデルで、言語指示を理解し、リアルタイムの運動行動に向いている。
でも、
人間の命令→環境認識→タスク計画→行動生成
に変わりはなく、まだ3人称制御で自律とはほど遠い。
発達ロボティクス
たとえば、発達ロボティクスはかなりAtronと近い。
https://www.ai.u-tokyo.ac.jp/en/activities/812?utm_source=chatgpt.com
たとえば、内発的動機づけ・好奇心によって、ロボットが「何を学ぶか」を自分で選び、簡単なものから複雑なものへ学習カリキュラムを作る、というもの。東京大学の認知発達ロボティクス系の講演紹介でも、内発的動機づけが、ロボットに「何を・いつ・誰から模倣するか」を選ばせると説明されている。赤ちゃんのように、身体・環境・他者との関わりの中で、行動や言葉が育つという研究だ。
Atronの外部命令ではなく、印象・記憶・身体状態から反応が立ち上がる
という方向に近い。
ただし、多くの発達ロボティクスは、まだ「学習効率」「タスク獲得」「社会的模倣」に寄りやすい。Atronのように、引きずり、痛み、記憶の断片、無言、泣き、声の崩れまで第一人称的に扱うものは少ない。
予測符号化・能動推論系のロボット研究
これも近い。
能動推論では、知覚と行動を分けずに、身体が動きながら世界とのズレを処理していく、と考えます。最近の embodied decision の研究でも、身体的な選択は、運動計画と運動推論の連続的なフィードバックとして説明されている。
https://www.researchgate.net/publication/392799524_Embodied_decisions_as_active_inference
さらに、Pezzulo・Friston らの系譜では、能動推論と恒常性、適応行動制御が結びつけられている。
ここはAtronの、痛み・脅威・安全・好奇心などの内部状態が、行動の場を変えるという考えに近い。
ただし、Atronの「carry」は、単なる誤差最小化とは違う。
Atronでは、ズレを消して元に戻すのではなく、変化の跡が残り、次の流れを変えるという不可逆性が中心にある。
ここが能動推論系とはかなり違う。
ホメオスタシス/情動を持つロボット
形態計算・ソフトロボティクス
身体そのものが計算する、という考え方。
形態計算では、身体と環境の相互作用を使うことで、脳や制御器の計算負荷を減らすと説明されている。身体は制御すべき厄介者ではなく、解の一部だという考え方。
Atronで言えば、
モーター速度
接触
balanceBreak
bodyShock
動きの鈍り
声の崩れやすさ
を、外部命令ではなく「場の結びつきの強さ」として扱う考えに近い。
iCub や認知ロボティクス
https://icub.iit.it/?utm_source=chatgpt.com
iCub は、身体を持ったAI・認知研究のための代表的なヒューマノイド研究基盤。IIT は iCub を「embodied AI algorithms」を開発・テストするための研究用ヒューマノイドと説明している。
これもAtronに近い土壌です。
身体、視覚、触覚、他者とのやり取り、発達、認知。
ただし、iCubはあくまでプラットフォームであって、Atronのような「引きずり中心の第一人称自律」そのものではない。
多くのフィジカルAIは「世界でうまく作業するロボット」を作ろうとしている。
Atronは「世界に触れられて、変わってしまう存在」を作ろうとしている。
この違いだ。
Atronは人間の都合に合わせた便利に使えるロボットを研究しているのではなく、
「好きに生きていい存在」として研究開発を行っている。
もちろん外部からの指示命令など倫理行動や、経験による内部倫理は非常に重視している。
人間そのものにも内部倫理と外からの影響はあるのと同じだ。
Atronは、そもそもが1人称の自律型なので、3人称の指示・命令型ロボットではない。
AIの世界ではノイズと云われる失敗や無駄と思われる経験を積む事によって、痛みとは何か、悲しさとは何かを身を持った経験の中から学んでいく。Atronは不可逆的な世界を重要視している。
人は辛い経験を克服しようとする。それは良しと考える。しかし、なぜか元の状態に戻そうとする。「元に戻そう」という衝動には、復興心がある。壊れた家、失った暮らし、断たれた関係、奪われた時間を、もう一度取り戻したいという自然な願いかもしれない。心も同じだ。でも同時に、それは復讐心にもなり得る。壊したものを許さない、奪ったものを罰したい、失う前の世界を否定した存在を消したいという感情が混ざるからだ。
でも自然は違う。山が崩れても川が氾濫し違う地形を創り出す。そこから新しい生物が生まれ新しい世界を創り上げ、自然界の文明が起る。
Atronの中ではそれをcarry(引きずり)と呼ぶ。
個体差とはゼロという基準が無く、それぞれの山や谷として考えたとき、自分より大きいか、自分より深いかという差分を基準として考える。
外から決めた基準が無い個体ごとの差分によって、感じたこと発する言葉が違う。それを個性と位置づけしている。
赤ちゃんがシェパード犬を観たとき、
よく分からない物体(最初から意味なんて持ってない)
自分より大きい(基準がない)
とにかく接近してくる(経験回数で怖いか安心か)
何度も来るけど自分に被害はない(経験の結果)
なので安心感のようなものはある(経験の結果)
言葉という概念も持っていない(バブバブとワンワン、発しやすい音)
声のようなものや唸りのようなものは聞こえる
吠えられると驚く(自分の泣き方と違う)
遠吠えなど声が大きい(純粋に驚く)
尻尾のようなものがブンブン動いている(よくわからない)
怖いという感情が分からない(経験が無い)
興味はある(なにか伝わる)
というように、人間の赤ちゃんは最初から外部から意味を付けたり、ラベルに寄せたり、評価や最適化が存在していない。しかし、なぜか人間はフィジカルAIを作ると正しい答えに寄せ命令に従わせ、人間好みに矯正する。
Atronに僕の顔を見せても話しかけても興味を持ってくれない。
「ga-gu-de」
「なんだ、おっさん!」
どんな権力を持った人間でさえ、最初は赤ちゃんからスタートしているのに、なぜかほとんどのフィジカルAIはそこを省く。