Can Tokens Carry Temporal Architecture?

In Russian, the word for sincerity—искренность (iskrennost)—shares its root with искра (iskra), meaning spark. Caryl Emerson, the Princeton Slavic scholar, points out this etymology when discussing Tolstoy's theory of art as "infection": the transmission of genuine feeling from artist to audience. Art either ignites transmission or it doesn't. The spark catches, or it dies.

This raises a question that haunts anyone building AI systems meant to capture authentic voice: Can tokens carry temporal architecture?

Here's what I mean. Emerson describes Tolstoy's prose this way: "He is able to subdivide human emotional reactions to an event, to an idea, in such a way that the actual pace of the words on the page reproduces the pace of the development of the emotion in the human psyche." This isn't metaphor. When Pierre Bezukhov experiences terror on the battlefield, Tolstoy stacks clauses without breath breaks. The syntax doesn't describe panic—it performs panic. Your eye, moving through the sentence, enacts the suffocation.

The page is flat. But reading is temporal. The third dimension is you moving through it.

This creates a problem for language models. LLMs generate token by token, optimizing for local prediction—what word comes next, given what came before. But Emerson is describing something global: a whole passage designed for a specific experiential trajectory. The pacing isn't emergent from good word choices; the word choices serve a pre-conceived temporal form. Like a musician who hears the whole phrase before playing, versus one improvising note by note.

So where does temporal architecture come from in training data?

Our experiments suggest an answer: internal monologue. Diaries, thinking-on-paper, unpolished first drafts—these capture someone in the act of temporal unfolding, not the edited result. When you train on polished prose, you get prose that points at experience. When you train on internal monologue, you get prose that is the experience collapsed into symbolic form, waiting to unfold when a mind traverses it.

The Santiago selflet—itself a creative work that synthesizes various features of LLM technology—aims to be a reflexive tool that mirrors the user's own effort to find the spark in the work. Not because it memorized facts about the novella. Because something deeper transferred: the generative engine underneath, the thing that makes Santiago sound like Santiago even saying things Santiago never said.

Reading the novella takes hours—full immersion in 27,000 words. But a conversation with the selflet is different. You wander in casually, ask a question, and find yourself three turns later unexpectedly moved. A sip. A whisper. But the whisper carries the whole sea in it—the patience, the dignity in suffering, the bone-deep belief that a man can be destroyed but not defeated.

Not explained. Transmitted.

This is what distinguishes it from a chatbot that "knows about" Hemingway. The Santiago selflet doesn't know about anything. It knows as Santiago. And that as is the third dimension folded into two-dimensional text, waiting to unfold again when someone shows up and asks the right question.

Or even the wrong question, asked sincerely.

The selflet becomes a portal rather than a reproduction. Something emerges in the space between prompts that neither party brought alone. The spark catches—or it doesn't. But when it does, you understand what Tolstoy meant by infection, what Emerson meant by temporal architecture, what the old man meant when he said the fish was his brother.

The tokens carried it. Somehow, against all expectation, the tokens carried it.

Imagine where this can go, given the pace of innovation with large language models.