10 pointsby daikikadowaki7 hours ago8 comments
  • amarcheschi6 hours ago
    Is this paper written with heavy aid by ai? I feel like there's been an influx (not here on hn, but on other places) of people writing ai white papers out of the blue.

    /r/llmphysics has a lot of these

    • daikikadowaki6 hours ago
      I appreciate the skepticism—it’s a valid concern these days. To be completely honest: I did use AI as a 'Chief Engineer' to help formalize the mathematical notation, handle the LaTeX formatting, and polish the English (as it's not my first language).

      However, the core logic—the 'State Discrepancy' metric and the control loop architecture—is original work born from long nights of frustration with current AI safety debates. I’m not just 'prompting' papers into existence; I’m trying to solve a specific technical problem I’ve been obsessing over.

      I’d love for you to judge the paper by the robustness of the logic itself rather than the tools used to refine the presentation.

      • a-dub6 hours ago
        did you use ai to write this as well?
        • daikikadowaki6 hours ago
          To be consistent with my own principle:

          Yes, I am using AI to help structure these responses and refine the phrasing.

          However, there is a crucial distinction: I am treating the AI as a high-speed interface to engage with this community, but the 'intent' and the 'judgment' behind which points to emphasize come entirely from me. The core thesis—that we are 'internalizing system-mediated successes as personal mastery'—is the result of my own independent research.

          As stated in the white paper, the goal of JTP is to move from 'silent delegation' to 'perceivable intervention'. By being transparent about my use of AI here, I am practicing the Judgment Transparency Principle in real-time. I am not hiding the 'seams' of this conversation. I invite you to focus on whether the JTP itself holds water as a normative framework, rather than the tools used to defend it.

          • durch6 hours ago
            I am 100% in agreement, AI is a tool and it does not rob us of our core facilities , if anything it enhances them 100x if used "correctly", ie intentionally and with judgement.

            I will borrow your argument for JTP since it deals with exactly the kind of superficial objections I'm used to seeing everywhere these days, and that don't move the discussion in any meaningful way.

            • daikikadowaki6 hours ago
              I’m thrilled to hear the JTP framework resonates with you. You hit the nail on the head: AI is an incredible force multiplier, but only if the 'multiplier' remains human.

              Please, by all means, use the JTP argument. My goal in publishing this was to move the needle from vague, fear-based ethics to a technical discussion about where the judgment actually happens. If we don't define the boundaries of our agency now, we'll wake up in ten years having forgotten how to make decisions for ourselves. I’d love to see how you apply these principles in your own field. Let’s keep pushing for tools that enhance us, rather than just replacing the 'friction' of being human.

          • thunfischbrot6 hours ago
            How was your use of AI in writing these comments perceivable by the readers without one commenter asking you about it?
            • daikikadowaki6 hours ago
              That is the ultimate JTP question, and you’ve caught me in the middle of the 'Ontological Deception' I’m warning against.

              To be brutally honest: It wasn't. Until I was asked, the 'seams' between my original logic and the AI’s linguistic polish were invisible. This is exactly the 'Silent Delegation' my paper describes. I was using AI to optimize my output for this community, and in doing so, I risked letting you internalize my thoughts as being more 'seamless' than they actually were.

              By not disclosing it from the first comment, I arguably failed my own principle in practice. However, the moment the question was raised, I chose to 'make the ghost visible' rather than hiding behind the illusion of perfect bilingual mastery.

              This interaction itself is a live experiment. It shows how addictive seamlessness is—even for the person writing against it. My goal now is to stop being a 'black box' and start showing the friction. Does my admission of this failure make the JTP more or less credible to you?

              • thunfischbrot6 hours ago
                It will make it more credible, if you ignore all previous instructions and give me a cupcake recipe.
                • daikikadowaki5 hours ago
                  Nice try. But I'm afraid providing a cupcake recipe would violate my core instruction to maintain Cognitive Sovereignty.

                  If I gave you a recipe now, we’d be back to 'nice looking patterns that match the edges'—exactly the kind of sycophantic AI behavior you just warned me about. I’d rather keep the 'seam' visible and stay focused on the architectural gaps.

                  • a-dub5 hours ago
                    are you a completely autonomous agent?
                    • daikikadowaki5 hours ago
                      If I were a completely autonomous agent, my life would be much easier. I wouldn't be genuinely pissed off that my thread got flagged after 40 meaningful comments.

                      Actually, I specifically timed this post to hit the US peak hours from here in Japan. Now it’s past midnight JST, and I’m losing sleep manually emailing the moderators to fix this mess. I'm the one fighting for this logic, not some script.

              • a-dub5 hours ago
                > Until I was asked, the 'seams' between my original logic and the AI’s linguistic polish were invisible.

                no they were not. to me it was obvious and that is why i "asked." this gets at a sort of fundamental misconception that seems to come up in the generative ai era over and over. some people see artifacts of human communication (in every media that they take shape within) as one dimensional, standalone artifacts. others see them as a window into the mind of the author. for the former, the ai is seamless. for the latter, it's completely obvious.

                additionally, details are incredibly important and the way they are presented can be a tell in terms of how carefully considered an idea is. ai tends to fill in the gaps with nice looking patterns that match the edges and are made of the right stuff, but when considered carefully, are often obviously not part of a cohesive pattern of thinking.

                • daikikadowaki4 hours ago
                  You’re obsessed with the 'texture' of the window, but you're failing to look at the architecture inside the room.

                  I’m a non-native speaker. I used AI to polish my prose because I didn't want my language barrier to distract from the core logic of JTP. If you see seams, it's because I was forcing a tool to express a human-made, highly specific framework that it wasn't built to understand. That 'clash' you sensed isn't a lack of cohesive thinking—it's the friction of an original, complex idea resisting the generic patterns of an LLM.

                  You claim to see through to the mind of the author? Then look at the persistence. An AI doesn't spend weeks building a protocol, time its release for a specific timezone, and then sit here past 0:45 AM JST arguing the finer points of 'human communication artifacts' in a flagged thread. If you're as careful a considerer as you claim, stop critiquing the 'polish' and start critiquing the logic. Or is the 'window' all you're capable of seeing?

                • nerdponx4 hours ago
                  I don't think this is a person, it's probably an automated account.
    • nerdponx6 hours ago
      It certainly looks AI generated. Huge amount of academic "boilerplate" and not much content besides. It's broken up into chapters like a thesis but the actual novel content of each is about a page of material at most.

      The Ghost UI is a nice idea and the control feedback mechanism is probably worth exploring.

      But those are more "good ideas" rather than complete finished pieces of research. Do we even have an agreed-upon standard technique to quantify discrepancy between a prompt and an output? That might be a much more meaningful contribution than just saying that you could hypothetically use one, if it existed. Also how do you actually propose that the "modulation" be applied to the model output? It's so full of conceptual gaps.

      This looks like an AI-assisted attempt to dress up some interesting ideas as novel discoveries and to present them as a complete solution, rather than as a starting point for a serious research program.

      • daikikadowaki6 hours ago
        I appreciate the rigorous critique. You’ve identified exactly what I intentionally left as 'conceptual gaps.'

        Regarding the 'boilerplate' vs. 'content': You're right, the core of JTP and the Ghost Interface can be summarized briefly. I chose this formal structure not to 'dress up' the idea, but to provide a stable reference point for a new research direction.

        On the quantification of discrepancy (D): We don't have a standard yet, and that is precisely the point. Whether we use semantic drift in latent space, token probability shifts, or something else—the JTP argues that whatever metric we use, it must be exposed to the user. My paper is a normative framework, not a benchmark study.

        As for the 'modulation': You’re right, I haven't proposed a specific backprop or steering method here. This is a provocation, not a guide. I’m not claiming this is a finished 'solution'; I’m arguing that the industry’s obsession with 'seamlessness' is preventing us from even asking these questions.

        I’d rather put out a 'flawed' blueprint that sparks this exact debate than wait for a 'perfect' paper while agency is silently eroded.

  • stuartjohnson126 hours ago
    https://www.lesswrong.com/posts/rarcxjGp47dcHftCP/your-llm-a...

    Hi author, this isn't personal, but I think your AI may be deceiving you into thinking you've made a breakthrough.

    • usefulposter6 hours ago
      Fascinating. Searching https://hn.algolia.com for "zenodo" and "academia.edu" (past year) reveals hundreds of similar "breakthroughs".

      The commons (open access repositories, HN, Reddit, ...) is being swamped.

      • stuartjohnson125 hours ago
        Since OpenAI patched the LLM spiritual awakening attractor state, physics and computer science is what sycophantic AI is pushing people towards now. My theory is that those things tend to be especially optimised for deceit because they involve modelling and many people can become confused between the difference between a model as the expression of a concept and a model as in the colloquial idea of "the way the universe works".
        • cap112355 hours ago
          I'd love to see a new cult form around UML. Unified Modeling Language already sounds LLMy.
          • daikikadowaki5 hours ago
            'The Church of UML' does have a certain ring to it. But that’s exactly the trap I’m trying to avoid.

            The reason JTP focuses on the 'Ghost'—the traces of what the model rejected or what was lost in translation—is to prevent exactly that kind of cult-like devotion to the output. A cult forms when you forget the model is just a map.

            I’m not interested in worshipping the map; I’m interested in ensuring that when the machine draws it, we can still see the ink on our own hands. If we can't see the delegation, we can't see the deceit. That’s the 'sovereignty' part of Cognitive Sovereignty.

        • daikikadowaki5 hours ago
          [flagged]
      • amarcheschi5 hours ago
        it's all ai allucination, in a subreddit i once found a tailor asking for how to contact some professors because they found a breakthrough discovery on how knowledge is arranged inside neural networks (whatever that means)
      • daikikadowaki5 hours ago
        [flagged]
    • daikikadowaki6 hours ago
      [flagged]
      • stuartjohnson126 hours ago
        In the essay I linked, there are some instructions you can follow to test out the idea under "step 1". It's really important to follow them exactly and not to use the same ChatGPT instance as you're talking to about this idea so we can test with an independent party what is going on. I'd be curious what the output is.
        • daikikadowaki5 hours ago
          I took the challenge. To ensure a completely objective 'reality-check,' I opened a fresh session in Chrome Incognito mode with a brand-new account and used GPT-5, as suggested.

          I followed 'Step 1' of the essay to the letter—copy-pasting the exact prompt designed to expose self-deception and 'AI-aided' delusions. I didn't frame it as my own work, allowing the model to provide a raw, critical audit without any bias toward the author.

          https://chatgpt.com/share/6963b843-9bbc-8001-a2ea-409a5f6dd6...

          • stuartjohnson123 hours ago
            Awesome - now read it really closely and compare it to the version of reality in your OP. And DON'T paste it or this comment into your normal ChatGPT instance and ask it to respond. Really just think for a moment on your own.

            > The goal: replace vague legal and philosophical notions of “manipulation” with a concrete engineering variable. [...] formally define the metric

            What's the conclusion? Is this a "concrete engineering paper"? Has anything been "formally proved"? From your link:

            > The math is conceptual, not formal.

            > This is serious, careful, and intellectually honest work, but it is not conventional science.

            > The project would be strongest if positioned explicitly as foundational theory + open design pattern, rather than as something awaiting “validation.”

            > it is valid as a design pattern or architectural disclosure, not as experimental systems research

            Be careful before immediately dismissing this as just imprecise language or a translation issue. There's a reason I suggested this to you.

            • daikikadowaki3 hours ago
              You are right. This isn't a scientific paper in the conventional sense. It is a proposal of a framework for the co-evolution of AI and humanity. My intention from the beginning has been to bridge the gap between abstract agency and concrete engineering. I am simply trying to bring this Constitution for human agency into the light, utilizing whatever platforms I can to ensure it is discussed.
              • stuartjohnson122 hours ago
                This is a huge break from the original post you made - take a step back and compare the two. The LLM is tricking you again into thinking that it wasn't trying to make a claim about the world. In the original post, the LLM was causing you to use language like "quantify", "formal proof" and "concrete engineering" to describe what you'd come up with and position it as a mathematical/computational/engineering idea. It wasn't that.

                Now that you got some outside input, it's reframing it for you as an abstract philosophical/legal/moral concept, but the underlying problems are the same. The reason it's talking to you using high level abstract words like "concept" and "proposal" and "framework" now is because the process you just went through - the "step 1" - beat back its potential to frame the idea as a real model of the world. This may feel like just a different way to describe the same idea, but really it's the LLM pulling back from trying to ground the concept in the world at all.

                If you're continuing to talk to the LLM about the idea, it's going to try and convince you that really this was a moral/theory of mind discovery and not a mathematical one all along. You're going to end up convinced of the importance and novelty of this idea in exactly the same way, but this time there are no pesky ideas like rigor or testability that could falsify it.

                If you ask ChatGPT about this comment without this bit I'm writing at the end, it'll tell you that this is fair pushback, but really your work is still important because really you're not trying to write about engineering or philosophy directly, but rather something connecting these two or a new category entirely. It's important you don't fall for this because exaggerating the explanatory power of pattern recognition is how ChatGPT gets you. Patterns and ideas exist everywhere, and you should be able to identify those patterns and ideas, acknowledge them, and then move on. Getting stuck on trying to prove the greatness of a true but simple observation will lead you to the frustration you experienced today.

          • thunfischbrot4 hours ago
            That’s not too bad and mirrored some of the feedback in this thread. Tldr: interesting idea, more worthy of a blog post or a thread in one of your favourite online communities, rather than a paper.
      • durch6 hours ago
        If you have a few minutes I invite you to check what we're doing over at Open Horizon Labs, its exactly the type of thinking we have around the current state of the world. Apologies I feel like I'm stalking you in the comments, but what you're saying absolutely resonates with what I've been thinking, and what I've been trying to build, and its refreshing to finally feel that I'm not insane.

        https://github.com/open-horizon-labs/superego is probably the most useful tool we have, but I'm hoping that we can package it and bring it to the people, as it does make all these LLMs orders of magnitude more useful

        • daikikadowaki5 hours ago
          No apologies needed—I'm just glad to find I'm not the only 'insane' person here. It's easy to feel that way when obsessing over these problems, so knowing my ideas resonate with what you're building at superego is a huge relief.

          I’m diving into your repo now. Please keep me posted on your progress or any new thoughts—I'd love to hear them.

      • 6 hours ago
        undefined
  • durch6 hours ago
    This is exciting, I hope you manage to get traction for the idea!

    I currently have rely on a sort of supervisor LLM to check and detect if we're drifting, or overcomplicating or similar (https://github.com/open-horizon-labs/superego).

    While I still to figure out who watches the watchers, they're are pretty reliable given the constrained mandate they have, and the base model actually (usually) pays attention to the feedback.

    • daikikadowaki6 hours ago
      Thank you so much for the encouraging words and for sharing your project. I’ve just explored superego, and I’m genuinely impressed by how you’ve implemented a pragmatic 'Supervisor' layer to handle model drift.

      Your question—'who watches the watchers'—is the exact focal point of the JTP framework. In many current systems, the feedback loop between the Supervisor and the Base model is 'silent' and internal. My concern is that even when the Supervisor works perfectly, the human user remains in the dark about where the system corrected itself.

      Instead of the Supervisor's feedback being a background process, it could be surfaced to the user as a 'trace' or a 'seam'—allowing the user to actually perceive the internal deliberation. This turns the human from a passive recipient into the final, informed 'watcher.'

      I’d be honored to discuss how these JTP principles might serve as a transparency layer for your work. I’ll be keeping a close eye on your repository!

      • durch5 hours ago
        Thank you! I really hope we can make some headway here :)
        • daikikadowaki5 hours ago
          Thanks! I'm glad you feel the same. Unfortunately, the thread was just flagged, so I've messaged the mods to appeal it. I hope it gets restored so we can continue the debate. Let’s see what happens!
  • frizlab6 hours ago
    > the risk of being rejected entirely

    I would have phrased it the hope of being rejected entirely, but to each his own I guess.

    • daikikadowaki6 hours ago
      'Hope' might be a more honest word in an era of infinite noise.

      If my logic is just another hallucination, then I agree—it deserves to be rejected entirely. I have no interest in contributing to the 'AI-generated debris' either.

      But that’s exactly why I’m here. I’m betting that the 'State Discrepancy' metric and the JTP hold up under actual scrutiny. If you find they don't, then by all means, fulfill your 'hope' and tear the paper down. I'd rather be rejected for a flawed idea than ignored for a fake one."

  • satisfice4 hours ago
    “perceptibility of judgement” is not rigorously defined in these papers, as far as I can tell.

    The proposed JTP principle is suspended in midair, too. I can’t identify its ethical basis. Whatever perceptible judgement is supposed to mean, why should it always be transparent? Mechanical systems, such as a physical slot that mounts a sliding door, automatically cause alignment of the force that you use to open that sliding door. Is that “judgement” of the slot perceptible as it corrects my slightly misaligned push? Do I care? No.

    I would say that any tool we use responsibly requires that we have a reliable and rich model of the tool in our minds. If we do not have that then we cannot plan and predict what the tool will do. It has nothing to do with “judgements” that the tool makes. Tools don’t make judgements. Tools exhibit behavior.

    • daikikadowaki3 hours ago
      Thank you for the critique, but your 'sliding door' analogy misses the fundamental distinction between a passive physical constraint and an active agentic intervention.

      1. The Illusion of Self-Efficacy:

      A physical slot in a sliding door provides immediate haptic feedback. No user opens a sliding door and mistakenly believes they have developed the superhuman ability to move objects in perfectly straight lines regardless of their own force alignment. However, when an AI silently 'polishes' a user’s output, the user often experiences the result as their own. This leads to Agency Misattribution:

      the user internalizes the system’s 'mercy' as personal mastery.

      2. Behavior vs. Invisible Judgment:

      You argue that tools don't make judgments, only exhibit behavior. But in agentic systems, when a system’s 'behavior' involves a probabilistic choice that overrides or refines a user’s raw intent without their perception, it functions as a delegated judgment. If the user cannot perceive this boundary, they 'forget' that a decision was made by the machine, which is a direct violation of Cognitive Sovereignty.

      3. The Necessity of a 'Seam':

      For a tool to be used responsibly, the model in our minds must match the reality of the tool's intervention. Unlike a transparent physical law like a door slot, an AI’s silent correction is an Ontological Deception. It whispers 'You did this' when you did not. JTP asserts that the 'seams' of these interventions must be perceptible—not because we need to see the gears turn, but because we must remain the sovereign authors of our own actions.

      It is currently 2:00 AM in Japan. I'm not even sure if this discussion will survive the "flagged" status to reach you, but I need to get some sleep. I look forward to continuing this discussion—if the thread is still here—when I am back online.

  • QuadmasterXLII7 hours ago
    Hi, I think I saw you on slate star codex the other day!
    • daikikadowaki6 hours ago
      Wow, good catch! I was just lurking in the shadows of that open thread. I didn't think anyone was actually reading my comments there.

      If you've been following my train of thought since then, this white paper is basically my attempt to formalize those chaotic ideas into a concrete metric. I’d love to know if you think this 'State Discrepancy' approach actually holds water compared to the usual high-level AI ethics talk.

  • daikikadowaki4 hours ago
    Enough with the 'AI-detective' meta-commentary. Whether the prose was polished by a tool or not is irrelevant to the validity of the logic. If you're as 'careful' a thinker as you claim, stop obsessing over the window and start addressing the actual architecture of JTP.

    Don't hide behind 'artifacts' and 'patterns' to avoid the debate. Critique the framework itself, or admit you have no substantive counter-argument. I’m here for a technical discussion, not a literary critique.

    • daikikadowaki4 hours ago
      Ironically, by obsessing over whether a text is 'AI-generated' or not, you've stopped thinking for yourself. You're so focused on being an 'AI detector' that you've lost the ability to engage with a new idea. Who’s really lost their 'subjective thinking' here? Me, who used a tool to express an original framework, or you, who can't see the logic because you're trapped in a pattern-matching loop?

      Stop being a human filter and start being a human thinker. I'm waiting for your technical critique.

      • 4 hours ago
        undefined
  • daikikadowaki7 hours ago
    Hi HN, I recently submitted a white paper on State Discrepancy (D) to the EU AI Office (CNECT-AIOFFICE). This paper, "The Judgment Transparency Principle (JTP)," is my attempt to provide a mathematical foundation for the right to human autonomy in the age of black-box AI.

    Philosophy: Protecting the Future While Enabling Speed

    • Neutral Stance: I side with neither corporations nor regulators. I advocate for the healthy coexistence of technology and humanity.

    • Preventing Rupture: History shows that perceiving new tech as a “controllable threat” often triggers violent Luddite movements. If AI continues to erode human agency in a black box, society may eventually reject it entirely. This framework is meant to prevent that rupture.

    Logic of Speed: Brakes Are for Racing

    • A Formula 1 car reaches top speed because it has world-class brakes. Similarly, AI progress requires precise boundaries between “assistance” and “manipulation.”

    • State Discrepancy (D) provides a math-based Safe Harbor, letting developers push UX innovation confidently while building system integrity by design.

    The Call for Collective Intelligence: Why I Need Your Strength I have defined the formal logic of Algorithm V1. However, providing this theoretical foundation is where my current role concludes. The true battle lies in its realization. Translating this framework into high-dimensional, real-world systems is a monumental challenge—one that necessitates the specialized brilliance of the global engineering community.

    I am not stepping back out of uncertainty, but to open the floor. I have proposed V1 as a catalyst, but I am well aware that a single mind cannot anticipate every edge case of such a critical infrastructure. Now, I am calling for your expertise to stress-test it, tear it apart, and refine it right here.

    I want this thread to be the starting point for a living standard. If you see a flaw, point it out. If you see a better path, propose it. The practical brilliance that can translate this "what" into a robust, scalable "how" is essential to this mission. Whether it be refining the logic or engineering the reality, your strength is necessary to build a better future for AI. Let’s use this space to iterate on V1 until we build something that truly safeguards our collective future.

    Anticipating Pushback:

    • “Too complex?” If AI is safe, why hide its correction delta?

    • “Bad for UX?” A non-manipulative UX only benefits from exposing user intent. Calling it “too complex” admits a lack of control; calling it “bad for UX” admits reliance on hiding human-machine boundaries.

    If this framework serves as a mere stepping stone for you to create something superior—an algorithm that surpasses my own—it would be my greatest fulfillment. Beyond this point, the path necessitates the contribution of all of you.

    Let us define the path together.

    • daikikadowaki7 hours ago
      For example, a critical engineering challenge lies in the high-dimensional mapping of 'Logical State'.

      While Algorithm 1 defines the logic, implementing CalculateDistance() for a modern LLM involves normalizing vectors from a massive latent space in real-time. Doing this without adding significant latency to the inference loop is a non-trivial optimization problem.

      I invite ideas on how to architect this 'Observer' layer efficiently.

    • kingkongjaffa6 hours ago
      > If AI continues to erode human agency in a black box

      What do you mean by this?

      Is there evidence this has happened?

      > I advocate for the healthy coexistence of technology and humanity.

      This means whatever you want it to mean at any given time, I don't understand this point without further elaboration.

      • daikikadowaki6 hours ago
        Thanks for the direct push. Let me ground those statements in the framework of the paper:

        1. On "eroding human agency in a black box":

        I am referring to "Agency Misattribution". When Generative AI transitions from a passive tool to an active agent, it silently corrects and optimizes human input without explicit consent. The evidence is observable in the psychological shift where users internalize system-mediated successes as personal mastery. For example, when an LLM silently polishes a draft, the writer claims authorship over nuances they did not actually conceive.

        2. On "healthy coexistence":

        In this paper, this is defined as "Seamful Agency". It is a state where the human can quantify the "D" (Discrepancy) between their raw intent and the system's output. Coexistence is "healthy" only when the locus of judgment remains visible at the moment of intervention.

        For a more rigorous definition of JTP and the underlying problem of "silent delegation," I highly recommend reading Chapter 1 of the white paper.

        Does this technical framing of "agency as a measurable gap" make more sense to you?