Archived version

Boox recently switched its AI assistant from Microsoft Azure GPT-3 to a language model created by ByteDance, TikTok’s parent company.

[…]

Testing shows the new AI assistant heavily censors certain topics. It refuses to criticize China or its allies, including Russia, Syria’s Assad regime, and North Korea. The system even blocks references to “Winnie the Pooh” - a term that’s banned in China because it’s used to mock President Xi Jinping.

When asked about sensitive topics, the assistant either dodges questions or promotes state narratives. For example, when discussing Russia’s role in Ukraine, it frames the conflict as a “complex geopolitical situation” triggered by NATO expansion concerns. The system also spreads Chinese state messaging about Tiananmen Square instead of addressing historical facts.

When users tried to bring attention to the censorship on Boox’s Reddit forum, their posts were removed. The company hasn’t made any official statement about the situation, but users are reporting that the AI assistant is currently unavailable.

[…]

In China, every AI model has to pass a government review to make sure it follows “socialist values” before it can launch. These systems aren’t allowed to create any content that goes against official government positions.

We’ve already seen what this means in practice: Baidu’s ERNIE-ViLG image AI won’t process any requests about Tiananmen Square, and while Kling’s video generator refuses to show Tiananmen Square protests, it has no problem creating videos of a burning White House.

Some countries are already taking steps to address these concerns. Taiwan, for example, is developing its own language model called “Taide” to give companies and government agencies an AI option that’s free from Chinese influence.

[…]

  • hersh@literature.cafe
    link
    fedilink
    arrow-up
    3
    ·
    15 days ago

    I’ve done this to give myself something akin to Cliff’s Notes, to review each chapter after I read it. I find it extremely useful, particularly for more difficult reads. Reading philosophy texts that were written a hundred years ago and haphazardly translated 75 years ago can be a challenge.

    That said, I have not tried to build this directly into my ereader and I haven’t used Boox’s specific service. But the concept has clear and tested value.

    I would be interested to see how it summarizes historical texts about these topics. I don’t need facts (much less opinions) baked into the LLM. Facts should come from the user-provided source material alone. Anything else would severely hamper its usefulness.

    • LukeZaz@beehaw.org
      link
      fedilink
      English
      arrow-up
      6
      ·
      15 days ago

      Reading philosophy texts that were written a hundred years ago and haphazardly translated 75 years ago can be a challenge.

      For a human, at that. I get that you feel it works for you, but personally, I would trust an LLM to understand it (insofar as that’s a thing they can do at all) even less.

      • Jayjader@jlai.lu
        link
        fedilink
        arrow-up
        5
        ·
        15 days ago

        I’m not sure if this is how @hersh@literature.cafe is using it, but I could totally see myself using an LLM to check my own understanding like the following:

        1. Read a chapter
        2. Read the LLM’s summary of the chapter
        3. Make sure I can understand and agree or disagree with each part of the LLM’s summary.

        Ironically, this exercise works better if the LLM “hallucinates”; noticing a hallucination in its summary is a decent metric for my own understanding of the chapter.

        • hersh@literature.cafe
          link
          fedilink
          arrow-up
          2
          ·
          14 days ago

          That’s pretty much what I do, yeah. On my computer or phone, I split an epub into individual text files for each chapter using pandoc (or similar tools). Then after I read each chapter, I upload it into my summarizer, and perhaps ask some pointed questions.

          It’s important to use a tool that stays confined to the context of the provided file. My first test when trying such a tool is to ask it a general-knowledge question that’s not related to the file. The correct answer is something along the lines of “the text does not provide that information”, not an answer that it pulled out of thin air (whether it’s correct or not).

          • Jayjader@jlai.lu
            link
            fedilink
            arrow-up
            2
            ·
            14 days ago

            Ooooh, that’s a good first test / “sanity check” !

            May I ask what you are using as a summarizer? I’ve played around with locally running models from huggingface, but never did any tuning nor straight-up training “from scratch”. My (paltry) experience with the HF models is that they’re incapable of staying confined to the given context.

      • hersh@literature.cafe
        link
        fedilink
        arrow-up
        2
        ·
        14 days ago

        I get that, and it’s good to be cautious. You certainly need to be careful with what you take from it. For my use cases, I don’t rely on “reasoning” or “knowledge” in the LLM, because they’re very bad at that. But they’re very good at processing grammar and syntax and they have excellent vocabularies.

        Instead of thinking of it as a person, I think of it as the world’s greatest rubber duck.