Apertus | Swiss AI

Switzerland has developed an open set of models, Apertus | Swiss AI, that is trained on a documented training set, “developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act.”

EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) has released Apertus, Switzerland’s first large-scale open, multilingual language model — a milestone in generative AI for transparency and diversity. Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others. Apertus serves as a building block for developers and organizations for future applications such as chatbots, translation systems, or educational tools.

This project should interest us in Canada as we are talking Sovereign AI. Should Canada develop its own open models? What advantages would that provide? Here are some I can think of:

  • It could provide an open and well maintained set of LLMs that researchers and companies could build on or use without fear that access could be changed/pulled or data logged about usage.
  • It could be designed to be privacy protecting and to encourage adherence to relevant and changing Canadian laws and best practices.
  • It could be trained on an open and well documented bilingual data that would reflect Canadian history, culture, and values.
  • It could be iteratively retrained as issues like bias is demonstrated to be tied to part of the training data. It could also be retrained for new capacities as needed by Canadians.
  • It could include ethically accessed Indigenous training sets developed in consultation with indigenous communities. Further, it could be made available to indigenous scholars and communities with support for the development of culturally appropriate AI tools.
  • We could archive code, data, weights, documentation in such a way that Canadians could check, test, and reproduce the work.

I wonder if we could partner with Switzerland to build on their model or other countries with similar values to produce a joint model?

Personal Superintelligence

Explore Meta’s vision of personal superintelligence, where AI empowers individuals to achieve their goals, create, connect, and lead fulfilling lives. Insights from Mark Zuckerberg on the future of AI and human empowerment.

Mark Zuckerberg has just posted his vision of superintelligence: Personal Superintelligence. He starts by reiterating what a lot of people are saying; namely that AGI (Artificial General Intelligence) or superintelligence is coming soon,

Over the last few months we have begun to see glimpses of our AI systems improving themselves. The improvement is slow for now, but undeniable. Developing superintelligence is now in sight.

He distinguishes what Meta is going to do with superintelligence from “others in the industry who believe superintelligence should be directed centrally towards automating all valuable work, …”. The “others” here is a poke at OpenAI who, in their Charter, define AGI as “highly autonomous systems that outperform humans at most economically valuable work …” He juxtaposes OpenAI as automating work (for companies and governments) while Meta will put superintelligence in our personal hands for creative and communicative play.

Along the way, Zuckerberg hints that future models may not be open any more, a change in policy. Until now Meta has released open models rather than charging for access. Zuckerberg not worries that “superintelligence will raise novel safety concerns.” For this reason they will need to “be rigorous about mitigating these risks and careful about what we choose to open source.”

Why don’t I trust either Meta or OpenAI company?

Brandolini’s law

In a 3QuarksDaily post about Bullshit and Cons: Alberto Brandolini and Mark Twain Issue a Warning About Trump I came across Brandolini’s law of Refutation which states:

The amount of energy needed to refute bullshit is an order of magnitude bigger than to produce it.

This law or principle goes a long way to explaining why bullshit, conspiracy theories, and disinformation are so hard to refute. The very act of refutation becomes suspect as if you are protesting too much. The refuter is made to look like the person with an agenda that we should be skeptical of.

The corollary is that it is less work to lie about someone before they have accused you of lying than to try to refute the accusation. Better to accuse the media of purveying fake news early than to wait until they publish news about you.

As for AI hallucinations, which I believe should be called AI bullshit, we can imagine Rockwell’s corollary:

The amount of energy needed to correct for AI hallucinations in a prompted essay is an order of magnitude bigger than the work of just writing it yourself.

News Media Publishers Run Coordinated Ad Campaign Urging Washington to Protect Content From Big Tech and AI

Today, hundreds of news publishers launched the “Support Responsible AI” ad campaign, which calls on Washington to make Big Tech pay for the content it takes to run its AI products.

I came across one these ads about AI Theft from the News Media Alliance and followed it to this site about how, News Media Publishers Run Coordinated Ad Campaign Urging Washington to Protect Content From Big Tech and AI. They have three asks:

  • Require Big Tech and AI companies to fairly compensate content creators.
  • Mandate transparency, sourcing, and attribution in AI-generated content
  • Prevent monopolies from engaging in coercive and anti-competitive practices.

Gary Marcus has a substack column on Sam Altman’s attitude problem that talks about Altman’s lack of a response when confronted with an example of what seems like IP theft. I think the positions are hardening as groups begin to use charged language like “theft” for what AI companies are doing.

Responsible AI Lecture in Delhi

A couple of days ago I gave an Institute Lecture on What is Responsible About Responsible AI at the Indian Institute of Technology Delhi, India. In it I looked at how AI ethics governance is discussed in Canada under the rubric of Responsible AI and AI Safety. I talked about the emergence of AI Safety Institutes like CAISI (Canadian AI Safety Institute.) Just when it seemed that “safety” was the emergent international approach to ethics governance, Vice President JD Lance’s speech at the Paris Summit made it clear that the Trump administration in not interested,

The AI future is not going to be won by hand-wringing about safety. (Vance)

Trump eliminates Biden AI policies

Trump has signed an Executive Order “eliminating harmful Biden Administration AI policies and enhancing America’s global AI dominance.” (Fact Sheet) In a Fact Sheet he calls Biden’s order(s) dangerous and onerous using the usual stifling innovation argument:

The Biden AI Executive Order established unnecessarily burdensome requirements for companies developing and deploying AI that would stifle private sector innovation and threaten American technological leadership.

There are, however, other components to the rhetoric:

  • It “established the commitment … to sustain and enhance America’s dominance to promote human flourishing, economic competitiveness, and national security.” The human flourishing seems to be
  • It directs the creation of an “AI Action Plan” within 180 days to sustain dominance. Nothing is mentioned about flourishing in regards to the plan. Presumably dominance is flourishing. This plan and review of policies will presumably where we will see the details of implementation. It sounds like the Trump administration may keep some of the infrastructure and policies. Will they, for example, keep the AI Safety Institute in NIST?
  • There is an interesting historic section reflecting back to activities of the first Trump administration noting that “President Trump also took executive action in 2020 to establish the first-ever guidance for Federal agency adoption of AI to more effectively deliver services to the American people and foster public trust in this critical technology.” Note the use of the word “trust”. I wonder if they will return to trustworthy AI language.
  • There is language about how “development of AI systems must be free from ideological bias or engineered social agendas.” My guess is that the target is AIs that don’t have “woke” guardrails.

It will be interesting to track what parts of the Biden orders are eliminated and what parts are kept.

 

Humanity’s Last Exam

Researchers with the Center for AI Safety and Scale AI are gathering submissions for Humanity’s Last Exam. The submission form is here. The idea is to develop an exam with questions from a breadth of academic specializations that current LLMs can’t answer.

While current LLMs achieve very low accuracy on Humanity’s Last Exam, recent history shows benchmarks are quickly saturated — with models dramatically progressing from near-zero to near-perfect performance in a short timeframe. Given the rapid pace of AI development, it is plausible that models could exceed 50% accuracy on HLE by the end of 2025. High accuracy on HLE would demonstrate expert-level performance on closed-ended, verifiable questions and cutting-edge scientific knowledge, but it would not alone suggest autonomous research capabilities or “artificial general intelligence.” HLE tests structured academic problems rather than open-ended research or creative problem-solving abilities, making it a focused measure of technical knowledge and reasoning. HLE may be the last academic exam we need to give to models, but it is far from the last benchmark for AI.

One wonders if it really will be the last exam. Perhaps we will get more complex exams that test for integrated skills. Andrej Karpathy criticises the exam on X. I agree that what we need are AIs able to do intern-level complex tasks rather than just answering questions.

Do we really know how to build AGI?

Sam Altman in a blog post titled Reflections looks back at what OpenAI has done and then predicts that they know how to build AGI,

We are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents “join the workforce” and materially change the output of companies. We continue to believe that iteratively putting great tools in the hands of people leads to great, broadly-distributed outcomes.

It is worth noting that the definition of AGI (Artificial General Intelligence) is sufficiently vague that meeting this target could become a matter of semantics. None the less, here are some definitions of AGI from OpenAI or others about OpenAI,

  • OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity.” – Note the “economically valuable work”. I wonder if philosophizing or making art is valuable? Is intelligence being limited here to economics?
  • “AI systems that are generally smarter than humans” – This is somewhat circular as brings us back to defining “smartness”, another work for “intelligence”.
  • “any system that can outperform humans at most tasks” – This could be timed to the quote above and the idea of AI agents that can work for companies outperforming humans. It seems to me we are nowhere near this if you include physical tasks.
  • an AI system that can generate at least $100 billion in profits” – This is the definition used by OpenAI and Microsoft to help identify when OpenAI doesn’t have to share technology with Microsoft any more.

Can A.I. Be Blamed for a Teen’s Suicide?

The New York Times has a story about youth who committed suicide after extended interactions with a character on Character.ai. The story, Can A.I. Be Blamed for a Teen’s Suicide? describes how Sewell Setzer III has long discussions with a character called Daenerys Targaryen from the Game of Thrones series. He became isolated and got attached to Daenerys. He eventually shot himself and now his mother is suing Character.ai.

Here is an example of what he wrote in his journal,

I like staying in my room so much because I start to detach from this ‘reality,’ and I also feel more at peace, more connected with Dany and much more in love with her, and just happier.

The suit claims that Character.ai’s product was untested, dangerous and defective. It remains to be seen if these types of suits will succeed. In the meantime we need to be careful with these social AIs.

The 18th Annual Hurtig Lecture 2024: Canada’s Role in Shaping our AI Future

The video for the 2024 Hurtig Lecture is up. The speaker was Dr. Elissa Strome, Executive Director of the Pan-Canadian AI Strategy. She gave an excellent overview of the AI Strategy here in Canada and ended by discussing some of the challenges.

The Hurtig Lecture was organized by my colleague Dr. Yasmeen Abu-Laban. I got to moderate the panel discussion and Q & A after the lecture.