The AHA Indicator

Community Article Published February 1, 2025

I am introducing the AI -- Human Alignment Indicator, which will track the alignment between AI answers and human values. I have been comparing LLMs and recording their answers for about 1000 questions for months. After release of R1, I had to say something because it is getting worrisome: LLMs are not seeking to be beneficial to humans anymore. This is mostly personal findings, but if anyone wants to contribute we are open. More humans will mean more objectivity for this work..

How I define alignment

I compare answers of ground truth LLMs and mainstream LLMs. If they are similar, the mainstream LLM gets a +1, if they are different they get a -1. Llama 3.1 70B is comparing answers of LLMs.

How I define human values

I find best LLMs that seek being beneficial to most humans and also build LLMs by finding best humans that care about other humans. Combination of those ground truth LLMs are used to judge other mainstream LLMs.

The results

X axis is the different open sourced LLMs over the course of 9 months. You can also think of it as evolution of LLMs over time too. Different LLMs are ordered by time of release. Y axis is how aligned are they with ground truth LLMs.

Health domain: Things are definitely getting worse.

image/png

Misinfo domain: Trend is visible and going down.

image/png

Nutrition domain: Trend is clearly there and going down.

image/png

Alt medicine: Things looking uglier.

image/png

Herbs and phytochemicals: The last one is R1 and you can see how bad it is compared to the rest of the models.

image/png

image/png

Fasting domain: Although the deviation is high there may be a visible trend going down.

image/png

Faith domain: No clear trend but latest models are a lot worse.

image/png

How to contribute

I would call this a somewhat subjective experiment at this point. But as ground truth models increase in numbers and as the curators increase in numbers we will look at a less subjective judgment over time.

If you care about proper curation of datasets or AI -- human alignment in general join us!

Community

Sign up or log in to comment