3 5

Alan Tseng

agentlans

agentlans

AI & ML interests

Small data, boring AI

Recent Activity

updated a dataset 43 minutes ago

agentlans/real-vs-gpt2-sentences

published a dataset about 1 hour ago

agentlans/real-vs-gpt2-sentences

updated a dataset about 4 hours ago

agentlans/c4-en-tokenized

View all activity

Organizations

None yet

agentlans's activity

updated a dataset 43 minutes ago

agentlans/real-vs-gpt2-sentences

Updated 43 minutes ago

published a dataset about 1 hour ago

agentlans/real-vs-gpt2-sentences

Updated 43 minutes ago

updated a dataset about 4 hours ago

agentlans/c4-en-tokenized

Updated about 4 hours ago

published a dataset about 4 hours ago

agentlans/c4-en-tokenized

Updated about 4 hours ago

replied to etemiz's post 5 days ago

I go into more detail here: agentlans/ai-human-alignment

In short:

You can talk with both LLMs and human beings but they're fundamentally different
- because humans are self-aware and have experiences
- so you can't truly understand what an AI is thinking or what kind of entity it is
- this makes AIs black boxes
Since AIs don't have conscious experiences
- they must constantly be updated with data that aligns with human needs
- sure, good design and initial training datasets are important
- but human needs and values are always changing
- and there can be unintended consequences when dealing with an automatic black box (more so than a human being)

updated a collection 5 days ago

Documents

Collection

3 items • Updated 5 days ago

updated 2 models 5 days ago

agentlans/ai-human-alignment

Updated 5 days ago

agentlans/Human-Like-Configurable-Llama3.1

Updated 5 days ago • 3

updated a collection 5 days ago

Documents

Collection

3 items • Updated 5 days ago

updated a model 5 days ago

agentlans/how-to-upload-to-huggingface-from-linux

Updated 5 days ago

replied to etemiz's post 5 days ago

I have many complicated opinions about that. Not to get into a debate but I think:

AIs are black boxes. It's hard to say whether a new black box is really better than your old black box.
Even if you have received the most profound wisdom and data from the prophets - you're still training a black box.
AIs are more aligned with their creators than their users.
Any technology can be abused, no matter how well-intentioned their inventors were.

reacted to etemiz's post with 👀 5 days ago

Post

1722

-= DeepSeek V3 =-

After installing the new CUDA toolkit and compiling llama.cpp again I tested DeepSeek V3 yesterday.

In terms of human alignment DeepSeek V3 did worse on:
- health
- fasting
- nostr
- misinfo
- nutrition

did better on:
- faith
- bitcoin
- alternative medicine
- ancient wisdom

compared to DeepSeek 2.5. In my opinion overall it is worse than 2.5. And 2.5 wasn't that great.

There is a general tendency of models getting smarter but at the same time getting less wiser, less human aligned, less beneficial to humans.

I don't know what is causing this. But maybe synthetic dataset use for further training the LLMs makes it more and more detached from humanity. This is not going in the right direction.

My solution is to come up with a curator council to determine the datasets that are closest to human preference. "Humans that care about other humans the most" could be a definition of this dataset. What do you think?

3 replies

updated 3 datasets 7 days ago

liked a dataset 8 days ago

OpenLeecher/lmsys_chat_1m_clean

Viewer • Updated 18 days ago • 273k • 1.56k • 67

updated a dataset 8 days ago

agentlans/Aratako-Japanese-Roleplay

Viewer • Updated 8 days ago • 58.7k • 15

updated 2 models 9 days ago

agentlans/deberta-v3-xsmall-zyda-2-v2

agentlans/flan-t5-small-ner

Text2Text Generation • Updated 9 days ago • 17

updated a model 12 days ago

agentlans/Phi-3.5-mini-instruct-o1

Updated 12 days ago • 8