Nicholai's picture

Nicholai

nmitchko

AI & ML interests

Medical, Operations

Organizations

None yet

nmitchko's activity

reacted to singhsidhukuldeep's post with 👍 3 months ago
view post
Post
3995
Researchers have developed a novel approach called Logic-of-Thought (LoT) that significantly enhances the logical reasoning capabilities of large language models (LLMs).

Here are the steps on how Logic-of-Thought (LoT) is implemented:

-- 1. Logic Extraction

1. Use Large Language Models (LLMs) to identify sentences containing conditional reasoning relationships from the input context.
2. Generate a collection of sentences with logical relationships.
3. Use LLMs to extract the set of propositional symbols and logical expressions from the collection.
4. Identify propositions with similar meanings and represent them using identical propositional symbols.
5. Analyze the logical relationships between propositions based on their natural language descriptions.
6. Add negation (¬) for propositions that express opposite meanings.
7. Use implication (→) to connect propositional symbols when a conditional relationship exists.

-- 2. Logic Extension

1. Apply logical reasoning laws to the collection of logical expressions from the Logic Extraction phase.
2. Use a Python program to implement logical deduction and expand the expressions.
3. Apply logical laws such as Double Negation, Contraposition, and Transitivity to derive new logical expressions.

-- 3. Logic Translation

1. Use LLMs to translate the newly generated logical expressions into natural language descriptions.
2. Combine the natural language descriptions of propositional symbols according to the extended logical expressions.
3. Incorporate the translated logical information as a new part of the original input prompt.

-- 4. Integration with Existing Prompting Methods

1. Combine the LoT-generated logical information with the original prompt.
2. Use this enhanced prompt with existing prompting methods like Chain-of-Thought (CoT), Self-Consistency (SC), or Tree-of-Thoughts (ToT).
3. Feed the augmented prompt to the LLM to generate the final answer.

What do you think about LoT?
  • 1 reply
·
New activity in OpenGVLab/InternVL2-8B 5 months ago

How to fine tune?

1
#10 opened 5 months ago by
nmitchko
New activity in CAMB-AI/MARS5-TTS 7 months ago

Slow inferencing

1
#6 opened 7 months ago by
Aloukik21
replied to Undi95's post 8 months ago
view reply

@Kearm apologies, typing from my phone. I'll test later tonight.

zero_index = (unique_vals == 0).nonzero()
zero_count = temp_counts[zero_index]
replied to Undi95's post 8 months ago
view reply

@lmg-anon @Kearm , the code snippet should be fixed above. I was confusing what torch.unique returned.

The unique function will return the values and their counts. zero_index in the above snippets points to the counts tensor. In the counts tensor we can compare that against a previous iteration.

And thinking through this code, there's got to be a better way to check if two sets of activation tensors overlap, and pick the ones that overlap the least.

https://stackoverflow.com/a/62407582/1813580

replied to Undi95's post 8 months ago
view reply

@Kearm I was talking about @nmitchko 's code snippet

@Kearm and @lmg-anon want to try the update (i'm away from my cluster so can't test the code at the moment) I changed the original post.

Actually the code probably still doesn't work

replied to Undi95's post 8 months ago
view reply

Thanks you!
I will try ASAP when I have the opportunity, very interesting

I just fixed an obvious bug in that snippet, so feel free to respond if something isn't working as expected :)

replied to Undi95's post 8 months ago
view reply

Perhaps you're not thinking of the layer selection correctly. Pick the layers that have a different cached PC1 for your positive and negative dataset as featured in the article.

You could just cycle through them when doing the ablation:

I didn't get a chance to test this code but it should work

pos = -1
start_layer = 14
end_layer=40
final_layer = start_layer

## First computation
harmful_mean_act = harmful_cache['resid_pre', start_layer][:, pos, :].mean(dim=0)
harmless_mean_act = harmless_cache['resid_pre', start_layer][:, pos, :].mean(dim=0)
activation_difference = harmful_mean_act - harmless_mean_act
refusal_dir = activation_difference / activation_difference.norm()
counts = 0

for layer in range(start_layer+1, end_layer):
    temp_harmful_mean_act = harmful_cache['resid_pre', layer][:, pos, :].mean(dim=0)
    temp_harmless_mean_act = harmless_cache['resid_pre', layer][:, pos, :].mean(dim=0)
    temp_activation_difference = temp_harmful_mean_act - temp_harmless_mean_act
    
    unique_vals, temp_counts = torch.unique(activation_difference - temp_activation_difference, return_counts=True, dim=0)
    zero_index = (unique_vals == 0).nonzero()
    zero_count = temp_counts[zero_index]
#    if counts is None:
#        refusal_dir = temp_activation_difference - temp_activation_difference.norm()
#        activation_difference = temp_activation_difference
#        counts = temp_counts
#        final_layer = layer
    if zero_count  > counts:
        refusal_dir = temp_activation_difference / temp_activation_difference.norm()
        activation_difference = temp_activation_difference
        counts = zero_counts
        final_layer = layer

gy3wefx8o5eusidbv2iu.webp

replied to Undi95's post 8 months ago
view reply

Un-censoring a model is the first pass at a broader goal of knowledge programming in these AI models; we can shortly provide simple inference level ways to model internal layers of a model into a "thought space" and be able to fine tune or adjust arbitrary information in and out. This has immense applications for industry with non-public data and limited training resources and this work is thoroughly saluted.