Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information RetrievalใƒปMedical Multimodal NLP (๐Ÿ–ผ+๐Ÿ“) Research Fellow @BU_Researchใƒปsoftware developer http://arekit.ioใƒปPhD in NLP

Organizations

None yet

nicolay-r's activity

posted an update 4 days ago
view post
Post
410
๐Ÿ“ข For those who are interested in extracting information about โœ๏ธ authors from texts, happy to share personal ๐Ÿ“น on Reading Between the lines: adapting ChatGPT-related systems ๐Ÿค– for Implicit Information Retrieval National

Youtube: https://youtu.be/nXClX7EDYbE

๐Ÿ”‘ In this talk, we refer to IIR as such information that is indirectly expressed by โœ๏ธ author / ๐Ÿ‘จ character / patient / any other entity.

๐Ÿ“Š I cover the 1๏ธโƒฃ pre-processing and 2๏ธโƒฃ reasoning techniques, aimed at enhancing gen AI capabilities in IIR. To showcase the effectiveness of the proposed techniques, we experiment with such IIR tasks as Sentiment Analysis, Emotion Extraction / Causes Prediction.

In pictures below, sharing the quick takeaways on the pipeline construction and experiment results ๐Ÿงช

Related paper cards:
๐Ÿ“œ emotion-extraction: https://nicolay-r.github.io/#semeval2024-nicolay
๐Ÿ“œ sentiment-analysis: https://nicolay-r.github.io/#ljom2024

Models:
nicolay-r/flan-t5-tsa-thor-base
nicolay-r/flan-t5-emotion-cause-thor-base


๐Ÿ““ PS: I got a hoppy for advetising HPMoR โœจ ๐Ÿ˜
posted an update 11 days ago
view post
Post
695
๐Ÿ“ข Have you ever been wondered how specifically Transformers were capable for handling long input contexts?
I got a chance to tackle this through long document texts summarization problem, and delighted to share the related survey and diagram for a quick skimming below:

Preprint ๐Ÿ“ https://nicolay-r.github.io/website/data/preprint-AINL_2023_longt5_summarization.pdf
Springer ๐Ÿ“ https://link.springer.com/article/10.1007/s10958-024-07435-z

๐ŸŽฏ The aim of the survey was the development of the long-document summarizer for mass-media news in Vietnamese language. ๐Ÿ‡ป๐Ÿ‡ณ

Sharing for a quick skimming of the methods performance overview of various LM-based solution across several datasets, covering domain-oriented advances in Vietnamese language (see attached screenshots)

As for solution we consider:
โ˜‘๏ธ 1. Adapt existed google/pegasus-cnn_dailymail for summarizing large dataset for arranging training
โ˜‘๏ธ 2. Tuning google/long-t5-tglobal-large suitable for performing generative summarization.

Implementation details:
๐ŸŒŸ https://github.com/nicolay-r/ViLongT5
(Simplier to go with huggingface rather flaxformer that so far become a legacy engine)
reacted to m-ric's post with ๐Ÿ”ฅ 18 days ago
view post
Post
2327
> Oasis: First Real-Time Video Game Without a Game Engine! ๐ŸŽฎ

DecartAI & Etched just released Oasis - a fully AI-generated video game running at 20 FPS (frames per second). The model takes keyboard inputs and generates everything - physics, rules, graphics - on the fly, without any game engine.

โšก๏ธ What makes this special? Current text-to-video models (Mochi-1, Sora, Kling) generate about 1 frame every 10-20 seconds (that's the kind of device I had to play LoL back in the day, thus my low rankings). Oasis is 200 times faster, making it the first playable AI-generated game.

โš™๏ธ Under the hood, it uses a vision transformer to encode space and a diffusion model to generate frames. The secret sauce is "dynamic noising" - a technique that keeps the video stable between frames.

Key insights:
โšก๏ธ Generates 20 FPS, vs 0.2 FPS for other DIT-based video models
โ€ฃ The specialized hardware Sohu developed by Etched allows to handle 10x more player than H100

๐ŸŽฎ Features real game mechanics
โ€ฃ Movement, jumping, item management
โ€ฃ Physics and lighting
โ€ฃ Procedurally generated worlds

โš ๏ธ Current limitations
โ€ฃ Blurry graphics at a distance
โ€ฃ Objects sometimes change appearance
โ€ฃ Memory issues in long sessions

Try it yourself, the playable demo is impressive! ๐Ÿ‘‰ https://oasis.decart.ai/welcome
Code ๐Ÿ‘‰ https://github.com/etched-ai/open-oasis
Read it in full ๐Ÿ‘‰ https://oasis-model.github.io/
posted an update 18 days ago
view post
Post
1828
๐Ÿ“ข If you're aimed at processing complex dependencies in spreadsheet data with LLM Chain-of-Thought technique, then this update might be valuable for you ๐Ÿ’Ž

The updated ๐Ÿ“ฆ bulk-chain-0.24.1 which is aimed at iterative processing of CSV/JSONL data with no-string dependencies from third party LLM frameworks is out ๐ŸŽ‰

๐Ÿ“ฆ: https://pypi.org/project/bulk-chain/0.24.1/
๐ŸŒŸ: https://github.com/nicolay-r/bulk-chain
๐Ÿ“˜: https://github.com/nicolay-r/bulk-chain/issues/26

The key feature of bulk-chain is SQLite caching that saves your time โฐ๏ธ and money ๐Ÿ’ต by guarantee no-data-lost, which is important once using the remote LLM providers such as OpenAI, ReplicateIO, OpenRouter, etc.

๐Ÿ”ง This release has the following updates:
โœ… Improved stability for various header conditions and the related support from SQLite
โœ… Manual setup for ID column / assigning the ID
โœ… Make CSV-related setups dynamic, that refers to the related Python ๐Ÿ“ฆ csv package.

Quick start on GoogleColab:
๐Ÿ“™: https://colab.research.google.com/github/nicolay-r/bulk-chain/blob/master/bulk_chain_tutorial.ipynb

Below is an example of the three simple steps in pictures:
1. โฌ‡๏ธ Package installation
2. โœ๏ธ Declaring schema
3. ๐Ÿš€ Launching inference for your data with Replicate and ๐Ÿค– meta-llama/Llama-3.1-405B
posted an update about 1 month ago
view post
Post
680
๐Ÿ“ข Excited to share that our studies ๐Ÿ“„ "Large Language Models in Targeted Sentiment Analysis for Russian" has recently become in ๐Ÿ“˜ Springer Lobachevskii Journal of Mathematics ๐Ÿฅณโœจ ...

๐Ÿ“˜ https://link.springer.com/article/10.1134/S1995080224603758

In this studies we provide such a diverse look and experiments over various ๐Ÿค– LLM models ๐Ÿค– scaled from 7B in two different modes: โ„๏ธ zero-shot and ๐Ÿ”ฅ fine-tuned (Flan-T5 only) using Three-Hop reasoning technique.
We showcase the importance of performing:
๐Ÿ’š text translation into English
๐Ÿ’š application on Chain-of-Thought for Implicit Sentiment Analysis

More:
๐Ÿ“„ Arxiv: https://arxiv.org/abs/2404.12342
๐Ÿง‘โ€๐Ÿ’ป๏ธ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework
๐Ÿค— Models: Large Language Models in Targeted Sentiment Analysis (2404.12342)
๐ŸŽฅ Video @NLPSummit : https://www.youtube.com/watch?v=qawLJsRHzB4

THOR: https://github.com/scofield7419/THOR-ISA
posted an update about 1 month ago
view post
Post
640
๐Ÿ“ข We are giving extra two weeks before switching to the final stage of RuOpinionNE-2024.
โฐ The final stage starts since 1-st of November 2024.
We have already first baseline submission by ๐Ÿ‘จโ€๐Ÿ’ป RefalMachine that showcase F1 = 0.17 based on Qwen2 model series.

For those who wish to attend:
๐Ÿ“Š Codalab: https://codalab.lisn.upsaclay.fr/competitions/20244
๐Ÿ—’ Task: https://codalab.lisn.upsaclay.fr/competitions/20244#learn_the_details-overview
๐Ÿ”” Updates: https://t.me/RuOpinionNE2024

๐Ÿ™‹ Questions: https://nicolay-r.github.io/
๐Ÿงช Past experiments: https://github.com/nicolay-r/RuSentNE-LLM-Benchmark
reacted to fdaudens's post with ๐Ÿง ๐Ÿ”ฅ about 1 month ago
view post
Post
3034
The Nobel Prize background for Hopfield and Hinton's work on neural networks is pure gold. It's a masterclass in explaining AI basics.

Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics ๐Ÿค AI: A two-way street of innovation.

Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells

We're just scratching the surface. The interplay between AI and physics is reshaping both fields.

Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)

#AI #MachineLearning #Physics #Ethics #Innovation
  • 1 reply
ยท
reacted to Jaward's post with ๐Ÿ‘€ about 1 month ago
reacted to fantaxy's post with ๐Ÿ˜Ž about 1 month ago
view post
Post
3443
NSFW Erotic Novel AI Generation
-NSFW Text (Data) Generator for Detecting 'NSFW' Text: Multilingual Experience

The multilingual NSFW text (data) auto-generator is a tool designed to automatically generate and analyze adult content in various languages. This service uses AI-based text generation to produce various types of NSFW content, which can then be used as training data to build effective filtering models. It supports multiple languages, including English, and allows users to input the desired language through the system prompt in the on-screen options to generate content in the specified language. Users can create datasets from the generated data, train machine learning models, and improve the accuracy of text analysis systems. Furthermore, content generation can be customized according to user specifications, allowing for the creation of tailored data. This maximizes the performance of NSFW text detection models.


Web: https://fantaxy-erotica.hf.space
API: https://replicate.com/aitechtree/nsfw-novel-generation

Usage Warnings and Notices: This tool is intended for research and development purposes only, and the generated NSFW content must adhere to appropriate legal and ethical guidelines. Proper monitoring is required to prevent the misuse of inappropriate content, and legal responsibility lies with the user. Users must comply with local laws and regulations when using the data, and the service provider is not liable for any issues arising from the misuse of the data.
  • 2 replies
ยท
reacted to reach-vb's post with ๐Ÿ‘ about 1 month ago
view post
Post
2047
On-device AI framework ecosystem is blooming these days:

1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc)
https://github.com/ggerganov/llama.cpp

2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there)
https://github.com/mlc-ai/web-llm

3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs
https://github.com/ml-explore/mlx-examples

4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories
https://github.com/huggingface/candle

Honorable mentions:

1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb
https://github.com/xenova/transformers.js

2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle
https://github.com/EricLBuehler/mistral.rs

3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments
https://github.com/huggingface/ratchet

4. Zml - Cross platform, Zig based ML framework
https://github.com/zml/zml

Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! ๐Ÿค—

Also, which frameworks did I miss?
  • 1 reply
ยท
posted an update about 1 month ago
view post
Post
1004
๐Ÿ“ข Two weeks ago I got a chance to share the most recent reasoning ๐Ÿง  capabilities of Large Language models in Sentiment Analysis NLPSummit-2024.

For those who missed and still wish to find out the advances of GenAI in that field, the recording is now available:
https://www.youtube.com/watch?v=qawLJsRHzB4

You will be aware of:
โ˜‘๏ธ how well LLMs reasoning can be used for reasoning in sentiment analysis as in Zero-shot-Learning,
โ˜‘๏ธ how to improve reasoning by applying and leaving step-by-step chains (Chain-of-Thought)
โ˜‘๏ธ how to prepare the most advanced model in sentiment analysis using Chain-of-Thought.

Links:
๐Ÿ“œ Paper: Large Language Models in Targeted Sentiment Analysis (2404.12342)
โญ Code: https://github.com/nicolay-r/Reasoning-for-Sentiment-Analysis-Framework
reacted to qnguyen3's post with ๐Ÿ”ฅ about 1 month ago
reacted to mmhamdy's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1826
๐Ÿ”— Evaluating Long Context #1: Long Range Arena (LRA)

Accurately evaluating how well language models handle long contexts is crucial, but it's also quite challenging to do well. In this series of posts, we're going to examine the various benchmarks that were proposed to assess long context understanding, starting with Long Range Arens (LRA)

Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation.

๐Ÿ“Œ Key Features of LRA

1๏ธโƒฃ Diverse Tasks: The LRA benchmark consists of a suite of tasks designed to evaluate model performance on long sequences ranging from 1,000 to 16,000 tokens. These tasks encompass different data types and modalities: Text, Natural and Synthetic Images, and Mathematical Expressions.

2๏ธโƒฃ Synthetic and Real-world Tasks: LRA is comprised of both synthetic probing tasks and real-world tasks.

3๏ธโƒฃ Open-Source and Extensible: Implemented in Python using Jax and Flax, the LRA benchmark code is publicly available, making it easy to extend.

๐Ÿ“Œ Tasks

1๏ธโƒฃ Long ListOps

2๏ธโƒฃ Byte-level Text Classification and Document Retrieval

3๏ธโƒฃ Image Classification

4๏ธโƒฃ Pathfinder and Pathfinder-X (Long-range spatial dependency)

๐Ÿ‘จโ€๐Ÿ’ป Long Range Arena (LRA) Github Repository: https://github.com/google-research/long-range-arena

๐Ÿ“„ Long Range Arena (LRA) paper: Long Range Arena: A Benchmark for Efficient Transformers (2011.04006)
reacted to merve's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
3739
Meta AI vision has been cooking @facebook
They shipped multiple models and demos for their papers at @ECCV ๐Ÿค—

Here's a compilation of my top picks:
- Sapiens is family of foundation models for human-centric depth estimation, segmentation and more, all models have open weights and demos ๐Ÿ‘

All models have their demos and even torchscript checkpoints!
A collection of models and demos: facebook/sapiens-66d22047daa6402d565cb2fc
- VFusion3D is state-of-the-art consistent 3D generation model from images

Model: facebook/vfusion3d
Demo: facebook/VFusion3D

- CoTracker is the state-of-the-art point (pixel) tracking model

Demo: facebook/cotracker
Model: facebook/cotracker
posted an update about 1 month ago
view post
Post
643
๐Ÿ“ข This year I made decent amout of experiments on LLM reasoning capabilities in author opinion extraction.
However, they did not go further with:
โ†—๏ธ annoation of other sources of opinion causes: entities, out-of-context object (None).
๐Ÿ“ evaluation of factual statements that support the extracted sentiment.

To address these limitations, so far we launch ๐Ÿš€ RuOpinionNE-2024 competition on the Codalab platform:
๐Ÿ“Š https://codalab.lisn.upsaclay.fr/competitions/20244

The competition is aimed at extraction of opinion tuples (see attached images) from texts written in Russian.
It proceeds the past RuSentNE-2023 codalab competition findings:
๐Ÿ”Ž Past year competition: https://www.dialog-21.ru/media/5896/golubevaplusetal118.pdf
๐Ÿ”Ž LLM reasoning ๐Ÿง : https://arxiv.org/abs/2404.12342

For those who interested to adopt Generative AI, the complete information about competition is below:
๐Ÿ“Š RuOpinionNE-2024: https://codalab.lisn.upsaclay.fr/competitions/20244
๐Ÿ—’ Task description: https://codalab.lisn.upsaclay.fr/competitions/20244#learn_the_details-overview
๐Ÿ”” To follow updates: https://t.me/RuOpinionNE2024
โฐ Stages Deadlines (might be extended)
๐Ÿ“ฆ Submission details (bottom of the competition page)

๐Ÿ™‹ For questions you can contact @nicolay-r : https://nicolay-r.github.io/
๐Ÿงช Most recent findings on LLM application: https://github.com/nicolay-r/RuSentNE-LLM-Benchmark
reacted to Tonic's post with ๐Ÿ‘€ about 1 month ago
reacted to nroggendorff's post with ๐Ÿ‘€ about 1 month ago
view post
Post
1516
pretty much all of the values in the llama training post are placeholders, so if you dont get a desireable result tweak and tweak and tweak. it took months to get smallama to do anything
reacted to clem's post with ๐Ÿ‘€ about 1 month ago
posted an update about 2 months ago
view post
Post
739
๐Ÿ“ข The fast application of named entity recognition (NER) model towards vast amout of texts usually serves two major pitfalls:
๐Ÿ”ด Limitation of the input window size
๐Ÿ”ด Drastically slows down the downstream pipeline of the whole application

โญ https://github.com/nicolay-r/bulk-ner

To address these problems, bulk-ner represent a no-string framework with the handy wrapping over any dynamically linked NER-ml model by providing:
โ˜‘๏ธ Native long-input contexts handling.
โ˜‘๏ธ Native support of batching (assuming that ML-model engine has the related support too)

To quick start, sharing the wrapper over DeepPavlov NER models.
With the application of such models you can play and bulk your data here:
๐Ÿ“™ https://colab.research.google.com/github/nicolay-r/ner-service/blob/main/NER_annotation_service.ipynb
(You have to have your data in CSV / JSONL format)

Lastly, it is powered by AREkit pipelines, and therefore could be a part of the relation extraction and complex information retrieval systems:
๐Ÿ’ป https://github.com/nicolay-r/AREkit
๐Ÿ“„ https://openreview.net/forum?id=nRybAsJMUt