Update README.md

09b7e57 verified about 4 hours ago

9.29 kB

	---
	license: mit
	language:
	- en
	tags:
	- audio2face
	- Transformers
	- seq2seq
	- UnrealEngine
	- LiveLink
	- FeatureMapping
	- PyTorch
	- AudioToFace
	---

	# NeuroSync Open Source Audio2Face Blendshape Transformer Model

	## Info Sheet
	- [Download the info sheet here](https://drive.google.com/file/d/1U9pvs_FY1L-cnSkWvnkbSPe0VVa8PZ8b/view?usp=drive_link)

	---

	### 27/02/2025 64 Frame Fintune Added

	An ever so slightly less accurate 64 frame finetune has been added, for those that want to setup streaming.

	Ensure you set the frame size in the config if you use this model.pth.

	Further research is being done to reduce the frame size more while keeping accuracy and smoothness.

	### 24/02/2025 model update

	75 epochs added from the open source dataset - this represents the most a 90 minute dataset from an iPhone can be trained.

	To increase fidelty, increase the variance of actor, sex and pitch of the data (get the open source dataset of 1 man and 1 woman as a starter).

	Other options include creating a higher dimension face dataset and amending the trainer to suit, the model is the same - we use ARKit and Livelink for its usability.

	Wink

	### 23/02/2025 model update

	Half precision inference added to local api, can disable in the config for full precision.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64ad37822a530cbdee7ce10b/ZxO-InQZXsmAxRzDvft0z.png)

	## Latest Updates
	<!-- Newest items appear at the top -->

	### 21/02/2025 Scaling UP! \| New 228m parameter model + config added

	A milestone has been hit and previous research has got us to a point where scaling the model up is now possible with much faster training and better quality overall.

	Going from 4 layers and 4 heads to 8 layers and 16 heads means updating your code and model, please ensure you have the latest versions of the api and player as the new model requires some architectural changes.

	Enjoy!

	### 19/02/2025 Trainer updates

	- Trainer: Use [NeuroSync Trainer Lite](https://github.com/AnimaVR/NeuroSync_Trainer_Lite) for training and fine-tuning.

	- Simplified Loss Removed second order smoothness loss (left code in if you want to research the differences, mostly it just squeezes the end result resulting in choppy animation without smoothing)
	- Mixed Precision Less memory usage and faster training
	- Data augmentation Interpolate a slow set and a fast set of data from your data to help with fine detail reproduction, uses a lot of memory so /care - generally just adding the fast is best as adding slow over saturates the data with slow and noisey data (more work to do here... obv's!)

	### 17/02/2025 – Player LLM Streaming + Chunking Updates

	- Talk to a language model with audio and face animation response
	- Player: [Download the NeuroSync Player](https://github.com/AnimaVR/NeuroSync_Player).

	---

	### 16/02/2025 – Demo Unreal Project Build
	- Demo Build: [Download the demo build](https://drive.google.com/drive/folders/1q-CYauPqyWfvs8NamW4QuC1H1r02RYMQ?usp=sharing) to test NeuroSync with an Unreal Project.
	- Player: [Download the NeuroSync Player](https://github.com/AnimaVR/NeuroSync_Player).

	---

	### 15/02/2025 – RoPe & Global/Local Positional Encoding Update
	- Update: Improved model performance.
	- Action: Update your code and model accordingly, bools available for research (easily turn either on or off when training to see differences, just make sure to match the settings in the local api model.py that you train with.)

	---

	### 11/02/2025 – Open Source Dataset Released
	- Dataset: [Download the dataset](https://huggingface.co/datasets/AnimaVR/Neurosync_Audio2Face_Dataset) to train your own model.
	- Trainer: Use [NeuroSync Trainer Lite](https://github.com/AnimaVR/NeuroSync_Trainer_Lite) for training and fine-tuning.

	---

	### 08/02/2025 – Player and License Updates v0.02
	- Player Update: Now includes blink animations from the default animation, better thread management, and no playback stutter.
	- Action: Update your Python files and `model.pth` to the new v0.02 versions.

	---

	### 25/11/2024 – CSV and Emotion Dimensions Update
	- Update: Correct timecode format added and an option to remove emotion dimensions in the CSV generator of [NeuroSync Player (Unreal Engine LiveLink)](https://github.com/AnimaVR/NeuroSync_Player).
	- Note: Set emotion dimensions to false in `utils/csv` to include only the first 61 dimensions for LiveLink.

	---

	## Model Overview
	The NeuroSync audio-to-face blendshape transformer seq2seq model converts sequences of audio features into corresponding facial blendshape coefficients, enabling real-time character animation. It integrates seamlessly with Unreal Engine via LiveLink.

	![Model Overview](https://cdn-uploads.huggingface.co/production/uploads/64ad37822a530cbdee7ce10b/35IP0CtVzzllXxOwf2f51.jpeg)

	---

	## Features
	- Audio-to-Face Transformation: Converts raw audio features into facial blendshape coefficients.
	- Transformer Seq2Seq Architecture: Utilizes encoder-decoder layers to capture complex dependencies between audio and facial expressions.
	- Unreal Engine Integration (LiveLink): Stream facial blendshapes in real time with the [NeuroSync Player](https://github.com/AnimaVR/NeuroSync_Player).

	---

	## Usage

	### Local API
	Set up your local API using the [NeuroSync Local API repository](https://github.com/AnimaVR/NeuroSync_Local_API) to process audio files and stream generated blendshapes.

	### Non-Local API (Alpha Access)
	If you prefer not to host the model locally, apply for the NeuroSync Alpha API at [neurosync.info](https://neurosync.info) for direct integration with the [NeuroSync Player](https://github.com/AnimaVR/NeuroSync_Player).

	---

	## Model Architecture
	- Encoder: Processes audio features with a transformer encoder using positional encodings.
	- Decoder: Uses cross-attention in a transformer decoder to generate blendshape coefficients.
	- Output: Produces 61 blendshape coefficients (with some exclusions for LiveLink).

	![Model Architecture](https://cdn-uploads.huggingface.co/production/uploads/64ad37822a530cbdee7ce10b/rptYcl8W7i3XnCCDPUVVL.jpeg)

	### Blendshape Coefficients
	- Included: Eye movements (e.g., EyeBlinkLeft, EyeSquintRight), jaw movements (e.g., JawOpen, JawRight), mouth movements (e.g., MouthSmileLeft, MouthPucker), brow movements (e.g., BrowInnerUp, BrowDownLeft), and cheek/nose movements (e.g., CheekPuff, NoseSneerRight).
	- Note: Coefficients 62–68 (related to emotional states) should be ignored or used for additive sliders since they are not streamed into LiveLink.

	---

	## Community & Resources

	### Live Demo
	- Twitch: [Talk to a NeuroSync prototype live on Twitch](https://www.twitch.tv/mai_anima_ai)

	![Twitch Demo](https://cdn-uploads.huggingface.co/production/uploads/64ad37822a530cbdee7ce10b/iCkHiPFc8RxmqegmkbHh_.png)

	### YouTube Channel
	For tutorials, updates, and more, visit our [YouTube channel](https://www.youtube.com/@animaai_mai).

	![YouTube Channel](https://cdn-uploads.huggingface.co/production/uploads/64ad37822a530cbdee7ce10b/f2EBvDJEmtsCwPJvyDcxl.jpeg)

	---

	## NeuroSync License
	This software uses a dual-license model:

	### 1. Free License (MIT License)
	For individuals and businesses earning under $1M per year:

	MIT License

	Copyright (c) 2025 NeuroSync

	Permission is hereby granted, free of charge, to any person obtaining a copy
	of this software and associated documentation files (the "Software"), to deal
	in the Software without restriction, including without limitation the rights
	to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	copies of the Software, and to permit persons to whom the Software is
	furnished to do so, subject to the following conditions:

	The above copyright notice and this permission notice shall be included in all
	copies or substantial portions of the Software.

	THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
	SOFTWARE.


	## 2. Commercial License (For Businesses Earning $1M+ Per Year)
	Businesses or organizations with annual revenue of $1,000,000 or more must obtain a commercial license to use this software.

	- To acquire a commercial license, please contact us.

	## Compliance
	By using this software, you agree to these licensing terms. If your business exceeds the revenue threshold, you must transition to a commercial license or cease using the software.

	© 2025 NeuroSync

	## References

	- [NeuroSync Local API](https://github.com/AnimaVR/NeuroSync_Local_API)
	- [NeuroSync Player (Unreal Engine LiveLink)](https://github.com/AnimaVR/NeuroSync_Player)
	- [Apply for Alpha API Access](https://neurosync.info)

	For any questions or further support, please feel free to contribute to the repository or raise an issue.