Model Card for HLLM

This repo is used for hosting HLLM's checkpoints.

For more details or tutorials see https://github.com/bytedance/HLLM.

Hierarchical Large Language Model (HLLM) architecture is designed to enhance sequential recommendation systems:

HLLM significantly outperforms classical ID-based models on large-scale academic datasets and has been validated to yield tangible benefits in real-world industrial settings. Additionally, this method demonstrates excellent training and serving efficiency.
HLLM effectively transfers the world knowledge encoded during the LLM pre-training stage into the recommendation model, encompassing both item feature extraction and user interest modeling. Nevertheless, task-specific fine-tuning with recommendation objectives is essential.
HLLM exhibits excellent scalability, with performance continuously improving as the data volume and model parameters increase. This scalability highlights the potential of the proposed approach when applied to even larger datasets and model sizes.

Comparison with state-of-the-art methods

Method	Dataset	Negatives	R@10	R@50	R@200	N@10	N@50	N@200
HSTU	Pixel8M	5632	4.83	10.30	18.28	2.75	3.94	5.13
SASRec	Pixel8M	5632	5.08	10.62	18.64	2.92	4.12	5.32
HLLM-1B	Pixel8M	5632	6.13	12.48	21.18	3.54	4.92	6.22
HSTU-large	Books	512	5.00	11.29	20.13	2.78	4.14	5.47
SASRec	Books	512	5.35	11.91	21.02	2.98	4.40	5.76
HLLM-1B	Books	512	6.97	14.61	24.78	3.98	5.64	7.16
HSTU-large	Books	28672	6.50	12.22	19.93	4.04	5.28	6.44
HLLM-1B	Books	28672	9.28	17.34	27.22	5.65	7.41	8.89
HLLM-7B	Books	28672	9.39	17.65	27.59	5.69	7.50	8.99

Cite our work

@article{HLLM,
title={HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling},
author={Junyi Chen and Lu Chi and Bingyue Peng and Zehuan Yuan},
journal={arXiv preprint arXiv:2409.12740},
year={2024}
}

ByteDance
/

HLLM

Model Card for HLLM

Comparison with state-of-the-art methods

Cite our work

Model tree for ByteDance/HLLM