ACE-Step Songs Demo Dataset¶

Details¶

Hub link: Yi3852/ACEStep-Songs
Description: A dataset of ~21k songs synthesized by ACE-Step itself, with lyrics generated by LLMs and scored by GPT-4o. This is useful for bootstrapping or testing the training pipeline.
score_lyrics: GPT-4o score (1-10) for lyrics quality; -1 indicates instrumental.
Caption format(s): HuggingFace dataset features: prompt, lyrics. (Also contains audio_duration and score_lyrics which are unused).

Dataloader configuration example¶

Use this configuration in your multidatabackend.json to load the dataset directly from Hugging Face.

[
  {
    "id": "acestep-demo-data",
    "type": "huggingface",
    "dataset_type": "audio",
    "dataset_name": "Yi3852/ACEStep-Songs",
    "metadata_backend": "huggingface",
    "caption_strategy": "huggingface",
    "cache_dir_vae": "cache/vae/{model_family}/acestep-demo-data"
  },
  {
    "id": "alt-embed-cache",
    "dataset_type": "text_embeds",
    "default": true,
    "type": "local",
    "cache_dir": "cache/text/{model_family}"
  }
]

Citation¶

@misc{jiang2025advancingfoundationmodelmusic,
      title={Advancing the Foundation Model for Music Understanding},
      author={Yi Jiang and Wei Wang and Xianwen Guo and Huiyun Liu and Hanrui Wang and Youri Xu and Haoqi Gu and Zhongqian Xie and Chuanjiang Luo},
      year={2025},
      eprint={2508.01178},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2508.01178},
}