Million Song Dataset

Million Song Dataset
The core of the dataset is the feature analysis and metadata for one million songs, provided by The Echo Nest. The dataset does not include any audio, only the derived features.
Keywords: social sciencemusic
Size: 199.5GB


  • ark:/31807/osdc-c1c763e4
Last Updated: 2013-12-11 13:47:45 UTC

Access Instructions

All public data sets are available on both commodity internet connections and high speed StarLight/Internet2 connections. We recommend using rsync or UDR to download the data.

Downloading with UDR (UDT enabled rsync)

UDR is a wrapper around rsync that enables rsync to use the high performance UDT network protocol, which can greatly improve download speeds, especially over high speed networks. Once installed, the only change is placing the udr command before the same rsync command you typically use to download the data. UDR is open source and under active development, the most recent version is available on githubAt the moment, UDR is not enabled on the transfer node. The functionality should return shortly. Use rsync in the meantime.

List the contents of Million Song Dataset:

  • Using rsync: rsync
  • Using udr: udr rsync

Download/synchronize Million Song Dataset:

  • Using rsync: rsync -avzuP /path/to/local_copy
  • Using udr: udr rsync -avzuP /path/to/local_copy

Download an individual file from Million Song Dataset:

  • Using rsync: rsync -avzuP /path/to/local_copy
  • Using udr: udr rsync -avzuP /path/to/local_copy


Profile Status
Profile Info

Kalyan Banga207 Posts

I am Kalyan Banga, a Post Graduate in Business Analytics from Indian Institute of Management (IIM) Calcutta, a premier management institute, ranked best B-School in Asia in FT Masters management global rankings. I have spent 14 years in field of Research & Analytics.


Leave a Comment

16 − eleven =