To view the dataset in more detail, please use the CASTLE Viewer. More features will be added to the viewer in the future.

What is CASTLE?

The CASTLE dataset is a large-scale, multimodal dataset designed for advancing research in lifelogging, human activity recognition, and multimodal retrieval. It provides a rich collection of time-aligned sensor and video data for analysis and benchmarking. See the Paper (or its arXiv pre-print) for more details.

Characteristics

Captured over four days in a controlled environment
10 participants engaged in natural activities
15 video streams (10 egocentric, 5 static perspectives)
Over 600 hours of UHD 50fps video with audio
Includes 6DoF IMU, GPS, and biometric data
8.22TB total size

House layout with camera positions (not to scale)

Download

License

The CASTLE dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Terms of Use

By downloading the dataset, you agree to the following terms:

The dataset is provided for research purposes only.
You will not use the dataset for any commercial purposes.
You will not distribute the dataset or any derivative works to others.
You will provide appropriate credit to the dataset authors in your publications.

Important

The dataset is available for download from HuggingFace

If you are using the dataset in your research, please consider citing the following paper:

@inproceedings{10.1145/3746027.3758199,
    author = {Rossetto, Luca and Bailer, Werner and Dang-Nguyen, Duc-Tien and Healy, Graham and J\'{o}nsson, Bj\"{o}rn \TH{}\'{o}r and Kongmeesub, Onanong and Le, Hoang-Bao and Rudinac, Stevan and Sch\"{o}ffmann, Klaus and Spiess, Florian and Tran, Allie and Tran, Minh-Triet and Tran, Quang-Linh and Gurrin, Cathal},
    title = {The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding},
    year = {2025},
    isbn = {9798400720352},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3746027.3758199},
    doi = {10.1145/3746027.3758199},
    booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
    pages = {12629–12635},
    numpages = {7},
    keywords = {dataset, egocentric vision, lifelogging, multi-perspective video, multimodal understanding},
    location = {Dublin, Ireland},
    series = {MM '25}
}

Challenges

We are organizing a series of challenges to encourage the research community to explore and utilize the CASTLE dataset.

To see the list of challenges and their details, please visit the Challenges page.