Dataset Preview
About
What is CASTLE?
The CASTLE dataset is a large-scale, multimodal dataset designed for advancing research in lifelogging, human activity recognition, and multimodal retrieval. It provides a rich collection of time-aligned sensor and video data for analysis and benchmarking. See the arXiv pre-print for more details.
Characteristics
- Captured over four days in a controlled environment
- 10 participants engaged in natural activities
- 15 video streams (10 egocentric, 5 static perspectives)
- Over 600 hours of UHD 50fps video with audio
- Includes 6DoF IMU, GPS, and biometric data
- 8.22TB total size

Download
The dataset is currently being prepared for release. Please check back soon for updates.
License
The CASTLE dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Terms of Use
By downloading the dataset, you agree to the following terms:
- The dataset is provided for research purposes only.
- You will not use the dataset for any commercial purposes.
- You will not distribute the dataset or any derivative works to others.
- You will provide appropriate credit to the dataset authors in your publications.
Download Links
The dataset is currently in pre-release. Please fill the access request form below to receive instructions on how to download the data.
Contact
If you have any questions about the dataset or licensing, please contact the team here.
Challenges
The first CASTLE multimodal analytics challenge will be held at ACM Multimedia 2025 in Dublin, Ireland.
To express your interest in participating to the challenge, please fill this form.
Timeline
- 09 March 2025: Challenge Announcement & Website Launch
- 09 March 2025: Registration Opens
- 24 March 2025: Dataset & Query Release
- 30 June 2025: Fully-Automated Submission Deadline
- 24 July 2025: Notification to Authors
- 26 August 2025: Camera-Ready Deadline
Guidelines for Participants
Participants will be required to register and agree to the dataset usage policy. Details regarding dataset licensing and submission guidelines will be provided upon release.
Tasks
The inaugural edition of the CASTLE Challenge features a diverse set of tasks, including event detection, retrieval, and question answering. Future editions will expand the scope, but for this edition, the tasks include:
🔍 Event Instance Search
Given a textual description (in English), participants must identify all timeframes where a specific event occurs. Events should be reported with both a time range and a video ID.
📦 Object Instance Search
Given a textual (in English) or visual (i.e., using an image) example of a physical object, participants must find all occurrences of that object across any of the video streams.
💬 Question Answering
Given a question in natural language (in English), participants must provide an answer. The response should be formulated in natural language and include references to relevant sensor streams and time intervals as supporting evidence.
Evaluation
The challenge will operate across two tracks: fully-automatic and interactive.
⚙️ Fully-Automatic Track
Participants receive queries in advance and generate results using any method they choose. These results are then submitted to the challenge organizers for evaluation. The queries for the Fully-Automatic Track are available here.
🎮 Interactive Track
This track will be evaluated live during the conference. Participants must solve tasks synchronously and interactively within a limited timeframe. This format follows established competitions such as the Video Browser Showdown and the Lifelog Search Challenge.
Meet Our Team
The CASTLE dataset is a collaborative effort between researchers from multiple institutions. Here are the participants who contributed to the generation of the first edition of the dataset.