GENERAL INFORMATION ........................... 1. Dataset title: UMATBrush Traces 2. Authors: Francisco Javier González-Cañete, Eduardo Casilari 3. Author contact information: Francisco González-Cañete. Email: fgc@uma.es METHODOLOGICAL INFORMATION ................................. 1. Description of the methods for collection/generation of data: The UMATBrush dataset provides accelerometer measurements captured by three different commercial smartwatches and from four participants during the execution of toothbrushing activities and other daily life activities in real-life conditions. The dataset can be used to characterize the dynamics of the toothbrushing movements as well as a dataset to train, validate and test AI-based HAR (Human Activity Recognition) systems. Reference: González-Cañete, Francisco; Casilari, Eduardo (2025). UMATBrush Traces. figshare. Dataset. https://doi.org/10.6084/m9.figshare.28955756.v2 2. Data processing methods Four experimental users were monitored with a smartwatch located in the dominant hand during the execution of a series of tooth brushing sessions. In addition, participants were also monitored ‘in the wild’, under real-world condition, during their daily life routines. 3. Software or instruments needed to interpret the data: Data are represented throug CSV (Comma Separated Values) plain-text files which can be directly loaded and processed from any programming language. 4. Standards and calibration information, if appropriate: Smartwatches are supposed to be calibrated by the corresponding vendor. 5. Environmental or experimental conditions: The samples were captured in the domestic environments of the four experimental users located in Málaga (Spain) or Rincón de la Victoria (Spain). A Wear OS application was developed and installed in the smartwatches in order to capture the subject’s movements. The application incorporates a button that triggers the measurement of the movements by capturing the values returned by the built-in accelerometer and storing them in text files in the smartwatch internal memory. In order to label the traces accordingly as toothbrush or non-toothbrush, once the capturing process has been initiated, a second on/off button enables the user to specify the initial and final moments of each toothbrush session. After the monitoring was complete, the raw files stored in the smartwatches were transferred to a personal computer and subsequently post-processed using the MATLAB programming environment in order to produce the final data format employed in the dataset. FILE OVERVIEW ---------------------- The database (UMATBrush Dataset) has a main folder and two subfolders named: Traces (with the sensed data traces measured) and Scripts. The first one contains the collected accelerometer data while the second one includes two scripts that automatically download, unzip and process the dataset. The Traces subfolder in turn consists of four subfolders, each corresponding to one of the experimental subjects. These are named with the word Subject, followed by an underscore (_) and the subject’s identifier (a number from 1 to 4). DATA-SPECIFIC INFORMATION: ------------------------------------------- The names of the monitored data files with the acceleration measurements are presented according the layout Subject_X_Watch_WWW_Date_YYYYMMDD_Time_HHMMSS.csv, where: • X is the subject identification number (1 to 4). • WWW is the smartwatch model name used to capture the acceleration signals. It can take one of the following values: LEO-DLXX, TicWatch-Pro or TicWatch-Pro-3-GPS, as indicated in the next section. • YYYYMMDD is the year (YYYY), month (MM) and day (DD) when the trace was collected. • HHMMSS is the hour (HH), minute (MM) and second (SS) of the instant when the gathering of the samples was initiated. Data files are plain text files in CSV (Comma Separated Values) format without headers. Every line in each data file contains a single measurement of the triaxial accelerometer of the IMU embedded in the smartwatch. The values of each line are arranged as it follows: • Timestamp, Ax, Ay, Az, Class Label where: Timestamp is the time (in μs) elapsed form the starting time of the recording to the instant in which the sample was captured. Accordingly, the first sample has a zero value and the rest of samples have a timestamp relative to this first sample. Ax, Ay, Az are the measurements of the three axes of the triaxial accelerometer (expressed in g units). Class Label is a binary value that indicates whether the sample corresponds to a toothbrush activity (1) or not (0). Finally, the Scripts subfolder includes two programs respectively developed in Python and Matlab. Both scripts, which are named Load_traces, have the same purpose: to automate the process of downloading and handling the dataset. Specifically, they perform the following operations: 1. Retrieve the dataset from the public repository as a single compressed ZIP file. 2. Extract the contents of this file and set up the dataset’s subfolder structure within a designated main directory called named UMATBrush_Dataset. 3. Read all the CSV files and store their contents in a data structure: a list of dictionaries in Python or a matrix of structures in Matlab, referred to as datasetTraces. Each element in this list or matrix includes two fields: the filename (which identifies the user, smartwatch and date and time of the experiment) and a numerical array with five columns containing: the timestamp, the three measurements captured by the triaxial sensor and, finally, the label for the measurement (binary formatted as previously described). MORE INFORMATION ------------------- Refer to the following article for further information about the dataset and the employed smartwatches: F.J. González-Cañete, E. Casilari, UMATBrush: A dataset of inertial signals of toothbrushing activities, Data in Brief, Volume 62, 2025,111980, ISSN 2352-3409, https://doi.org/10.1016/j.dib.2025.111980.