APPLIED SCIENCES

Validity of Consumer-Based Physical Activity Monitors for Specific Activity Types

NELSON, M. BENJAMIN; KAMINSKY, LEONARD A.; DICKIN, D. CLARK; MONTOYE, ALEXANDER H. K.

Author Information
Medicine & Science in Sports & Exercise 48(8):p 1619-1628, August 2016. | DOI: 10.1249/MSS.0000000000000933
  • Free

Abstract

Purpose 

Consumer-based physical activity (PA) monitors are popular for individual tracking of PA variables. However, current research has not examined how these monitors track energy expenditure (EE) and steps in distinct activities. This study examined the accuracy of the Fitbits One, Zip, and Flex and Jawbone UP24 for estimating EE and steps for specific activities and activity categories.

Methods 

Thirty subjects completed a structured protocol consisting of three sedentary, four household, and four ambulatory/exercise activities. All subjects began by lying on a bed for 10 min; 10 other activities were performed for 5 min each. Indirect calorimetry (COSMED) and researcher-counted steps were criterion measures for EE and step counts, respectively. The Omron HJ-720IT pedometer was used as a comparison of step count accuracy. EE and steps were compared with criterion measures using the Friedman repeated-measures nonparametric test and mean absolute percent error (MAPE).

Results 

All PA monitors predicted EE within 8% of COSMED for sedentary activity but overestimated EE by 16%–40% during ambulatory activity. All monitors except the Fitbit Flex (within 8% of criterion) underestimated EE by 27%–34% during household activity. EE predictions were accompanied with MAPE >10%. For household activity, the Fitbit Flex estimated steps within 10% of researcher-counted steps; all other monitors underestimated steps by 35%–64%. All monitors estimated steps within 4% of researcher-counted steps and displayed MAPE <10% during ambulatory activity. The Omron underestimated household steps by 74% but was within 1% for ambulatory steps. All monitors severely underestimated EE and steps during cycling.

Conclusion 

Consumer-based PA monitors should be used cautiously for estimating EE, although they provide accurate measures of steps for structured ambulatory activity, similar to validated pedometers.

Physical activity (PA) has been defined as any bodily movement that results in an increased energy expenditure (EE) from resting levels (4). Leading a regularly physically active lifestyle leads to many health benefits, including decreased risk of chronic disease morbidity and mortality (2,14,32). Accelerometers and pedometers are two types of PA monitors that are frequently used to objectively quantify PA (5). With the increasing awareness of the benefits of PA and the technical advances making electronics less expensive and smaller, commercial entities have created a market of wearable technology products that track and analyze PA.

Consumer-based PA monitors have many commonalities with pedometers and accelerometers but also offer their own unique features, making them new and potentially useful devices for PA assessment. These monitors use accelerometer technology and, through proprietary analysis methods, derive estimates of several aspects of PA such as steps and EE. Using on-device screens, computer screens, or smartphone connections, these PA monitors are able to provide real-time feedback about PA in the form of estimated EE (kcal), steps taken, and activity intensity in an easily understood format to the user. Similar to commonly used pedometers, these feedback features may allow for consumer-based PA monitors to promote behavior change and PA maintenance (16,19). Consumer-based PA monitors could help users track PA parameters in reference to current PA guidelines and public health statements, which have been designed to be understood and used by the public and in public health settings (i.e., 150 min·wk−1 of moderate-vigorous PA, 10,000 steps per day) (28,29).

Measurement accuracy is important when tracking PA variables to provide meaningful measures of the quantity and quality of PA and to assess adherence to PA guidelines. Validation studies conducted using accelerometers for EE prediction predominantly use indirect calorimetry as a criterion; for step counting validation in pedometers and other PA monitors, manually counted steps and reference pedometers have been used as criterion measures. Despite the potential uses of consumer-based monitors for tracking PA, there is limited research with equivocal findings pertaining to these monitors’ accuracy for assessing PA parameters in reference to these criterion measures. Studies by Lee et al. (15) and Dannecker et al. (8) have established that consumer-based PA monitors generally underestimate total EE during protocols incorporating activities of daily living, exercise activities, and sedentary behaviors. Lee et al. (15) showed that the Fitbit Zip (FZ) was accurate for estimating total EE within a margin of 10% error using equivalence testing analyses; the mean absolute percent error (MAPE) of monitors ranged from 9% to 24%. However, these studies were limited by analyzing only total EE over the course of a protocol rather than looking at individual activities, limiting the understanding of how monitors perform during distinct activities. Sasaki et al. (20) examined the accuracy of consumer-based PA monitors for EE prediction on an individual activity basis, finding that EE estimation accuracy was variable, depending on the activity; however, this study was conducted with only the Fitbit Classic, which is no longer on the market. Stackpool et al. (24) found that EE estimates of consumer-based PA monitors had high ranges of error (ranging from 13% to 60%) for walking, jogging, and recreation activities but did not examine other activity types. In summary, there is considerable variability of device accuracy for predicting EE, depending on the type of activity and monitor selection.

Although consumer-based PA monitors have largely been studied with regard to their capability to estimate EE, little is known about the accuracy of these monitors for counting steps. Stackpool et al. (24) found that step counts for ambulatory activity were accurate within 4% of manually counted steps; error increased to 18% during sport and recreation activities. Case et al. (3) recently found that hip-worn monitors were highly accurate during treadmill walking, with estimations within 1% of manually counted steps. These studies suggest that nonambulatory activities may be more difficult to measure accurately, although more study is needed to develop a better understanding of the accuracy of consumer-based PA monitors for different types of nonambulatory activities.

Steps and EE are important PA variables to assess accurately because they are practical indicators of PA and are widely used to quantify PA. Monitor accuracy specific to individual activities for EE and step counting needs further research to evaluate the usefulness of these monitors for PA assessment, especially because research-grade accelerometers (accelerometers meant for research use that are not commonly used by individuals) are known to display variable ranges of error when tracking movement for some types of activity (i.e., nonambulatory) (31). In addition, because consumer-based PA monitors share similar traits with pedometers (provide feedback about steps, are simple to use, and are commonly used by the public), they could prove useful in PA assessment and interventions. However, the accuracy of consumer-based PA monitors has not been studied in comparison with validated pedometers.

The purpose of this study was to examine the criterion-based validity of four commonly used consumer-based PA monitors, the Fitbit One (FO), FZ, Fitbit Flex (FF), and Jawbone UP24 (JU), for the estimation of EE and steps taken specific to several activities and activity categories. In addition, we assessed the capability of these monitors to estimate steps in comparison with the Omron HJ-720IT (OM) pedometer to compare the accuracy of consumer-based monitors to that of a previously validated PA assessment tool.

METHODS

Subjects

The subjects for this study were 30 healthy adults recruited from Ball State University and the Adult Physical Fitness Program at Ball State University. Ten subjects were recruited in each of three age-groups: 18–39 yr, 40–59 yr, and 60–80 yr. Eligibility for the study included male and female adults without any gait abnormalities (i.e., limp, use of assist devices) and who were able to participate in a variety of activities for at least 5 min each. Subjects were excluded on the basis of acute illness, unstable chronic conditions, and pregnancy. Before participation, subjects provided written informed consent approved by the Ball State University Institutional Review Board. Subjects also completed the Edinburgh Handedness Inventory for the determination of hand dominance.

Subjects had a mean age of 48.9 ± 19.4 yr (ranging from 18 to 79 yr), with a mean body mass index (BMI) of 26.3 ± 5.2 kg·m−2 (ranging from 16.3 to 38.2 kg·m−2). Thee average waist circumference was 94.5 ± 8.1 cm for the 15 male subjects and 81.2 ± 11.3 cm for the 15 female subjects. Two subjects were left-hand dominant, and the other 28 were right-hand dominant.

Equipment

During the activity protocol, subjects wore five PA monitors and a portable metabolic analyzer. A description of the equipment used follows. For the consumer-based PA monitors, only the EE and step functions were used for this study.

Fitbit One

The FO (Fitbit Inc., San Francisco, CA) is a hip-worn, accelerometer-based PA monitor that estimates EE, steps, distance moved, flights of stairs climbed, and sleep (quantity and quality). The FO uses a rechargeable battery as a power source and also uses an Internet connection or Bluetooth to transmit data to a computer or a smartphone device, and it can upload data to the Fitbit Connect Application. It also has a display screen, which provides real-time tracking information for the variables assessed.

Fitbit Zip

The FZ (Fitbit Inc.) is a hip-worn, accelerometer-based PA monitor that estimates EE, steps, distance moved, and flights of stairs climbed. The FZ uses a watch battery as a power source and also uses an Internet connection or Bluetooth to transmit data to a computer or smartphone device, and it can upload data to the Fitbit Connect Application. This monitor has a display screen, which provides real-time tracking information.

Fitbit Flex

The FF (Fitbit Inc.) is a wrist-worn, accelerometer-based PA monitor that estimates EE, steps, active time, distance moved, and sleep. The monitor has a rechargeable battery and syncs to a smartphone via an Internet connection or Bluetooth, and it can be downloaded via USB linking to a computer. Data can be uploaded to the Fitbit Connect Application, which allows for viewing and analysis of both PA and sleep data.

Jawbone UP24

The JU (AliphCom dba Jawbone, San Francisco, CA) is a wrist-worn, accelerometer-based PA monitor that tracks EE, steps, active time, distance moved, and sleep. The monitor has a rechargeable battery and syncs to a smartphone via an Internet connection or Bluetooth and uploads data to the Jawbone Application, which allows for viewing and analysis of PA and sleep data.

Omron HJ-720IT

The OM (Omron Healthcare, Inc., Lake Forest, IL) is a hip- or pocket-worn, accelerometer-based device that is used primarily for its pedometer (step counting) function. Powered by a watch battery, the OM tracks and displays steps and distance on a display screen and estimates EE based on subject parameters. The OM has no wireless syncing capabilities but can be downloaded to a computer for data analysis. The OM has been shown to be a valid step counter at variable walking speeds in structured settings but may underestimate steps during free-living settings in comparison with an ankle-based pedometer (12,22). Despite this limitation (a common issue with pedometers in general), the OM remains a highly regarded tool for the assessment of steps taken. Therefore, the OM was included in this study as a comparison with the step counting capabilities of the four consumer-based PA monitors.

COSMED K4b2

The COSMED (COSMED Srl, Rome, Italy) is a portable metabolic analyzer that measures oxygen consumption (V˙O2) and carbon dioxide production (V˙CO2). The COSMED has been shown to provide valid measures of V˙O2 and V˙CO2 during both steady-state exercise and varying intensity exercise (17,21). V˙O2 data from COSMED were converted to kilocalories and served as the criterion measure of EE in this study.

Protocol

Data for this study were collected during one laboratory visit. Subjects were instructed not to consume food or caffeine or exercise 2–3 h before the study. Subject height (to the nearest 0.1 cm, using wall stadiometer), weight (to the nearest 0.1 kg, using digital scale), and waist circumference (to the nearest 0.1 cm, using spring-loaded measuring tape) were measured using standardized procedures. Height, weight, sex, and date of birth were used to initialize the consumer-based PA monitors for each subject. Subjects were then outfitted with the five PA monitors and COSMED. The FO and the FZ were placed on an elastic waistband over the left hip near the anterior axillary line and were counterbalanced in anterior-to-posterior placement order among subjects. The OM was placed over the right hip at the anterior axillary line, attached to the same waistband as the FO and FZ monitors. The placement of the FF and JU was standardized to the nondominant wrist, per manufacturer recommendation, and were counterbalanced for distal and proximal placement on the wrist among subjects.

Subjects took part in a structured activity protocol consisting of 11 activities (three sedentary, four household, and four ambulatory/exercise) chosen by researchers from a list of 21 activities described in Table 1. Activities were counterbalanced so that sex and age categories had approximately equal participation in the activities. All subjects began by lying quietly on a bed for 10 min. All other activities were performed for 5 min each, in order of generally increasing intensity. Activities were chosen to consist of everyday tasks and were broadly grouped into sedentary, household, and ambulatory/exercise activities. All activities were performed at a self-selected intensity by the subject. Subjects chosen to perform the jogging activity had the option of participating in a brisk walk if unable to jog for 5 min.

T1-23
TABLE 1:
Description of activities and average METs during individual activities.

Baseline step counts and EE were recorded before starting the protocol from the FO, FZ, FF, and JU monitors as well as baseline steps for the OM monitor. At the end of each activity, steps and EE estimates were recorded for each monitor (only steps for the OM). Trained research technicians counted and recorded steps using a hand tally counter (Fisher Scientific, Waltham, Hampton, NH) during the activity protocol, which served as the criterion measure for steps taken. Data collected between two step counters were averaged. Interrater reliability between any two step counters was greater than κ = 0.97; step counters averaged an absolute difference of 1% (26 steps), ranging from 0% to 4% (0–119 steps) for any total session. A step was defined as a picking up of the heel and toe of a foot and replacing it on the ground. During cycling, steps were counted for each pedal stroke, or two steps per revolution. COSMED data were analyzed using 30-s averaging. V˙O2 in liters per minute was converted to EE (kilocalorie expenditure) using a conversion factor of 5 kcal·L−1 V˙O2 (11). EE computed from measures of V˙O2 by the COSMED served as the criterion for EE.

Sedentary activities (lying down, watching television, writing, reading, playing cards, and computer use), household activities (standing, dusting, sweeping, vacuuming, folding laundry, making bed, picking up items from floor, and gardening), and ambulatory activities (slow overground walk, brisk overground walk, treadmill walk, overground jog, treadmill jog, and stair climbing) were analyzed as one category each. Cycling was analyzed separately from the ambulatory activity category. In a secondary analysis, ambulatory activities were analyzed separately as combined walking (slow overground walk, brisk overground walk, and treadmill walk; two activities per subject), combined jogging (overground jogging and treadmill jogging; one activity per subject), and stair climbing activities. Combined walking speed averaged 78 ± 20 m·min−1 (2.9 ± 0.7 mph) and combined jogging speed averaged 121 ± 33 m·min−1 (4.5 ± 1.2 mph).

Data Analyses

Activity category and separate activity analyses were analyzed using the total gross EE and total steps for each separate activity category and for the combined walking, combined jogging, cycling, and stairs activities. Syncing issues between monitors and software resulted in loss of data for either EE or steps in one activity for seven subjects and one activity category in four subjects. Data from these activities were removed from related analyses; therefore, sample sizes were dependent on the available data, which was not always the full sample. Three subjects did not complete a jogging task and completed a walking task instead because of functional limitations. One subject completed the cycling activity in substitution of stairs activity because of a knee condition. COSMED data were lost on one subject for a 3.5-min segment during the jogging portion of the activity protocol; this subject’s data were removed from the activity-specific analysis, but data from a second self-paced jogging activity were used to estimate EE during the jogging activity for use in the ambulatory category.

Mean estimates of gross EE and total steps for each activity category and combined walking, combined jogging, stairs, and cycling activities were analyzed using the Friedman for repeated-measures nonparametric test (due to nonnormal distributions of data), with Dunn’s test for multiple comparisons, to compare each PA monitor to the respective criterion measures. This analysis was also conducted for steps in each activity category and combined walking, combined jogging, cycling, and stairs activities with the OM as a comparative measure. Dunn’s test reports a P value adjusted for multiple comparisons; P values reported are considered significant at an alpha level of P < 0.05. In addition, mean absolute error (MAE), MAPE, and root mean square error (RMSE), were calculated to analyze individual predictive error of the PA monitors (25). Although MAE and RMSE are indicators of error in each measure’s respective units (steps and kilocalories), MAPE presents the error as a percentage of the overall mean, making it a useful indicator of the degree of the error. No standardized thresholds exist for high or low MAPE, but we considered a MAPE ≥10% as an indicator of inaccuracy. These error measures were calculated for the three activity categories and combined walking, combined jogging, stairs, and cycling activities.

RESULTS

All 30 subjects completed the activity protocol, with a total mean measured EE of 292 ± 62 kcal and a mean of 3234 ± 389 steps taken during the activity protocol. Each of the 21 activities performed is shown in Table 1, with the average MET values measured for subjects who completed each activity. All activities had an average MET value within 1.1 METs of those predicted for these activities in the Compendium of Physical Activities (1).

EE Results

Activity categories

Mean EE with 95% confidence intervals for the COSMED and PA monitors for sedentary, household, and ambulatory activity categories are displayed in Table 2A. For the sedentary category, only the JU significantly underestimated EE by 8% (P = 0.013), although this difference was within 3 kcal of the measured value. All monitors significantly underestimated EE for the household activity category by 27%–34% (all P < 0.001), except the FF, which was not significantly different from the COSMED (P > 0.99). All PA monitors significantly overestimated EE for the ambulatory category (excluding cycling), by 16%–40% (all P = 0.013–<0.001).

T2-23
TABLE 2:
EE and step counts for activity categories.

Walking, jogging, stairs, and cycling

The ambulatory category was next split into individual activities for analysis. All monitors except the JU (P = 0.38) significantly overestimated EE for walking by 26%–61% (P = 0.005–<0.001; Fig. 1A). For jogging, all monitors except the FO (P = 0.24) overestimated EE by 25%–39% (all P < 0.001; Fig. 1B). The measured EE for the walking and jogging activities are similar because EE data for walking were combined from two walking activities, therefore resulting in 10 min of data, versus a 5-min period for jogging. All monitors significantly underestimated EE for cycling by 37%–59% (P = 0.025–<0.001; Fig. 1C). The FZ and the FF were not statistically different from the COSMED for the stairs activity, with a mean EE within 11% of the criterion measure (P > 0.99 and P = 0.13, respectively); the FO and JU significantly underestimated EE for the stairs activity by 13% and 30%, respectively (P = 0.006 and P < 0.001, respectively; Fig. 1D).

F1-23
FIGURE 1:
Means and 95% confidence intervals of total EE (total kilocalories) for the COSMED, FO, FZ, FF, and JU for walking (A), jogging (B), cycling (C), and stairs (D). *Significantly different from COSMED.

Error measures for EE

For all activity categories (sedentary, household, and ambulatory), all PA monitors displayed high MAPE (defined hereafter as >10% of measured EE) for EE prediction, ranging from 13% to 35% (Table 3). All PA monitors had the lowest MAPE for the sedentary category (13%–17%). MAPE for individual activities was also high. The FO, FZ, and FF all had the lowest MAPE for the stairs activity, ranging from 11% to 14%. The JU displayed the lowest MAPE for the walking activities at 24%. All monitors displayed the highest MAPE for cycling, ranging from 43% to 57%. RMSE and MAE values were similar in magnitude for all monitors for EE predictions for all activity categories and individual activities.

T3-23
TABLE 3:
RMSE, MAE, and MAPE of PA monitors for EE.

Steps Results

Activity categories

Average step counts with 95% confidence intervals for the sedentary, household, and ambulatory activity categories are displayed in Table 2B. The OM, FO, FZ, and JU monitors significantly underestimated steps by 35%–74% for the household category (all P = 0.006–<0.001). All PA monitors except the FZ were not statistically different from researcher-counted steps for the ambulatory activity category, with mean steps within 4% of researcher-counted steps. The FZ significantly overestimated steps in the ambulatory category by 2% (P < 0.001). For the sedentary activity category, hip monitors recorded 0 steps for all participants, except the FO, which recorded one step taken during the writing task in one subject. The FF correctly recorded 0 steps during sedentary activities in all but five participants, recording as many as 14 steps during a lying down activity in one subject. However, average steps recorded averaged <2.0 across all subjects. The JU correctly recorded 0 steps in all but two subjects, recording as many as 11 steps during the card playing activity in one subject; the average JU steps recorded during sedentary activities was <1.0. No monitors were statistically different from researcher-counted steps for the sedentary category (all P > 0.99).

In comparison with the OM, only the FO was not statistically different for counting household activity steps (P = 0.11). The FZ, FF, and JU counted significantly more steps than the OM, ranging from 43% to 254% higher step counts (P = 0.007–<0.001); however, the OM counted the least steps in comparison with researcher-counted steps during household activity. There were no statistical differences between the OM and the FO, FF, or the JU (P = 0.10–>0.99) for the ambulatory category. The FZ counted significantly more steps in comparison with the OM for ambulatory activities, but the difference was only ~2%. There were no significant differences in steps counted by the PA monitors for the sedentary activity category (P > 0.99).

Walking, jogging, stairs, and cycling

For walking, only the FF significantly underestimated steps by 7% (P = 0.034), and the FZ overestimated steps by 1% (P < 0.001; Fig. 2A). For jogging, only the FZ significantly overestimated steps by 2% (P = 0.004); all other monitors ranged within −1% to 2% difference from researcher-counted steps (Fig. 2B). For cycling, all monitors underestimated steps by 68%–94% (P = 0.013–<0.001; Fig. 2C). For the stairs activity, no monitors were statistically different from researcher-counted steps, with mean step count differences ranging from −11% to 3% (P = 0.16–>0.99; Fig. 2D). The rate of stair climbing and descending varied considerably among subjects, contributing to the wide 95% confidence interval for the researcher-counted steps.

F2-23
FIGURE 2:
Means and 95% confidence intervals of total step counts for researcher-counted steps, OM, FO, FZ, FF, and JU for walking (A), jogging (B), cycling (C), and stairs (D). *Significantly different from researcher-counted steps.

For walking, jogging, and stairs activities, all consumer-based PA monitors displayed no statistical difference from the OM except the FF during walking (P < 0.001). For cycling only, the JU was statistically different from the OM (P = 0.005), although all monitors significantly underestimated steps during cycling.

Error measures for step counts

For the household activity category, MAPE ranged from 54% to 79% (Table 4). By contrast, for the ambulatory category, all monitors displayed MAPE ≤10%, ranging from 3% to 6%. For walking, hip-worn monitors (i.e., OM, FO, and FZ) displayed MAPE of 2%–3%, and wrist-worn monitors (i.e., FF and JU) displayed larger MAPE at 8% and 11%, respectively. A similar finding was seen for jogging, with MAPE of the OM, FO, and FZ of 3% each, but the FF and JU having MAPE of 8% each. MAPE was large for the cycling activity for all monitors, ranging from 70% to 93%. For stairs, all monitors displayed large MAPE, ranging from 10% to 41%. RMSE and MAE were similar in magnitude for all monitors for each activity category and individual walking, jogging, cycling, and stairs activities.

T4-23
TABLE 4:
RMSE, MAE, and MAPE of PA monitors for step counts.

DISCUSSION

A unique aspect of the present study is the assessment of consumer-based PA monitors’ accuracy for estimating EE in three activity categories (sedentary, household, and ambulatory) and four distinct ambulatory/exercise activities (walking, jogging, stairs, and cycling). Our findings indicate that consumer-based PA monitors typically were inaccurate for estimating EE in the three activity categories, although prediction error was smallest for the sedentary activity category in all four consumer-based PA monitors. The FO and the FZ were not statistically different for measured EE for the sedentary category. Only the JU statistically underestimated sedentary EE by approximately 8%. This equated to 2–3 kcal for a 20-min span of time spent sedentary in this study but could contribute to an approximate 150–200 kcal underestimation if extrapolated to a full day. Mean EE comparison does not account for individual monitor prediction errors; large MAPE (i.e., >10%) was displayed by all monitors for predicting sedentary EE, albeit the lowest MAPE for any activity category. Because movement during sedentary activities is minimal, the accuracy of the monitors for predicting EE is a reflection of manufacturers’ proprietary equations for predicting resting EE, likely a combination of height, weight, age, and sex variables input by the user of the monitor.

For the household activity category, all consumer-based PA monitors except the FF underestimated EE by 27%–34% and displayed the highest MAPE of any activity category. EE estimates by the FF for household activity were within 10% of the criterion measure. The FF also displayed lower MAPE than other monitors, although still high (21%). This finding suggests that wrist-worn consumer-based PA monitors may be better suited for capturing household activity than hip-worn monitors. These findings are in agreement with Hendelman et al. (13), who found that hip-worn, research-grade accelerometers do not accurately capture nonambulatory activity and upper body movement, leading to inaccuracies of EE estimation. Also, our findings are in agreement with a study by Ellis et al. (9), who found that wrist-worn, research-grade accelerometers can provide better estimates of EE during activity with significant upper body movement (e.g., household activity) than hip-mounted devices. However, the JU (also a wrist-worn monitor) was not accurate for predicting EE for household activity, with estimates similar to the FO and FZ. These findings suggest that wrist placement may decrease the error of EE estimation during household activities by consumer-based monitors, but is variable between devices.

In regard to the ambulatory activity category (excluding cycling), all consumer-based PA monitors significantly overestimated EE. EE estimation by consumer-based PA monitors for ambulatory activities was not improved when analyzed as individual activities of walking, jogging, and stairs for most monitors. Mean EE estimates for walking and jogging were also accompanied with high MAPE. These findings were unexpected, as common regression models for analyzing research-grade accelerometers have generally performed well for EE estimation during ambulatory activities (versus other activity types) because algorithms for these types of activities are often derived in studies during structured ambulatory activities (10). It is important that PA monitors provide accurate estimations of EE, especially for structured PA, as interventions targeting an increase in daily EE would target habits relating to PA-related EE, the most variable and modifiable component of daily EE (27). In summary, individuals using consumer-based PA monitors should be cautious when relying on EE estimations, especially if weight loss is a goal and eating habits are adjusted based on EE estimations of monitors.

In addition to assessing the accuracy of EE estimations, this study examined consumer-based PA monitors’ accuracy for counting steps. Overall, consumer-based PA monitors displayed smaller errors for assessing steps than EE during the activity protocol, performing best during ambulatory activities. For the ambulatory category, only the FZ monitor was statistically different from the criterion of researcher-counted steps; however, the difference was small, being only 2% over the criterion measure. Together, all monitors were within 4% of researcher-counted steps. In addition, means of all monitors were accompanied by low MAPE, ranging from 2% to 6%. These findings suggest that consumer-based PA monitors are accurate for tracking steps during periods of varying ambulatory activities. When analyzed as individual activities, most monitors retained their high predictive accuracy, but subtle differences were noted. For walking, all monitors were closer to the criterion (within 2% of researcher-counted steps) except the FF, which significantly underestimated steps by 7% and the FZ, which significantly overestimated steps by 2%. All PA monitors except the FZ (overestimated by 2%) were accurate for counting steps for jogging activities, all being within 2% of researcher-counted steps. For the stairs activity, a higher degree of variability of criterion steps was seen, but none of the monitors were found to be statistically different. For walking and jogging, hip-worn monitors generally displayed lower MAPE (<5%) than wrist-worn monitors (8%–11%). For stairs, all monitors had MAPE ≥10%. These findings are consistent with previous literature concerning consumer-based PA monitors step counting accuracy in ambulatory activity. Stackpool et al. (24) found that the Jawbone UP and Fitbit Ultra were accurate within 4% of manually counted steps for walking and jogging activities. Case et al. (3) showed that the FO and FZ displayed similar accuracy for counting steps during highly structured walking trials, being within 1% of manually counted steps. Together, these findings suggest that consumer-based PA monitors, and especially hip-based monitors, are reasonably accurate for counting steps taken during ambulatory activity.

Conversely, for household activities, hip-worn PA monitors underestimated step counts compared with researcher-counted steps by more than 60%. Poor performance of the FO and FZ for step counting is likely due to activities in the household category (such as sweeping, vacuuming, and making a bed) being undertaken at slow ambulation speeds, as well as “shuffling” steps, which do not involve a defined heel strike. This finding was also exhibited by the OM (underestimating steps by 74%), which has been previously shown to underestimate steps in free-living settings (22). That the FO, FZ, and OM displayed poor step measures during slow and unstructured ambulatory activity is not surprising, as hip-worn pedometers’ accuracy has been shown to be speed dependent, being inaccurate for step counting at slow structured ambulation speeds <0.54 m·min−1 (2.0 mph) (7) and during intermittent-stepping activities (22). This suggests that hip-worn, consumer-based PA monitors have similar traits to validated pedometers in that they have difficulty capturing steps during slow, less-structured, intermittent-stepping activities. Similar to EE prediction, wrist-worn monitors had significantly better estimates for step counting during the household category; however, the JU still significantly underestimated steps by 35%, whereas the FF estimated steps within 8% of researcher-counted steps. Overall, consumer-based PA monitors were generally inaccurate in tracking steps during less-structured household activity, with a large difference in functionality seen between hip- and wrist-worn devices.

Wrist-worn PA monitors counted very few steps during sedentary activities. During sedentary activities performed in this study, the FF and JU estimated on average 1.8 and 0.5 steps, respectively; this error is very small, even if extrapolated to a full day of wear. Hip-based PA monitors estimated virtually no steps during sedentary activities. This is an encouraging finding, as monitors misclassifying steps during sedentary behavior would contribute falsely to total daily step counts; our results suggest that consumer-based PA monitors, when worn on the hip or nondominant wrist, count minimal steps during several common types of sedentary (or at least stationary) behaviors.

Because PA monitors were not worn in body locations that track lower-body motion well, it is not surprising that they had poor estimates of EE and steps during cycling. This is similar to research-grade accelerometers, which when worn on the hips or wrist do not track cycling well (31). Although steps are not, in reality, taken while cycling, we felt it was appropriate to count steps during cycling because cycling is a form of PA that researchers and individuals would like to track, either in terms of EE or in step counts, which could correspond to a similar volume of ambulatory activity. Our study did not utilize the alternative placement of monitors, in part because virtually all consumer-based monitors are designed for hip or wrist wear. Future studies should assess alternative placement (such as the ankle or thigh) of consumer-based PA monitors for counting steps and estimating EE during cycling, as ankle-placed, research-grade accelerometers have demonstrated accuracy for tracking cycling activity (23).

MAE and RMSE were similar for each monitor for each category and individual ambulatory and cycling activities. Mathematically, RMSE is always greater than or equal to MAE; RMSE tends to become increasingly larger than MAE as error magnitudes become increasingly variable (33). Therefore, that errors were similar in magnitude illustrates that errors for any given activity or category were generally consistent errors among subjects; despite the large errors seen in some activities, the variability in individual errors within monitors was low. This finding suggests that monitors can be expected to be accurate or inaccurate in similar magnitude across different users.

A secondary aim of this study was to use the OM pedometer as a measure of comparative accuracy to the four consumer-based monitors assessed. Importantly, all four consumer-based monitors compared favorably with the OM pedometer for step assessment across sedentary, household, and ambulatory activities. Although the OM has been shown to have limitations during nonstructured ambulatory and intermittent-stepping activity, it has been shown to be superior to many other brands of pedometers as a valid measure of steps taken during ambulatory activities in a variety of populations (6,26). It is encouraging that the consumer-based PA monitors performed with similar accuracy to the OM, as it is a highly regarded tool for measuring steps, a meaningful indicator of PA. Given that the majority of PA that adults perform is in the form of walking (30), the accuracy of the consumer-based monitors for counting ambulatory steps provides support for their use in tracking PA levels in adults. As pedometers have been widely used for PA assessment and PA interventions, consumer-based PA monitors would also be useful as both assessment and intervention tools.

This study is not without limitations. Most notably, this study was not conducted in a free-living environment. Our study was designed to allow for steady-state examination of several activities so that we were able to identify specific activity categories and activities for which predictions from the activity monitors were accurate or inaccurate. These findings may not translate directly into less-structured settings, where activity intensity and time vary considerably more than that in the laboratory. Given the inconsistencies in the literature regarding the accuracy of consumer-based PA monitors across a period of time or an entire activity protocol, our choice of the laboratory setting was to understand the types of activities that the monitors can detect well by examining activities in structured, steady-state environments. Still, further study in free-living settings is needed to establish the overall validity of consumer-based PA monitors. Our sample was primarily active, being recruited from an exercise facility and college campus; this may limit our findings to more active adults and not be generalizable to less active adults or to children/adolescents. In addition, it may be that differences in the accuracy exist between age and sex groups, as it is likely that age and sex are important factors in the EE prediction equations used in these monitors; scientific equations for predicting EE typically include multiple factors, such as age and sex (18). Future studies should be carried out in activities with higher intensities (i.e., ≥10 METs), as well as sport and recreation activities. As consumer-based PA monitors increase in popularity, ever newer models and designs come into production, and these should be studied as well. Finally, this study did not account for the reliability of consumer-based PA monitors. Despite issues with monitor accuracy, a reliable monitor would allow individuals and researchers to assess changes in PA over time.

The findings of this study indicate that consumer-based PA monitors’ accuracy for tracking EE and steps is dependent on the type of activity being performed. With the possible exception of sedentary behaviors, consumer-based PA monitors do not provide accurate estimates of EE and should not be used for estimating EE. Consumer-based PA monitors provided reasonably accurate measures of steps during structured ambulatory activity, were not accurate for measuring household steps, and correctly counted approximately no steps during sedentary behavior. These traits were similar to a validated pedometer. Given their comparable attributes of step measurement accuracy with a validated pedometer, the FO and the FZ may be useful substitutes for pedometers in PA assessments and interventions. In conclusion, insofar as step counts are a respected indicator of PA, consumer-based PA monitors can be useful tools for PA assessment.

This work was supported by a Ball State University ASPiRE grant and the Ball State University College of Applied Sciences and Technology. The authors would like to thank Joshua Bock, Mary Tuttle, Ezra Tinkle, Alexis Sutter, Danielle Bozymski, Kelly Buss, and Meredith Patty for their assistance with data collection. Funding for publication fees and page charges was provided by the Ball State University Leroy “Bud” Getchell Graduate Student Professional Development Fund.

The authors declare no conflicts of interest. Results of the present study do not constitute endorsement by the American College of Sports Medicine.

REFERENCES

1. Ainsworth BE, Haskell WL, Herrmann SD, et al. 2011 Compendium of physical activities: a second update of codes and MET values. Med Sci Sports Exerc. 2011;43(8):1575–81.
2. Blair SN, Horton E, Leon AS, et al. Physical activity, nutrition, and chronic disease. Med Sci Sports Exerc. 1996;28(3):335–49.
3. Case MA, Burwick HA, Volpp KG, Patel MS. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA. 2015;313(6):625–6.
4. Caspersen CJ, Powell KE, Christenson GM. Physical activity, exercise, and physical fitness: definitions and distinctions for health-related research. Public Health Rep. 1985;100(2):126–31.
5. Chen KY, Bassett DR. The technology of accelerometry-based activity monitors: current and future. Med Sci Sports Exerc. 2005;37(11 Suppl):S490–500.
6. Connolly CP, Coe DP, Kendrick JM, Bassett DR Jr, Thompson DL. Accuracy of physical activity monitors in pregnant women. Med Sci Sports Exerc. 2011;43(6):1100–5.
7. Crouter SE, Schneider PL, Karabulut M, Bassett DR Jr. Validity of 10 electronic pedometers for measuring steps, distance, and energy cost. Med Sci Sports Exerc. 2003;35(8):1455–60.
8. Dannecker KL, Sazonova NA, Melanson EL, Sazonov ES, Browning RC. A comparison of energy expenditure estimation of several physical activity monitors. Med Sci Sports Exerc. 2013;45(11):2105–12.
9. Ellis K, Kerr J, Godbole S, Lanckriet G, Wing D, Marshall S. A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers. Physiol Meas. 2014;35(11):2191–203.
10. Freedson PS, Melanson E, Sirard J. Calibration of the Computer Science and Applications, Inc accelerometer. Med Sci Sports Exerc. 1998;30(5):777–81.
11. Garber CE, Blissmer B, Deschenes MR, et al. American College of Sports Medicine Position Stand: quantity and quality of exercise for developing and maintaining cardiorespiratory, musculoskeletal, and neuromotor fitness in apparently healthy adults: guidance for prescribing exercise. Med Sci Sports Exerc. 2011;43(7):1334–59.
12. Giannakidou DM, Kambas A, Ageloussis N, et al. The validity of two Omron pedometers during treadmill walking is speed dependent. Eur J Appl Physiol. 2012;112(1):49–57.
13. Hendelman D, Miller K, Baggett C, Debold E, Freedson P. Validity of accelerometry for the assessment of moderate intensity physical activity in the field. Med Sci Sports Exerc. 2000;32(9 Suppl):S442–9.
14. LaFontaine T, Dabney S, Brownson R, Smith C. The effect of physical activity on all cause mortality compared to cardiovascular mortality: a review of research and recommendations. Mo Med. 1994;91(4):188–94.
15. Lee JM, Kim Y, Welk GJ. Validity of consumer-based physical activity monitors. Med Sci Sports Exerc. 2014;46(9):1840–8.
16. Lyons EJ, Lewis ZH, Mayrsohn BG, Rowland JL. Behavior change techniques implemented in electronic lifestyle activity monitors: a systematic content analysis. J Med Internet Res. 2014;16(8):e192.
17. McLaughlin JE, King GA, Howley ET, Bassett DR Jr, Ainsworth BE. Validation of the COSMED K4 b2 portable metabolic system. Int J Sports Med. 2001;22(4):280–4.
18. Mifflin MD, St Jeor ST, Hill LA, Scott BJ, Daugherty SA, Koh YO. A new predictive equation for resting energy expenditure in healthy individuals. Am J Clin Nutr. 1990;51(2):241–7.
19. Rowe-Roberts D, Cercos R, Mueller F. Preliminary results from a study of the impact of digital activity trackers on health risk status. Stud Health Technol Inform. 2014;204:143–8.
20. Sasaki JE, Hickey A, Mavilia M, et al. Validation of the Fitbit wireless activity tracker for prediction of energy expenditure. J Phys Act Health. 2015;12(2):149–54.
21. Schrack JA, Simonsick EM, Ferrucci L. Comparison of the Cosmed K4b(2) portable metabolic system in measuring steady-state walking energy expenditure. PLoS One. 2010;5(2):e9292.
22. Silcott NA, Bassett DR Jr, Thompson DL, Fitzhugh EC, Steeves JA. Evaluation of the Omron HJ-720ITC pedometer under free-living conditions. Med Sci Sports Exerc. 2011;43(9):1791–7.
23. Skotte J, Korshøj M, Kristiansen J, Hanisch C, Holtermann A. Detection of physical activity types using triaxial accelerometers. J Phys Act Health. 2014;11(1):76–84.
24. Stackpool C, Porcari J, Mikat R, Gillette C, Foster C. The accuracy of various activity trackers in estimating steps taken and energy expenditure. J Fitness Res. 2014;3(3):32–48.
25. Staudenmayer J, Zhu W, Catellier DJ. Statistical considerations in the analysis of accelerometry-based activity monitor data. Med Sci Sports Exerc. 2012;44(1 Suppl 1):S61–7.
26. Steeves JA, Tyo BM, Connolly CP, Gregory DA, Stark NA, Bassett DR. Validity and reliability of the Omron HJ-303 tri-axial accelerometer-based pedometer. J Phys Act Health. 2011;8(7):1014–20.
27. Strath SJ, Kaminsky LA, Ainsworth BE, et al. Guide to the assessment of physical activity: clinical and research applications: a scientific statement from the American Heart Association. Circulation. 2013;128(20):2259–79.
28. Tudor-Locke C, Bassett DR Jr. How many steps/day are enough? Preliminary pedometer indices for public health. Sports Med. 2004;34(1):1–8.
29. U.S. Department of Health and Human Services. 2008 Physical Activity Guidelines for Americans: Be Active, Healthy, and Happy! [Internet] Washington (DC): U.S. Department of Health and Human Services; 2008 [cited 2015 July 11]. Available from: http://health.gov/paguidelines/pdf/paguide.pdf.
30. Watson KB, Frederick GM, Harris CD, Carlson SA, Fulton JE. U.S. adults’ participation in specific activities: behavioral risk factor surveillance system—2011. J Phys Act Health. 2015;12(1 Suppl):S3–S10.
31. Welk GJ. Physical Activity Assessment for Health-Related Research. Champaign (IL): Human Kinetics Publishers, Inc.; 2002. pp. 125–142.
32. Whaley MH, Blair SN. Epidemiology of physical activity, physical fitness and coronary heart disease. J Cardiovasc Risk. 1995;2(4):289–95.
33. Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 2005;30:79–82.
Keywords:

ACCELEROMETER; ENERGY EXPENDITURE; ACTIVITY TRACKER; PEDOMETER; FITNESS TRACKER

© 2016 American College of Sports Medicine