This study presents a protocol of designing and manufacturing a glasses-type wearable device that detects the patterns of food intake and other featured physical activities using load cells inserted in both hinges of the glasses.
This study presents a series of protocols of designing and manufacturing a glasses-type wearable device that detects the patterns of temporalis muscle activities during food intake and other physical activities. We fabricated a 3D-printed frame of the glasses and a load cell-integrated printed circuit board (PCB) module inserted in both hinges of the frame. The module was used to acquire the force signals, and transmit them wirelessly. These procedures provide the system with higher mobility, which can be evaluated in practical wearing conditions such as walking and waggling. A performance of the classification is also evaluated by distinguishing the patterns of food intake from those physical activities. A series of algorithms were used to preprocess the signals, generate feature vectors, and recognize the patterns of several featured activities (chewing and winking), and other physical activities (sedentary rest, talking, and walking). The results showed that the average F1 score of the classification among the featured activities was 91.4%. We believe this approach can be potentially useful for automatic and objective monitoring of ingestive behaviors with higher accuracy as practical means to treat ingestive problems.
Continuous and objective monitoring of food intake is essential for maintaining energy balance in the human body, as excessive energy accumulation may cause overweightness and obesity1, which could result in various medical complications2. The main factors in the energy imbalance are known to be both excessive food intake and insufficient physical activity3. Various studies on the monitoring of daily energy expenditure have been introduced with automatic and objective measurement of physical activity patterns through wearable devices4,5,6, even at the end-consumer level and medical stage7. Research on the monitoring of food intake, however, is still in the laboratory setting, since it is difficult to detect the food intake activity in a direct and objective manner. Here, we aim to present a device design and its evaluation for monitoring the food intake and physical activity patterns at a practical level in daily life.
There have been various indirect approaches to monitor the food intake through chewing and swallowing sounds8,9,10, movement of the wrist11,12,13, image analysis14, and electromyogram (EMG)15. However, these approaches were difficult to apply to daily life applications, because of their inherent limitations: the methods using sound were vulnerable to be influenced by environmental sound; the methods using the movement of the wrist were difficult to distinguish from other physical activities when not consuming food; and the methods using the images and EMG signals are restricted by the boundary of movement and environment. These studies showed the capability of automated detection of the food intake using sensors, but still had a limitation of practical applicability to everyday life beyond laboratory settings.
In this study, we used the patterns of temporalis muscle activity as the automatic and objective monitoring of the food intake. In general, the temporalis muscle repeats the contraction and relaxation as a part of masticatory muscle during the food intake16,17; thus, the food intake activity can be monitored by detecting the periodic patterns of temporalis muscle activity. Recently, there have been several studies utilizing the temporalis muscle activity18,19,20,21, which used the EMG or piezoelectric strain sensor and attaching them directly onto human skin. These approaches, however, were sensitive to the skin location of the EMG electrodes or strain sensors, and were easily detached from the skin due to the physical movement or perspiration. Therefore, we proposed a new and effective method using a pair of glasses that sense the temporalis muscle activity through two load cells inserted in both the hinges in our previous study22. This method showed great potential of detecting the food intake activity with a high accuracy without touching the skin. It was also un-obtrusive and non-intrusive, since we used a common glasses-type device.
In this study, we present a series of detailed protocols of how to implement the glasses-type device and how to use the patterns of temporalis muscle activity for monitoring the food intake and physical activity. The protocols include the process of hardware design and fabrication that consists of a 3D-printed frame of the glasses, a circuit module, and a data acquisition module, and include the software algorithms for data processing and analysis. We furthermore examined the classification among several featured activities (e.g., chewing, walking, and winking) to demonstrate the potential as a practical system that can tell a minute difference between the food intake and other physical activity patterns.
NOTE: All the procedures including the use of human subjects were accomplished by a non-invasive manner of simply wearing a pair of glasses. All the data were acquired by measuring the force signals from load cells inserted in the glasses that were not in direct contact with the skin. The data were wirelessly transmitted to the data recording module, which, in this case is a designated smartphone for the study. All the protocols were not related to in vivo/in vitro human studies. No drug and blood samples were used for the experiments. Informed consent was obtained from all subjects of the experiments.
1. Manufacturing of a Sensor-integrated Circuit Module
2. 3D Printing of a Frame of the Glasses
3. Assembly of All Parts of the Glasses
4. Development of a Data Acquisition System
NOTE: The data acquisition system is composed of a data transmitting module and a data receiving module. The data transmitting module reads the time and the force signals of both sides, and then sends them to the data receiving module, which gathers the received data and writes them to .tsv files.
5. Data Collection from a User Study
NOTE: This study collected six featured activity sets: sedentary rest (SR), sedentary chewing (SC), walking (W), chewing while walking (CW), sedentary talking (ST), and sedentary wink (SW).
6. Signal Preprocessing and Segmentation
NOTE: The left and right signals are calculated separately in the following procedures.
7. Generation of Feature Vectors
NOTE: A feature vector is generated per frame produced in section 6 of the protocol. The left and right frames are calculated separately and combined into a feature vector in the following procedures. All the procedures were implemented in MATLAB.
8. Classification of the Activities into Classes
NOTE: This step is to select the classifier model of a support vector machine (SVM)23 by determining parameters that show the best accuracy from the given problem (i.e., feature vectors). The SVM is a well-known supervised machine learning technique, which shows excellent performance in generalization and robustness using a maximum margin between the classes and a kernel function. We used a grid-search and a cross-validation method to define a penalty parameter C and a kernel parameter γ of the radial basis function (RBF) kernel. A minimum understanding of machine learning techniques and the SVM is required to perform the following procedures. Some referential materials23,24,25 are recommended for better understanding of machine learning techniques and the SVM algorithm. All the procedures in this section were implemented using LibSVM25 software package.
Through the procedures outlined in the protocol, we prepared two versions of the 3D printed frame by differentiating the length of the head piece, LH (133 and 138 mm), and the temples, LT (110 and 125 mm), as shown in Figure 4. Therefore, we can cover several wearing conditions, which can be varied from the subjects' head size, shape, etc. The subjects chose one of the frames to fit to their head for the user study. The vertical distance, Lh, between the hinge joint and the hole for the support bolt was set to 7.5 mm so that the amplified force would not exceed 15 N, which is the linear operating range of the load cell. Finally, the head piece should have a thickness, tH, that can resist the bending moment transmitted from both support bolts when equipped. We chose the tH to be 6 mm with a use of carbon fiber material from a heuristic approach. The contact points can be adjusted through the support bolts to fine-tune the tightness of the glasses as shown in Figure 5.
Table 3 shows the representative results of the classification for all the activity sets. The average F1 score resulted in 80.5%. If considered as a single score, the performance may seem to be relatively degraded compared to the result of our previous study22. We, however, can extract significant information by comparing the outcomes between each activity. The SR was relatively well distinguished from the SC, CW, and SW, but not from the W and ST. Both chewing activities, SC and CW, were difficult to distinguish from each other. On the other hand, it can be observed that both chewing activities can be easily distinguished from the SR, W, ST, and SW, which represent the other physical activities. In the case of the SW, the wink activity turned out to be misclassified slightly throughout the other activities.
From the results of the Table 3, we can observe in-depth details of the classification. First, the two chewing activities, SC and CW, were clearly distinguished from the other activities. Among them, the distinction from the walking activity suggests a possibility that the food intake activity, which is the main purpose of this study, can be easily separable from the active physical activity, such as walking, using our system. As shown in Figure 6, it can be verified that the chewing and wink signals, activated from the temporalis muscle activity, were significantly different from those not activated by the temporalis muscle activity. On the other hand, the distinction between the two chewing activities showed relatively high misclassifications. They played a dominant role in lowering the both the precision and recall of the chewing activities.
In terms of chewing detection, the SR, W, and ST can be regarded as unintended noise in daily life. The wink activity, on the other hand, can be considered as meaningful measurement, because it is also activated from the temporalis muscle activity as well. Based on the above, the two chewing activities were bounded into a chewing activity (CH), and the other activities except for the wink were grouped into a physical activity (PA). Table 4 shows the classification results on these activities: chewing (CH), physical activity (PA), and sedentary wink (SW). We can find more remarkable results from it. It predicts information about whether the system is robust for detecting food intake without being affected by other physical activities. Furthermore, it also indicates whether it is possible to distinguish food intake from other face activity such as wink. The results show that the chewing activity can be well distinguished from the other activities by a high F1 score of 93.4%. In the case of wink, the recall (85.5%) was slightly lower than that of the other activities. This means that the quality of the collected data of wink was likely to be low, as the users had to wink at the exact time in 3 s intervals. In fact, it was observed that the users missed the wink or the glasses shifted down occasionally during the user study.
In order to obtain more meaningful results from the above, we grouped and re-defined the activities into new ones. The two chewing activities, SC and CW, were grouped into one activity, and defined as chewing. The SR, W, and ST, which had a large degree of misclassification among themselves, were also grouped into one activity, defined as physical activity. As a result, we obtained new representative results of the classification re-performed through the activities featured as chewing (CH), physical activity (PA), and sedentary wink (SW), as shown in Table 4. The results showed that a high prediction score with an average F1 score of 91.4% of.
Figure 1: Schematic diagrams of both left and right circuits. (A) Schematic diagrams of the left circuit. It contains a battery to supply power to the both left and right circuit. A 3.3 V voltage regulator with bypass capacitor was provided to supply a stable operating voltage to the system. Load cells presented here were inserted into both sides of the circuit (B) Schematic diagrams of the right circuit. It contains a micro controller unity (MCU) with Wi-Fi capability. A two-channel multiplexer was provided to process two force signals from both sides with one analog-to-digital converter (ADC) of the MCU. A universal asynchronous receiver/transmitter (UART) connector was used to flash the MCU. Please click here to view a larger version of this figure.
Figure 2: PCB artworks of both left and right circuits. (A) An artwork of the left PCB. All electronic components are displayed as actual measurements in mm. (B) An artwork of the right PCB. Please click here to view a larger version of this figure.
Figure 3: Representative results of PCBs soldered with all components. (A) The left circuit module. The load cell was integrated into the board. It contains a 2-pin connector for battery and a 3-pin connector to connect to the right board. (B) The right circuit module. The load cell was also integrated into the board. It contains a 4-pin connector for flashing mode of the MCU, and a 3-pin connector to connect to the left circuit. Please click here to view a larger version of this figure.
Figure 4: A 3D model design of the frame of the glasses. (A) The design of the head piece. The upper figure shows a front view, and the lower figure shows a top view of the head piece. The length of the head piece, LH, is a design parameter to cover various head size of subjects. We 3D printed two versions of the head piece by differentiating it. The thickness of the head piece, tH, was defined by heuristic. The distance between a hinge joint and a hole for a support bolt, Lh, was set from the mechanical amplification factor. (B) The design of the temples. The upper figure shows the left temple, and the lower figure shows the right temple. The PCBs in Figure 3 were inserted into slots and a battery was mounted to a battery holder. Please click here to view a larger version of this figure.
Figure 5: A representative result of thePCB-integrated glasses. The PCBs were inserted into the slots with bolts. The nose pads and the tips of the temples were covered by rubber tapes to add friction with skin. When the glasses are equipped, the load cells are pressed by support bolts on both sides. The tightness of the glasses can be fine-tuned by loosening or tightening the support bolts. Please click here to view a larger version of this figure.
Figure 6: Temporal signals in a recording block of a user for all activities. The y-axis represents the measured force, which was subtracted by its median of the recording block for a visualization purpose. The maximum amplitudes of the chewing activities are larger than the other activities. Left and right signals of wink activity are inverted. The figure shows an example of the left wink. A 2 s frame was used to define a feature vector by hopping the signals at 1 s interval. Please click here to view a larger version of this figure.
Figure 7: Representative results of finding the local maximum accuracy through various pairs of (C, γ). (A) A contour plot of cross-validated accuracies of all activities defined in Table 3. Each axis increases exponentially and the range was heuristically selected. The local maximum accuracy of 80.4% occurred at (C, γ) = (25, 20). (B) A contour plot of cross-validated accuracies of re-defined activities in Table 4. The maximum accuracy of 92.3% occurred at (C, γ) = (25, 20), and was much accurate than the result of (A). Please click here to view a larger version of this figure.
No. | Feature description | No. | Feature description | ||
1 | Standard deviation L | 28 | Skenwness R | ||
2 | Standard deviation R | 29 | Kurtosis L | ||
3 | Coefficient of variation L | 30 | Kurtosis R | ||
4 | Coefficient of variation R | 31 | Autocorrelation function coefficients L | ||
5 | Zero crossing rate L | 32 | Autocorrelation function coefficients R | ||
6 | Zero crossing rate R | 33 | Signal energy L | ||
7 | 20th percentile L | 34 | Signal energy R | ||
8 | 20th percentile R | 35 | Log signal energy L | ||
9 | 50th percentile L | 36 | Log signal energy R | ||
10 | 50th percentile R | 37 | Entropy of energy L | ||
11 | 80th percentile L | 38 | Entropy of energy R | ||
12 | 80th percentile R | 39 | Peak-to-peak amplitude L | ||
13 | Interquartile range L | 40 | Peak-to-peak amplitude R | ||
14 | Interquartile range R | 41 | The number of peaks L | ||
15 | Square sum of 20th percentile L | 42 | The number of peaks R | ||
16 | Square sum of 20th percentile R | 43 | Mean of time between peaks L | ||
17 | Square sum of 50th percentile L | 44 | Mean of time between peaks R | ||
18 | Square sum of 50th percentile R | 45 | Std. of time between peaks L | ||
19 | Square sum of 80th percentile L | 46 | Std. of time between peaks R | ||
20 | Square sum of 80th percentile R | 47 | Prediction ratio L | ||
21 | 1st bin of binned distribution L | 48 | Prediction ratio R | ||
22 | 1st bin of binned distribution R | 49 | Harmonic ratio L | ||
23 | 2nd bin of binned distribution L | 50 | Harmonic ratio R | ||
24 | 2nd bin of binned distribution R | 51 | Fundamental frequency L | ||
25 | 3rd bin of binned distribution L | 52 | Fundamental frequency R | ||
26 | 3rd bin of binned distribution R | 53 | Correlation coefficient of L and R | ||
27 | Skenwness L | 54 | Sigmal magnitude area of L and R |
Table 1: Extracted statistical features of a temporal frame. A total of 54 features were extracted. The left and right signals were calculated separately except for the correlation features, 53 and 54.
No. | Feature description | No. | Feature description | ||
1 | Spectral energy L | 16 | Spectral spread R | ||
2 | Spectral energy R | 17 | Spectral entropy L | ||
3 | Spectral zone 1 of energy L | 18 | Spectral entropy R | ||
4 | Spectral zone 1 of energy R | 19 | Spectral entropy of energy L | ||
5 | Spectral zone 2 of energy L | 20 | Spectral entropy of energy R | ||
6 | Spectral zone 2 of energy R | 21 | Spectral flux L | ||
7 | Spectral zone 3 of energy L | 22 | Spectral flux R | ||
8 | Spectral zone 3 of energy R | 23 | Spectral rolloff L | ||
9 | Spectral zone 4 of energy L | 24 | Spectral rolloff R | ||
10 | Spectral zone 4 of energy R | 25 | Maximum spectral crest L | ||
11 | Spectral zone 5 of energy L | 26 | Maximum spectral crest R | ||
12 | Spectral zone 5 of energy R | 27 | Spectral skewness L | ||
13 | Spectral centroid L | 28 | Spectral skewness R | ||
14 | Spectral centroid R | 29 | Spectral kurtosis L | ||
15 | Spectral spread L | 30 | Spectral kurtosis R |
Table 2: Extracted statistical features of a spectral frame. A total of 30 features were extracted. The left and right signals were calculated separately. From the features in Table 1 and Table 2, a feature vector consists of a total of 84 features.
Predicted activity |
Actual activity | Total | Precision | |||||
aSR | bSC | cW | dCW | eST | fSW | |||
SR | 1222 | 18 | 79 | 6 | 168 | 75 | 1568 | 77.9% |
SC | 10 | 1268 | 17 | 159 | 46 | 15 | 1515 | 83.7% |
W | 55 | 19 | 1212 | 32 | 144 | 20 | 1482 | 81.8% |
CW | 3 | 158 | 34 | 1327 | 28 | 12 | 1562 | 85.0% |
ST | 192 | 75 | 185 | 19 | 1117 | 55 | 1643 | 68.0% |
SW | 78 | 22 | 33 | 17 | 57 | 1383 | 1590 | 87.0% |
Total | 1560 | 1560 | 1560 | 1560 | 1560 | 1560 | 9360 | |
Recall | 78.3% | 81.3% | 77.7% | 85.1% | 71.6% | 88.7% | 80.4% | |
F1 score | 78.1% | 82.5% | 79.7% | 85.0% | 69.7% | 87.8% | ||
Average F1 score | 80.5% |
Table 3: Confusion matrix of all the activities when (C, γ) = (25, 20) in Figure 7A. This matrix shows all the prediction results for all activities: aSR: sedentary rest, bSC: sedentary chewing, cW: walking, dCW: chewing while walking, eST: sedentary talking, fSW: sedentary wink.
Predicted activity |
Actual activity | Total | Precision | ||
aC | bPA | cSW | |||
C | 2898 | 162 | 26 | 3086 | 93.9% |
PA | 201 | 4404 | 200 | 4805 | 91.7% |
SW | 21 | 114 | 1334 | 1469 | 90.8% |
Total | 3120 | 4680 | 1560 | 9360 | |
Recall | 92.9% | 94.1% | 85.5% | 92.3% | |
F1 score | 93.4% | 92.9% | 88.1% | ||
Average F1 score | 91.4% |
Table 4: Confusion matrix of all the re-defined activities when (C, γ) = (25, 20) in Figure 7B. This matrix shows all the prediction results for all re-defined activities: aCH: chewing, bPA: physical activity, cSW: sedentary wink.
In this study, we first proposed the design and manufacturing process of glasses that sense the patterns of food intake and physical activities. As this study mainly focused on the data analysis to distinguish the food intake from the other physical activities (such as walking and winking), the sensor and data acquisition system required the implementation of mobility recording. Thus, the system included the sensors, the MCU with wireless communication capability, and the battery. The proposed protocol provided a novel and practical way to measure the patterns of temporalis muscle activity due to the food intake and wink in a non-contact manner: the tools and methodologies to easily detect the food intake in daily life without any cumbersome equipment were described.
There are important considerations for the procedure of manufacturing the glasses. The temple parts should be designed to integrate the PCB modules fabricated in step 1.2 as shown in Figure 4B and Figure 4C. The load cell should be placed so that it is pressed by a support bolt at a support plate of the head piece when equipped as illustrated in the top view of the hinge part in Figure 5. In step 2.4, the degree of bending of the glasses temple does not need to be rigorous, as the purpose of the curvature is to increase a form factor to better fit the glasses on a subject's head. Be careful, however, as excessive bending will prevent the temples from touching the temporalis muscle, which would make it impossible to collect significant patterns.
To obtain reliable data reflecting the different head sizes and shapes of subjects, two versions of the glasses were provided by varying the length of the head piece and the temples. In addition, by utilizing the support volts to fine-tune the wear-ability, we could adjust the tightness of the glasses. Thus, the data collected through the various glasses, subjects, and wearing-conditions can reflect intra- and inter-individual variability and different form factors.
In the user study, the subject took off the glasses during the break, and wore them again when the recording block restarted. This action prevented the data from overfitting to a specific wearing condition because it changed the wearing conditions (e.g., left-and-right balance, preload on the load cells, contact area with the skin, etc.) every time the subject re-wore the glasses.
According to an earlier study of chewing frequency, the chewing activity mainly ranged from 0.94 Hz (5th percentile) to 2.17 Hz (95th percentile)26. Thus, we set the frame size to 2 s so that a frame contains multiple chewing activities. This frame size is also suitable for containing the one or more walking cycles, which generally range from 1.4 Hz to 2.5 Hz27. We conducted the walking activity at a speed of 4.5 km/h on a treadmill because the normal walking speed varies from 3.3 km/h to 6.5 km/h27,28. The hop size in Figure 6 was determined from the recorded wink data where subjects were informed to wink at 3-s intervals. We also filtered the data with the cutoff frequency of 10 Hz, because we found from our previous study that signals over 10 Hz had no significant information on chewing detection22.
Because the system has two load cells on both sides, it is possible to distinguish the left and right events of the chewing and wink, as proved in our previous study22. However, unlike the previous study, the aim of this study was to demonstrate that the system could effectively separate food intake from the physical activities. If the data are sufficiently accumulated through the user study, then further research on the left and right classification can be conducted, utilizing the correlation features included in the feature vector. On the other hand, it is difficult to distinguish between the sedentary activity and walking within the system. Further modifications to the system can provide detailed classification of the food intake, like eating while sitting and eating on the move, with a high accuracy. This can be implemented through a sensor fusion technique by adding an inertial measurement unit (IMU) to the system18. If so, the system can track the energy expenditure and the energy intake simultaneously. We believe that our approach provides practical and potential ways for detection of food intake and physical activities.
Estimation of energy intake is a crucial goal of research on dietary monitoring, and for example, can be analyzed by classifying the type of food, and then converting it into calories from predefined caloric information. A recent study suggested a method of classifying food types using food images and deep learning algorithms14. However, it is difficult to separate the food types with the force sensors used in this study; the addition of an image sensor to the front of the device could recognize the food types through image processing and machine learning techniques, and thus classify the food types. Through this sensor fusion technique with the force and image sensors, the potential of this study is application toward general dietary monitoring applications.
The authors have nothing to disclose.
This work was supported by Envisible, Inc. This study was also supported by a grant of the Korean Health Technology R&D Project, Ministry of Health & Welfare, Republic of Korea (HI15C1027). This research was also supported by the National Research Foundation of Korea (NRF-2016R1A1A1A05005348).
FSS1500NSB | Honeywell, USA | Load cell | |
INA125U | Texas Instruments, USA | Amplifier | |
ESP-07 | Shenzhen Anxinke Technology, China | MCU with Wi-Fi module | |
74LVC1G3157 | Nexperia, The Netherlands | Multiplexer | |
MP701435P | Maxpower, China | LiPo battery | |
U1V10F3 | Pololu, USA | Voltage regulator | |
Ultimaker 2 | Ultimaker, The Netherlands | 3D printer | |
ColorFabb XT-CF20 | ColorFabb, The Netherlands | Carbon fiber filament | |
iPhone 6s Plus | Apple, USA | Data acquisition device | |
Jelly Belly | Jelly Belly Candy Company, USA | Food texture for user study |