Design of real-time cow behavior monitoring system based on wireless sensor networks and K-Mmeans clustering algorithm

.


INTRODUCTION
Livestock production is an essential source within the food supply chain, and animal monitoring is a significant problem in livestock. The primary purpose is to monitor the health of animals regularly. Consequently, animal welfare and product quality could be improved, leading to improved profit [1]. system was considered one of the promising solutions for cow monitoring [4][5][6][7][8][9][10]. The requirements for this kind of system are that it should be economical, high performance, and provide real-time data. Increased performance means the system could detect and classify cow behaviors accurately.
Due to its small size, low cost, and low power consumption, the sensor is convenient for measuring a variety of information and is widely used in modern agriculture.
In this paper, a wireless sensor node was designed to measure the collar-mounted acceleration data of a cow using an accelerometer. Firstly, the collected data were classified into three classes based on the VeDBA (Vector of Dynamic Body Acceleration) feature using the K-means algorithm. Then, the thresholds for VeDBA in the previous step were used to classify new data. The proposed system, using wireless sensor networks, could be adapted to cow monitoring in real-time; the behavior classification could be implemented on the microcontroller with the accuracy of classification in the region of 89%. Three behaviors could be classified using the proposed system, including feeding, lying, and standing.

Construction of wireless sensor networks for livestock behavior monitoring
The proposed system uses two wireless sensor networks. The first one is a 2.4GHz    The DS1307 real-time module connects to Arduino Mega via I2C protocol. Using the pulse signal taken from Arduino Mega, it is reasonable to increase the synchronization between the internal counter in the Arduino and the counter of the module, which increases timing accuracy and reduces the workload for Arduino.
An EEPROM is connected to the Arduino via I2C interface; the Arduino can control the signal pulse through the SCL pin, so that the data, when converted to EEPROM for writing, is more synchronous, more stable, thereby ensuring the most stable data possible.
The power block maintains the power source for the entire collar device. It has components to protect the device, such as avoiding upstream current or overcharging the battery, causing a battery explosion that could cause injury. The power block has a design to add a charging port to charge the battery, making the device rechargeable without removing the outer shell.

Classification of cow behavior using the K-means clustering algorithm
There are three behaviors of the cow in the experiment: • Feeding: The animal is eating in the eating area.
• Lying: The animal is in a lying down position.
• Standing: The animal stands on its four legs.
The collar device ( Figure 2) collected accumulated acceleration data based on the accelerometer sensor. Figure 4 shows the accelerometer data samples under three axes corresponding to three behaviors: standing, lying, and feeding.  The K-means algorithm is suitable for distinguishing cow behavior data because each data point (record) in this case represents only one behavior (in other words, there is only one label for a data point). The K-means algorithm conforms to a certain idea of behavior recognition in real-time and is simple to implement. Moreover, this algorithm guarantees convergence and easily adapts to new behavior data.
The main ideas are: • Collecting acceleration data.
• Generating a number of clusters; 3 in this case for three behaviors.
• Finding centroids and grouping objects (acceleration data) to its closet clusters.
• Finding centroids and grouping objects based on the use of VeDBA.
The Constant threshold A for VeDBA was calculated; if the VeDBA value is greater than threshold A, it means acceleration data of high activity (feeding), the inverse corresponds to low activity.
Low activity behavior is lying or standing. We calculated threshold B for the static acceleration component of the y-axis (SCAY: Static Component of the Acceleration in the Y-axis). If the value is greater than threshold B, the behavior is determined to be standing; with the inverse being lying down.
To calculate VeDBA, we need the DBA (Dynamic Body Acceleration). The DBA represents the energy consumed by cattle in one direction. (1) where: • i = x, y, z represent the acceleration axes.
• i A is the static sensor value.
• * i A is the value of acceleration data.
• t i µ is the average value of acceleration data and is calculated by (2). (2)

Source: own work
Information about SCAY characteristics is described in Figure 7. With SCAY, three behaviors show some differences in values. Figure 8 shows the relationship between SCAY and VeDBA (horizontal axis) before using K-means. Figure 9 shows the relationship between SCAY and and VeDBA using K-means to find three different clusters. The three cluster centers are defined by: Source: own work Figure 10 shows the calculation of the VeDBA feature with the test data as (5). Figure 11 shows the relationship between SCAY and VeDBA before using K-means to find three different clusters. The three cluster centers, defined by a training process, would be used to estimate three behaviors.    Figure 12. Estimated (up) and actual (down) behavior index (3 is feeding, 2 is standing, and 1 is lying).

Source: own work
Thus, we proposed the system to classify behavior on the microcontroller in real-time as follows: • Step 1: Collect acceleration data under three axes every second (1 frame data obtained).
• Step 3: Calculate the distance to 3 center clusters (values of cluster centers in the space < VeDBA; SCAY> have been pre-loaded with the microcontroller).
• Step 4: Determine the minimum distance value (closet to the center of cluster), label the corresponding behavior.
• Step 5: Send classification results to the gateway.

DISCUSSION AND CONCLUSIONS
Wang et al. [10] classified various cow behaviors based on the AdaBoost algorithm.
The data in their work are leg-mounted acceleration data, sampled at 1 Hz. The average accuracy of their system for predictions of cow behaviors was about 86%, lower compared to this work's 89%. However, our system currently classifies three behaviors, less than Wang et al. [10] with seven behaviors. If we extend the number of behaviors to classify, the overall performance might change as well.
Phung et al. [13] re-used a dataset from [10]. They proposed a system using a Gradient Boosted Decision Tree algorithm to classify seven behaviors with an overall accuracy of about 96%, higher than our result. The performance of our work was also not better than [3,13,14]. As compensation, the advantage of our work compared to [10,13] is cow behavior can be classified on the microcontroller to classify behavior in real-time. The trade-off is the lower performance of classification. Fortunately, the difference is not significant since we reached 89% accuracy. Our results in predicting three behaviors can be applied in real-time systems such as the proposal of [18,19].
To ensure the node's long-term operation, the node's design should consider maintaining a constant input voltage and reducing energy consumption as much as possible. The node's size should be as small as possible and easy to install to minimize interference to the animals. In terms of communication, all nodes in the network use the same frequency to transmit and receive, which will cause channel competition and cause transmission failure to reduce the probability of successful transmission.
Therefore, the capacity problem of wireless sensor network transmission with a single frequency is more prominent.
Energy management is equally important in software design. Energy consumption is required for sensor reading, data transmission and data processing. However, each link's energy consumption is different, so it is necessary to consider efficient programming to further reduce energy consumption comprehensively. Dynamic node management should also be implemented; when old nodes fail, new nodes join to ensure the orderly progress of the dynamic process. Additionally wireless network reprogramming should be supported to perform node remote control updates, and ensure a single node failure will not affect the reliability of the entire network.
Sensor installation problems need to be further improved. Sensor nodes may be used in extremely harsh environments (such as direct sunlight, rain, water mist, animal friction damage, etc.). Packaging and protection are essential.
This paper has proposed a real-time cow monitoring system based on wireless sensor networks and the K-means clustering algorithm to monitor three cow behaviors, including feeding, lying, and standing. The system is designed using the wireless sensor is based on an accelerometer and K-means algorithm with a classification accuracy of about 89%. In the future, we will classify more behaviors within our systems.