Simple Summary: The daily behaviors of Holstein cows, such as standing, grazing, and lying, as well as abnormal behaviors such as estrus, licking, and fighting, are closely related to their physiological health. Accurately identifying these behaviors is of great significance for monitoring the health of dairy cows. For instance, hoof disease generally causes dairy cows to lie down more, while cows in estrus exhibit mounting behavior. This study employs deep learning technology based on computer vision to detect dairy cow behavior. The experimental results demonstrate that this method effectively meets the need for the accurate and rapid identification of Holstein cow behavior in real agricultural environments, which is crucial for improving the economic benefits of farms. Cow behavior carries important health information. The timely and accurate detection of standing, grazing, lying, estrus, licking, fighting, and other behaviors is crucial for individual cow monitoring and understanding of their health status. In this study, a model called CAMLLA-YOLOv8n is proposed for Holstein cow behavior recognition. We use a hybrid data augmentation method to provide the model with rich Holstein cow behavior features and improve the YOLOV8n model to optimize the Holstein cow behavior detection results under challenging conditions. Specifically, we integrate the Coordinate Attention mechanism into the C2f module to form the C2f-CA module, which strengthens the expression of inter-channel feature information, enabling the model to more accurately identify and understand the spatial relationship between different Holstein cows' positions, thereby improving the sensitivity to key areas and the ability to filter background interference. Secondly, the MLLAttention mechanism is introduced in the P3, P4, and P5 layers of the Neck part of the model to better cope with the challenges of Holstein cow behavior recognition caused by large-scale changes. In addition, we also innovatively improve the SPPF module to form the SPPF-GPE module, which optimizes small target recognition by combining global average pooling and global maximum pooling processing and enhances the model's ability to capture the key parts of Holstein cow behavior in the environment. Given the limitations of traditional IoU loss in cow behavior detection, we replace CIoU loss with Shape–IoU loss, focusing on the shape and scale features of the Bounding Box, thereby improving the matching degree between the Prediction Box and the Ground Truth Box. In order to verify the effectiveness of the proposed CAMLLA-YOLOv8n algorithm, we conducted experiments on a self-constructed dataset containing 23,073 Holstein cow behavior instances. The experimental results show that, compared with models such as YOLOv3-tiny, YOLOv5n, YOLOv5s, YOLOv7-tiny, YOLOv8n, and YOLOv8s, the improved CAMLLA-YOLOv8n model achieved increases in Precision of 8.79%, 7.16%, 6.06%, 2.86%, 2.18%, and 2.69%, respectively, when detecting the states of Holstein cows grazing, standing, lying, licking, estrus, fighting, and empty bedding. Finally, although the Params and FLOPs of the CAMLLA-YOLOv8n model increased slightly compared with the YOLOv8n model, it achieved significant improvements of 2.18%, 1.62%, 1.84%, and 1.77% in the four key performance indicators of Precision, Recall, mAP@0.5, and mAP@0.5:0.95, respectively. This model, named CAMLLA-YOLOv8n, effectively meets the need for the accurate and rapid identification of Holstein cow behavior in actual agricultural environments. This research is significant for improving the economic benefits of farms and promoting the transformation of animal husbandry towards digitalization and intelligence. [ABSTRACT FROM AUTHOR]