Motion perception is crucial in competitive sports like dance, basketball, and diving. However, evaluations in these sports heavily rely on professionals, posing two main challenges: subjective assessments are uncertain and can be influenced by experience, making it hard to guarantee timeliness and accuracy, and increasing labor costs with multi-expert voting. While video analysis methods have alleviated some pressure, challenges remain in extracting key points/frames from videos and constructing a suitable, quantifiable evaluation method that aligns with the static–dynamic nature of movements for accurate assessment. Therefore, this study proposes an innovative intelligent evaluation method aimed at enhancing the accuracy and processing speed of complex video analysis tasks. Firstly, by constructing a keyframe extraction method based on musical beat detection, coupled with prior knowledge, the beat detection is optimized through a perceptually weighted window to accurately extract keyframes that are highly correlated with dance movement changes. Secondly, OpenPose is employed to detect human joint points in the keyframes, quantifying human movements into a series of numerically expressed nodes and their relationships (i.e., pose descriptions). Combined with the positions of keyframes in the time sequence, a standard pose description sequence is formed, serving as the foundational data for subsequent quantitative evaluations. Lastly, an Action Sequence Evaluation method (ASCS) is established based on all action features within a single action frame to precisely assess the overall performance of individual actions. Furthermore, drawing inspiration from the Rouge-L evaluation method in natural language processing, a Similarity Measure Approach based on Contextual Relationships (SMACR) is constructed, focusing on evaluating the coherence of actions. By integrating ASCS and SMACR, a comprehensive evaluation of dancers is conducted from both the static and dynamic dimensions. During the method validation phase, the research team judiciously selected 12 representative samples from the popular dance game Just Dance, meticulously classifying them according to the complexity of dance moves and physical exertion levels. The experimental results demonstrate the outstanding performance of the constructed automated evaluation method. Specifically, this method not only achieves the precise assessments of dance movements at the individual keyframe level but also significantly enhances the evaluation of action coherence and completeness through the innovative SMACR. Across all 12 test samples, the method accurately selects 2 to 5 keyframes per second from the videos, reducing the computational load to 4.1–10.3% compared to traditional full-frame matching methods, while the overall evaluation accuracy only slightly decreases by 3%, fully demonstrating the method's combination of efficiency and precision. Through precise musical beat alignment, efficient keyframe extraction, and the introduction of intelligent dance motion analysis technology, this study significantly improves upon the subjectivity and inefficiency of traditional manual evaluations, enhancing the scientificity and accuracy of assessments. It provides robust tool support for fields such as dance education and competition evaluations, showcasing broad application prospects. [ABSTRACT FROM AUTHOR]