Deep neural networks must address the dual challenge of delivering high-accuracy predictions and providing user-friendly explanations. While deep models are widely used in the field of time series modeling, deciphering the core principles that govern the models' outputs remains a significant challenge. This is crucial for fostering the development of trusted models and facilitating domain expert validation, thereby empowering users and domain experts to utilize them confidently in high-risk decision-making contexts (e.g., decision-support systems in healthcare). In this work, we put forward a deep prototype learning model that supports interpretable and manipulable modeling and classification of medical time series (i.e., ECG signal). Specifically, we first optimize the representation of single heartbeat data by employing a bidirectional long short-term memory and attention mechanism, and then construct prototypes during the training phase. The final classification outcomes (i.e., normal sinus rhythm, atrial fibrillation, and other rhythm) are determined by comparing the input with the obtained prototypes. Moreover, the proposed model presents a human–machine collaboration mechanism, allowing domain experts to refine the prototypes by integrating their expertise to further enhance the model's performance (contrary to the human-in-the-loop paradigm, where humans primarily act as supervisors or correctors, intervening when required, our approach focuses on a human–machine collaboration, wherein both parties engage as partners, enabling more fluid and integrated interactions). The experimental outcomes presented herein delineate that, within the realm of binary classification tasks—specifically distinguishing between normal sinus rhythm and atrial fibrillation—our proposed model, albeit registering marginally lower performance in comparison to certain established baseline models such as Convolutional Neural Networks (CNNs) and bidirectional long short-term memory with attention mechanisms (Bi-LSTMAttns), evidently surpasses other contemporary state-of-the-art prototype baseline models. Moreover, it demonstrates significantly enhanced performance relative to these prototype baseline models in the context of triple classification tasks, which encompass normal sinus rhythm, atrial fibrillation, and other rhythm classifications. The proposed model manifests a commendable prediction accuracy of 0.8414, coupled with macro precision, recall, and F1-score metrics of 0.8449, 0.8224, and 0.8235, respectively, achieving both high classification accuracy as well as good interpretability. [ABSTRACT FROM AUTHOR]