11,551 results on '"False positive paradox"'
Search Results
2. Adult stuttering prevalence II: Recalculation, subgrouping and estimate of stuttering community engagement
- Author
-
Gattie, Max, Lieven, Elena, and Kluk, Karolina
- Published
- 2025
- Full Text
- View/download PDF
3. A review of artificial intelligence based malware detection using deep learning
- Author
-
Evgeny Kostyuchenko, Al-Ani Mustafa Majid, Ahmed Jamal Alshaibi, and Alexander Alexandrovich Shelupanov
- Subjects
Computer science ,business.industry ,Deep learning ,Auto encoders ,General Medicine ,computer.software_genre ,Encoding (memory) ,Android malware ,False positive paradox ,Malware ,Artificial intelligence ,State (computer science) ,business ,computer ,Decoding methods - Abstract
Malware propagation by adversaries has witnessed many issues across the globe. Often it is found that malware is released in different countries for monetary gains. With the proliferation of malware spreading activities, it is made possible that now we have malware patterns that are used for training machine learning models. Thus machine learning became indispensable for malware detection. The traditional machine learning models have limitations in performance as the training depth is limited. The emergence of deep learning models paved way for more training possibilities and improvement in detection accuracy with least false positives. This paper reviews literature on deep learning techniques that are used for malware detection. The deep learning methods used for malware detection include CNN, RNN, LSTM and auto encoders. LSTM is found to have memory in the cell to have better possibilities. Auto encoders are found to have better unsupervised approach with encoding and decoding to arrive at abnormalities (malware) detection. There are many contributions found using machine learning and deep learning towards Android malware detection. This paper provides knowledge that leads to further research in deep learning which is essential to improve the state of the art.
- Published
- 2023
- Full Text
- View/download PDF
4. Gaussian Dropout Based Stacked Ensemble CNN for Classification of Breast Tumor in Ultrasound Images
- Author
-
Gugan S. Kathiresan, R. Karthik, M. Nagharjun, R. Menaka, and M. Anirudh
- Subjects
Computer science ,business.industry ,Gaussian ,Pooling ,Biomedical Engineering ,Biophysics ,Cancer ,Pattern recognition ,medicine.disease ,Convolutional neural network ,symbols.namesake ,Breast cancer ,Feature (computer vision) ,False positive paradox ,symbols ,medicine ,Artificial intelligence ,business ,Dropout (neural networks) - Abstract
Objective Breast cancer and breast tumors have been considered to be the most pervasive form of cancer in medical practice. Breast tumors are life-threatening to women, and their early detection could save lives with the proper treatment. Physical methods for detection of Breast Cancer are time-consuming and often prone to a misdiagnosis at classifying tumors. Recent trends in radiological imaging have significantly improved the efficiency and veracity of breast tumor classification. Artificial intelligence techniques could be used as an automated detection and classification system. Materials and methods In this research, we propose a novel configuration of a Stacking Ensemble with custom Convolutional Neural Network architectures to classify breast tumors from ultrasound images into ‘Normal’, ‘Benign’, and ‘Malignant’ categories. Results After thorough experimentation, our ensemble has performed with an accuracy, f1-score, precision, and recall of 92.15%, 92.21%, 92.26%, 92.17% respectively. Conclusion The presented ensemble leverages three Stacked Feature Extractors coupled with a characteristic meta-learner to provide an overall balanced classification performance, with better accuracy and lower false positives. The architecture works in association with gaussian dropout layers to improve the computation and an alternative pooling scheme to retain essential features.
- Published
- 2022
- Full Text
- View/download PDF
5. Assisting Example-Based API Misuse Detection via Complementary Artificial Examples
- Author
-
Heng Li, Weiyi Shang, and Maxime Lamothe
- Subjects
Source code ,Application programming interface ,Java ,business.industry ,Computer science ,media_common.quotation_subject ,Reuse ,Misuse detection ,Software ,Benchmark (computing) ,False positive paradox ,Software engineering ,business ,computer ,media_common ,computer.programming_language - Abstract
Application Programming Interfaces (APIs) allow their users to reuse existing software functionality without implementing it by themselves. However, using external functionality can come at a cost. Because developers are decoupled from the API's inner workings, they face the possibility of misunderstanding, and therefore misusing APIs. Prior research has proposed state-of-the-art example-based API misuse detectors that rely on existing API usage examples mined from existing code bases. Intuitively, without a varied dataset of API usage examples, it is challenging for the example-based API misuse detectors to differentiate between infrequent but correct API usages and API misuses. Such mistakes lead to false positives in the API misuse detection results, which was reported in a recent study as a major limitation of the state-of-the-art. To tackle this challenge, in this paper, we first undertake a qualitative study of 384 falsely detected API misuses. We find that around one third of the false-positives are due to missing alternative correct API usage examples. Based on the knowledge gained from the qualitative study, we uncover five patterns which can be followed to generate artificial examples for complementing existing API usage examples in the API misuse detection. To evaluate the usefulness of the generated artificial examples, we apply a state-of-the-art example-based API misuse detector on 50 open source Java projects. We find that our artificial examples can complement the existing API usage examples by preventing the detection of 55 false API misuses. Furthermore, we conduct a pre-designed experiment in an automated API misuse detection benchmark (MUBench), in order to evaluate the impact of generated artificial examples on recall. We find that the API misuse detector covers the same true positive results with and without the artificial example, i.e., obtains the same recall of 94.7%. Our findings highlight the potential of improving API misuse detection by pattern-guided source code transformation techniques.
- Published
- 2022
- Full Text
- View/download PDF
6. CBA-Detector: A Self-Feedback Detector Against Cache-Based Attacks
- Author
-
Beilei Zheng, Jialun Wang, Chuliang Weng, and Jianan Gu
- Subjects
Computer science ,business.industry ,Detector ,Cloud computing ,Self feedback ,Computer security ,computer.software_genre ,Information leakage ,False positive paradox ,Overhead (computing) ,Isolation (database systems) ,Cache ,Electrical and Electronic Engineering ,business ,computer - Abstract
Cloud computing is convenient to provide adequate resources for tenants. However, since multiple tenants share the underlying hardware resources, malicious tenants can use the shared processor to launch cache-based attacks. Such attacks can help malicious tenants steal private data of other tenants bypassing isolation mechanisms provided by the system, resulting in information leakage. Moreover, Spectre and Meltdown vulnerabilities can even extract memory contents arbitrarily with the help of cache attacks. Therefore, cache-based attacks pose a serious threat to the security of cloud platforms. To defeat such attacks, many detection methods have been proposed. However, most methods induce high false positives because they completely rely on the hardware performance counters (HPCs) and detect attacks with static criteria. To solve this problem, this paper proposes a self-feedback detector named CBA-Detector to detect cache-based attacks in real time. Specifically, CBA-Detector first uses machine learning technologies to create models for identifying suspicious programs with abnormal hardware behaviors, then analyzes suspicious programs from the instruction level to identify real attacks and provide feedback. Based on the feedback, the models can be updated to further improve their detection accuracy. As our experiments show, CBA-Detector can accurately identify cache-based attacks in real time and introduces a little overhead. Besides, the misjudgment rate decreases with the running time.
- Published
- 2022
- Full Text
- View/download PDF
7. Unleashing Coveraged-Based Fuzzing Through Comprehensive, Efficient, and Faithful Exploitable-Bug Exposing
- Author
-
Qiushi Wu, Kangjie Lu, Bowen Wang, and Aditya Pakki
- Subjects
Computer science ,Code coverage ,AddressSanitizer ,Fuzz testing ,Computer security ,computer.software_genre ,Software bug ,TheoryofComputation_LOGICSANDMEANINGSOFPROGRAMS ,Synchronization (computer science) ,False positive paradox ,N-version programming ,Overhead (computing) ,Electrical and Electronic Engineering ,computer - Abstract
Fuzzing has become an essential means of finding software bugs. Bug finding through fuzzing requires two partsexploring code paths to reach bugs and exposing bugs when they are reached. Existing fuzzing research has primarily focused on improving code coverage but not on exposing bugs. Sanitizers such as AddressSanitizer (ASAN) and MemorySanitizer (MSAN) have been the dominating tools for exposing bugs. However, sanitizer-based bug exposing has the following limitations. (1) sanitizers are not compatible with each other. (2) sanitizers incur significant runtime overhead. (3) sanitizers may generate false positives, and (4) exposed bugs may not be exploitable. To address these limitations, we propose EXPOZZER, a fuzzing system that can expose bugs comprehensively, efficiently, and faithfully. The intuition of EXPOZZER is to detect bugs through divergences in a properly diversified dual-execution environment, which does not require maintaining or checking execution metadata. We design a practical and deterministic dual-execution engine, a co-design for dual-execution and fuzzers, bug-sensitive diversification, comprehensive and efficient divergence detection to ensure the effectiveness of EXPOZZER. The results of evaluations show that EXPOZZER can detect not only CVE-assigned vulnerabilities reliably, but also new vulnerabilities in well-tested real-world programs.EXPOZZER is 10 times faster than MemorySanitizer and is similar to AddressSanitizer.
- Published
- 2022
- Full Text
- View/download PDF
8. Deep Learning Based Vulnerability Detection: Are We There Yet?
- Author
-
Saikat Chakraborty, Yangruibo Ding, Rahul Krishna, and Baishakhi Ray
- Subjects
FOS: Computer and information sciences ,Data collection ,Computer science ,business.industry ,Deep learning ,media_common.quotation_subject ,Machine learning ,computer.software_genre ,Data modeling ,Software Engineering (cs.SE) ,Computer Science - Software Engineering ,Program analysis ,Software security assurance ,False positive paradox ,Data deduplication ,Artificial intelligence ,business ,Function (engineering) ,computer ,Software ,media_common - Abstract
Automated detection of software vulnerabilities is a fundamental problem in software security. Existing program analysis techniques either suffer from high false positives or false negatives. Recent progress in Deep Learning (DL) has resulted in a surge of interest in applying DL for automated vulnerability detection. Several recent studies have demonstrated promising results achieving an accuracy of up to 95% at detecting vulnerabilities. In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario?". To our surprise, we find that their performance drops by more than 50%. A systematic investigation of what causes such precipitous performance drop reveals that existing DL-based vulnerability prediction approaches suffer from challenges with the training data (e.g., data duplication, unrealistic distribution of vulnerable classes, etc.) and with the model choices (e.g., simple token-based models). As a result, these approaches often do not learn features related to the actual cause of the vulnerabilities. Instead, they learn unrelated artifacts from the dataset (e.g., specific variable/function names, etc.). Leveraging these empirical findings, we demonstrate how a more principled approach to data collection and model design, based on realistic settings of vulnerability prediction, can lead to better solutions. The resulting tools perform significantly better than the studied baseline: up to 33.57% boost in precision and 128.38% boost in recall compared to the best performing model in the literature. Overall, this paper elucidates existing DL-based vulnerability prediction systems' potential issues and draws a roadmap for future DL-based vulnerability prediction research. In that spirit, we make available all the artifacts supporting our results: https://git.io/Jf6IA., Comment: Under Review IEEE Transactions on Software Engineering
- Published
- 2022
- Full Text
- View/download PDF
9. Daedalus: Breaking Nonmaximum Suppression in Object Detection via Adversarial Examples
- Author
-
Sheng Wen, Yang Xiang, Chaoran Li, Qing-Long Han, Surya Nepal, Xiangyu Zhang, and Derui Wang
- Subjects
Computer science ,Real-time computing ,020207 software engineering ,02 engineering and technology ,Object detection ,Computer Science Applications ,Human-Computer Interaction ,Adversarial system ,Control and Systems Engineering ,Distortion ,0202 electrical engineering, electronic engineering, information engineering ,False positive paradox ,020201 artificial intelligence & image processing ,False positive rate ,Electrical and Electronic Engineering ,Software ,Information Systems - Abstract
This article demonstrates that nonmaximum suppression (NMS), which is commonly used in object detection (OD) tasks to filter redundant detection results, is no longer secure. Considering that NMS has been an integral part of OD systems, thwarting the functionality of NMS can result in unexpected or even lethal consequences for such systems. In this article, an adversarial example attack that triggers malfunctioning of NMS in OD models is proposed. The attack, namely, Daedalus, compresses the dimensions of detection boxes to evade NMS. As a result, the final detection output contains extremely dense false positives. This can be fatal for many OD applications, such as autonomous vehicles and surveillance systems. The attack can be generalized to different OD models, such that the attack cripples various OD applications. Furthermore, a way of crafting robust adversarial examples is developed by using an ensemble of popular detection models as the substitutes. Considering the pervasive nature of model reuse in real-world OD scenarios, Daedalus examples crafted based on an ensemble of substitutes can launch attacks without knowing the parameters of the victim models. The experimental results demonstrate that the attack effectively stops NMS from filtering redundant bounding boxes. As the evaluation results suggest, Daedalus increases the false positive rate in detection results to 99.9% and reduces the mean average precision scores to 0, while maintaining a low cost of distortion on the original inputs. It also demonstrates that the attack can be practically launched against real-world OD systems via printed posters.
- Published
- 2022
- Full Text
- View/download PDF
10. Detecting Taxi Trajectory Anomaly Based on Spatio-Temporal Relations
- Author
-
Minglu Li, Jian Cao, Guangtao Xue, Yanmin Zhu, Bin Cheng, Shiyou Qian, Jiadi Yu, and Tao Zhang
- Subjects
Similarity (geometry) ,Computer science ,business.industry ,Mechanical Engineering ,Anomaly (natural sciences) ,Work (physics) ,Pattern recognition ,Displacement (vector) ,Computer Science Applications ,Automotive Engineering ,Trajectory ,False positive paradox ,Point (geometry) ,Artificial intelligence ,business - Abstract
Researchers have proposed many novel methods to detect abnormal taxi trajectories. However, most of the existing methods usually adopt a counting-based strategy, which may cause high false positives due to imprecisely identifying diverse trajectories as anomalies and therefore, they need the support of large-scale historical trajectories to work properly. To improve detection precision and efficiency, in this article, we propose STR, an online abnormal taxi trajectory detection method based on spatio-temporal relations. The basic principle behind STR is that given the displacement from the source point to a testing point, if the driving time and driving distance are not within the normal ranges, the point is identified as anomalous. To learn the two normal ranges for driving time and driving distance, STR defines two spatio-temporal models which characterize the relationship between displacement and driving distance/driving time. To improve detection efficiency, STR reduces the number of models that need to be learned by making full use of the similarity of transportation modes in different time periods and neighboring areas. The effectiveness and performance of STR are evaluated on real-world taxi trajectories. The experiment results show that compared with counting-based methods, STR achieves greater precision by reducing false positives. Furthermore, STR is more efficient than its counterparts and is suitable for online detection.
- Published
- 2022
- Full Text
- View/download PDF
11. Improving antibody thermostability based on statistical analysis of sequence and structural consensus data
- Author
-
Mani Jain, Yaxiong Sun, and Lei Jia
- Subjects
biology ,Computer science ,Melting temperature ,Immunology ,Consensus sequence ,biology.protein ,False positive paradox ,Immunology and Allergy ,Statistical analysis ,Computational biology ,Antibody ,Sequence (medicine) ,Thermostability - Abstract
Background The use of Monoclonal Antibodies (MAbs) as therapeutics has been increasing over the past 30 years due to their high specificity and strong affinity toward the target. One of the major challenges toward their use as drugs is their low thermostability, which impacts both efficacy as well as manufacturing and delivery. Methods To aid the design of thermally more stable mutants, consensus sequence-based method has been widely used. These methods typically have a success rate of about 50% with maximum melting temperature increment ranging from 10 to 32°C. To improve the prediction performance, we have developed a new and fast MAbs specific method by adding a 3D structural layer to the consensus sequence method. This is done by analyzing the close-by residue pairs which are conserved in >800 MAbs’ 3D structures. Results Combining consensus sequence and structural residue pair covariance methods, we developed an in-house application for predicting human MAb thermostability to guide protein engineers to design stable molecules. Major advantage of this structural level assessment is in significantly reducing the false positives by almost half from the consensus sequence method alone. This application has shown success in designing MAb engineering panels in multiple biologics programs. Conclusions Our data science-based method shows impacts in Mab engineering.
- Published
- 2022
- Full Text
- View/download PDF
12. Dermatophytic onychia: Effectiveness of rapid immunochromatographic diagnostic testing directly on samples compared to culture
- Author
-
S. Challier and A. Paugam
- Subjects
medicine.medical_specialty ,Microbiological culture ,biology ,business.industry ,Dermatology ,Trichophyton rubrum ,biology.organism_classification ,medicine.disease_cause ,Chromatography, Affinity ,Trichophyton interdigitale ,medicine.anatomical_structure ,Nails ,Trichophyton ,Onychomycosis ,Scopulariopsis ,False positive paradox ,Nail (anatomy) ,Dermatophyte ,Humans ,Medicine ,Sampling (medicine) ,business ,Diagnostic Techniques and Procedures ,Retrospective Studies - Abstract
Background Until now, definite diagnosis of dermatophytic onychia has been made by taking a nail sample and placing it in culture. The result is usually obtained only after 2 to 3 weeks. More recently, diagnosis within a few minutes after sampling has become possible thanks to an immunochromatography technique developed in Japan and now available in France: the Diafactory Tinea Unguium® test strip (Biosynex, France). Methods Over a 12-month period, 80 nail samples from 80 patients giving rise to a positive fungal culture were included in the study. For each patient, part of the removed nail was stored at room temperature and an immunochromatographic test was retrospectively performed according to the supplier's instructions. A small fragment of nail (≥ 1 mg) was mixed with a few drops of reagent in a tube for 1 min and the test strip was then placed in the tube with the result being visible to the naked eye (control strip, positivity strip) after incubation for a few minutes. Results Compared with the culture method used for 51 isolated dermatophytes (42 Trichophyton rubrum, 9 Trichophyton interdigitale), the sensitivity of the rapid test was 96.07% (49/51). For the 29 other fungal cultures (10 Fusarium sp., 3 Scytalidium sp., 3 Scopulariopsis brevicaulis,3 Aspergillus sp., 1 Alternaria sp., 3 Candida albicans, 1 Candida parapsilosis, 1 Trichosporons sp., 1 Rhodotorula sp., and 3 Corynebacterium sp.), the specificity was 75.86% (22/29). False positives were mainly due to the genera Fusarium and Scopulariopsis (6 of 7 false positives), which were the likely cause of onychomycosis. Discussion This rapid test could be useful in limiting excessive clinical diagnosis of dermatophyte onychomycosis. The rapid test has several advantages: ease of application, speed of results, and good performance, which, together, could improve diagnostic certainty during the actual consultation, thus limiting prolonged unnecessary prescriptions of antifungal treatments, while waiting for the laboratory culture results (3 weeks for a negative result).
- Published
- 2022
- Full Text
- View/download PDF
13. Learning Efficient Binarized Object Detectors With Information Compression
- Author
-
Ziwei Wang, Jiwen Lu, Jie Zhou, and Ziyi Wu
- Subjects
business.industry ,Computer science ,Applied Mathematics ,Pattern recognition ,Information bottleneck method ,Pascal (programming language) ,Mutual information ,Object (computer science) ,Object detection ,Computational Theory and Mathematics ,Artificial Intelligence ,Margin (machine learning) ,Feature (computer vision) ,False positive paradox ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Software ,computer.programming_language - Abstract
In this paper, we propose a binarized neural network learning method (BiDet) for efficient object detection. Conventional network binarization methods directly quantize the weights and activations in one-stage or two-stage detectors with constrained representational capacity, so that the information redundancy in the networks causes numerous false positives and degrades the performance significantly. On the contrary, our BiDet fully utilizes the representational capacity of the binary neural networks by redundancy removal, through which the detection precision is enhanced with alleviated false positives. Specifically, we generalize the information bottleneck (IB) principle to object detection, where the amount of information in the high-level feature maps is constrained and the mutual information between the feature maps and object detection is maximized. Meanwhile, we learn sparse object priors so that the posteriors are concentrated on informative detection prediction with false positive elimination. Since BiDet employs a fixed IB trade-off to balance the total and relative information contained in the high-level feature maps, the information compression leads to ineffective utilization of the network capacity or insufficient redundancy removal for input in different complexity. To address this, we further present binary neural networks with automatic information compression (AutoBiDet) to automatically adjust the IB trade-off for each input according to the complexity. Moreover, we further propose the class-aware sparse object priors by assigning different sparsity to objects in various classes, so that the false positives are alleviated more effectively without recall decrease. Extensive experiments on the PASCAL VOC and COCO datasets show that our BiDet and AutoBiDet outperform the state-of-the-art binarized object detectors by a sizable margin.
- Published
- 2022
- Full Text
- View/download PDF
14. Multilaboratory assessment of metagenomic next-generation sequencing for unbiased microbe detection
- Author
-
Dongsheng Han, Yanxi Han, Zhenli Diao, Jiehong Xie, Huiying Lai, Jinming Li, and Rui Zhang
- Subjects
Staphylococcus aureus ,Multidisciplinary ,Routine testing ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Computational biology ,Biology ,DNA sequencing ,Clinical Practice ,Metagenomics ,Patient information ,False positive paradox ,Humans ,Metagenome ,Volume concentration - Abstract
Introduction Metagenomic next-generation sequencing (mNGS) assay for detecting infectious agents is now in the stage of being translated into clinical practice. With no approved approaches or guidelines available, laboratories adopt customized mNGS assays to detect clinical samples. However, the accuracy, reliability, and problems of these routinely implemented assays are not clear. Objectives To evaluate the performance of 90 mNGS laboratories under routine testing conditions through analyzing identical samples. Methods Eleven microbial communities were generated using 15 quantitative microbial suspensions. They were used as reference materials to evaluate the false negatives and false positives of participating mNGS protocols, as well as the ability to distinguish genetically similar organisms and to identify true pathogens from other microbes based on fictitious case reports. Results High interlaboratory variability was found in the identification and the quantitative reads per million reads (RPM) values of each microbe in the samples, especially when testing microbes present at low concentrations (1 × 103 cell/ml or less). 42.2% (38/90) of the laboratories reported unexpected microbes (i.e. false positive problem). Only 56.7% (51/90) to 83.3% (75/90) of the laboratories showed a sufficient ability to obtain clear etiological diagnoses for three simulated cases combined with patient information. The analysis of the performance of mNGS in distinguishing genetically similar organisms in three samples revealed that only 56.6% to 63.0% of the laboratories recovered RPM ratios (RPMS. aureus/RPMS. epidermidis) within the range of a 2-fold change of the initial input ratios (indicating a relatively low level of bias). Conclusion The high interlaboratory variability found in both identifying microbes and distinguishing true pathogens emphasizes the urgent need for improving the accuracy and comparability of the results generated across different mNGS laboratories, especially in the detection of low-microbial-biomass samples.
- Published
- 2022
- Full Text
- View/download PDF
15. NROI based feature learning for automated tumor stage classification of pulmonary lung nodules using deep convolutional neural networks
- Author
-
Subaji Mohan and Supriya Suresh
- Subjects
Ground truth ,Lung ,General Computer Science ,Receiver operating characteristic ,Computer science ,business.industry ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Overfitting ,Convolutional neural network ,medicine.anatomical_structure ,Region of interest ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,False positive paradox ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Feature learning - Abstract
Identifying the exact pulmonary nodule boundaries in computed tomography (CT) images are crucial tasks to computer-aided detection systems (CADx). Segregation of CT images as benign, malignant and non-cancerous is essential for early detection of lung cancers to improve survival rates. In this paper, a methodology for automated tumor stage classification of pulmonary lung nodules is proposed using an end-to-end learning Deep Convolutional Neural Network (DCNN). The images used in the study were acquired from the Lung Image Database Consortium and Infectious Disease Research Institute (LIDC-IDRI) public repository comprising of 1018 cases. Lung CT images with candidate nodules are segmented into a 52 × 52 pixel nodule region of interest (NROI) rectangle based on four radiologists’ annotations and markings with ground truth (GT) values. The approach aims in analyzing and extracting the self-learned salient features from the NROI consisting of differently structured nodules. DCNN are trained with NROI samples and are further classified according to the tumor patterns as non-cancerous, benign or malignant samples. Data augmentation and dropouts are used to avoid overfitting. The algorithm was compared with the state of art methods and traditional hand-crafted features like the statistical, texture and morphological behavior of lung CT images. A consistent improvement in the performance of the DCNN was observed using nodule grouped dataset and the classification accuracy of 97.8%, the specificity of 97.2%, the sensitivity of 97.1%, and area under the receiver operating characteristic curve (AUC) score of 0.9956 was achieved with reduced low false positives.
- Published
- 2022
- Full Text
- View/download PDF
16. Classifying cybergrooming for child online protection using hybrid machine learning model
- Author
-
Gustavo Isaza, Fabián Muñoz, Felipe Buitrago, and Luis Fernando Castillo
- Subjects
Hybrid machine ,Artificial neural network ,business.industry ,Computer science ,Cognitive Neuroscience ,Semantic analysis (machine learning) ,Context (language use) ,Machine learning ,computer.software_genre ,Convolutional neural network ,Computer Science Applications ,Artificial Intelligence ,Classifier (linguistics) ,False positive paradox ,Artificial intelligence ,Representation (mathematics) ,business ,computer - Abstract
This paper shows a computational model that classifies Cybergrooming attacks in the context of COP (child online protection) using Natural Language Processing (NLP) and Convolutional Neural Networks (CNN). The model predicts a high number of false positives, therefore low precision and F-score, but a high accuracy. In this issue, where the number of messages in the context of grooming are so low compared to the number of conversations and messages from other contexts, it can be concluded that is a very consistent and useful result as it captures a high number of true positives, considering that the classifier works for messages. Performing the training of machine learning algorithms with neural networks, semantic analysis and NLP, allows approximate representation of knowledge contributing to discovery of pseudo-intelligent information in these environments and reducing human intervention for characterization of underlying abnormal behavior and detecting messages that potentially represent these attacks.
- Published
- 2022
- Full Text
- View/download PDF
17. Anomaly Detection of Calcifications in Mammography Based on 11,000 Negative Cases
- Author
-
Carlo C. Maley, Joseph Y. Lo, Yinhao Ren, Lorraine M. King, Maciej A. Mazurowski, Yifan Peng, Rui Hou, Lars J. Grimm, Jeffrey R. Marks, and Eun-Sil Shelley Hwang
- Subjects
Pixel ,medicine.diagnostic_test ,Computer science ,business.industry ,Biomedical Engineering ,Calcinosis ,Breast Neoplasms ,Pattern recognition ,Overfitting ,medicine.disease ,Autoencoder ,Machine Learning ,Breast cancer ,False positive paradox ,medicine ,Humans ,Mammography ,Female ,Anomaly detection ,Diagnosis, Computer-Assisted ,Artificial intelligence ,business ,Calcification - Abstract
In mammography, calcifications are one of the most common signs of breast cancer. Detection of such lesions is an active area of research for computer-aided diagnosis and machine learning algorithms. Due to limited numbers of positive cases, many supervised detection models suffer from overfitting and fail to generalize. We present a one-class, semi-supervised framework using a deep convolutional autoencoder trained with over 50,000 images from 11,000 negative-only cases. Since the model learned from only normal breast parenchymal features, calcifications produced large signals when comparing the residuals between input and reconstruction output images. As a key advancement, a structural dissimilarity index was used to suppress non-structural noises. Our selected model achieved pixel-based AUROC of 0.959 and AUPRC of 0.676 during validation, where calcification masks were defined in a semi-automated process. Although not trained directly on any cancers, detection performance of calcification lesions on 1,883 testing images (645 malignant and 1238 negative) achieved 75% sensitivity at 2.5 false positives per image. Performance plateaued early when trained with only a fraction of the cases, and greater model complexity or a larger dataset did not improve performance. This study demonstrates the potential of this anomaly detection approach to detect mammographic calcifications in a semi-supervised manner with efficient use of a small number of labeled images, and may facilitate new clinical applications such as computer-aided triage and quality improvement.
- Published
- 2022
- Full Text
- View/download PDF
18. Oracle-Supported Dynamic Exploit Generation for Smart Contracts
- Author
-
Ye Liu, Shang-Wei Lin, Haijun Wang, Cyrille Artho, Lei Ma, Yang Liu, and Yi Li
- Subjects
021110 strategic, defence & security studies ,Exploit ,Computer science ,Reliability (computer networking) ,0211 other engineering and technologies ,Vulnerability ,02 engineering and technology ,Computer security ,computer.software_genre ,Oracle ,False positive paradox ,State (computer science) ,Logic error ,Electrical and Electronic Engineering ,computer ,Database transaction - Abstract
Despite the high stakes involved in smart contracts, they are often developed in an undisciplined manner, leaving the security and reliability of blockchain transactions at risk. In this paper, we introduce ContraMaster an oracle-supported dynamic exploit generation framework for smart contracts. Existing approaches mutate only single transactions; ContraMaster exceeds these by mutating the transaction sequences. ContraMaster uses data-flow, control-flow, and the dynamic contract state to guide its mutations. It then monitors the executions of target contract programs, and validates the results against a general-purpose semantic test oracle to discover vulnerabilities. Being a dynamic technique, it guarantees that each discovered vulnerability is a violation of the test oracle and is able to generate the attack script to exploit this vulnerability. In contrast to rule-based approaches, ContraMaster has not shown any false positives, and it easily generalizes to unknown types of vulnerabilities (e.g., logic errors). We evaluate ContraMaster on 218 vulnerable smart contracts. The experimental results confirm its practical applicability and advantages over the state-of-the-art techniques, and also reveal three new types of attacks.
- Published
- 2022
- Full Text
- View/download PDF
19. Elastic Bloom Filter: Deletable and Expandable Filter Using Elastic Fingerprints
- Author
-
Jintao He, Olivier Ruas, Jianyu Wu, Shen Yan, Yuhan Wu, Tong Yang, Bin Cui, Gong Zhang, Peking University [Beijing], Pengcheng Laboratory = Peng Cheng Laboratory [Shenzhen], Self-adaptation for distributed services and large software systems (SPIRALS), Inria Lille - Nord Europe, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS)-Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), and Huawei Technologies [Shenzhen]
- Subjects
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,Elastic Bloom filter ,Computer science ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,Fingerprint (computing) ,Bloom filter ,Bloom filter expansion ,Theoretical Computer Science ,Set (abstract data type) ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Computational Theory and Mathematics ,Hardware and Architecture ,False positive paradox ,Key (cryptography) ,Elastic Fingerprints ,False positive rate ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Algorithm ,Software - Abstract
International audience; The Bloom filter, answering whether an item is in a set, has achieved great success in various fields, including networking, databases, and bioinformatics. However, the Bloom filter has two main shortcomings: no support of item deletion and no support of expansion. Existing solutions either support deletion at the cost of using additional memory, or support expansion at the cost of increasing the false positive rate and decreasing the query speed. Unlike existing solutions, we propose the Elastic Bloom filter (EBF) to address the two shortcomings simultaneously. Importantly, when EBF expands, the false positives decrease. Our key technique is Elastic Fingerprints, which dynamically absorb and release bits during compression and expansion. To support deletion, EBF can first delete the corresponding fingerprint and then update the corresponding bit in the Bloom filter. To support expansion, Elastic Fingerprints release bits and insert them to the Bloom filter. Our experimental results show that the Elastic Bloom filter significantly outperforms existing works.
- Published
- 2022
- Full Text
- View/download PDF
20. Comprehensive Assessment of Coronary Calcification in Intravascular OCT Using a Spatial-Temporal Encoder-Decoder Network
- Author
-
Yubin Gong, Sining Hu, Chong He, Wang Zhao, Chao Li, Fang Lu, Kaiwen Li, Jinwei Tian, Bo Yu, and Haibo Jia
- Subjects
Computer science ,medicine.medical_treatment ,Coronary Artery Disease ,Convolutional neural network ,Coronary artery disease ,Percutaneous Coronary Intervention ,Robustness (computer science) ,Image Processing, Computer-Assisted ,medicine ,False positive paradox ,Humans ,Segmentation ,Electrical and Electronic Engineering ,Radiological and Ultrasound Technology ,business.industry ,Calcinosis ,Percutaneous coronary intervention ,Pattern recognition ,medicine.disease ,Plaque, Atherosclerotic ,Computer Science Applications ,Coronary artery calcification ,Neural Networks, Computer ,Artificial intelligence ,business ,Software ,Calcification - Abstract
Coronary calcification is a strong indicator of coronary artery disease and a key determinant of the outcome of percutaneous coronary intervention. We propose a fully automated method to segment and quantify coronary calcification in intravascular OCT (IVOCT) images based on convolutional neural networks (CNN). All possible calcified plaques were segmented from IVOCT pullbacks using a spatial-temporal encoder-decoder network by exploiting the 3D continuity information of the plaques, which were then screened and classified by a DenseNet network to reduce false positives. A novel data augmentation method based on the IVOCT image acquisition pattern was also proposed to improve the performance and robustness of the segmentation. Clinically relevant metrics including calcification area, depth, angle, thickness, volume, and stent-deployment calcification score, were automatically computed. 13844 IVOCT images with 2627 calcification slices from 45 clinical OCT pullbacks were collected and used to train and test the model. The proposed method performed significantly better than existing state-of-the-art 2D and 3D CNN methods. The data augmentation method improved the Dice similarity coefficient for calcification segmentation from 0.615±0.332 to 0.756±0.222, reaching human-level inter-observer agreement. Our proposed region-based classifier improved image-level calcification classification precision and F1-score from 0.725±0.071 and 0.791±0.041 to 0.964±0.002 and 0.883±0.008, respectively. Bland-Altman analysis showed close agreement between manual and automatic calcification measurements. Our proposed method is valuable for automated assessment of coronary calcification lesions and in-procedure planning of stent deployment.
- Published
- 2022
- Full Text
- View/download PDF
21. Semi-Synchronized Non-Blocking Concurrent Kernel Cruising
- Author
-
Donghai Tian, Qiang Zeng, Dinghao Wu, Changzhen Hu, and Peng Liu
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Temporal isolation among virtual machines ,Hypervisor ,Cloud computing ,02 engineering and technology ,Virtualization ,computer.software_genre ,020202 computer hardware & architecture ,Computer Science Applications ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,False positive paradox ,Operating system ,020201 artificial intelligence & image processing ,business ,computer ,Software ,Information Systems ,Heap (data structure) ,Buffer overflow - Abstract
Kernel heap buffer overflow vulnerabilities have been exposed for decades, but there are few practical countermeasure that can be applied to OS kernels. Previous solutions either suffer from high performance overhead or compatibility problems with mainstream kernels and hardware. In this paper, we present KRUISER, a concurrent kernel heap buffer overflow monitor. Unlike conventional methods, the security enforcement of which is usually inlined into the kernel execution, Kruiser migrates security enforcement from the kernel's normal execution to a concurrent monitor process, leveraging the increasingly popular multi-core architectures. To reduce the synchronization overhead between the monitor process and the running kernel, we design a novel semi-synchronized non-blocking monitoring algorithm, which enables efficient runtime detection on live memory without incurring false positives. To prevent the monitor process from being tampered and provide guaranteed performance isolation, we utilize the virtualization technology to run the monitor process out of the monitored VM, while heap memory allocation information is collected inside the monitored VM in a secure and efficient way. We have implemented a prototype of KRUISER based on Linux and the Xen/KVM hypervisor. The evaluation shows that Kruiser can detect realistic kernel heap buffer overflow attacks in cloud environment effectively with minimal cost.
- Published
- 2022
- Full Text
- View/download PDF
22. Implications of Data Anonymization on the Statistical Evidence of Disparity
- Author
-
Nan Zhang and Heng Xu
- Subjects
Prima facie ,Data anonymization ,Computer science ,Strategy and Management ,False positives and false negatives ,Privacy laws of the United States ,False positive paradox ,Consumer privacy ,Differential privacy ,Management Science and Operations Research ,Data science ,Disparate impact - Abstract
Research and practical development of data anonymization techniques has proliferated in recent years. Although the privacy literature has questioned the efficacy of data anonymization at protecting individuals against harms associated with re-identification, this paper raises another new set of questions: whether anonymization techniques themselves can mask statistical disparities and thus conceal evidence of disparate impact that is potentially discriminatory. If so, the choice of data anonymization technique to protect privacy, and the specific technique employed, may pick winners and losers. Examining the implications of these choices on the potentially disparate impact of privacy protection on underprivileged sub-populations is thus a critically important policy question. The paper begins with an interdisciplinary overview of two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. In terms of data-anonymization mechanisms, the two common ones are data removal (e.g., k-anonymity), which aims to remove the part of a dataset that could potentially identify an individual; and noise insertion (e.g., differential privacy), which inserts into a dataset carefully designed noises that block the identification of individuals yet allow the accurate recovery of certain summary statistics. In terms of the statistical evidence for disparity, the two commonly accepted types are disparity through separation (e.g., the "two or three standard deviations" rule for a prima facie case of discrimination), which is grounded in the idea of detecting the separation between the outcome distributions for different sub-populations; and disparity through variation (e.g., the "more likely than not" rule in toxic tort cases), which concentrates on the magnitude of difference between the mean outcomes of different sub-populations. We develop conceptual foundation and mathematical formalism demonstrating that the two data anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. Specifically, under the regime of disparity through separation, data removal tends to produce more false positives (i.e., detecting false disparity when none exists) than false negatives (i.e., failing to detect an existing disparity); while noise insertion rarely produces any false positives at all. Meanwhile, noise insertion does produce false positives (equally likely as false negatives) under the regime of disparity through variation; while the likelihood for data removal to produce false positives and false negatives depend on the underlying data distribution. We empirically validated our findings with an inpatient dataset from one of the five most populated states in the U.S. We examined four data-anonymization techniques (two in the data-removal category and the other two in noise insertion), ranging from the current rules used by the State of Texas to anonymize their state-wide inpatient discharge dataset to the state-of-the-art differential privacy algorithms for regression analysis. After presenting the empirical results, which confirmed our conceptual and mathematical findings, we conclude the paper by discussing the business and policy implications of these findings, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact. In sum, our paper identifies an important knowledge gap in both tech and law fields: whether data anonymization technologies themselves can mask statistical disparities and thus conceal the evidence of disparate impact that is potentially discriminatory. The emergence of privacy laws (e.g., GDPR) gives primacy to answering this question, because if such disparate impacts do exist, legislators and regulators would be essentially picking winners and losers by requiring or incentivizing the use of data anonymization techniques. This paper tackles this timely yet complex challenge, especially given the current public discourse in the U.S. about racial discrimination, and the worldwide trend of prioritizing the protection of consumer privacy in legislations and regulations.
- Published
- 2022
- Full Text
- View/download PDF
23. Cluster-level statistical inference in fMRI datasets: The unexpected behavior of random fields in high dimensions
- Author
-
Ravi Bansal and Bradley S. Peterson
- Subjects
Adult ,Normal Distribution ,Biomedical Engineering ,Biophysics ,Datasets as Topic ,computer.software_genre ,01 natural sciences ,Article ,Statistical power ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Voxel ,Statistics ,Statistical inference ,False positive paradox ,Image Processing, Computer-Assisted ,Cluster Analysis ,Humans ,Radiology, Nuclear Medicine and imaging ,Autistic Disorder ,0101 mathematics ,Child ,Mathematics ,Parametric statistics ,Statistical hypothesis testing ,Brain Mapping ,Nonparametric statistics ,Contrast (statistics) ,Brain ,Magnetic Resonance Imaging ,computer ,030217 neurology & neurosurgery - Abstract
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (< 6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a cluster defining threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques.
- Published
- 2022
- Full Text
- View/download PDF
24. On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things
- Author
-
Ghada Abdelmoumin, Abdul Rahman, and Danda B. Rawat
- Subjects
Artificial neural network ,Computer Networks and Communications ,business.industry ,Computer science ,Deep learning ,020206 networking & telecommunications ,02 engineering and technology ,Intrusion detection system ,Machine learning ,computer.software_genre ,Ensemble learning ,Computer Science Applications ,Support vector machine ,Hardware and Architecture ,Signal Processing ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,False positive paradox ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Information Systems ,Test data - Abstract
Anomaly-based machine learning-enabled intrusion detection systems (AML-IDS) show low performance and prediction accuracy while detecting intrusions in the Internet of Things (IoT) than that of deep learning-based intrusion detection systems (DL-IDS). In particular, AML-IDS that employ low complexity models for IoT, such as the Principal Component Machine (PCA) method and the One-class Support Vector Machine (1-SVM) method, are inefficient in detecting intrusions when compared to DL-IDS with the two-class Neural Network (2-NN) method. PCA and 1-SVM AML-IDS suffer from low detection rates compared to DL-IDS. The size of the dataset and the number of features or variants in the dataset may influence how well PCA and 1-SVM AML-IDS perform compared to DL-IDS. We attribute the low performance and prediction accuracy of the AML-IDS model to an imbalanced dataset, a low similarity index between the training data and testing data, and the use of a single-learner model. The intrinsic limitations of the single-learner model have a direct impact on the accuracy of an intelligent IDS. Also, the dissimilarity between testing data and training data leads to an increasingly high rate of false positives in AML-IDS than DL-IDS, which have low false alarms and high predictability. In this paper, we examine the use of optimization techniques to enhance the performance of single-learner AML-IDS, such as PCA and 1-SVM AML-IDS models for building efficient, scalable, and distributed intelligent IDS for detecting intrusions in IoT. We evaluate these AML-IDS models by tuning hyperparameters and ensemble learning optimization techniques using the Microsoft Azure ML Studio (AMLS) platform and two datasets containing malicious and benign IoT and industrial IoT (IIoT) network traffic. Furthermore, we present a comparative analysis of AML-IDS models for IoT regarding their performance and predictability.
- Published
- 2022
- Full Text
- View/download PDF
25. Algorithm optimization and anomaly detection simulation based on extended Jarvis-Patrick clustering and outlier detection
- Author
-
Yao Du, Xiaohui Hu, and Wei Wang
- Subjects
Outlier Detection ,Computer science ,business.industry ,Anomaly (natural sciences) ,General Engineering ,Process (computing) ,Jarvis-Patrick Clustering ,Pattern recognition ,Engineering (General). Civil engineering (General) ,Constant false alarm rate ,False positive paradox ,Anomaly Detection ,Graph (abstract data type) ,Anomaly detection ,Artificial intelligence ,TA1-2040 ,Cluster analysis ,business ,Extended Shared Nearest Neighbor ,Simulation based - Abstract
In this paper, the authors analyze the algorithm optimization and anomaly detection simulation based on extended jarvis-patrick clustering and outlier detection. We perform detection by using the jarvis-patrick graph-based clustering method. After that, to further improve the false alarm rate (FAR) of the algorithm, we use an extra outlier detection step combined with our proposed EJP to create a new anomaly detection method called LD-EJP. Using LD-EJP, the false alarm rate improved much (experiments show that the false alarm rate can reach 4.1% while the best JP clustering can reach is 7.4%). Then, we tested LD-EJP against two other anomaly detection methods using k-means and LGCCB, showing that our algorithm has a better detection rate and false alarm rate than these two clustering-based anomaly detection methods. In addition, the detection rate and false positives of the algorithm also have some room for improvement. In the labeling process, the proportion of anomaly clusters to normal clusters needs to be manually adjusted to find a better detection rate. In addition, the detection rate we chose can consume some of the detection rate gained in extended JP clustering to have the LD-EJP obtain a better FAR. Therefore, our future work contains finding or proposing another outlier detection algorithm with better performance than our LD-EJP method.
- Published
- 2022
- Full Text
- View/download PDF
26. The clinical performance and cost-effectiveness of two psychosocial assessment models in maternity care: The Perinatal Integrated Psychosocial Assessment study
- Author
-
Nicole Reilly, Emma Black, Georgina M. Chambers, Willings Botha, Marie-Paule Austin, and Dawn Kingston
- Subjects
medicine.medical_specialty ,Referral ,Cost effectiveness ,Cost-Benefit Analysis ,Staffing ,Cohort Studies ,03 medical and health sciences ,0302 clinical medicine ,Pregnancy ,Maternity and Midwifery ,False positive paradox ,Humans ,Mass Screening ,Medicine ,Maternal Health Services ,030219 obstetrics & reproductive medicine ,030504 nursing ,business.industry ,Clinical performance ,Obstetrics and Gynecology ,Triage ,Pregnancy Complications ,Family medicine ,Female ,0305 other medical science ,business ,Psychosocial ,Cohort study - Abstract
Problem Although perinatal universal depression and psychosocial assessment is recommended in Australia, its clinical performance and cost-effectiveness remain uncertain. Aim To compare the performance and cost-effectiveness of two models of psychosocial assessment: Usual-Care and Perinatal Integrated Psychosocial Assessment (PIPA). Methods Women attending their first antenatal visit were prospectively recruited to this cohort study. Endorsement of significant depressive symptoms or psychosocial risk generated an ‘at-risk’ flag identifying those needing referral to the Triage Committee. Based on its detailed algorithm, a higher threshold of risk was required to trigger the ‘at-risk’ flag for PIPA than for Usual-Care. Each model’s performance was evaluated using the midwife’s agreement with the ‘at-risk’ flag as the reference standard. Cost-effectiveness was limited to the identification of True Positive and False Positive cases. Staffing costs associated with administering each screening model were quantified using a bottom-up time-in-motion approach. Findings Both models performed well at identifying ‘at-risk’ women (sensitivity: Usual-Care 0.82 versus PIPA 0.78). However, the PIPA model was more effective at eliminating False Positives and correctly identifying ‘at-risk’ women (Positive Predictive Value: PIPA 0.69 versus Usual Care 0.41). PIPA was associated with small incremental savings for both True Positives detected and False Positives averted. Discussion Overall PIPA performed better than Usual-Care as a psychosocial screening model and was a cost-saving and relatively effective approach for detecting True Positives and averting False Positives. These initial findings warrant evaluation of longer-term costs and outcomes of women identified by the models as ‘at-risk’ and ‘not at-risk’ of perinatal psychosocial morbidity.
- Published
- 2022
- Full Text
- View/download PDF
27. False-Positive Rates and Associated Risk Factors on the Vestibular-Ocular Motor Screening and Modified Balance Error Scoring System in US Military Personnel
- Author
-
Anne Mucha, Cyndi L. Holland, Drew Thomas, Hannah B. Bitzer, Maj Katrina Monti, Anthony P. Kontos, Maj Eliot Thomasma, Shawn R. Eagle, and Michael W. Collins
- Subjects
medicine.medical_specialty ,Motion Sickness ,Migraine Disorders ,Population ,Concussion ,Physical Therapy, Sports Therapy and Rehabilitation ,Context (language use) ,Logistic regression ,Risk Factors ,Internal medicine ,medicine ,False positive paradox ,Humans ,Orthopedics and Sports Medicine ,Medical history ,education ,Brain Concussion ,education.field_of_study ,business.industry ,General Medicine ,Odds ratio ,medicine.disease ,Cross-Sectional Studies ,Military Personnel ,Athletic Injuries ,False positive rate ,business - Abstract
Context In 2018, the US military developed the Military Acute Concussion Evaluation-2 (MACE-2) to inform the acute evaluation of mild traumatic brain injury (mTBI). However, researchers have yet to investigate false-positive rates for components of the MACE-2, including the Vestibular-Ocular Motor Screening (VOMS) and modified Balance Error Scoring System (mBESS), in military personnel. Objective To examine factors associated with false-positive results on the VOMS and mBESS in US Army Special Operations Command (USASOC) personnel. Design Cross-sectional study. Setting Military medical clinic. Patients or Other Participants A total of 416 healthy USASOC personnel completed the medical history, VOMS, and mBESS evaluations. Main Outcome Measure(s) False-positive rates for the VOMS (≥2 on VOMS symptom items, >5 cm for near point of convergence [NPC] distance) and mBESS (total score >4) were determined using χ2 analyses and independent-samples t tests. Multivariable logistic regressions (LRs) with adjusted odds ratios (aORs) were performed to identify risk factors for false-positive results on the VOMS and mBESS. The VOMS item false-positive rates ranged from 10.6% (smooth pursuits) to 17.5% (NPC). The mBESS total score false-positive rate was 36.5%. Results The multivariable LR model supported 3 significant predictors of VOMS false-positives, age (aOR = 1.07; 95% CI = 1.02, 1.12; P = .007), migraine history (aOR = 2.49; 95% CI = 1.29, 4.81; P = .007), and motion sickness history (aOR = 2.46; 95% CI = 1.34, 4.50; P = .004). Only a history of motion sickness was a significant predictor of mBESS false-positive findings (aOR = 2.34; 95% CI = 1.34, 4.05; P = .002). Conclusions False-positive rates across VOMS items were low and associated with age and a history of mTBI, migraine, or motion sickness. False-positive results for the mBESS total score were higher (36.5%) and associated only with a history of motion sickness. These risk factors for false-positive findings should be considered when administering and interpreting VOMS and mBESS components of the MACE-2 in this population.
- Published
- 2023
28. Robust Benchmark Structural Variant Calls of An Asian Using State-of-the-art Long-read Sequencing Technologies
- Author
-
Sanyang Liu, Yuhui Sun, Fan Liang, Linying Wang, Yuhui Xiao, Shoufang Qu, Jiezhong Zhang, Xiao Du, Xinming Liang, Shuai Sun, Yang Wang, Fei Fan, Jie Huang, Fang Chen, Wenxin Zhang, Depeng Wang, Weifei Yang, Li-Li Li, Ou Wang, Guangyi Fan, and Weijin Qiu
- Subjects
Epstein-Barr Virus Infections ,Herpesvirus 4, Human ,Computer science ,Sequence assembly ,Computational biology ,Biochemistry ,Genome ,Giga ,Structural variation ,03 medical and health sciences ,symbols.namesake ,0302 clinical medicine ,Asian People ,Genetics ,False positive paradox ,Humans ,Molecular Biology ,030304 developmental biology ,Sanger sequencing ,0303 health sciences ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,Benchmarking ,Computational Mathematics ,Haplotypes ,Benchmark (computing) ,symbols ,Nanopore sequencing ,030217 neurology & neurosurgery - Abstract
The importance of structural variants (SVs) for human phenotypes and diseases is now recognized. Although a variety of SV detection platforms and strategies that vary in sensitivity and specificity have been developed, few benchmarking procedures are available to confidently assess their performances in biological and clinical research. To facilitate the validation and application of these SV detection approaches, we established an Asian reference material by characterizing the genome of an Epstein-Barr virus (EBV)-immortalized B lymphocyte line along with identified benchmark regions and high-confidence SV calls. We established a high-confidence SV callset with 8938 SVs by integrating four alignment-based SV callers, including 109× Pacific Biosciences (PacBio) continuous long reads (CLRs), 22× PacBio circular consensus sequencing (CCS) reads, 104× Oxford Nanopore Technologies (ONT) long reads, and 114× Bionano optical mapping platform, and one de novo assembly-based SV caller using CCS reads. A total of 544 randomly selected SVs were validated by PCR amplification and Sanger sequencing, demonstrating the robustness of our SV calls. Combining trio-binning-based haplotype assemblies, we established an SV benchmark for identifying false negatives and false positives by constructing the continuous high-confidence regions (CHCRs), which covered 1.46 gigabase pairs (Gb) and 6882 SVs supported by at least one diploid haplotype assembly. Establishing high-confidence SV calls for a benchmark sample that has been characterized by multiple technologies provides a valuable resource for investigating SVs in human biology, disease, and clinical research.
- Published
- 2022
- Full Text
- View/download PDF
29. Two-stage lesion detection approach based on dimension-decomposition and 3D context
- Author
-
Jingyi Chen, Haiwei Pan, Tao Jin, Jiacheng Jiao, Yang Dong, and Chunling Chen
- Subjects
Multidisciplinary ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Inference ,Context (language use) ,Pattern recognition ,Convolutional neural network ,Dimension (vector space) ,Bounding overwatch ,Computer-aided diagnosis ,False positive paradox ,Sensitivity (control systems) ,Artificial intelligence ,business - Abstract
Lesion detection in Computed Tomography (CT) images is a challenging task in the field of computer-aided diagnosis. An important issue is to locate the area of lesion accurately. As a branch of Convolutional Neural Networks (CNNs), 3D Context-Enhanced (3DCE) frameworks are designed to detect lesions on CT scans. The False Positives (FPs) detected in 3DCE frameworks are usually caused by inaccurate region proposals, which slow down the inference time. To solve the above problems, a new method is proposed, a dimension-decomposition region proposal network is integrated into 3DCE framework to improve the location accuracy in lesion detection. Without the restriction of "anchors" on ratios and scales, anchors are decomposed to independent "anchor strings". Anchor segments are dynamically combined in accordance with probability, and anchor strings with different lengths dynamically compose bounding boxes. Experiments show that the accurate region proposals generated by our model promote the sensitivity of FPs and spend less inference time compared with the current methods.
- Published
- 2022
- Full Text
- View/download PDF
30. Imaging of Patients Suspected of SLAP Tear: A Cost-Effectiveness Study
- Author
-
Soterios Gyftopoulos, Jordan Conroy, Naveen Subhas, James Koo, Anthony Miniaci, and Morgan H. Jones
- Subjects
Adult ,Male ,medicine.medical_specialty ,Cost effectiveness ,Cost-Benefit Analysis ,Sensitivity and Specificity ,Mr arthrography ,Shoulder pathology ,medicine ,False positive paradox ,Humans ,Radiology, Nuclear Medicine and imaging ,Arthrography ,Sensitivity analyses ,Shoulder Joint ,business.industry ,General Medicine ,Gold standard (test) ,medicine.disease ,Magnetic Resonance Imaging ,Female ,Radiology ,Arthrogram ,Shoulder Injuries ,business ,SLAP tear - Abstract
Background: Superior labral anterior-to-posterior (SLAP) tears are a common shoulder pathology. While MRI is the imaging gold standard for diagnosis of this pathology, the cost-effectiveness of the common MRI strategies is unclear. Objective: The primary objective of our study was to determine the cost-effectiveness of the common MRI-based strategies used for the diagnosis of SLAP tears. Methods: We created decision analytic models from the U.S. health care system perspective over a two-year time horizon for a hypothetical patient population of 25-year-olds with a previous diagnosis of SLAP tear. We used the decision models to compare the differences in incremental cost-effectiveness of the common MRI strategies and resulting treatment applied for this patient type, which included combinations of 1.5T and 3T imaging and unenhanced MRI and MR arthrogram protocols. Input data on cost, probability, and utility estimates were obtained through a comprehensive literature search. The primary effectiveness outcome was quality-adjusted life years (QALY). Costs were estimated in 2017 U.S. dollars. Results: When all imaging strategies were considered, the unenhanced 3T MRI based imaging strategy was the preferred and dominant option over 3T MR arthrography (MRA) and 1.5T imaging (MRI/MRA). When the model was run without 3T imaging as an option, 1.5T MRA was the favored option. Probabilistic sensitivity analyses confirmed the same preferred imaging strategy results. Conclusion: An unenhanced 3T MRI based strategy is the most cost-effective imaging option for patients with suspected SLAP tear. When 3T imaging is not available, 1.5T MRA is more cost-effective than 1.5T imaging. The main driver of these results is the fact that 3T MRI and 1.5T MRA are the most specific tests in these respective scenarios which results in fewer false positives and prevents unnecessary surgeries leading to decreased costs. Clinical Impact: Our cost-effectiveness model findings complement prior diagnostic accuracy work, helping produce a more comprehensive approach to define imaging utility for the SLAP patient population for radiologists, clinicians, and patients who have access to various types of MRI options.
- Published
- 2022
- Full Text
- View/download PDF
31. Factors Influencing the False Positive Rate in CT Lung Cancer Screening
- Author
-
Mark M. Hammer, Chung Yin Kong, and Suzanne C. Byrne
- Subjects
medicine.medical_specialty ,Lung Neoplasms ,Logistic regression ,Article ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Internal medicine ,medicine ,False positive paradox ,Humans ,Mass Screening ,Radiology, Nuclear Medicine and imaging ,Lung cancer ,Lung ,Early Detection of Cancer ,Retrospective Studies ,business.industry ,Retrospective cohort study ,Odds ratio ,medicine.disease ,nervous system diseases ,030220 oncology & carcinogenesis ,Income level ,False positive rate ,Tomography, X-Ray Computed ,business ,Lung cancer screening - Abstract
Purpose To identify factors influencing the likelihood of a false positive lung cancer screening (LCS) computed tomography (CT), which may lead to increased costs and patient anxiety. Materials and Methods In this retrospective study, we examined all LCS CTs performed across our healthcare network from 2014 to 2018, recording Lung-RADS category and diagnosis of lung cancer. A false positive was defined by Lung-RADS 3-4X and no diagnosis of lung cancer within 1 year. Patient demographics and smoking history, presence of emphysema, diagnosis of chronic obstructive pulmonary disease, radiologist years of experience and annual volume, income level by patient zip code, and screening institution were evaluated in a multivariate logistic regression model for false positive exams. Results A total of 5835 LCS CTs were included from 3735 patients. Lung cancer was diagnosed in 142 cases (2%). Of the LCS CTs, 905 (16%) were positive by Lung-RADS, and 766 (13%) represented false positives. Logistic regression analysis showed that screening institution (odds ratios [OR] 0.91 – 2.43), baseline scan (OR 1.43), radiologist experience (OR 0.59), patient age (OR 2.08), diagnosis of chronic obstructive pulmonary disease (OR 1.34), presence of emphysema (OR 1.32), and income level (OR 0.43) were significant predictors of false positives. Conclusion A number of patient-specific and site/radiologist-specific factors influence the false positive rate in CT LCS. In particular, radiologists with less experience had a higher false positive rate. Screening programs may wish to develop quality assurance programs to compare the false positive rates of their radiologists to national benchmarks.
- Published
- 2022
- Full Text
- View/download PDF
32. Explaining Static Analysis With Rule Graphs
- Author
-
Lisa Nguyen Quang Do and Eric Bodden
- Subjects
Computer science ,business.industry ,020207 software engineering ,Static program analysis ,02 engineering and technology ,Static analysis ,Task (project management) ,Set (abstract data type) ,Taint checking ,Simple (abstract algebra) ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,False positive paradox ,Software engineering ,business ,Software - Abstract
As static data-flow analysis becomes able to report increasingly complex bugs, using an evergrowing set of complex internal rules encoded into flow functions, the analysis tools themselves grow more and more complex. In result, for users to be able to effectively use those tools on specific codebases, they require special configurations—a task which in industry is typically performed by individual developers or dedicated teams. To efficiently use and configure static analysis tools, developers need to build a certain understanding of the analysis' rules, i.e., how the underlying analyses interpret the analyzed code and their reasoning for reporting certain warnings. In this article, we explore how to assist developers in understanding the analysis' warnings, and finding weaknesses in the analysis' rules. To this end, we introduce the concept of rule graphs that expose to the developer selected information about the internal rules of data-flow analyses. We have implemented rule graphs on top of a taint analysis, and show how the graphs can support the abovementioned tasks. Our user study and empirical evaluation show that using rule graphs helps developers understand analysis warnings more accurately than using simple warning traces, and that rule graphs can help developers identify causes for false positives in analysis rules.
- Published
- 2022
- Full Text
- View/download PDF
33. A New Family of Similarity Measures for Scoring Confidence of Protein Interactions Using Gene Ontology
- Author
-
Madhusudan Paul and Ashish Anand
- Subjects
Gene ontology ,Computer science ,Applied Mathematics ,Low Confidence ,Computational Biology ,Proteins ,Computational biology ,Bottleneck ,Protein–protein interaction ,Gene Ontology ,Similarity (network science) ,Taxonomy (general) ,Protein Interaction Mapping ,Genetics ,False positive paradox ,Cluster Analysis ,Fraction (mathematics) ,Cluster analysis ,Set (psychology) ,Gene ,Algorithms ,Biotechnology - Abstract
The large-scale protein-protein interaction (PPI) data has the potential to play a significant role in the endeavor of understanding cellular processes. However, the presence of a considerable fraction of false positives is a bottleneck in realizing this potential. There have been continuous efforts to utilize complementary resources for scoring confidence of PPIs in a manner that false positive interactions get a low confidence score. Gene Ontology (GO), a taxonomy of biological terms to represent the properties of gene products and their relations, has been widely used for this purpose. We utilize GO to introduce a new set of specificity measures: Relative Depth Specificity (RDS), Relative Node-based Specificity (RNS), and Relative Edge-based Specificity (RES), leading to a new family of similarity measures. We use these similarity measures to obtain a confidence score for each PPI. We evaluate the new measures using four different benchmarks. We show that all the three measures are quite effective. Notably, RNS and RES more effectively distinguish true PPIs from false positives than the existing alternatives. RES also shows a robust set-discriminating power and can be useful for protein functional clustering as well.
- Published
- 2022
- Full Text
- View/download PDF
34. Deep learning enhancement on mammogram images for breast cancer detection
- Author
-
Ashok Kumar Sahoo, Chaitanya Singla, Pramod K. Singh, and Pradeepta Kumar Sarangi
- Subjects
010302 applied physics ,Pixel ,business.industry ,Computer science ,Process (computing) ,Cancer ,Pattern recognition ,02 engineering and technology ,021001 nanoscience & nanotechnology ,medicine.disease ,01 natural sciences ,Thresholding ,Breast cancer ,0103 physical sciences ,medicine ,False positive paradox ,Segmentation ,Noise (video) ,Artificial intelligence ,0210 nano-technology ,business - Abstract
The mortality rate due to breast cancer can be controlled if the process of classification of breast lesions is done correctly which could be either malignant or benign. With the help of this process, we can reduce this rate up to a great extent. It has been observed that this process seems to be complicated. The reason behind this is the presence of errors in the detection of noise pixels as false positives. Mammogram images play a vital role in the process of breast cancer examination. These give the indication of cancer which needs to be targeted. A proper enhancement in them is required which will further help to reduce the complications. A mammogram is a poor quality image that requires an improvement in order to be better defined. The viability of the pre-processing techniques for amplifying a mammogram image is eliminated by performance metrics. Good filtering process with high PSNR and low MSE value. The suggested techniques have been implemented in the Mammographic Image Analysis Society. It contains a huge number of images i.e. 322. The segmentation technique used is thresholding, which is applied to the enhanced image. It helps in achieving the required results.
- Published
- 2022
- Full Text
- View/download PDF
35. Cosaliency Detection and Region-of-Interest Extraction via Manifold Ranking and MRF in Remote Sensing Images
- Author
-
Libao Zhang and Hanlin Wu
- Subjects
Schema (genetic algorithms) ,Markov random field ,Region of interest ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Fuse (electrical) ,False positive paradox ,General Earth and Planetary Sciences ,Function (mathematics) ,Electrical and Electronic Engineering ,Thresholding ,Energy (signal processing) ,Remote sensing - Abstract
Saliency-based region-of-interest (ROI) extraction is significant for the interpretation of remote sensing images (RSIs). Recently, cosaliency detection has shown its superiority of better extraction of common ROIs by using both intraimage and interimage cues. However, most existing methods still suffer from the complex backgrounds of RSIs, resulting in incomplete ROI extraction, many false positives, and blurred boundaries. In this article, we propose a cosaliency detection framework via manifold ranking and the Markov random field (MRF) for RSIs to address these problems. First, we design a two-stage manifold ranking schema for converting single-image saliency maps (SISMs) to multi-image saliency maps (MISMs). This step takes full advantage of the correlation between images to improve the integrity of ROIs and reduce false positives. Second, we locally fuse saliency proposals by minimizing the energy function in an MRF. The design of the energy function comprehensively considers the global and local performance of saliency proposals to assign appropriate fusion weights. Finally, we generate the ROI masks by thresholding the cosaliency maps. Our approach is evaluated on four RSI datasets and compared to the state-of-the-art methods. Experimental results demonstrate the effectiveness of our model in both cosaliency detection and ROI extraction.
- Published
- 2022
- Full Text
- View/download PDF
36. Quantifying Benefits and Harms of Lung Cancer Screening in an Underserved Population: Results From a Prospective Study
- Author
-
Freda Patterson, Grace X. Ma, Simran Randhawa, Mark Weir, Rachel Kim, and Cherie P. Erkmen
- Subjects
Pulmonary and Respiratory Medicine ,medicine.medical_specialty ,Lung Neoplasms ,030204 cardiovascular system & hematology ,Vulnerable Populations ,Article ,03 medical and health sciences ,Underserved Population ,0302 clinical medicine ,Internal medicine ,medicine ,False positive paradox ,Humans ,Mass Screening ,Prospective Studies ,Prospective cohort study ,Lung cancer ,Early Detection of Cancer ,Health Equity ,business.industry ,nutritional and metabolic diseases ,Cancer ,General Medicine ,medicine.disease ,Annual Screening ,Treatment Outcome ,030228 respiratory system ,Surgery ,National Lung Screening Trial ,Cardiology and Cardiovascular Medicine ,business ,Lung cancer screening - Abstract
OBJECTIVE: Lung cancer screening with annual Low-Dose CT (LDCT) reduces lung cancer death by 20-26%. However, potential harms of screening include false positive results, procedures from false positives, procedural complications and failure to adhere to follow-up recommendations. In diverse, underserved populations, it is unknown if benefits of early lung cancer detection outweigh harms. METHODS: We conducted a prospective observational study of lung cancer screening participants in an urban, safety-net institution from September 2014 to June 2020. We measured benefits of screening in terms of cancer diagnosis, stage and treatment. We measured harms of screening by calculating false positive rate, procedures as a result of false positive screens, procedural complications and failure to follow-up with recommended care. Of patients with 3-year follow up, we measured these same outcomes in addition to compliance with annual screening. RESULTS: Of 1509 participants, 55.6% were African American, 35.2% White, 8.1% Hispanic and 0.5% Asian. Screening resulted in cancer detection and treatment in 2.8%. False positive and procedure as a result of a false positive occurred in 9.2% and 0.8% of participants, respectively with no major complications from diagnostic procedures or treatment. Adherence to annual screening was low, 18.7%, 3.7% and 0.4% at 1, 2 and 3 years after baseline screening respectively. CONCLUSIONS: Multidisciplinary lung cancer screening in a safety-net institution can successfully detect and treat lung cancer with few harms of false positive screens, procedure after false positive screens and major complications. However, adherence to annual screening is poor.
- Published
- 2022
- Full Text
- View/download PDF
37. Novel Computer-Aided Diagnosis Software for the Prevention of Retained Surgical Items
- Author
-
Shun Yamaguchi, Masaaki Hidaka, Shin Hamauzu, Shinichiro Ono, Masahiko Yamada, Susumu Eguchi, Toru Fukuda, Toshiyuki Tsurumoto, Masataka Uetani, and Akihiko Soyama
- Subjects
medicine.medical_specialty ,Training set ,Software ,business.industry ,Computer-aided diagnosis ,Surgical Sponges ,False positive paradox ,Medicine ,Surgery ,Software performance testing ,Radiology ,Retained Surgical Items ,business - Abstract
Background Retained surgical items are a serious human error. Surgical sponges account for 70% of retained surgical items. To prevent retained surgical sponges, it is important to establish a system that can identify errors and avoid the occurrence of adverse events. To date, no computer-aided diagnosis software specialized for detecting retained surgical sponges has been reported. We developed a software program that enables easy and effective computer-aided diagnosis of retained surgical sponges with high sensitivity and specificity using the technique of deep learning, a subfield of artificial intelligence. Study Design In this study, we developed the software by training it through deep learning using a dataset and then validating the software. The dataset consisted of a training set and validation set. We created composite x-rays consisting of normal postoperative x-rays and surgical sponge x-rays for a training set (n = 4,554) and a validation set (n = 470). Phantom x-rays (n = 12) were prepared for software validation. X-rays obtained with surgical sponges inserted into cadavers were used for validation purposes (formalin: Thiel's method = 252:117). In addition, postoperative x-rays without retained surgical sponges were used for the validation of software performance to determine false-positive rates. Sensitivity, specificity, and false positives per image were calculated. Results In the phantom x-rays, both the sensitivity and specificity in software image interpretation were 100%. The software achieved 97.7% sensitivity and 83.8% specificity in the composite x-rays. In the normal postoperative x-rays, 86.6% specificity was achieved. In reading the cadaveric x-rays, the software attained both sensitivity and specificity of >90%. Conclusions Software with high sensitivity for diagnosis of retained surgical sponges was developed successfully.
- Published
- 2021
- Full Text
- View/download PDF
38. Fish forewarning of comprehensive toxicity in water environment based on Bayesian sequential method
- Author
-
Jiang Jie, Heyu Xiang, Li Tang, Mei Ma, Liu Yong, Wei Wang, Xin Zhang, Yiping Xu, Kaifeng Rao, Tang Liang, and Zijian Wang
- Subjects
Environmental Engineering ,Computer science ,Bayesian probability ,Reproducibility of Results ,Water ,Bayes Theorem ,General Medicine ,computer.software_genre ,Wavelet ,Feature (computer vision) ,Outlier ,False positive paradox ,Water environment ,Animals ,Environmental Chemistry ,Anomaly detection ,Data mining ,Environmental noise ,computer ,Algorithms ,Probability ,General Environmental Science - Abstract
Environmental impact of pollutants can be analyzed effectively by acquiring fish behavioral signals in water with biological behavior sensors. However, a variety of factors, such as the complexity of biological organisms themselves, the device error and the environmental noise, may compromise the accuracy and timeliness of model predictions. The current methods lack prior knowledge about the fish behavioral signals corresponding to characteristic pollutants, and in the event of a pollutant invasion, the fish behavioral signals are poorly discriminated. Therefore, we propose a novel method based on Bayesian sequential, which utilizes multi-channel prior knowledge to calculate the outlier sequence based on wavelet feature followed by calculating the anomaly probability of observed values. Furthermore, the relationship between the anomaly probability and toxicity is analyzed in order to achieve forewarning effectively. At last, our algorithm for fish toxicity detection is verified by integrating the data on laboratory acceptance of characteristic pollutants. The results show that only one false positive occurred in the six experiments, the present algorithm is effective in suppressing false positives and negatives, which increases the reliability of toxicity detections, and thereby has certain applicability and universality in engineering applications.
- Published
- 2021
- Full Text
- View/download PDF
39. Predicting risk of satellite collisions using machine learning
- Author
-
Michal Myller, Lukasz Tulczyjew, Jakub Nalepa, Daniel Kostrzewa, and Michal Kawulok
- Subjects
Collision avoidance (spacecraft) ,Spacecraft ,Computer science ,business.industry ,Process (computing) ,Aerospace Engineering ,Space (commercial competition) ,Machine learning ,computer.software_genre ,Collision risk ,Task (project management) ,False positive paradox ,Satellite ,Artificial intelligence ,Safety, Risk, Reliability and Quality ,business ,computer - Abstract
Active collision avoidance has become an important task in space operations nowadays, and hundreds of alerts corresponding to close encounters of a satellite and other space objects are typically issued for a satellite in Low Earth Orbit every week. Such alerts are provided in the form of conjunction data messages, and only about two actionable alerts per spacecraft and week remain to be resolved after analyzing all cases. Therefore, building fully automated techniques for predicting the collision risk can help make the process of avoiding collisions less costly, as the number of false positives could be substantially reduced. In this paper, we present our machine learning techniques for this task which we exploited in the Collision Avoidance Challenge organized by the European Space Agency, in which we took the seventh place as the DunderMifflin Team (out of 97 registered participants).
- Published
- 2021
- Full Text
- View/download PDF
40. An Alert to Possible False Positives With a Commercial Assay for MET Exon 14 Skipping
- Author
-
Takashi Teishikata, Yasushi Yatabe, Yuki Shinno, Takako Ishiyama, Jumpei Kashima, Yoshihisa Kobayashi, Taisuke Mori, Tatsuya Yoshida, and Kouya Shiraishi
- Subjects
Pulmonary and Respiratory Medicine ,Oncology ,medicine.medical_specialty ,Lung Neoplasms ,Capmatinib ,business.industry ,Exons ,Proto-Oncogene Proteins c-met ,Clinical trial ,Exon ,Splice Donor Site ,Carcinoma, Non-Small-Cell Lung ,Internal medicine ,Mutation ,False positive paradox ,Humans ,Medicine ,In patient ,business - Abstract
Introduction Because molecular-targeted drugs against MET exon 14 (METex14) skipping have been approved, molecular testing of the alteration has added to clinical guidelines. There are several such assays, but methodological issues have been reported. Methods METex14 skipping results from three assays (Oncomine DxTT, ArcherMET, and laboratory-developed reverse-transcriptase polymerase chain reaction test [LDT RT-PCR]) were compared in a relatively small series of the specimens diagnosed as advanced NSCLC (n = 50). Results The ArcherMET and LDT RT-PCR results were identical for all 50 samples, but eight samples had discordant results between Oncomine DxTT and the other two assays. All eight samples had METex14 skipping with Oncomine DxTT and wild-type signals with ArcherMET and LDT RT-PCR. The discordance might be caused by the homopolymeric error of the splice donor site with Oncomine DxTT, and false positives could be distinguished by relatively low read counts. Conclusions Although the caution in detecting METex14 skipping focuses on false negatives in the literature, false positives were first noted at a relatively high frequency (8 of 26, 30.8%) in this study. According to the results of previous clinical trials using the other tyrosine kinase inhibitors, it could be surmised that MET inhibitor treatment in patients without METex14 skipping is detrimental. Clinicians need to be alert to the false positives that can lead to harmful treatments.
- Published
- 2021
- Full Text
- View/download PDF
41. PLSDB: advancing a comprehensive database of bacterial plasmids
- Author
-
Fabian Kern, Anna Hartung, Rolf Müller, Tobias Fehlmann, Georges Pierre Schmartz, Andreas Keller, and Pascal Hirsch
- Subjects
AcademicSubjects/SCI00010 ,Firmicutes ,Biology ,computer.software_genre ,User-Computer Interface ,Upload ,Databases, Genetic ,Proteobacteria ,Genetics ,False positive paradox ,Database Issue ,Preprocessor ,Relevance (information retrieval) ,computer.programming_language ,Internet ,Bacteria ,Virulence ,Database ,Application programming interface ,Bacteroidetes ,Drug Resistance, Microbial ,Molecular Sequence Annotation ,Python (programming language) ,Actinobacteria ,Workflow ,Metagenomics ,Spirochaetales ,computer ,Tenericutes ,Plasmids - Abstract
Plasmids are known to contain genes encoding for virulence factors and antibiotic resistance mechanisms. Their relevance in metagenomic data processing is steadily growing. However, with the increasing popularity and scale of metagenomics experiments, the number of reported plasmids is rapidly growing as well, amassing a considerable number of false positives due to undetected misassembles. Here, our previously published database PLSDB provides a reliable resource for researchers to quickly compare their sequences against selected and annotated previous findings. Within two years, the size of this resource has more than doubled from the initial 13,789 to now 34,513 entries over the course of eight regular data updates. For this update, we aggregated community feedback for major changes to the database featuring new analysis functionality as well as performance, quality, and accessibility improvements. New filtering steps, annotations, and preprocessing of existing records improve the quality of the provided data. Additionally, new features implemented in the web-server ease user interaction and allow for a deeper understanding of custom uploaded sequences, by visualizing similarity information. Lastly, an application programming interface was implemented along with a python library, to allow remote database queries in automated workflows. The latest release of PLSDB is freely accessible under https://www.ccb.uni-saarland.de/plsdb., Graphical Abstract Graphical AbstractPLSDB aggregates plasmid and meta data from different public resources. This data passes through several enhanced annotation and filtering steps. Finally, the information is presented in a web-server, providing two new functionalities.
- Published
- 2021
- Full Text
- View/download PDF
42. Combining First and Second-Tier Newborn Screening in a Single Assay Using High-Throughput Chip-Based Capillary Electrophoresis Coupled to High-Resolution Mass Spectrometry
- Author
-
Samantha L. Isenberg, C. Austin Pickens, Carla D. Cuthbert, and Konstantinos Petritis
- Subjects
Flow injection analysis ,Newborn screening ,Chromatography ,Chemistry ,Biochemistry (medical) ,Clinical Biochemistry ,Infant, Newborn ,Electrophoresis, Capillary ,Tandem mass spectrometry ,Mass spectrometry ,Chip ,Dried blood spot ,Neonatal Screening ,Capillary electrophoresis ,Tandem Mass Spectrometry ,Flow Injection Analysis ,False positive paradox ,Humans ,Dried Blood Spot Testing ,Biomarkers - Abstract
Background Most first-tier newborn screening (NBS) biomarkers are evaluated by a 2-min flow injection analysis coupled to tandem mass spectrometry (FIA-MS/MS) assay. The absence of separation prior to MS/MS analysis can lead to false positives and inconclusive results due to interferences by nominal isobars and isomers. Therefore, many presumptive positive specimens require confirmation by a higher specificity second-tier assay employing separations, which require additional time and resources prior to patient follow-up. Methods A 3.2-mm punch was taken from dried blood spot (DBS) specimens and extracted using a solution containing isotopically labeled internal standards for quantification. Analyses were carried out in positive mode using a commercially available microfluidic capillary electrophoresis (CE) system coupled to a high-resolution mass spectrometer (HRMS). Results The CE-HRMS platform quantified 35 first- and second-tier biomarkers from a single injection in Conclusions Our CE-HRMS assay is capable of multiplexing first- and second-tier NBS biomarkers into a single assay with an acquisition time of
- Published
- 2021
- Full Text
- View/download PDF
43. Differences in lesion interpretation between radiologists in two countries: Lessons from a digital breast tomosynthesis training test set
- Author
-
Phuong Dung Trieu, Tong Li, Ziba Gandomkar, Patrick C. Brennan, and Sarah J. Lewis
- Subjects
medicine.medical_specialty ,education ,Breast Neoplasms ,Normal case ,Lesion ,Breast cancer ,Radiologists ,medicine ,False positive paradox ,Humans ,Mammography ,Breast ,medicine.diagnostic_test ,business.industry ,Australia ,Lesion types ,General Medicine ,Digital Breast Tomosynthesis ,medicine.disease ,Oncology ,Test set ,Female ,Radiology ,medicine.symptom ,business - Abstract
INTRODUCTION In many western countries, there is good evidence documenting the performance of radiologists reading digital breast tomosynthesis (DBT) images. However, the diagnostic efficiency of Chinese radiologists using DBT, particularly type of errors being made and type of cancers being missed, is understudied. This study aims to investigate the pattern of diagnostic errors across different lesion types produced by Chinese radiologists diagnosing from DBT images. Australian radiologists will be used as a benchmark. METHODS Twelve Chinese radiologists read a DBT test set and located each perceived cancer lesion. True positives, false positives (FP), true negatives and false negatives (FN) were generated. The same test set was also read by 14 Australian radiologists. Z-scores and Pearson correlations were used to compare interpretation of lesions and identification of normal appearances between two groups of radiologists. RESULTS Architectural distortions (p
- Published
- 2021
- Full Text
- View/download PDF
44. Noninfectious influencers of early-onset sepsis biomarkers
- Author
-
Caterina Tiozzo and Sagori Mukhopadhyay
- Subjects
medicine.medical_specialty ,business.industry ,medicine.drug_class ,Antibiotics ,Infant, Newborn ,Inflammation ,medicine.disease ,Meconium Aspiration Syndrome ,Sepsis ,Early onset sepsis ,Immune system ,Surgical Procedures, Operative ,Hypoxia-Ischemia, Brain ,Pediatrics, Perinatology and Child Health ,medicine ,False positive paradox ,Humans ,Biomarker (medicine) ,Neonatal Sepsis ,medicine.symptom ,business ,Intensive care medicine ,Biomarkers ,Infectious agent - Abstract
Diagnostic tests for sepsis aim to either detect the infectious agent (such as microbiological cultures) or detect host markers that commonly change in response to an infection (such as C-reactive protein). The latter category of tests has advantages compared to culture-based methods, including a quick turnaround time and in some cases lower requirements for blood samples. They also provide information on the immune response of the host, a critical determinant of clinical outcome. However, they do not always differentiate nonspecific host inflammation from true infection and can inadvertently lead to antibiotic overuse. Multiple noninfectious conditions unique to neonates in the first days after birth can lead to inflammatory marker profiles that mimic those seen among infected infants. Our goal was to review noninfectious conditions and patient characteristics that alter host inflammatory markers commonly used for the diagnosis of early-onset sepsis. Recognizing these conditions can focus the use of biomarkers on patients most likely to benefit while avoiding scenarios that promote false positives. We highlight approaches that may improve biomarker performance and emphasize the need to use patient outcomes, in addition to conventional diagnostic performance analysis, to establish clinical utility.
- Published
- 2021
- Full Text
- View/download PDF
45. FA-YOLO: An Improved YOLO Model for Infrared Occlusion Object Detection under Confusing Background
- Author
-
Pin Zhang, Peng Xiang, Shuangjiang Du, Baofu Zhang, and Hong Xue
- Subjects
Technology ,Article Subject ,Computer Networks and Communications ,Infrared ,Computer science ,business.industry ,TK5101-6720 ,Object detection ,Field (computer science) ,Telecommunication ,False positive paradox ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Focus (optics) ,business ,Transfer of learning ,F1 score ,Information Systems ,Block (data storage) - Abstract
Infrared target detection is a popular applied field in object detection as well as a challenge. This paper proposes the focus and attention mechanism-based YOLO (FA-YOLO), which is an improved method to detect the infrared occluded vehicles in the complex background of remote sensing images. Firstly, we use GAN to create infrared images from the visible datasets to make sufficient datasets for training as well as using transfer learning. Then, to mitigate the impact of the useless and complex background information, we propose the negative sample focusing mechanism to focus on the confusing negative sample training to depress the false positives and increase the detection precision. Finally, to enhance the features of the infrared small targets, we add the dilated convolutional block attention module (dilated CBAM) to the CSPdarknet53 in the YOLOv4 backbone. To verify the superiority of our model, we carefully select 318 infrared occluded vehicle images from the VIVID-infrared dataset for testing. The detection accuracy-mAP improves from 79.24% to 92.95%, and the F1 score improves from 77.92% to 88.13%, which demonstrates a significant improvement in infrared small occluded vehicle detection.
- Published
- 2021
- Full Text
- View/download PDF
46. AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes
- Author
-
Yang Li, Ning Jiang, and Yanni Sun
- Subjects
Genome evolution ,Physiology ,Computer science ,Arabidopsis ,Reproducibility of Results ,Guidelines as Topic ,Oryza ,Plant Science ,Computational biology ,Plant genomes ,Pipeline (software) ,Genome ,Annotation ,Genetics ,False positive paradox ,Sine ,Hidden Markov model ,Data Curation ,Genome, Plant ,Short Interspersed Nucleotide Elements - Abstract
Short interspersed nuclear elements (SINEs) are a widespread type of small transposable element (TE). With increasing evidence for their impact on gene function and genome evolution in plants, accurate genome-scale SINE annotation becomes a fundamental step for studying the regulatory roles of SINEs and their relationship with other components in the genomes. Despite the overall promising progress made in TE annotation, SINE annotation remains a major challenge. Unlike some other TEs, SINEs are short and heterogeneous, and they usually lack well-conserved sequence or structural features. Thus, current SINE annotation tools have either low sensitivity or high false discovery rates. Given the demand and challenges, we aimed to provide a more accurate and efficient SINE annotation tool for plant genomes. The pipeline starts with maximizing the pool of SINE candidates via profile hidden Markov model-based homology search and de novo SINE search using structural features. Then, it excludes the false positives by integrating all known features of SINEs and the features of other types of TEs that can often be misannotated as SINEs. As a result, the pipeline substantially improves the tradeoff between sensitivity and accuracy, with both values close to or over 90%. We tested our tool in Arabidopsis thaliana and rice (Oryza sativa), and the results show that our tool competes favorably against existing SINE annotation tools. The simplicity and effectiveness of this tool would potentially be useful for generating more accurate SINE annotations for other plant species. The pipeline is freely available at https://github.com/yangli557/AnnoSINE.
- Published
- 2021
- Full Text
- View/download PDF
47. Diagonistic Value of ARFI in Breast Lesions
- Author
-
J. Jayapriya and S. Arul Murugan
- Subjects
Alternative methods ,medicine.medical_specialty ,medicine.diagnostic_test ,business.industry ,medicine.disease ,Fibroadenoma ,Palpation ,Breast cancer ,Biopsy ,medicine ,False positive paradox ,Radiology ,Elastography ,Tissue stiffness ,skin and connective tissue diseases ,business - Abstract
Breast cancer became the most prominent cancer type in women worldwide. Its prevalence increased in recent years due to changes in life style and relapse among the patients seemed to be higher. Acoustic radiation force impulse (ARFI) imaging in based on the principle of the ultrasonic elasticity and the elestography accurately predict and measure the changes in breast cancer tissue compared to the normal tissue. It is a technical alternative to the palpation and able to measure lesser than 10 mm size. In contrast to biopsy, where the reduced deformability would occur and lead to biopsy failing. In fibroadenoma, due to its complications, many false positives could be detected and the ARFI elastography serve as an effective alternative method for breast cancer confirmation. The tissue stiffness index value is used to differentiate the benign and malignant tissue samples. ARFI further, use B- mode elasticity and help in recommending the biopsy confirmation.
- Published
- 2021
- Full Text
- View/download PDF
48. The utility of high‐risk human papillomavirus in situ hybridization in cytology cell block material from cystic head and neck lesions
- Author
-
Sarah M. Calkins, Tara A. Saunders, and Lucy M. Han
- Subjects
Cancer Research ,Pathology ,medicine.medical_specialty ,In situ hybridization ,Alphapapillomavirus ,Metastasis ,hemic and lymphatic diseases ,Cytology ,Biomarkers, Tumor ,medicine ,False positive paradox ,Humans ,Papillomaviridae ,neoplasms ,Lymph node ,Cyclin-Dependent Kinase Inhibitor p16 ,In Situ Hybridization ,Cell block ,P16 immunohistochemistry ,business.industry ,Papillomavirus Infections ,virus diseases ,medicine.disease ,medicine.anatomical_structure ,Oncology ,Head and Neck Neoplasms ,DNA, Viral ,Carcinoma, Squamous Cell ,Immunohistochemistry ,business - Abstract
BACKGROUND Human papillomavirus-related oropharyngeal squamous cell carcinoma (HPV-OPSCC) presents frequently as metastasis in a neck lymph node that may be cystic or necrotic. Fine-needle aspiration (FNA) biopsies are often first-line diagnostic procedures. p16 immunohistochemistry (IHC) is a surrogate marker for high-risk HPV (hrHPV) infection but can be challenging to interpret. This study evaluated the use of hrHPV in situ hybridization (ISH) in cytology cell blocks of cystic neck lesions. METHODS Twenty-four FNA cases with cell blocks and surgical correlates were evaluated. p16 IHC and hrHPV ISH were assessed on cell blocks (C-p16 and C-hrHPV ISH), and hrHPV ISH on surgical samples (S-hrHPV ISH). All results were classified as negative, positive, or equivocal. RESULTS Two cases were excluded because of insufficient tissue on recut. On the basis of C-hrHPV ISH cases, 12 were positive, 5 were negative, and 5 were equivocal. All 12 positive C-hrHPV ISH cases had concordant S-hrHPV ISH with no false positives. Of the 5 negative C-hrHPV ISH cases, 4 had concordant S-hrHPV ISH, and 1 had a discordant S-hrHPV ISH. Of the 5 equivocal C-hrHPV ISH cases, S-hrHPV ISH were both positive and negative. Fourteen cases were equivocal by C-p16; 9 cases were reliably classified by C-hrHPV ISH (5 positive, 4 negative; 64%). CONCLUSIONS C-hrHPV ISH can be reliably used, especially when positive. A negative or equivocal interpretation of C-hrHPV ISH may warrant repeat testing. Compared to C-p16, C-hrHPV ISH is more frequently diagnostic and could be helpful for HPV-OSCC diagnosis and management.
- Published
- 2021
- Full Text
- View/download PDF
49. Quantitative Hydrogen/Deuterium Exchange Mass Spectrometry
- Author
-
Yoshitomo Hamuro
- Subjects
Sequence ,Deuterium ,Resolution (mass spectrometry) ,Structural Biology ,Chemistry ,Test data generation ,False positive paradox ,Centroid ,Hydrogen–deuterium exchange ,Scale (descriptive set theory) ,Algorithm ,Spectroscopy - Abstract
This Account describes considerations for the data generation, data analysis, and data interpretation of a hydrogen/deuterium exchange-mass spectrometry (HDX-MS) experiment to have a quantitative argument. Although HDX-MS has gained its popularity as a biophysical tool, the argument from its data often remains qualitative. To generate HDX-MS data that are more suitable for a quantitative argument, the sequence coverage and sequence resolution should be optimized during the feasibility stage, and the time window coverage and time window resolution should be improved during the HDX stage. To extract biophysically meaningful values for a certain perturbation from medium-resolution HDX-MS data, there are two major ways: (i) estimating the area between the two deuterium buildup curves using centroid values with and without the perturbation when plotted against log time scale and (ii) dissecting into multiple single-exponential curves using the isotope envelopes. To have more accurate arguments for an HDX-MS perturbation study, (i) false negatives due to sequence coverage, (ii) false negatives due to time window coverage, (iii) false positives due to sequence resolution, and (iv) false positives due to allosteric effects should be carefully examined.
- Published
- 2021
- Full Text
- View/download PDF
50. Comparison of SARS-CoV-2 Antigen Tests in Asymptomatic Testing of Passengers at German Airports under Time Constraints: Application of Three Different Antigen Test Formats
- Author
-
Jennifer Hannen, Axel Schubert, Stephan Schaefer, Robert Knote, Peter Kleinow, Nikenza Viceconte, Christian Schölz, Laura Pradas, Jörg Hartkamp, Andreas Heuer, Volkmar Weckesser, Peter Bauer, and Leon Wille
- Subjects
Antigen ,business.industry ,Antigen assays ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,False positive paradox ,General Earth and Planetary Sciences ,Medicine ,medicine.symptom ,Antigen test ,business ,Virology ,Asymptomatic ,General Environmental Science - Abstract
People infected asymptomatically with SARS-CoV-2 can spread the virus very efficiently. To break infection chains, massive testing efforts are underway. While the value of RT-PCR in asymptomatic patients is established, point-of-care (POC) antigen tests against SARS-CoV-2 are considered inferior to RT-PCR in terms of sensitivity and specificity but have demonstrated utility, mostly in symptomatic patients. We compared the performance of three different antigen tests with colorimetric (Roche), fluorometric (Quidel Sofia 2), and instrument-based chemiluminescent (Fujirebio Lumipulse® G) readout. Sensitivities for Roche, Quidel, and Fujirebio were 62.5%, 90.9%, 97.5% (≤ct 26); 43.8%, 90.9%, 95.1% (≤ct 30); and 4.3%, 0.0%, 57.6% (˃ct 30), respectively. The two assays with increased sensitivity were employed to screen > 35,000 passengers at German airports under time constraints. Under real-world conditions, the rate of false positives was low: 0.15% (Quidel) and 0.06% for the instrument based Fujirebio assay. Our study exemplifies that antigen tests with enhanced detection methods have an acceptable sensitivity of >90% in samples containing SARS-CoV-2 RNA that are considered to be infectious. Therefore, our results support the view of the WHO that discourages the use of antigen assays with a sensitivity of “only” 80% for screening travelers.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.