75 results on '"Scott Levy"'
Search Results
2. Figure S1 from Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Erica L. Carpenter, Arati S. Desai, Steven Brem, Andrew J. Cucchiara, Donald M. O'Rourke, Zev A. Binder, Jennifer J.D. Morrissette, MacLean P. Nasrallah, Stephanie S. Yee, Theresa Christensen, Samantha Guiry, Timothy Prior, Jasmin Hussain, Whitney Sarchiapone, Scott Levy, Jeffrey B. Ware, Jacob E. Till, Jazmine J. Mays, S. Ali Nabavizadeh, and Stephen J. Bagley
- Abstract
Plasma cell-free DNA (cfDNA) concentration (ng/mL) is correlated with (A), total radiographic tumor burden (contrast-enhancing tumor + T2/FLAIR non-enhancing tumor) and (B), contrast-enhancing tumor burden at the first post-radiation MRI scan in patients with newly diagnosed glioblastoma
- Published
- 2023
- Full Text
- View/download PDF
3. Table S2 from Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Erica L. Carpenter, Arati S. Desai, Steven Brem, Andrew J. Cucchiara, Donald M. O'Rourke, Zev A. Binder, Jennifer J.D. Morrissette, MacLean P. Nasrallah, Stephanie S. Yee, Theresa Christensen, Samantha Guiry, Timothy Prior, Jasmin Hussain, Whitney Sarchiapone, Scott Levy, Jeffrey B. Ware, Jacob E. Till, Jazmine J. Mays, S. Ali Nabavizadeh, and Stephen J. Bagley
- Abstract
Tissue fusion transcript panel gene coverage (55 genes)
- Published
- 2023
- Full Text
- View/download PDF
4. Data from Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Erica L. Carpenter, Arati S. Desai, Steven Brem, Andrew J. Cucchiara, Donald M. O'Rourke, Zev A. Binder, Jennifer J.D. Morrissette, MacLean P. Nasrallah, Stephanie S. Yee, Theresa Christensen, Samantha Guiry, Timothy Prior, Jasmin Hussain, Whitney Sarchiapone, Scott Levy, Jeffrey B. Ware, Jacob E. Till, Jazmine J. Mays, S. Ali Nabavizadeh, and Stephen J. Bagley
- Abstract
Purpose:The clinical utility of plasma cell-free DNA (cfDNA) has not been assessed prospectively in patients with glioblastoma (GBM). We aimed to determine the prognostic impact of plasma cfDNA in GBM, as well as its role as a surrogate of tumor burden and substrate for next-generation sequencing (NGS).Experimental Design:We conducted a prospective cohort study of 42 patients with newly diagnosed GBM. Plasma cfDNA was quantified at baseline prior to initial tumor resection and longitudinally during chemoradiotherapy. Plasma cfDNA was assessed for its association with progression-free survival (PFS) and overall survival (OS), correlated with radiographic tumor burden, and subjected to a targeted NGS panel.Results:Prior to initial surgery, GBM patients had higher plasma cfDNA concentration than age-matched healthy controls (mean 13.4 vs. 6.7 ng/mL, P < 0.001). Plasma cfDNA concentration was correlated with radiographic tumor burden on patients' first post-radiation magnetic resonance imaging scan (ρ = 0.77, P = 0.003) and tended to rise prior to or concurrently with radiographic tumor progression. Preoperative plasma cfDNA concentration above the mean (>13.4 ng/mL) was associated with inferior PFS (median 4.9 vs. 9.5 months, P = 0.038). Detection of ≥1 somatic mutation in plasma cfDNA occurred in 55% of patients and was associated with nonstatistically significant decreases in PFS (median 6.0 vs. 8.7 months, P = 0.093) and OS (median 5.5 vs. 9.2 months, P = 0.053).Conclusions:Plasma cfDNA may be an effective prognostic tool and surrogate of tumor burden in newly diagnosed GBM. Detection of somatic alterations in plasma is feasible when samples are obtained prior to initial surgical resection.
- Published
- 2023
- Full Text
- View/download PDF
5. Table S3 from Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Erica L. Carpenter, Arati S. Desai, Steven Brem, Andrew J. Cucchiara, Donald M. O'Rourke, Zev A. Binder, Jennifer J.D. Morrissette, MacLean P. Nasrallah, Stephanie S. Yee, Theresa Christensen, Samantha Guiry, Timothy Prior, Jasmin Hussain, Whitney Sarchiapone, Scott Levy, Jeffrey B. Ware, Jacob E. Till, Jazmine J. Mays, S. Ali Nabavizadeh, and Stephen J. Bagley
- Abstract
Circulating tumor DNA next-generation sequencing gene coverage (74 genes)
- Published
- 2023
- Full Text
- View/download PDF
6. Table S4 from Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Erica L. Carpenter, Arati S. Desai, Steven Brem, Andrew J. Cucchiara, Donald M. O'Rourke, Zev A. Binder, Jennifer J.D. Morrissette, MacLean P. Nasrallah, Stephanie S. Yee, Theresa Christensen, Samantha Guiry, Timothy Prior, Jasmin Hussain, Whitney Sarchiapone, Scott Levy, Jeffrey B. Ware, Jacob E. Till, Jazmine J. Mays, S. Ali Nabavizadeh, and Stephen J. Bagley
- Abstract
Complete list of variants detected in 20 subjects who had both tissue and plasma next generation sequencing
- Published
- 2023
- Full Text
- View/download PDF
7. Special Issue on Hot Interconnects 29
- Author
-
Scott Levy
- Subjects
Hardware and Architecture ,Electrical and Electronic Engineering ,Software - Published
- 2023
- Full Text
- View/download PDF
8. Variations of Classical and Nonclassical Ultrasound Nonlinearity Parameters during Heat-Induced Microstructural Evolution in an Iron-Copper Alloy
- Author
-
Laurence J. Jacobs, Katherine Marie Scott Levy, and Jin-Yeon Kim
- Subjects
010302 applied physics ,Heat induced ,Microstructural evolution ,Iron copper ,Materials science ,business.industry ,Mechanical Engineering ,Ultrasound ,Alloy ,engineering.material ,01 natural sciences ,Nonlinear system ,Mechanics of Materials ,0103 physical sciences ,engineering ,General Materials Science ,Composite material ,business ,010301 acoustics - Abstract
This research demonstrates and compares the potential of two nonlinear ultrasound techniques, second harmonic generation (SHG) and nonlinear resonant ultrasound spectroscopy (NRUS). This research examines a set of thermally aged iron-copper (Fe-1.0% Cu) alloy specimens, which are used as surrogate specimens for radiation damage. It is found that both SHG and NRUS are sensitive to the growth of the copper precipitates, while the changes in the respective nonlinearity parameters are due to different mechanisms. This research demonstrates the potential use of both of these nonlinear ultrasound techniques.
- Published
- 2021
- Full Text
- View/download PDF
9. Enhanced Fiber Tractography Using Edema Correction: Application and Evaluation in High-Grade Gliomas
- Author
-
Drew Parker, Ragini Verma, Timothy H. Lucas, Michael L. McGarvey, Mark A. Elliott, Fraser Henderson, Steven Brem, Wesley B Hodges, Ronald L. Wolf, Lisa Desiderio, Jessica Harsch, Lauren Karpf, Anupa Ambili Vijayakumari, Eileen Maloney-Wilensky, and Scott Levy
- Subjects
computer.software_genre ,03 medical and health sciences ,0302 clinical medicine ,Voxel ,Edema ,Fractional anisotropy ,Humans ,Medicine ,Diffusion Tractography ,medicine.diagnostic_test ,Brain Neoplasms ,business.industry ,Magnetic resonance imaging ,Glioma ,Magnetic Resonance Imaging ,Diffusion Tensor Imaging ,Research—Human—Clinical Studies ,030220 oncology & carcinogenesis ,Surgery ,Neurology (clinical) ,medicine.symptom ,business ,Functional magnetic resonance imaging ,Nuclear medicine ,computer ,030217 neurology & neurosurgery ,Diffusion MRI ,Tractography - Abstract
Background A limitation of diffusion tensor imaging (DTI)-based tractography is peritumoral edema that confounds traditional diffusion-based magnetic resonance metrics. Objective To augment fiber-tracking through peritumoral regions by performing novel edema correction on clinically feasible DTI acquisitions and assess the accuracy of the fiber-tracks using intraoperative stimulation mapping (ISM), task-based functional magnetic resonance imaging (fMRI) activation maps, and postoperative follow-up as reference standards. Methods Edema correction, using our bi-compartment free water modeling algorithm (FERNET), was performed on clinically acquired DTI data from a cohort of 10 patients presenting with suspected high-grade glioma and peritumoral edema in proximity to and/or infiltrating language or motor pathways. Deterministic fiber-tracking was then performed on the corrected and uncorrected DTI to identify tracts pertaining to the eloquent region involved (language or motor). Tracking results were compared visually and quantitatively using mean fiber count, voxel count, and mean fiber length. The tracts through the edematous region were verified based on overlay with the corresponding motor or language task-based fMRI activation maps and intraoperative ISM points, as well as at time points after surgery when peritumoral edema had subsided. Results Volume and number of fibers increased with application of edema correction; concordantly, mean fractional anisotropy decreased. Overlay with functional activation maps and ISM-verified eloquence of the increased fibers. Comparison with postsurgical follow-up scans with lower edema further confirmed the accuracy of the tracts. Conclusion This method of edema correction can be applied to standard clinical DTI to improve visualization of motor and language tracts in patients with glioma-associated peritumoral edema.
- Published
- 2021
- Full Text
- View/download PDF
10. Understanding Memory Failures on a Petascale Arm System
- Author
-
Kurt B. Ferreira, Scott Levy, Joshua Hemmert, and Kevin Pedretti
- Published
- 2022
- Full Text
- View/download PDF
11. Characterizing Memory Failures Using Benford’s Law
- Author
-
Kurt B. Ferreira and Scott Levy
- Published
- 2022
- Full Text
- View/download PDF
12. 'Smarter' NICs for faster molecular dynamics: a case study
- Author
-
Sara Karamati, Clayton Hughes, K. Scott Hemmert, Ryan E. Grant, W. Whit Schonbein, Scott Levy, Thomas M. Conte, Jeffrey Young, and Richard W. Vuduc
- Subjects
Performance (cs.PF) ,FOS: Computer and information sciences ,Computer Science - Performance ,Computer Science - Distributed, Parallel, and Cluster Computing ,Distributed, Parallel, and Cluster Computing (cs.DC) - Abstract
This work evaluates the benefits of using a "smart" network interface card (SmartNIC) as a compute accelerator for the example of the MiniMD molecular dynamics proxy application. The accelerator is NVIDIA's BlueField-2 card, which includes an 8-core Arm processor along with a small amount of DRAM and storage. We test the networking and data movement performance of these cards compared to a standard Intel server host using microbenchmarks and MiniMD. In MiniMD, we identify two distinct classes of computation, namely core computation and maintenance computation, which are executed in sequence. We restructure the algorithm and code to weaken this dependence and increase task parallelism, thereby making it possible to increase utilization of the BlueField-2 concurrently with the host. We evaluate our implementation on a cluster consisting of 16 dual-socket Intel Broadwell host nodes with one BlueField-2 per host-node. Our results show that while the overall compute performance of BlueField-2 is limited, using them with a modified MiniMD algorithm allows for up to 20% speedup over the host CPU baseline with no loss in simulation accuracy.
- Published
- 2022
- Full Text
- View/download PDF
13. MiniMod: A Modular Miniapplication Benchmarking Framework for HPC
- Author
-
William Marts, Matthew Dosanjh, William Schonbein, Scott Levy, Ryan Grant, and Patrick Bridges
- Published
- 2021
- Full Text
- View/download PDF
14. pMEMCPY: a simple, lightweight, and portable I/O library for storing data in persistent memory
- Author
-
Luke Logan, Patrick Widener, Anthony Kougkas, Scott Levy, Jay Lofstead, and Xian-He Sun
- Subjects
Random access memory ,SIMPLE (military communications protocol) ,business.industry ,Computer science ,computer.file_format ,Hierarchical Data Format ,Metadata ,Software ,Embedded system ,Computer cluster ,Layer (object-oriented design) ,business ,computer ,Dram - Abstract
Persistent memory (PMEM) devices can achieve comparable performance to DRAM while providing significantly more capacity. This has made the technology compelling as an expansion to main memory. Rethinking PMEM as storage devices can offer a high performance buffering layer for HPC applications to temporarily, but safely store data. However, modern parallel I/O libraries, such as HDF5 and pNetCDF, are complicated and introduce significant software and metadata overheads when persisting data to these storage devices, wasting much of their potential. In this work, we explore the potential of PMEM as storage through pMEMCPY: a simple, lightweight, and portable I/O library for storing data in persistent memory. We demonstrate that our approach is up to 2x faster than other popular parallel I/O libraries under real workloads.
- Published
- 2021
- Full Text
- View/download PDF
15. MiniMod: A Modular Miniapplication Benchmarking Framework for HPC
- Author
-
Scott Levy, Matthew G. F. Dosanjh, Patrick G. Bridges, Ryan E. Grant, Whit Schonbein, and W. Pepper Marts
- Subjects
Flexibility (engineering) ,Computer science ,business.industry ,Benchmarking ,Modular design ,computer.software_genre ,Instruction set ,Kernel (linear algebra) ,Computer architecture ,Models of communication ,Middleware (distributed applications) ,business ,computer ,Data transmission - Abstract
The HPC application community has proposed many new application communication structures, middleware interfaces, and communication models to improve HPC application performance. Modifying proxy applications is the standard practice for the evaluation of these novel methodologies. Currently, this requires the creation of a new version of the proxy application for each combination of the approach being tested. In this article, we present a modular proxy-application framework, MiniMod, that enables evaluation of a combination of independently written computation kernels, data transfer logic, communication access, and threading libraries. MiniMod is designed to allow rapid development of individual modules which can be combined at runtime. Through MiniMod, developers only need a single implementation to evaluate application impact under a variety of scenarios.We demonstrate the flexibility of MiniMod’s design by using it to implement versions of a heat diffusion kernel and the miniFE finite element proxy application, along with a variety of communication, granularity, and threading modules. We examine how changing communication libraries, communication granularities, and threading approaches impact these applications on an HPC system. These experiments demonstrate that MiniMod can rapidly improve the ability to assess new middleware techniques for scientific computing applications and next-generation hardware platforms.
- Published
- 2021
- Full Text
- View/download PDF
16. Understanding the Effects of DRAM Correctable Error Logging at Scale
- Author
-
Victor Kuhns, Sean Blanchard, Nathan DeBardeleben, Kurt B. Ferreira, and Scott Levy
- Subjects
Random access memory ,Computer science ,Computer cluster ,Scale (chemistry) ,Fault tolerance ,State (computer science) ,Dram ,System characteristics ,Reliability engineering - Abstract
Fault tolerance poses a major challenge for future large-scale systems. Current research on fault tolerance has been principally focused on mitigating the impact of uncorrectable errors: errors that corrupt the state of the machine and require a restart from a known good state. However, correctable errors occur much more frequently than uncorrectable errors and may be even more common on future systems. Although an application can safely continue to execute when correctable errors occur, recovery from a correctable error requires the error to be corrected and, in most cases, information about its occurrence to be logged. The potential performance impact of these recovery activities has not been extensively studied in HPC. In this paper, we use simulation to examine the relationship between recovery from correctable errors and application performance for several important extreme-scale workloads. Our paper contains what is, to the best of our knowledge, the first detailed analysis of the impact of correctable errors on application performance. Our study shows that correctable errors can have significant impact on application performance for future systems. We also find that although the focus on correctable errors is focused on reducing failure rates, reducing the time required to log individual errors may have a greater impact on overheads at scale. Finally, this study outlines the error frequency and durations targets to keep correctable overheads similar to that of today’s systems. This paper provides critical analysis and insight into the overheads of correctable errors and provides practical advice to systems administrators and hardware designers in an effort to fine-tune performance to application and system characteristics.
- Published
- 2021
- Full Text
- View/download PDF
17. Using simulation to examine the effect of MPI message matching costs on application performance
- Author
-
Kurt B. Ferreira, Scott Levy, Whit Schonbein, Matthew G. F. Dosanjh, and Ryan E. Grant
- Subjects
Computer engineering ,Artificial Intelligence ,Computer Networks and Communications ,Hardware and Architecture ,Computer science ,Computer Graphics and Computer-Aided Design ,Queue ,Software ,Theoretical Computer Science - Abstract
Attaining high performance with MPI applications requires efficient message matching to minimize message processing overheads and the latency these overheads introduce into application communication. In this paper, we use a validated simulation-based approach to examine the relationship between MPI message matching performance and application time-to-solution. Specifically, we examine how the performance of several important HPC workloads is affected by the time required for matching. Our analysis yields several important contributions: (i) the performance of current workloads is unlikely to be significantly affected by MPI matching unless match queue operations get much slower or match queues get much longer; (ii) match queue designs that provide sublinear performance as a function of queue length are unlikely to yield much benefit unless match queue lengths increase dramatically; and (iii) we provide guidance on how long the mean time per match attempt may be without significantly affecting application performance. The results and analysis in this paper provide valuable guidance on the design and development of MPI message match queues.
- Published
- 2019
- Full Text
- View/download PDF
18. An Initial Examination of the Effect of Container Resource Constraints on Application Perturbation
- Author
-
Scott Levy and Kurt Ferreira
- Published
- 2021
- Full Text
- View/download PDF
19. Low-cost MPI Multithreaded Message Matching Benchmarking
- Author
-
Matthew G. F. Dosanjh, Scott Levy, W. Pepper Marts, Ryan E. Grant, and Whit Schonbein
- Subjects
020203 distributed computing ,Matching (statistics) ,Computer science ,Distributed computing ,Message Passing Interface ,Process (computing) ,010103 numerical & computational mathematics ,02 engineering and technology ,Network interface ,Benchmarking ,01 natural sciences ,Multithreading ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Code (cryptography) ,0101 mathematics - Abstract
The Message Passing Interface (MPI) standard allows user-level threads to concurrently call into an MPI library. While this feature is currently rarely used, there is considerable interest from developers in adopting it in the near future. There is reason to believe that multithreaded communication may incur additional message processing overheads in terms of number of items searched during demultiplexing and amount of time spent searching because it has the potential to increase the number of messages exchanged and to introduce non-deterministic message ordering. Therefore, understanding the implications of adding multithreading to MPI applications is important for future application development.One strategy for advancing this understanding is through ‘low-cost’ benchmarks that emulate full communication patterns using fewer resources. For example, while a complete, ‘real-world’ multithreaded halo exchange requires 9 or 27 nodes, the low-cost alternative needs only two, making it deployable on systems where acquiring resources is difficult because of high utilization (e.g., busy capacity-computing systems), or impossible because the necessary resources do not exist (e.g., testbeds with too few nodes). While such benchmarks have been proposed, the reported results have been limited to a single architecture or derived indirectly through simulation, and no attempt has been made to confirm that a low-cost benchmark accurately captures features of full (non-emulated) exchanges. Moreover, benchmark code has not been made publicly available.The purpose of the study presented in this paper is to quantify how accurately the low-cost benchmark captures the matching behavior of the full, real-world benchmark. In the process, we also advocate for the feasibility and utility of the low-cost benchmark. We present a ‘real-world’ benchmark implementing a full multithreaded halo exchange on 9 and 27 nodes, as defined by 5-point and 9-point 2D stencils, and 7-point and 27-point 3D stencils. Likewise, we present a ‘low-cost’ benchmark that emulates these communication patterns using only two nodes. We then confirm, across multiple architectures, that the low-cost benchmark gives accurate estimates of both number of items searched during message processing, and time spent processing those messages. Finally, we demonstrate the utility of the low-cost benchmark by using it to profile the performance impact of state-of-the-art Mellanox ConnectX-5 hardware support for offloaded MPI message demultiplexing. To facilitate further research on the effects of multithreaded MPI on message matching behavior, the source of our two benchmarks is to be included in the next release version of the Sandia MPI Micro-Benchmark Suite.
- Published
- 2020
- Full Text
- View/download PDF
20. RaDD Runtimes: Radical and Different Distributed Runtimes with SmartNICs
- Author
-
Ryan E. Grant, Whit Schonbein, and Scott Levy
- Subjects
Software ,Overhead (business) ,business.industry ,Computer science ,Distributed computing ,Computation ,Message passing ,Process (computing) ,Software design ,Network interface ,business ,Host (network) - Abstract
As network speeds increase, the overhead of processing incoming messages is becoming onerous enough that many manufacturers now provide network interface cards (NICs) with offload capabilities to handle these overheads. This increase in NIC capabilities creates an opportunity to enable computation on data in-situ on the NIC. These enhanced NICs can be classified into several different categories of SmartNICs. SmartNICs present an interesting opportunity for future runtime software designs. Designing runtime software to be located in the network as opposed to the host level leads to new radical distributed runtime possibilities that were not practical prior to SmartNICs. In the process of transitioning to a radically different runtime software design for SmartNICs there are intermediary steps of migrating current runtime software to be offloaded onto a SmartNIC that also present interesting possibilities. This paper will describe SmartNIC design and how SmartNICs can be leveraged to offload current generation runtime software and lead to future radically different in-network distributed runtime systems.
- Published
- 2020
- Full Text
- View/download PDF
21. Low-cost MPI Multithreaded Message Matching Benchmarking
- Author
-
William Schonbein, Ryan Grant, Scott Levy, Matthew Dosanjh, and William Marts
- Published
- 2020
- Full Text
- View/download PDF
22. Evaluating MPI Message Size Summary Statistics
- Author
-
Kurt B. Ferreira and Scott Levy
- Subjects
Set (abstract data type) ,Computer engineering ,Semantics (computer science) ,Computer science ,Metric (mathematics) ,Message passing ,Synchronization (computer science) ,Programming paradigm ,Key (cryptography) ,TRACE (psycholinguistics) - Abstract
The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today’s high-performance computing (HPC) systems. This dominance stems from MPI’s powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an applications MPI usage is critical to tuning MPI’s performance on a particular platform. The results of this analysis is typically a discussion of average message sizes for a workload or set of workloads. While a discussion of the message average might be the most intuitive summary statistic, it might not be the most useful in terms of representing the entire message size dataset for an application. Using a previously developed MPI trace collector, we analyze the MPI message traces for a number of key MPI workloads. Through this analysis, we demonstrate that the average, while easy and efficient to calculate, may not be a good representation of all subsets of application messages sizes, with median and mode of message sizes being a superior choice in most cases. We show that the problem with using the average relate to the multi-modal nature of the distribution of point-to-point messages. Finally, we show that while scaling a workload has little discernible impact on which measures of central tendency are representative of the underlying data, different input descriptions can significantly impact which metric is most effective. The results and analysis in this paper have the potential for providing valuable guidance on how we as a community should discuss and analyze MPI message data for scientific applications.
- Published
- 2020
- Full Text
- View/download PDF
23. Stress in the Workplace- Implementing Solutions: Preparing the Individual and Organization for When the Worst-Case Scenario is Actualized
- Author
-
Janis Davis-Street, Scott Levy, Christina Stevens, and Brian Walker
- Subjects
03 medical and health sciences ,0302 clinical medicine ,Risk analysis (engineering) ,Computer science ,Stress (linguistics) ,Worst-case scenario ,030212 general & internal medicine ,Resilience (network) ,030210 environmental & occupational health - Abstract
Objectives/Scope Workplace stress can happen for many different reasons but is very prominent during times of change and uncertainty. Many businesses have access to mental health and emotional well-being resources which they offer widely during normal operations and enhance those capabilities during times of expected stress. In this presentation we will discuss the challenge of what happens when that expected short-term period of difficulty morphs into longer term uncertainty. Method, Procedures, Process We will describe risk factors for workplace stress as well as short- and long-term solutions during periods of prolonged uncertainty. The discussion will include key components including baseline resources, gap assessments, and mitigating solutions. By working with business leaders to identify critical time periods and utilizing a validated survey tool, as well as a multitude of educational and informational materials, we were able to better deliver our health and wellness resources to deliver long-term support to the business. Results, Observations, Conclusions Mitigating solutions to address stress can be utilized to limit health and safety risks and optimize human performance. By offering a dynamic and relevant program which can be tailored to the individual workforce, we can maintain support for a prolonged period with positive results. Stress levels in the workplace can be assessed via validated tools, results of which can be managed across a spectrum of different work environments. Although our business environments differ considerably, by developing fit for purpose solutions we can implement high quality services to meet the needs of the business. Novel/Additive Information Fostering individual and organizational resilience early in the process and maintaining it throughout is an essential component especially when a worst-case scenario situation is highly probable. We will discuss how to best prepare both the organization and the individual to adapt, change and potentially even thrive during an extended period of uncertainty. We will explore simple methods of screening for mental health concerns well as developing interventions to optimize worker health and safety over prolonged periods of change and uncertainty.
- Published
- 2020
- Full Text
- View/download PDF
24. The Case for Explicit Reuse Semantics for RDMA Communication
- Author
-
Todd Kordenbrock, Patrick Widener, Scott Levy, and Craig D. Ulmer
- Subjects
Hardware_MEMORYSTRUCTURES ,Remote direct memory access ,business.industry ,Computer science ,Registered memory ,020206 networking & telecommunications ,02 engineering and technology ,Reuse ,Allocator ,Network interface controller ,Synchronization (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,020201 artificial intelligence & image processing ,Implicit memory ,business ,Computer network - Abstract
Remote Direct Memory Access (RDMA) is an increasingly important technology in high-performance computing (HPC). RDMA provides low-latency, high-bandwidth data transfer between compute nodes. Additionally, it does not require explicit synchronization with the destination processor. Eliminating unnecessary synchronization can significantly improve the communication performance of large-scale scientific codes. A long-standing challenge presented by RDMA communication is mitigating the cost of registering memory with the network interface controller (NIC). Reusing memory once it is registered has been shown to significantly reduce the cost of RDMA communication. However, existing approaches for reusing memory rely on implicit memory semantics. In this paper, we introduce an approach that makes memory reuse semantics explicit by exposing a separate allocator for registered memory. The data and analysis in this paper yield the following contributions: (i) managing registered memory explicitly enables efficient reuse of registered memory; (ii) registering large memory regions to amortize the registration cost over multiple user requests can significantly reduce cost of acquiring new registered memory; and (iii) reducing the cost of acquiring registered memory can significantly improve the performance of RDMA communication. Reusing registered memory is key to high-performance RDMA communication. By making reuse semantics explicit, our approach has the potential to improve RDMA performance by making it significantly easier for programmers to efficiently reuse registered memory.
- Published
- 2020
- Full Text
- View/download PDF
25. ALAMO: Autonomous Lightweight Allocation, Management, and Optimization
- Author
-
Jay Lofstead, Ann C. Gentile, Scott Levy, Andrew J. Younge, Kurt B. Ferreira, Ron Brightwell, Jim Brandt, Stephen L. Olivier, Ryan E. Grant, and Kevin Pedretti
- Subjects
Research program ,Computer science ,business.industry ,Scale (chemistry) ,020206 networking & telecommunications ,02 engineering and technology ,Variety (cybernetics) ,Software ,020204 information systems ,Paradigm shift ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Resource management ,Software engineering ,business - Abstract
Several recent workshops conducted by the DOE Advanced Scientific Computing Research program have established the fact that the complexity of developing applications and executing them on high-performance computing (HPC) systems is rising at a rate which will make it nearly impossible to continue to achieve higher levels of performance and scalability. Absent an alternative approach to managing this ever-growing complexity, HPC systems will become increasingly difficult to use. A more holistic approach to designing and developing applications and managing system resources is required. This paper outlines a research strategy for managing the increasing the complexity by providing the programming environment, software stack, and hardware capabilities needed for autonomous resource management of HPC systems. Developing portable applications for a variety of HPC systems of varying scale requires a paradigm shift from the current approach, where applications are painstakingly mapped to individual machine resources, to an approach where machine resources are automatically mapped and optimized to applications as they execute. Achieving such automated resource management for HPC systems is a daunting challenge that requires significant sustained investment in exploring new approaches and novel capabilities in software and hardware that span the spectrum from programming systems to device-level mechanisms. This paper provides an overview of the functionality needed to enable autonomous resource management and optimization and describes the components currently being explored at Sandia National Laboratories to help support this capability.
- Published
- 2020
- Full Text
- View/download PDF
26. Space-Efficient Reed-Solomon Encoding to Detect and Correct Pointer Corruption
- Author
-
Kurt B. Ferreira and Scott Levy
- Subjects
Hardware_MEMORYSTRUCTURES ,Correctness ,Memory errors ,Computer engineering ,Reed–Solomon error correction ,Computer science ,Pointer (computer programming) ,Encoding (memory) ,0202 electrical engineering, electronic engineering, information engineering ,02 engineering and technology ,Silent data corruption ,020202 computer hardware & architecture - Abstract
Concern about memory errors has been widespread in high-performance computing (HPC) for decades. These concerns have led to significant research on detecting and correcting memory errors to improve performance and provide strong guarantees about the correctness of the memory contents of scientific simulations. However, power concerns and changes in memory architectures threaten the viability of current approaches to protecting memory (e.g., Chipkill). Returning to less protective error-correcting codes (ECC), e.g., single-error correction, double-error detection (SECDED), may increase the frequency of memory errors, including silent data corruption (SDC). SDC has the potential to silently cause applications to produce incorrect results and mislead domain scientists. We propose an approach for exploiting unnecessary bits in pointer values to support encoding the pointer with a Reed-Solomon code. Encoding the pointer allows us to provides strong capabilities for correcting and detecting corruption of pointer values.
- Published
- 2020
- Full Text
- View/download PDF
27. Ethiopian paediatric oncology registry progress report: documentation practice improvements at tertiary care centre in Addis Ababa, Ethiopia
- Author
-
Sheila Weitzman, Julie Broas, Kaitlyn M Buhlinger, Daniel Hailu, Abdulkadir M Said Gidey, Wondwessen Bekele, Mohammed Mustefa, Vanessa Miller, Stephen M Clark, Thomas B. Alexander, David N. Korones, Tadele Hailu, Haileyesus Adam, Benyam Muluneh, Megan C. Roberts, Michael Chargualaf, Atalay Mulu Fentie, Mulugeta Ayalew Yimer, Scott Levy, Ali Mamude Dinkiye, Diriba Fufa, and Aziza T. Shad
- Subjects
medicine.medical_specialty ,Documentation ,Medical Oncology ,Pediatrics ,Tertiary care ,Article ,Unmet needs ,Tertiary Care Centers ,03 medical and health sciences ,0302 clinical medicine ,Paediatric cancer ,Neoplasms ,030225 pediatrics ,Humans ,Medicine ,Patient treatment ,Registries ,Child ,business.industry ,Paediatric oncology ,Medical record ,Quality Improvement ,Family medicine ,Pediatrics, Perinatology and Child Health ,Ethiopia ,business ,Delivery of Health Care ,Qualitative research - Abstract
Limited data are available regarding cancer in low and middle-income countries (LMICs), distorting the true burden of paediatric cancer.1 A sobering statistic based on available data shows that more than 80% of children diagnosed with cancer in high-income countries survive, while fewer than 25% of children in LMICs survive.2 While access to paediatric oncological care in Ethiopia is improving, the establishment of a national paediatric cancer registry remains an unmet need. Building on our previous work, we sought to standardise patient treatment documentation within the paediatric haematology and oncology department at Tikur Anbessa Specialized Hospital (TASH) in Addis Ababa, Ethiopia, to begin formal paediatric cancer registration at TASH.3 We interviewed medical record users and observed that there was a lack of consistency in treatment documentation as well as variability in the collection of data relating to cancer diagnoses. We attempted to address these gaps in documentation through the creation of two separate sets of data …
- Published
- 2021
- Full Text
- View/download PDF
28. Evaluating MPI resource usage summary statistics
- Author
-
Kurt B. Ferreira and Scott Levy
- Subjects
Computer Networks and Communications ,Computer science ,Message passing ,Computer Graphics and Computer-Aided Design ,Usage data ,Theoretical Computer Science ,Set (abstract data type) ,Resource (project management) ,Computer engineering ,Artificial Intelligence ,Hardware and Architecture ,Synchronization (computer science) ,Key (cryptography) ,Programming paradigm ,Software ,TRACE (psycholinguistics) - Abstract
The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today’s high-performance computing (HPC) systems. This dominance stems from MPI’s powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an application’s MPI resource usage is critical to tuning MPI’s performance on a particular platform. The result of this analysis is typically a discussion of the mean message sizes, queue search lengths and message arrival times for a workload or set of workloads. While a discussion of the arithmetic mean in MPI resource usage might be the most intuitive summary statistic, it is not always the most accurate in terms of representing the underlying data. In this paper, we analyze MPI resource usage for a number of key MPI workloads using an existing MPI trace collector and discrete-event simulator. Our analysis demonstrates that the average, while easy and efficient to calculate, is a useful metric for characterizing latency and bandwidth measurements, but may not be a good representation of application message sizes, match list search depths, or MPI inter-operation times. Additionally, we show that the median and mode are superior choices in many cases. We also observe that the arithmetic mean is not the best representation of central tendency for data that are drawn from distributions that are multi-modal or have heavy tails. The results and analysis of our work provide valuable guidance on how we, as a community, should discuss and analyze MPI resource usage data for scientific applications.
- Published
- 2021
- Full Text
- View/download PDF
29. Reform at Risk — Mandating Participation in Alternative Payment Plans
- Author
-
Rahul Rajkumar, Nicholas Bagley, and Scott Levy
- Subjects
media_common.quotation_subject ,Public administration ,Medicare ,Centers for Medicare and Medicaid Services, U.S ,Reimbursement Mechanisms ,03 medical and health sciences ,0302 clinical medicine ,Health care ,Agency (sociology) ,Health insurance ,Medicine ,030212 general & internal medicine ,media_common ,Government ,Medicaid ,business.industry ,Patient Protection and Affordable Care Act ,General Medicine ,Payment ,United States ,Health Care Reform ,Government Regulation ,United States Dept. of Health and Human Services ,030211 gastroenterology & hepatology ,Health Services Research ,Health care reform ,business - Abstract
Reform at Risk The Center for Medicare and Medicaid Innovation was meant to be the government’s innovation laboratory for health care. But HHS has quietly hobbled the agency, imperiling its ability...
- Published
- 2018
- Full Text
- View/download PDF
30. Evaluating tradeoffs between MPI message matching offload hardware capacity and performance
- Author
-
Scott Levy and Kurt B. Ferreira
- Subjects
Matching (statistics) ,Computer science ,business.industry ,Semantics (computer science) ,Message Passing Interface ,computer.software_genre ,Inter-process communication ,Wildcard ,Programming paradigm ,business ,Queue ,computer ,Dram ,Computer hardware - Abstract
Although its demise has been frequently predicted, the Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on high-performance computing (HPC) systems. MPI specifies powerful semantics for interprocess communication that have enabled scientists to write applications for simulating important physical phenomena. However, these semantics have also presented several significant challenges. For example, the existence of wildcard values has made the efficient enforcement of MPI message matching semantics challenging.Significant research has been dedicated to accelerating MPI message matching. One common approach has been to offload matching to dedicated hardware. One of the challenges that hardware designers have faced is knowing how to size hardware structures to accommodate outstanding match requests. Applications that exceed the capacity of specialized hardware typically must fall back to storing match requests in bulk memory, e.g. DRAM on the host processor. In this paper, we examine the implications of hardware matching and develop guidance on sizing hardware matching structure to strike a balance between minimizing expensive dedicated hardware resources and overall matching performance.By examining the message matching behavior of several important HPC workloads, we show that when specialized hardware matching is not dramatically faster than matching in memory the offload hardware's match queue capacity can be reduced without significantly increasing match time. On the other hand, effectively exploiting the benefits of very fast specialized matching hardware requires sufficient storage resources to ensure that every search completes in the specialized hardware. The data and analysis in this paper provide important guidance for designers of MPI message matching hardware.
- Published
- 2019
- Full Text
- View/download PDF
31. The Upcoming Storm: The Implications of Increasing Core Count on Scalable System Software
- Author
-
Matthew G.F. Dosanjh, Ryan E. Grant, Nathan Hjelm, Scott Levy, and Whit Schonbein
- Abstract
As clock speeds have stagnated, the number of cores in a node has been drastically increased to improve processor throughput. Most scalable system software was designed and developed for single-threaded environments. Multithreaded environments become increasingly prominent as application developers optimize their codes to leverage the full performance of the processor; however, these environments are incompatible with a number of assumptions that have driven scalable system software development. This paper will highlight a case study of this mismatch focusing on MPI message matching. MPI message matching has been designed and optimized for traditional serial execution. The reduced determinism in the order of MPI calls can significantly reduce the performance of MPI message matching, potentially overtaking time-per-iteration targets of many applications. Different proposed techniques attempt to address these issues and enable multithreaded MPI usage. These approaches highlight a number of tradeoffs that make adapting MPI message matching complex. This case study and its proposed solutions highlight a number of general concepts that need to be leveraged in the design of next generation scaleable system software.
- Published
- 2019
- Full Text
- View/download PDF
32. Clinical Utility of Plasma Cell-Free DNA in Adult Patients with Newly Diagnosed Glioblastoma: A Pilot Prospective Study
- Author
-
Arati Desai, Timothy Prior, Erica L. Carpenter, MacLean Nasrallah, Samantha Guiry, Donald M. O'Rourke, Jeffrey B. Ware, Zev A. Binder, S. Ali Nabavizadeh, Theresa Christensen, Whitney Sarchiapone, Steven Brem, Jennifer J.D. Morrissette, Jazmine Mays, Scott Levy, Jasmin Hussain, Jacob Till, Andrew J. Cucchiara, Stephanie S. Yee, and Stephen J Bagley
- Subjects
0301 basic medicine ,Oncology ,Adult ,Male ,Cancer Research ,medicine.medical_specialty ,Pilot Projects ,Newly diagnosed ,Plasma cell ,Free dna ,Article ,Circulating Tumor DNA ,03 medical and health sciences ,Young Adult ,0302 clinical medicine ,Germline mutation ,Internal medicine ,medicine ,Biomarkers, Tumor ,Humans ,Longitudinal Studies ,Prospective Studies ,Prospective cohort study ,Aged ,Aged, 80 and over ,Adult patients ,business.industry ,High-Throughput Nucleotide Sequencing ,Middle Aged ,medicine.disease ,Prognosis ,Magnetic Resonance Imaging ,Tumor Burden ,Survival Rate ,030104 developmental biology ,medicine.anatomical_structure ,030220 oncology & carcinogenesis ,Mutation ,Female ,business ,Glioblastoma ,Chemoradiotherapy - Abstract
Purpose: The clinical utility of plasma cell-free DNA (cfDNA) has not been assessed prospectively in patients with glioblastoma (GBM). We aimed to determine the prognostic impact of plasma cfDNA in GBM, as well as its role as a surrogate of tumor burden and substrate for next-generation sequencing (NGS). Experimental Design: We conducted a prospective cohort study of 42 patients with newly diagnosed GBM. Plasma cfDNA was quantified at baseline prior to initial tumor resection and longitudinally during chemoradiotherapy. Plasma cfDNA was assessed for its association with progression-free survival (PFS) and overall survival (OS), correlated with radiographic tumor burden, and subjected to a targeted NGS panel. Results: Prior to initial surgery, GBM patients had higher plasma cfDNA concentration than age-matched healthy controls (mean 13.4 vs. 6.7 ng/mL, P < 0.001). Plasma cfDNA concentration was correlated with radiographic tumor burden on patients' first post-radiation magnetic resonance imaging scan (ρ = 0.77, P = 0.003) and tended to rise prior to or concurrently with radiographic tumor progression. Preoperative plasma cfDNA concentration above the mean (>13.4 ng/mL) was associated with inferior PFS (median 4.9 vs. 9.5 months, P = 0.038). Detection of ≥1 somatic mutation in plasma cfDNA occurred in 55% of patients and was associated with nonstatistically significant decreases in PFS (median 6.0 vs. 8.7 months, P = 0.093) and OS (median 5.5 vs. 9.2 months, P = 0.053). Conclusions: Plasma cfDNA may be an effective prognostic tool and surrogate of tumor burden in newly diagnosed GBM. Detection of somatic alterations in plasma is feasible when samples are obtained prior to initial surgical resection.
- Published
- 2019
33. Hardware MPI message matching: Insights into MPI matching behavior to inform design
- Author
-
Taylor Groves, Michael J. Levenhagen, Kurt B. Ferreira, Ryan E. Grant, and Scott Levy
- Subjects
Thesaurus (information retrieval) ,Search engine ,Matching (statistics) ,Information retrieval ,Computational Theory and Mathematics ,Computer Networks and Communications ,Computer science ,Software ,Computer Science Applications ,Theoretical Computer Science - Published
- 2019
- Full Text
- View/download PDF
34. Mediating Data Center Storage Diversity in HPC Applications with FAODEL
- Author
-
Gary J. Templet, Scott Levy, Craig D. Ulmer, Patrick Widener, and Todd Kordenbrock
- Subjects
Service (systems architecture) ,business.industry ,Computer science ,Distributed computing ,Data management ,020206 networking & telecommunications ,02 engineering and technology ,Supercomputer ,Data type ,Workflow ,Computer data storage ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data center ,business - Abstract
Composition of computational science applications into both ad hoc pipelines for analysis of collected or generated data and into well-defined and repeatable workflows is becoming increasingly popular. Meanwhile, dedicated high performance computing storage environments are rapidly becoming more diverse, with both significant amounts of non-volatile memory storage and mature parallel file systems available. At the same time, computational science codes are being coupled to data analysis tools which are not filesystem-oriented. In this paper, we describe how the FAODEL data management service can expose different available data storage options and mediate among them in both application- and FAODEL-directed ways. These capabilities allow applications to exploit their knowledge of the different types of data they may exchange during a workflow execution, and also provide FAODEL with mechanisms to proactively tune data storage behavior when appropriate. We describe the implementation of these capabilities in FAODEL and how they are used by applications, and present preliminary performance results demonstrating the potential benefits of our approach.
- Published
- 2019
- Full Text
- View/download PDF
35. Comparison of changes in nonclassical (α) and classical (β) acoustic nonlinear parameters due to thermal aging of 9Cr–1Mo ferritic martensitic steel
- Author
-
Katherine Marie Scott Levy, Daniel Niklas Fahse, Jin-Yeon Kim, and Laurence J. Jacobs
- Subjects
Nonlinear system ,Rockwell scale ,Materials science ,Mechanical Engineering ,Martensite ,Nonlinear resonance ,Modulus ,General Materials Science ,Composite material ,Condensed Matter Physics ,Microstructure ,Laser Doppler vibrometer ,Carbide - Abstract
The objective of this research is to demonstrate the sensitivity of the hysteretic, nonclassical acoustic nonlinear parameter, α to track changes in the microstructure of 9Cr–1Mo ferritic martensitic steel due to thermal aging. The α parameter is measured with a non-contact nonlinear resonance ultrasound spectroscopy (NRUS) system with an air-coupled source and a laser Doppler vibrometer (LDV) receiver. This NRUS setup is used to track changes in multiple 9Cr–1Mo specimens subjected to different aging times at the same 650 ∘ C temperature. These α results are shown to be highly sensitive to the associated changes in the microstructure of the 9Cr–1Mo specimens, and are then compared to three different parameters – Rockwell hardness, Young's modulus, E and the classical acoustic nonlinear parameter, β – all measured in the same specimens. These results are then combined to infer microstructure changes such as the removal of dislocations and the formation of carbide precipitates occurring in the 9Cr–1Mo specimens during thermal aging.
- Published
- 2020
- Full Text
- View/download PDF
36. Lessons Learned from Memory Errors Observed Over the Lifetime of Cielo
- Author
-
Kurt B. Ferreira, Elisabeth Baseman, Vilas Sridharan, Taniya Siddiqua, Scott Levy, and Nathan DeBardeleben
- Subjects
Random access memory ,Memory errors ,Computer science ,Reliability (computer networking) ,Context (language use) ,02 engineering and technology ,Reliability engineering ,Memory management ,020204 information systems ,Cielo ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Static random-access memory ,Dram - Abstract
Maintaining the performance of high-performance computing (HPC) applications as failures increase is a major challenge for next-generation extreme-scale systems. Recent work demonstrates that hardware failures are expected to become more common. Few existing studies, however, have examined failures in the context of the entire lifetime of a single platform. In this paper, we analyze a corpus of empirical failure data collected over the entire five-year lifetime of Cielo, a leadership-class HPC system. Our analysis reveals several important findings about failures on Cielo: (i) its memory (DRAM and SRAM) exhibited no aging effects; detectable, uncorrectable errors (DUE) showed no discernible increase over its five-year lifetime; (ii) contrary to popular belief, correctable DRAM faults are not predictive of future uncorrectable DRAM faults; (iii) the majority of system down events have no identifiable hardware root cause, highlighting the need for more comprehensive logging facilities to improve failure analysis on future systems; and (iv) continued advances will be needed in order for current failure mitigation techniques to be viable on future systems. Our analysis of this corpus of empirical data provides critical analysis of, and guidance for, the deployment of extreme-scale systems.
- Published
- 2018
- Full Text
- View/download PDF
37. Using Simulation to Examine the Effect of MPI Message Matching Costs on Application Performance
- Author
-
Kurt B. Ferreira and Scott Levy
- Subjects
Message processing ,020203 distributed computing ,Computer engineering ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,Message Passing Interface ,010103 numerical & computational mathematics ,02 engineering and technology ,0101 mathematics ,Latency (engineering) ,01 natural sciences ,Queue - Abstract
Attaining high performance with MPI applications requires efficient message matching to minimize message processing overheads and the latency these overheads introduce into application communication. In this paper, we use a validated simulation-based approach to examine the relationship between MPI message matching performance and application time-to-solution. Specifically, we examine how the performance of several important HPC workloads is affected by the time required for matching. Our analysis yields several important contributions: (i) the performance of current workloads is unlikely to be significantly affected by MPI matching unless match queue operations get much slower or match queues get much longer; (ii) match queue designs that provide sublinear performance as a function of queue length are unlikely to yield much benefit unless match queue lengths increase dramatically; and (iii) we provide guidance on how long the mean time per match attempt may be without significantly affecting application performance. The results and analysis in this paper provide valuable guidance on the design and development of MPI message match queues.
- Published
- 2018
- Full Text
- View/download PDF
38. The unexpected virtue of almost: Exploiting MPI collective operations to approximately coordinate checkpoints
- Author
-
Scott Levy, Patrick Widener, and Kurt B. Ferreira
- Subjects
020203 distributed computing ,Virtue ,Computer Networks and Communications ,Computer science ,media_common.quotation_subject ,Fault tolerance ,010103 numerical & computational mathematics ,02 engineering and technology ,Parallel computing ,01 natural sciences ,Computer Science Applications ,Theoretical Computer Science ,Computational Theory and Mathematics ,0202 electrical engineering, electronic engineering, information engineering ,0101 mathematics ,Software ,media_common - Published
- 2018
- Full Text
- View/download PDF
39. ASC ATDM Level 2 Milestone #6358: Assess Status of Next Generation Components and Physics Models in EMPIRE
- Author
-
Craig D. Ulmer, Edward G. Phillips, Christopher Siefert, Paul Lin, Eric C. Cyr, Gary J. Templet, Matthew Swan, Jonathan Joseph Hu, Christian A. Glusa, Scott Levy, Roger P. Pawlowski, Keith Cartwright, Irina Kalashnikova Tezaur, Curtis C. Ober, Sidafa Conde, Eric T. Phipps, Matthew Tyler Bettencourt, Richard Michael Jack Kramer, Micheal W. Glass, and Todd Kordenbrock
- Subjects
Aeronautics ,media_common.quotation_subject ,Empire ,Milestone ,media_common - Published
- 2018
- Full Text
- View/download PDF
40. Open Science on Trinity's Knights Landing Partition
- Author
-
Scott Levy, Kurt B. Ferreira, and Kevin Pedretti
- Subjects
Job scheduler ,Open science ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Supercomputer ,computer.software_genre ,Partition (database) ,Data science ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Resource management ,computer ,Xeon Phi - Abstract
High-performance computing (HPC) systems are critically important to the objectives of universities, national laboratories, and commercial companies. Because of the cost of deploying and maintaining these systems ensuring their efficient use is imperative. Job scheduling and resource management are critically important to the efficient use of HPC systems. As a result, significant research has been conducted on how to effectively schedule user jobs on HPC systems. Developing and evaluating job scheduling algorithms, however, requires a detailed understanding of how users request resources on HPC systems. In this paper, we examine a corpus of job data that was collected on Trinity, a leadership-class supercomputer. During the stabilization period of its Intel Xeon Phi (Knights Landing) partition, it was made available to users outside of a classified environment for the Trinity Open Science Phase 2 campaign. We collected information from the resource manager about each user job that was run during this Open Science period. In this paper, we examine the jobs contained in this dataset. Our analysis reveals several important characteristics of the jobs submitted during the Open Science period and provides critical insight into the use of one of the most powerful supercomputers in existence. Specifically, these data provide important guidance for the design, development, and evaluation of job scheduling and resource management algorithms.
- Published
- 2018
- Full Text
- View/download PDF
41. Faodel
- Author
-
Jay Lofstead, Margaret Lawson, Shyamali Mukherjee, Todd Kordenbrock, Gary J. Templet, Patrick Widener, Craig D. Ulmer, and Scott Levy
- Subjects
business.industry ,Computer science ,Distributed computing ,Data management ,Scale (chemistry) ,Bandwidth (signal processing) ,020207 software engineering ,02 engineering and technology ,Set (abstract data type) ,Workflow ,020204 information systems ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Programming paradigm ,Data as a service ,business - Abstract
Composition of computational science applications, whether into ad hoc pipelines for analysis of simulation data or into well-defined and repeatable workflows, is becoming commonplace. In order to scale well as projected system and data sizes increase, developers will have to address a number of looming challenges. Increased contention for parallel filesystem bandwidth, accomodating in situ and ex situ processing, and the advent of decentralized programming models will all complicate application composition for next-generation systems. In this paper, we introduce a set of data services, Faodel, which provide scalable data management for workflows and composed applications. Faodel allows workflow components to directly and efficiently exchange data in semantically appropriate forms, rather than those dictated by the storage hierarchy or programming model in use. We describe the architecture of Faodel and present preliminary performance results demonstrating its potential for scalability in workflow scenarios.
- Published
- 2018
- Full Text
- View/download PDF
42. Obstructive Sleep Apnea OSA as a Cause Of Resistant Fatigue in the Safety Sensitive Workforce
- Author
-
Leslie Emma, Scott Levy, and Neelum Sanderson
- Subjects
Obstructive sleep apnea ,medicine.medical_specialty ,business.industry ,Workforce ,Emergency medicine ,medicine ,Apnea ,medicine.symptom ,medicine.disease ,business ,Sleep in non-human animals - Abstract
Objectives/Scope Fatigue is a known contributor to accidents. The potential for fatigue-related accidents also exists in the oil and energy industry. Fatigue risk management systems commonly involve review and adjustment of employee rosters and job functions to assist employees with getting rest based on their work demands. Although this approach is reasonable, it assumes that by giving the employee the ability to rest, that he/she will return refreshed. Certain medical conditions may inhibit an employee's ability to rest. Obstructive sleep apnea (OSA) is a medical condition where the patient has an airway obstruction which occurs when muscles in the upper airway relax while sleeping. This obstruction forces them to awaken, and if untreated, may lead to adverse medical conditions. For these patients, hours of work may not correlate as well with level of fatigue. Although there are many factors for OSA, the one most relevant for this abstract is Body Mass Index (BMI). In the adult population, OSA is estimated to be approximately 25% to 45% higher in obese subjects. The odds of having OSA increases as BMI rises and for individuals with a BMI of >35. Method, Procedures, Process We will describe risk factors for OSA, treatment of the condition, as well as methods to reduce fatigue related risk. The discussion will include key components of a medical screening program as well as health and wellness programming that can be considered in parallel with any Fatigue Risk Management System. Results, Observations, Conclusions Biometric data can be utilized to help predict the risk of fatigue related accidents in the workplace. By addressing the risks and providing solutions these incidents may decrease. Novel/Additive Information We will explore current OSA screening criteria, work hours limitations, and health and wellness programs as they relate to reducing risk. Most importantly, we will discuss a significant shortcoming with the identification of high risk individuals and an easy approach to help mitigate this risk.
- Published
- 2018
- Full Text
- View/download PDF
43. It’s Not the Heat, It’s the Humidity: Scheduling Resilience Activity at Scale
- Author
-
Patrick Widener, Kurt B. Ferreira, and Scott Levy
- Subjects
020203 distributed computing ,Risk analysis (engineering) ,Computer science ,media_common.quotation_subject ,0202 electrical engineering, electronic engineering, information engineering ,020206 networking & telecommunications ,02 engineering and technology ,Psychological resilience ,Scheduling (computing) ,media_common - Abstract
Maintaining the performance of high-performance computing (HPC) applications with the expected increase in failures is a major challenge for next-generation extreme-scale systems. With increasing scale, resilience activities (e.g. checkpointing) are expected to become more diverse, less tightly synchronized, and more computationally intensive. Few existing studies, however, have examined how decisions about scheduling resilience activities impact application performance. In this work, we examine the relationship between the duration and frequency of resilience activities and application performance. Our study reveals several key findings: (i) the aggregate amount of time consumed by resilience activities is not an effective metric for predicting application performance; (ii) the duration of the interruptions due to resilience activities has the greatest influence on application performance; shorter, but more frequent, interruptions are correlated with better application performance; and (iii) the differential impact of resilience activities across applications is related to the applications’ inter-collective frequencies; the performance of applications that perform infrequent collective operations scales better in the presence of resilience activities than the performance of applications that perform more frequent collective operations. This initial study demonstrates the importance of considering how resilience activities are scheduled. We provide critical analysis and direct guidance on how the resilience challenges of future systems can be met while minimizing the impact on application performance.
- Published
- 2018
- Full Text
- View/download PDF
44. Empress
- Author
-
Jay Lofstead, Todd Kordenbrock, Scott Levy, Shyamali Mukherjee, Margaret Lawson, Gary J. Templet, Patrick Widener, and Craig D. Ulmer
- Subjects
Metadata ,World Wide Web ,Information retrieval ,Data element ,Computer science ,Metadata management ,Data_FILES ,Geospatial metadata ,Meta Data Services ,Metadata modeling ,Database catalog ,Metadata repository - Abstract
Significant challenges exist in the efficient retrieval of data from extreme-scale simulations. An important and evolving method of addressing these challenges is application-level metadata management. Historically, HDF5 and NetCDF have eased data retrieval by offering rudimentary attribute capabilities that provide basic metadata. ADIOS simplified data retrieval by utilizing metadata for each process' data. EMPRESS provides a simple example of the next step in this evolution by integrating per-process metadata with the storage system itself, making it more broadly useful than single file or application formats. Additionally, it allows for more robust and customizable metadata.
- Published
- 2017
- Full Text
- View/download PDF
45. Lifetime memory reliability data from the field
- Author
-
Scott Levy, Vilas Sridharan, Nathan DeBardeleben, Elisabeth Baseman, Taniya Siddiqua, Steven Raasch, Qiang Guan, and Kurt B. Ferreira
- Subjects
020203 distributed computing ,Computer science ,Reliability (computer networking) ,Mode (statistics) ,Percentage point ,02 engineering and technology ,Fault (power engineering) ,Field (computer science) ,Reliability engineering ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Static random-access memory ,Resilience (network) ,Dram - Abstract
In order to provide high system resilience, it is important to understand the nature of the faults that occur in the field. This study analyzes fault rates from a production system that has been monitored for five years, capturing data for the entire operational lifetime of the system. The data show that devices in this system did not show any sign of aging during the monitoring period, suggesting that the lifetime of a system may be longer than five years. In DRAM, the relative incidence of fault modes changed insignificantly over the system's lifetime: the relative rate of each fault mode at the end of the system's lifetime was within 1.4 percentage point of the rate observed during the first year. SRAM caches in the system exhibited different fault modes including cache-way fault and single-bit faults. Overall, this study provides insights on how fault modes and types in a system evolve over the system's lifetime.
- Published
- 2017
- Full Text
- View/download PDF
46. Characterizing MPI matching via trace-based simulation
- Author
-
Kurt B. Ferreira, Kevin Pedretti, Scott Levy, and Ryan E. Grant
- Subjects
020203 distributed computing ,Matching (statistics) ,Computer Networks and Communications ,Computer science ,Distributed computing ,010103 numerical & computational mathematics ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,01 natural sciences ,Theoretical Computer Science ,Resource (project management) ,Artificial Intelligence ,Hardware and Architecture ,Middleware ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Resource allocation ,020201 artificial intelligence & image processing ,Trace-based simulation ,0101 mathematics ,Software ,TRACE (psycholinguistics) - Abstract
With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application’s execution, including its matching performance, and is highly dependent on the MPI library’s matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads and microbenchmarks, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.
- Published
- 2017
- Full Text
- View/download PDF
47. Evaluating the Viability of Using Compression to Mitigate Silent Corruption of Read-Mostly Application Data
- Author
-
Patrick G. Bridges, Scott Levy, and Kurt B. Ferreira
- Subjects
020203 distributed computing ,business.industry ,Computer science ,Language change ,020207 software engineering ,02 engineering and technology ,Construct (python library) ,Silent data corruption ,Computer security ,computer.software_genre ,Exascale computing ,Memory management ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,business ,Resource management (computing) ,Resilience (network) ,computer ,Memory protection - Abstract
Aggregating millions of hardware components to construct an exascale computing platform will pose significant resilience challenges. In addition to slowdowns associated with detected errors, silent errors are likely to further degrade application performance. Moreover, silent data corruption (SDC) has the potential to undermine the integrity of the results produced by important scientific applications.In this paper, we propose an application-independent mechanism to efficiently detect and correct SDC in read-mostly memory, where SDC may be most likely to occur. We use memory protection mechanisms to maintain compressed backups of application memory. We detect SDC by identifying changes in memory contents that occur without explicit write operations. We demonstrate that, for several applications, our approach can potentially protect a significant fraction of application memory pages from SDC with modest overheads. Moreover, our proposed technique can be straightforwardly combined with many other approaches to provide a significant bulwark against SDC.
- Published
- 2017
- Full Text
- View/download PDF
48. Bringing Regenerative Medicine to Patients: The Coverage, Coding, and Reimbursement Processes
- Author
-
Sujata K. Bhatia, Scott Levy, and Khin-Kyemon Aung
- Subjects
medicine.medical_specialty ,business.industry ,Medicine ,Coding (therapy) ,Medical physics ,business ,Regenerative medicine ,Reimbursement ,Biomedical engineering - Published
- 2017
- Full Text
- View/download PDF
49. Horseshoes and Hand Grenades: The Case for Approximate Coordination in Local Checkpointing Protocols
- Author
-
Patrick Widener, Kurt B. Ferreira, and Scott Levy
- Subjects
Scheme (programming language) ,Computer science ,Distributed computing ,Message Passing Interface ,020207 software engineering ,02 engineering and technology ,Synchronization ,Asynchrony (computer programming) ,020204 information systems ,Synchronization (computer science) ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,computer ,computer.programming_language - Abstract
Fault-tolerance poses a major challenge for future large-scale systems. Active research into coordinated, uncoordinated, and hybrid checkpointing systems has explored how the introduction of asynchrony can address anticipated scalability issues. While fully uncoordinated approaches have been shown to have significant delays, the degree of sychronization required to keep overheads low has not yet been significantly addressed. In this paper, we use a simulation-based approach to show the impact of synchronization on local checkpoint activity. Specifically, we show the degree of synchronization needed to keep the impacts of local checkpointing low is attainable with current technology for a number of key production HPC workloads. Our work provides a critical analysis and comparison of synchronization and local checkpointing. This enables users and system administrators to fine-tune the checkpointing scheme to the application and system characteristics available.
- Published
- 2017
- Full Text
- View/download PDF
50. A study of the viability of exploiting memory content similarity to improve resilience to memory errors
- Author
-
Kurt B. Ferreira, Patrick G. Bridges, Aidan P. Thompson, Scott Levy, and Christian Robert Trott
- Subjects
Memory errors ,Hardware and Architecture ,Computer science ,Node (networking) ,Distributed computing ,Similarity (psychology) ,Fault tolerance ,Resilience (network) ,Supercomputer ,Software ,Theoretical Computer Science - Abstract
Building the next-generation of extreme-scale distributed systems will require overcoming several challenges related to system resilience. As the number of processors in these systems grow, the failure rate increases proportionally. One of the most common sources of failure in large-scale systems is memory. In this paper, we propose a novel runtime for transparently exploiting memory content similarity to improve system resilience by reducing the rate at which memory errors lead to node failure. We evaluate the viability of this approach by examining memory snapshots collected from eight high-performance computing (HPC) applications and two important HPC operating systems. Based on the characteristics of the similarity uncovered, we conclude that our proposed approach shows promise for addressing system resilience in large-scale systems.
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.