18 results on '"Ye, Yanfang Fanny"'
Search Results
2. Aves: A Decision Engine for Energy-efficient Stream Analytics across Low-power Devices
- Author
-
Das, Roshan Bharath, Makkes, Marc X., Uta, Alexandru, Wang, Lin, Bal, Henri, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Computer Systems, Network Institute, High Performance Distributed Computing, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
Decision support system ,Computer science ,Real-time computing ,Wearable computer ,020206 networking & telecommunications ,02 engineering and technology ,Simplicity (photography) ,Factor (programming language) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Power semiconductor device ,SDG 7 - Affordable and Clean Energy ,computer ,Energy (signal processing) ,computer.programming_language ,Efficient energy use - Abstract
Today's low-power devices, such as smartphones and wearables, form a very heterogeneous ecosystem. Applications in such a system typically follow a reactive pattern based on stream analytics, i.e., sensing, processing, and actuating. Despite the simplicity of this pattern, deciding where to place the processing tasks of an application to achieve energy efficiency is non-trivial in a heterogeneous system since application components are distributed across multiple devices. In this paper, we present Aves - a decision-making engine based on a holistic energy-prediction model, with which the processing tasks of applications can be placed automatically in an energy-efficient manner without programmer/user intervention. We validate the effectiveness of the model and reveal several counter-intuitive placement decisions. Our decision engine's improvements are typically 10-30%, with up to a factor 14 in the most extreme cases. We also show that Aves gives an accurate decision in comparison with real energy measurements for two sensor-based applications.
- Published
- 2020
- Full Text
- View/download PDF
3. Arms Race in Adversarial Malware Detection: A Survey
- Author
-
Li, Deqiang, primary, Li, Qianmu, additional, Ye, Yanfang (Fanny), additional, and Xu, Shouhuai, additional
- Published
- 2021
- Full Text
- View/download PDF
4. WebEvo: Taming Web Application Evolution via Detecting Semantic Structure Change
- Author
-
Shao, Fei, Xu, Rui, Haque, Wasif, Xu, Jingwei, Zhang, Ying, Yang, Wei, Ye, Yanfang (Fanny), and Xiao, Xusheng
- Subjects
Data_MISCELLANEOUS ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Static analysis ,GeneralLiterature_MISCELLANEOUS ,Testing and development processes ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Please refer to README.html.
- Published
- 2021
- Full Text
- View/download PDF
5. A Programming Framework for Heterogeneous Stream Analytics
- Author
-
Das, Roshan Bharath, Makkes, Marc X., Uta, Alexandru, Wang, Lin, Bal, Henri, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Computer Systems, Network Institute, High Performance Distributed Computing, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
SDG 16 - Peace ,Computer science ,business.industry ,SDG 16 - Peace, Justice and Strong Institutions ,Big data ,020207 software engineering ,Cloud computing ,02 engineering and technology ,computer.software_genre ,Justice and Strong Institutions ,Software framework ,Human–computer interaction ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Use case ,Cloudlet ,business ,computer - Abstract
Sensor-based applications using Big Data are of increasing importance in various fields. A typical example of such use cases is building health-care applications [1], [2]. A typical scenario is where a patient's heart rate is monitored by a smartwatch. A smartphone can then analyze the gathered data and identify patterns in the patient's heart rate. However, if the data analysis is too complex to be performed on a smartphone, the computation could be offloaded to a nearby cloudlet or a remote cloud. A decision usually follows the analysis, and actuation is performed accordingly (e.g., a message is sent to either the patient or the doctor). Developing such an application is intrinsically complex, as the programmer needs to reconcile different APIs specific to different platforms.
- Published
- 2019
- Full Text
- View/download PDF
6. Hyperbolic Graph Attention Network
- Author
-
Zhang, Yiding, primary, Wang, Xiao, additional, Shi, Chuan, additional, Jiang, Xunqiang, additional, and Ye, Yanfang Fanny, additional
- Published
- 2021
- Full Text
- View/download PDF
7. Differentially private binary- and matrix-valued data query
- Author
-
Ji, Tianxi, primary, Li, Pan, additional, Yilmaz, Emre, additional, Ayday, Erman, additional, Ye, Yanfang (Fanny), additional, and Sun, Jinyuan, additional
- Published
- 2021
- Full Text
- View/download PDF
8. Heterogeneous Information Network Embedding with Adversarial Disentangler
- Author
-
Wang, Ruijia, primary, Shi, Chuan, additional, Zhao, Tianyu, additional, Wang, Xiao, additional, and Ye, Yanfang Fanny, additional
- Published
- 2021
- Full Text
- View/download PDF
9. An innovative online process mining framework for supporting incremental GDPR compliance of business processes
- Author
-
Zaman, Rashid, Cuzzocrea, Alfredo, Hassani, Marwan, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, and Process Science
- Subjects
Computer science ,Business process ,business.industry ,Process mining ,02 engineering and technology ,Compliance (psychology) ,Business Intelligence ,Model Adaptation ,Process Mining ,Risk analysis (engineering) ,020204 information systems ,General Data Protection Regulation ,Business intelligence ,0202 electrical engineering, electronic engineering, information engineering ,media_common.cataloged_instance ,Compliance Checking ,020201 artificial intelligence & image processing ,European union ,business ,media_common - Abstract
GDPR (General Data Protection Regulation) is a new regulation of the European Union that superimposes strict privacy constraints on storing, accessing and processing user data, as a way to ensure that personal user data are not violated neither disclosed without an explicit consent. As a consequence, business processes that interact with large amounts of such data may easily cause GDPR violations, due to the typical complexity of such processes. Inspired by these considerations, this paper highlights the challenges and critical aspects associated with the GDPR compliance journey when opting for naïve straight-forward solutions. We propose a business-aware GDPR compliance journey using online process mining. Using several large log files generated based on a real scenario, we show that the proposed tool is both effective and efficient. As such, it proves to be a powerful concept for usage in incremental GDPR compliance environments.
- Published
- 2019
10. Understanding Data Similarity in Large-Scale Scientific Datasets
- Author
-
Linton, P, Baru, Chaitanya1, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Linton, P, Melodia, W, Lazar, A, Agarwal, D, Bianchi, L, Ghoshal, D, Pastorello, G, Ramakrishnan, L, Wu, K, Linton, P, Baru, Chaitanya1, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Linton, P, Melodia, W, Lazar, A, Agarwal, D, Bianchi, L, Ghoshal, D, Pastorello, G, Ramakrishnan, L, and Wu, K
- Abstract
Today, scientific experiments and simulations produce massive amounts of heterogeneous data that need to be stored and analyzed. Given that these large datasets are stored in many files, formats and locations, how can scientists find relevant data, duplicates or similarities? In this context, we concentrate on developing algorithms to compare similarity of time series for the purpose of search, classification and clustering. For example, generating accurate patterns from climate related time series is important not only for building models for weather forecasting and climate prediction, but also for modeling and predicting the cycle of carbon, water, and energy. We developed the methodology and ran an exploratory analysis of climatic and ecosystem variables from the FLUXNET2015 dataset. The proposed combination of similarity metrics, nonlinear dimension reduction, clustering methods and validity measures for time series data has never been applied to unlabeled datasets before, and provides a process that can be easily extended to other scientific time series data. The dimensionality reduction step provides a good way to identify the optimum number of clusters, detect outliers and assign initial labels to the time series data. We evaluated multiple similarity metrics, in terms of the internal cluster validity for driver as well as response variables. While the best metric often depends on a number of factor, the Euclidean distance seems to perform well for most variables and also in terms of computational expense.
- Published
- 2019
11. HDMF: Hierarchical Data Modeling Framework for Modern Science Data Standards.
- Author
-
Tritt, Andrew J, Baru, Chaitanya1, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Tritt, Andrew J, Rübel, Oliver, Dichter, Benjamin, Ly, Ryan, Kang, Donghe, Chang, Edward F, Frank, Loren M, Bouchard, Kristofer, Tritt, Andrew J, Baru, Chaitanya1, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Tritt, Andrew J, Rübel, Oliver, Dichter, Benjamin, Ly, Ryan, Kang, Donghe, Chang, Edward F, Frank, Loren M, and Bouchard, Kristofer
- Abstract
A ubiquitous problem in aggregating data across different experimental and observational data sources is a lack of software infrastructure that enables flexible and extensible standardization of data and metadata. To address this challenge, we developed HDMF, a hierarchical data modeling framework for modern science data standards. With HDMF, we separate the process of data standardization into three main components: (1) data modeling and specification, (2) data I/O and storage, and (3) data interaction and data APIs. To enable standards to support the complex requirements and varying use cases throughout the data life cycle, HDMF provides object mapping infrastructure to insulate and integrate these various components. This approach supports the flexible development of data standards and extensions, optimized storage backends, and data APIs, while allowing the other components of the data standards ecosystem to remain stable. To meet the demands of modern, large-scale science data, HDMF provides advanced data I/O functionality for iterative data write, lazy data load, and parallel I/O. It also supports optimization of data storage via support for chunking, compression, linking, and modular data storage. We demonstrate the application of HDMF in practice to design NWB 2.0 [13], a modern data standard for collaborative science across the neurophysiology community.
- Published
- 2019
12. Magnitude and Uncertainty Pruning Criterion for Neural Networks
- Author
-
Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Ko, Vinnie, Oehmcke, Stefan, Gieseke, Fabian, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Ko, Vinnie, Oehmcke, Stefan, and Gieseke, Fabian
- Abstract
Neural networks have achieved dramatic improvements in recent years and depict the state-of-the-art methods for many real-world tasks nowadays. One drawback is, however, that many of these models are overparameterized, which makes them both computationally and memory intensive. Furthermore, overparameterization can also lead to undesired overfitting side-effects. Inspired by recently proposed magnitude-based pruning schemes and the Wald test from the field of statistics, we introduce a novel magnitude and uncertainty (MU) pruning criterion that helps to lessen such shortcomings. One important advantage of our MU pruning criterion is that it is scale-invariant, a phenomenon that the magnitude-based pruning criterion suffers from. In addition, we present a 'pseudo bootstrap' scheme, which can efficiently estimate the uncertainty of the weights by using their update information during training. Our experimental evaluation, which is based on various neural network architectures and datasets, shows that our new criterion leads to more compressed models compared to models that are solely based on magnitude-based pruning criteria, with, at the same time, less loss in predictive power.
- Published
- 2019
13. Optimization of arable land use towards meat-free and climate-smart agriculture:A case study in food self-sufficiency of Vietnam
- Author
-
Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Kuzmanovski, Vladimir, Larsen, Daniel Ellehammer, Henriksen, Christian Bugge, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Kuzmanovski, Vladimir, Larsen, Daniel Ellehammer, and Henriksen, Christian Bugge
- Abstract
UN Sustainable Development Goals and the Paris agreement for climate change indicate that a transition to sustainable and healthy diets is necessary. Additionally, the fact that agricultural sector is responsible for near a quarter of global greenhouse emissions (IPCC 2019-special report on climate change), such transition will require substantial dietary shifts, including reduction of sugar and red meat consumption. Vietnam, with more than 95 millions of population, have a challenge to significantly reduce the rice consumption and convert some of the land used for it to production of more legumes. However, correct allocation of arable land for cultivation of particular crops' combination that would ease the transition, and comply with recommendations for healthy nutritional intake, is a challenge of the society. We approached the problem of arable land allocation with mathematical optimization, in particular stochastic evolutionary computing. Arable land allocation to crops' combination is evaluated through three objectives: food self-sufficiency, climate efficiency and crop diversity. Candidate solutions (crops' combinations) were analysed through the non-dominated Pareto front with prioritizing the objective of food self-sufficiency of Vietnam. The results suggest significant change in production of certain crops. As such, sugar cane and rice are required to be reduced on expense of increased production of soybeans, maize, brassicas, and nuts. Therefore, the current surplus of produced carbohydrates would be reduced while proteins increased, which leads to balanced production of macronutrients.
- Published
- 2019
14. An Identity Privacy Preserving IoT Data Protection Scheme for Cloud Based Analytics
- Author
-
Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Gehrmann, Christian, Gunnarsson, Martin, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, Ye, Yanfang Fanny, Gehrmann, Christian, and Gunnarsson, Martin
- Abstract
Efficient protection of huge amount of IoT produced data is key for wide scale data analytic services. The most efficient way is to use pure symmetric encryption as that allows both fast decryption at the analytic engine side as well as energy efficient encryption at the IoT side. However, symmetric encryption can only be performed if there is a way to directly map an encrypted object to the correct key. Typically, such mapping require a unique IoT identity, which constitute a privacy problem. In this paper, we present an IoT identity protection scheme for symmetric IoT data encryption. We give basic security definitions for this problem setting, present a new construction and give security proofs of security level achieved with the construction. Performance figures for a proof of concept implementation are also given. The new scheme gives a fair trade-off between identity privacy and complexity.
- Published
- 2019
15. Understanding Data Similarity in Large-Scale Scientific Datasets
- Author
-
Alina Lazar, Deb Agarwal, Lavanya Ramakrishnan, Ludovico Bianchi, Gilberto Pastorello, Payton Linton, Devarshi Ghoshal, William Melodia, Kesheng Wu, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
010504 meteorology & atmospheric sciences ,Computer science ,Dimensionality reduction ,Context (language use) ,02 engineering and technology ,Similarity measure ,computer.software_genre ,01 natural sciences ,Euclidean distance ,similarity measure ,Similarity (network science) ,Outlier ,Metric (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Time series ,Cluster analysis ,computer ,0105 earth and related environmental sciences ,dimensionality reduction ,clustering - Abstract
Today, scientific experiments and simulations produce massive amounts of heterogeneous data that need to be stored and analyzed. Given that these large datasets are stored in many files, formats and locations, how can scientists find relevant data, duplicates or similarities? In this context, we concentrate on developing algorithms to compare similarity of time series for the purpose of search, classification and clustering. For example, generating accurate patterns from climate related time series is important not only for building models for weather forecasting and climate prediction, but also for modeling and predicting the cycle of carbon, water, and energy. We developed the methodology and ran an exploratory analysis of climatic and ecosystem variables from the FLUXNET2015 dataset. The proposed combination of similarity metrics, nonlinear dimension reduction, clustering methods and validity measures for time series data has never been applied to unlabeled datasets before, and provides a process that can be easily extended to other scientific time series data. The dimensionality reduction step provides a good way to identify the optimum number of clusters, detect outliers and assign initial labels to the time series data. We evaluated multiple similarity metrics, in terms of the internal cluster validity for driver as well as response variables. While the best metric often depends on a number of factor, the Euclidean distance seems to perform well for most variables and also in terms of computational expense.
- Published
- 2019
16. HDMF: Hierarchical Data Modeling Framework for Modern Science Data Standards
- Author
-
Kristofer E. Bouchard, Oliver Rubel, Ryan Ly, Loren M. Frank, Benjamin Dichter, Edward F. Chang, Andrew Tritt, Donghe Kang, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua, Ak, Ronay, Tian, Yuanyuan, Barga, Roger S, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
0301 basic medicine ,Standardization ,business.industry ,Computer science ,HDF5 ,computer.file_format ,Hierarchical Data Format ,Article ,Hierarchical database model ,Data modeling ,Metadata ,Data Standard ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,Computer data storage ,data formats ,Use case ,neurophysiology ,data standards ,Software engineering ,business ,computer ,030217 neurology & neurosurgery ,data modeling - Abstract
A ubiquitous problem in aggregating data across different experimental and observational data sources is a lack of software infrastructure that enables flexible and extensible standardization of data and metadata. To address this challenge, we developed HDMF, a hierarchical data modeling framework for modern science data standards. With HDMF, we separate the process of data standardization into three main components: (1) data modeling and specification, (2) data I/O and storage, and (3) data interaction and data APIs. To enable standards to support the complex requirements and varying use cases throughout the data life cycle, HDMF provides object mapping infrastructure to insulate and integrate these various components. This approach supports the flexible development of data standards and extensions, optimized storage backends, and data APIs, while allowing the other components of the data standards ecosystem to remain stable. To meet the demands of modern, large-scale science data, HDMF provides advanced data I/O functionality for iterative data write, lazy data load, and parallel I/O. It also supports optimization of data storage via support for chunking, compression, linking, and modular data storage. We demonstrate the application of HDMF in practice to design NWB 2.0 [13], a modern data standard for collaborative science across the neurophysiology community.
- Published
- 2019
- Full Text
- View/download PDF
17. Towards analyzing large graphs with quantum annealing
- Author
-
Ville Kotovirta, Hannu Reittu, Lasse Leskelä, Tomi Raty, Hannu Rummukainen, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
Lemma (mathematics) ,Theoretical computer science ,Computer science ,Quantum annealing ,quantum annealing ,Graph theory ,Quantum computing ,01 natural sciences ,graph communitiy detection ,Graph ,010305 fluids & plasmas ,0103 physical sciences ,010306 general physics ,Quantum ,Quantum computer - Abstract
The use of quantum computing in graph community detection and regularity checking related to Szemerédi's Regularity Lemma (SRL) are demonstrated with D-Wave Systems' quantum annealer and simulations. We demonstrate the capability of quantum computing in solving hard problems relevant to big data. A new community detection algorithm based on SRL is also introduced and tested.
- Published
- 2019
- Full Text
- View/download PDF
18. An Identity Privacy Preserving IoT Data Protection Scheme for Cloud Based Analytics
- Author
-
Christian Gehrmann, Martin Gunnarsson, Baru, Chaitanya, Huan, Jun, Khan, Latifur, Hu, Xiaohua Tony, Ak, Ronay, Tian, Yuanyuan, Barga, Roger, Zaniolo, Carlo, Lee, Kisung, and Ye, Yanfang Fanny
- Subjects
business.industry ,Computer science ,IoT security ,020206 networking & telecommunications ,Cloud computing ,02 engineering and technology ,Encryption ,Computer security ,computer.software_genre ,Computer Science::Computers and Society ,identity privacy ,Symmetric-key algorithm ,Analytics ,020204 information systems ,Computer Science::Multimedia ,analytics ,0202 electrical engineering, electronic engineering, information engineering ,Identity (object-oriented programming) ,Key (cryptography) ,Data Protection Act 1998 ,business ,Security level ,computer ,Information Systems ,Computer Science::Cryptography and Security - Abstract
Efficient protection of huge amount of IoT produced data is key for wide scale data analytic services. The most efficient way is to use pure symmetric encryption as that allows both fast decryption at the analytic engine side as well as energy efficient encryption at the IoT side. However, symmetric encryption can only be performed if there is a way to directly map an encrypted object to the correct key. Typically, such mapping require a unique IoT identity, which constitute a privacy problem. In this paper, we present an IoT identity protection scheme for symmetric IoT data encryption. We give basic security definitions for this problem setting, present a new construction and give security proofs of security level achieved with the construction. Performance figures for a proof of concept implementation are also given. The new scheme gives a fair trade-off between identity privacy and complexity.
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.