Author: "Zheng, Zeyu" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

1. Adaptive Pairwise Weights for Temporal Credit Assignment

Author: Zheng, Zeyu, Vuorio, Risto, Lewis, Richard, and Singh, Satinder
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, General Medicine, Machine Learning (cs.LG)
Abstract: How much credit (or blame) should an action taken in a state get for a future reward? This is the fundamental temporal credit assignment problem in Reinforcement Learning (RL). One of the earliest and still most widely used heuristics is to assign this credit based on a scalar coefficient, $\lambda$ (treated as a hyperparameter), raised to the power of the time interval between the state-action and the reward. In this empirical paper, we explore heuristics based on more general pairwise weightings that are functions of the state in which the action was taken, the state at the time of the reward, as well as the time interval between the two. Of course it isn't clear what these pairwise weight functions should be, and because they are too complex to be treated as hyperparameters we develop a metagradient procedure for learning these weight functions during the usual RL training of a policy. Our empirical work shows that it is often possible to learn these pairwise weight functions during learning of the policy to achieve better performance than competing approaches., Comment: AAAI 2022. The first two authors contributed equally
Published: 2022

2. Best Arm Identification with Fairness Constraints on Subpopulations

Author: Wu, Yuhang, Zheng, Zeyu, and Zhu, Tingyu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computers and Society, Statistics - Machine Learning, Computers and Society (cs.CY), Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We formulate, analyze and solve the problem of best arm identification with fairness constraints on subpopulations (BAICS). Standard best arm identification problems aim at selecting an arm that has the largest expected reward where the expectation is taken over the entire population. The BAICS problem requires that an selected arm must be fair to all subpopulations (e.g., different ethnic groups, age groups, or customer types) by satisfying constraints that the expected reward conditional on every subpopulation needs to be larger than some thresholds. The BAICS problem aims at correctly identify, with high confidence, the arm with the largest expected reward from all arms that satisfy subpopulation constraints. We analyze the complexity of the BAICS problem by proving a best achievable lower bound on the sample complexity with closed-form representation. We then design an algorithm and prove that the algorithm's sample complexity matches with the lower bound in terms of order. A brief account of numerical experiments are conducted to illustrate the theoretical findings.
Published: 2023

3. Understanding plasticity in neural networks

Author: Lyle, Clare, Zheng, Zeyu, Nikishin, Evgenii, Pires, Bernardo Avila, Pascanu, Razvan, and Dabney, Will
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems, but the mechanisms driving this phenomenon are still poorly understood. This paper conducts a systematic empirical analysis into plasticity loss, with the goal of understanding the phenomenon mechanistically in order to guide the future development of targeted solutions. We find that loss of plasticity is deeply connected to changes in the curvature of the loss landscape, but that it typically occurs in the absence of saturated units or divergent gradient norms. Based on this insight, we identify a number of parameterization and optimization design choices which enable networks to better preserve plasticity over the course of training. We validate the utility of these findings in larger-scale learning problems by applying the best-performing intervention, layer normalization, to a deep RL agent trained on the Arcade Learning Environment., Accepted to ICML 2023
Published: 2023

4. Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

Author: Vadori, Nelson, Ardon, Leo, Ganesh, Sumitra, Spooner, Thomas, Amrouni, Selim, Vann, Jared, Xu, Mengda, Zheng, Zeyu, Balch, Tucker, and Veloso, Manuela
Subjects: FOS: Computer and information sciences, FOS: Economics and business, Artificial Intelligence (cs.AI), Quantitative Finance - Computational Finance, Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory, Computational Finance (q-fin.CP), Computer Science - Multiagent Systems, Multiagent Systems (cs.MA), Computer Science and Game Theory (cs.GT)
Abstract: We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with associated shared policy learning constitutes an efficient solution to this problem. Precisely, we show that our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of incentives encompassing profit-and-loss, optimal execution and market share, by playing against each other. In particular, we find that liquidity providers naturally learn to balance hedging and skewing as a function of their incentives, where the latter refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium, both on toy and real market data.
Published: 2022

5. Adaptive A/B Tests and Simultaneous Treatment Parameter Optimization

Author: Wu, Yuhang, Zheng, Zeyu, Zhang, Guangyu, Zhang, Zuohua, and Wang, Chu
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, Statistics - Methodology
Abstract: Constructing asymptotically valid confidence intervals through a valid central limit theorem is crucial for A/B tests, where a classical goal is to statistically assert whether a treatment plan is significantly better than a control plan. In some emerging applications for online platforms, the treatment plan is not a single plan, but instead encompasses an infinite continuum of plans indexed by a continuous treatment parameter. As such, the experimenter not only needs to provide valid statistical inference, but also desires to effectively and adaptively find the optimal choice of value for the treatment parameter to use for the treatment plan. However, we find that classical optimization algorithms, despite of their fast convergence rates under convexity assumptions, do not come with a central limit theorem that can be used to construct asymptotically valid confidence intervals. We fix this issue by providing a new optimization algorithm that on one hand maintains the same fast convergence rate and on the other hand permits the establishment of a valid central limit theorem. We discuss practical implementations of the proposed algorithm and conduct numerical experiments to illustrate the theoretical findings.
Published: 2022

6. Extremal planar graphs with no cycles of particular lengths

Author: Győri, Ervin, Wang, Xianzhi, and Zheng, Zeyu
Subjects: FOS: Mathematics, Mathematics - Combinatorics, Combinatorics (math.CO)
Abstract: In this paper we estimate the planar Tur\'an number $\mathrm{ex}_\mathcal{P}(n,H)$ of some graphs $H$, i.e., the maximum number of edges in a planar graph $G$ of $n$ vertices not containing $H$ as a subgraph. We give a new, short proof when $H=C_5$, and study the cases when $G$ is bipartite or triangle-free and $H$ is a short even cycle. The proofs are mostly new applications or variants of the "contribution method" introduced by Ghosh, Gy\H{o}ri, Martin, Paulos and Xiao in arXiv:2004.14094.
Published: 2022

7. Islet β‐cells physiological difference study of old and young mice based on single‐cell transcriptomics

Author: Zheng Zeyu, Gang Chen, Qiufeng Zhan, Ayun Chen, and Zhen Yu
Subjects: 0301 basic medicine, Senescence, medicine.medical_specialty, Aging, ScRNA‐seq, Endocrinology, Diabetes and Metabolism, 030209 endocrinology & metabolism, Diseases of the endocrine glands. Clinical endocrinology, Transcriptome, 03 medical and health sciences, 0302 clinical medicine, Downregulation and upregulation, Internal medicine, Insulin-Secreting Cells, Internal Medicine, medicine, Animals, Gene Regulatory Networks, Gene, Transcription factor, geography, geography.geographical_feature_category, business.industry, Sequence Analysis, RNA, General Medicine, Original Articles, RC648-665, Islet, Phenotype, Fold change, β‐Cells, Mice, Inbred C57BL, Pancreas aging, 030104 developmental biology, Endocrinology, Original Article, Female, Single-Cell Analysis, business
Abstract: Aims/Introduction Body aging is a universal biological process. With aging, cells undergo a series of physiological changes. The main feature is cell proliferation decline, although the cells still have normal functions. Pancreatic β‐cells are no exception. However, the physiological senescence of β‐cells, and the resulting function and transcriptome changes have rarely attracted attention. The specific senescence phenotype of β‐cells remains unknown. Materials and Methods Pancreatic samples from three female C57BL/6 mice with aged 2.5 months (young) mice and 20 months (old) were digested to a single‐cell suspension and analyzed, with 10× Genomics single‐cell ribonucleic acid sequencing, β‐cells were determined by biosynthesis analysis, and differences between old and young mice were identified. Results A total of 47 differential genes with significant and statistical significance were screened in β‐cells (fold change >1.5, P, Transcription factors play an important role in facilitating transitions between pancreatic cells. The comprehensive datasets of aging‐related genes, transcription factors and pathways, and in almost all pancreatic cell types were provided. This study could serve as new therapies that attenuate aging and alleviate aging‐related diseases with further research.
Published: 2021

8. Inference on the Best Policies with Many Covariates

Author: Wei, Waverly, Zhou, Yuqing, Zheng, Zeyu, and Wang, Jingshen
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, Statistics - Methodology
Abstract: Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the best policies. In recent years, given the increased availability of large datasets, such an issue can be further complicated when researchers include many covariates to estimate the policy or treatment effects in an attempt to control for potential confounders. In this manuscript, to simultaneously address the above-mentioned issues, we propose a resampling-based procedure that not only lifts the winner's curse in evaluating the best policies observed in a random sample, but also is robust to the presence of many covariates. The proposed inference procedure yields accurate point estimates and valid frequentist confidence intervals that achieve the exact nominal level as the sample size goes to infinity for multiple best policy effect sizes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical studies, evaluating the most effective policies in charitable giving and the most beneficial group of workers in the National Supported Work program., Accepted by The Journal of Econometrics
Published: 2022

9. Common kings of a chain of cycles in a strong tournament

Author: Post, Logan and Zheng, Zeyu
Subjects: Computer Science::Discrete Mathematics, FOS: Mathematics, Mathematics - Combinatorics, Combinatorics (math.CO), Physics::History of Physics, Computer Science::Databases
Abstract: It is known that every strong tournament has directed cycles of any length, and thereby strong subtournaments of any size. In this note, we prove that they also can share a common vertex which is a king of all of them. This common vertex can be any king in the whole tournament. Further, the Hamiltonian cycles in them can be recursively constructed by inserting an additional vertex to one directed edge.
Published: 2022

10. Exosomal lncRNA HOTAIR induce macrophages to M2 polarization via PI3K/ p-AKT /AKT pathway and promote EMT and metastasis in laryngeal squamous cell carcinoma

Author: Wang, Jingting, Wang, Nan, Zheng, Zeyu, Che, Yanlu, Suzuki, Masanobu, Kano, Satoshi, Lu, Jianguang, Wang, Peng, Sun, Yanan, and Homma, Akihiro
Subjects: Cancer Research, Epithelial-Mesenchymal Transition, Macrophage, Squamous Cell Carcinoma of Head and Neck, LSCC, Macrophages, EMT, lncRNA HOTAIR, Exosomes, Gene Expression Regulation, Neoplastic, Phosphatidylinositol 3-Kinases, Oncology, Genetics, Tumor Microenvironment, Humans, RNA, Long Noncoding, Laryngeal Neoplasms, Proto-Oncogene Proteins c-akt, Cell Proliferation
Abstract: Exosomes are a new way of the communication between the tumor cell and macrophage in the micro-environment. The macrophage can be induced to different phenotypes according to the different tumors. In the present study, long-chain noncoding RNA HOTAIR (lncRNA HOTAIR) was highly expressed in LSCC and exosomes. The pathway of exosomal lncRNA HOTAIR inducing macrophage to M2 polarization in the LSCC was investigated. The carcinoma tissues and adjacent tissues were collected from 104 LSCC cases, and the positive relationship between CD163-/CD206-M2 macrophage infiltration and clinical phase, lymph node spreading and pathological phase in LSCC was observed. To examine the role of exosomal lncRNA HOTAIR, macrophages were co-cultured with LSCC-exosomes of high lncRNA HOTAIR expression or transferred with HOTAIR mimics. It was suggested that exosomal lncRNA HOTAIR can induce macrophages to M2 polarization by PI3K/p-AKT/AKT signaling pathway. Furthermore, exo-treated M2 macrophages facilitate the migration, proliferation, and EMT of LSCC.
Published: 2022

11. GrASP: Gradient-Based Affordance Selection for Planning

Author: Veeriah, Vivek, Zheng, Zeyu, Lewis, Richard, and Singh, Satinder
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Planning with a learned model is arguably a key component of intelligence. There are several challenges in realizing such a component in large-scale reinforcement learning (RL) problems. One such challenge is dealing effectively with continuous action spaces when using tree-search planning (e.g., it is not feasible to consider every action even at just the root node of the tree). In this paper we present a method for selecting affordances useful for planning -- for learning which small number of actions/options from a continuous space of actions/options to consider in the tree-expansion process during planning. We consider affordances that are goal-and-state-conditional mappings to actions/options as well as unconditional affordances that simply select actions/options available in all states. Our selection method is gradient based: we compute gradients through the planning procedure to update the parameters of the function that represents affordances. Our empirical work shows that it is feasible to learn to select both primitive-action and option affordances, and that simultaneously learning to select affordances and planning with a learned value-equivalent model can outperform model-free RL.
Published: 2022

12. A Simple and Optimal Policy Design with Safety against Heavy-tailed Risk for Stochastic Bandits

Author: Simchi-Levi, David, Zheng, Zeyu, and Zhu, Feng
Subjects: Computer Science::Machine Learning, FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, FOS: Mathematics, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Machine Learning (cs.LG)
Abstract: We study the stochastic multi-armed bandit problem and design new policies that enjoy both worst-case optimality for expected regret and light-tailed risk for regret distribution. Starting from the two-armed bandit setting with time horizon $T$, we propose a simple policy and prove that the policy (i) enjoys the worst-case optimality for the expected regret at order $O(\sqrt{T\ln T})$ and (ii) has the worst-case tail probability of incurring a linear regret decay at an exponential rate $\exp(-\Omega(\sqrt{T}))$, a rate that we prove to be best achievable for all worst-case optimal policies. Briefly, our proposed policy achieves a delicate balance between doing more exploration at the beginning of the time horizon and doing more exploitation when approaching the end, compared to the standard Successive Elimination policy and Upper Confidence Bound policy. We then improve the policy design and analysis to work for the general $K$-armed bandit setting. Specifically, the worst-case probability of incurring a regret larger than any $x>0$ is upper bounded by $\exp(-\Omega(x/\sqrt{KT}))$. We then enhance the policy design to accommodate the "any-time" setting where $T$ is not known a priori, and prove equivalently desired policy performances as compared to the "fixed-time" setting with known $T$. A brief account of numerical experiments is conducted to illustrate the theoretical findings. We conclude by extending our proposed policy design to the general stochastic linear bandit setting and proving that the policy leads to both worst-case optimality in terms of expected regret order and light-tailed risk on the regret distribution., Comment: Preliminary version appeared in NeurIPS 2022
Published: 2022
Full Text: View/download PDF

13. Selecting the Best Optimizing System

Author: Si, Nian and Zheng, Zeyu
Subjects: Methodology (stat.ME), FOS: Computer and information sciences, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Mathematics - Optimization and Control, Statistics - Methodology
Abstract: We formulate selecting the best optimizing system (SBOS) problems and provide solutions for those problems. In an SBOS problem, a finite number of systems are contenders. Inside each system, a continuous decision variable affects the system's expected performance. An SBOS problem compares different systems based on their expected performances under their own optimally chosen decision to select the best, without advance knowledge of expected performances of the systems nor the optimizing decision inside each system. We design easy-to-implement algorithms that adaptively chooses a system and a choice of decision to evaluate the noisy system performance, sequentially eliminates inferior systems, and eventually recommends a system as the best after spending a user-specified budget. The proposed algorithms integrate the stochastic gradient descent method and the sequential elimination method to simultaneously exploit the structure inside each system and make comparisons across systems. For the proposed algorithms, we prove exponential rates of convergence to zero for the probability of false selection, as the budget grows to infinity. We conduct three numerical examples that represent three practical cases of SBOS problems. Our proposed algorithms demonstrate consistent and stronger performances in terms of the probability of false selection over benchmark algorithms under a range of problem settings and sampling budgets., Comment: Code in https://github.com/nian-si/SelectOptSys
Published: 2022
Full Text: View/download PDF

14. Additional file 1 of Exosomal lncRNA HOTAIR induce macrophages to M2 polarization via PI3K/ p-AKT /AKT pathway and promote EMT and metastasis in laryngeal squamous cell carcinoma

Author: Wang, Jingting, Wang, Nan, Zheng, Zeyu, Che, Yanlu, Suzuki, Masanobu, Kano, Satoshi, Lu, Jianguang, Wang, Peng, Sun, Yanan, and Homma, Akihiro
Abstract: Additional file 1.
Published: 2022
Full Text: View/download PDF

15. A Short Proof of a Convex Representation for Stationary Distributions of Markov Chains with an Application to State Space Truncation

Author: Zheng, Zeyu, Infanger, Alex, and Glynn, Peter W.
Subjects: Probability (math.PR), FOS: Mathematics, Mathematics - Probability
Abstract: In an influential paper, Courtois and Semal (1984) establish that when $G$ is an irreducible substochastic matrix for which $\sum_{n=0}^{\infty}G^n
Published: 2022
Full Text: View/download PDF

16. Advances in Deep Reinforcement Learning: Intrinsic Rewards, Temporal Credit Assignment, State Representations, and Value-equivalent Models

Author: Zheng, Zeyu
Subjects: reinforcement learning, machine learning, Engineering, Computer Science, deep learning, artificial intelligence
Abstract: Reinforcement learning (RL) is a machine learning paradigm concerned with how an agent learns to predict and control its own experience stream so as to maximize long-term cumulative reward. In the past decade, deep reinforcement learning (DeepRL), a subfield that aims to combine the sequential decision-making techniques in RL with the powerful non-linear function approximation tools offered by deep learning, has seen great success such as defeating human champions in the ancient board game Go and achieving expert-level performance in complex strategy games like Dota $2$ and Starcraft. It has also had an impact on real-world applications. Examples include robot control, stratospheric balloon navigation, and controlling nuclear fusion plasma. This thesis aims to further advance DeepRL techniques. Concretely, this thesis makes contributions in the following four directions: 1) In reward design, we develop a novel meta-learning algorithm for learning reward functions that facilitate policy optimization. Our algorithm improves the performance of policy-gradient methods and outperforms handcrafted heuristic reward functions. In a follow-up study, we show that the learned reward functions can capture knowledge about long-term exploration and exploitation and can generalize to different RL algorithms and changes in the environment dynamics. 2) In temporal credit assignment, we explore methods based on pairwise weights that are functions of the state in which the action was taken, the state in which the reward was received, and the time elapsed in between. We develop a metagradient algorithm for adapting these weights during policy learning. Our experiments show that our method achieves better performance than competing approaches. 3) In state representation learning, we investigate using random deep action-conditional prediction tasks as auxiliary tasks to help agents learn better state representations. Our experiments show that random deep action-conditional predictions can often yield better performance than handcrafted auxiliary tasks. 4) In model learning and planning, we develop a new method for learning value-equivalent models, a class of models that demonstrates strong empirical performance lately, that generalizes existing methods. Our experiments show that our method can improve both the model prediction accuracy and the control performance of the downstream planning procedure.
Published: 2022
Full Text: View/download PDF

17. Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Author: Lin, Tianyi, Zheng, Zeyu, and Jordan, Michael I.
Subjects: FOS: Computer and information sciences, Computer Science - Computational Complexity, Computer Science - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Computational Complexity (cs.CC), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: Nonsmooth nonconvex optimization problems broadly emerge in machine learning and business decision making, whereas two core challenges impede the development of efficient solution methods with finite-time convergence guarantee: the lack of computationally tractable optimality criterion and the lack of computationally powerful oracles. The contributions of this paper are two-fold. First, we establish the relationship between the celebrated Goldstein subdifferential~\citep{Goldstein-1977-Optimization} and uniform smoothing, thereby providing the basis and intuition for the design of gradient-free methods that guarantee the finite-time convergence to a set of Goldstein stationary points. Second, we propose the gradient-free method (GFM) and stochastic GFM for solving a class of nonsmooth nonconvex optimization problems and prove that both of them can return a $(\delta,\epsilon)$-Goldstein stationary point of a Lipschitz function $f$ at an expected convergence rate at $O(d^{3/2}\delta^{-1}\epsilon^{-4})$ where $d$ is the problem dimension. Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results. Finally, we demonstrate the effectiveness of 2-SGFM on training ReLU neural networks with the \textsc{Minst} dataset., Comment: Accepted by NeurIPS 2022; 32 pages, 18 figures; Fix a confusing part in the proof of Theorem 3.1: we use Bertsekas [1973, Proposition 2.3] rather than Bertsekas [1973, Proposition 2.4] here and do not assume the convexity of the function f
Published: 2022
Full Text: View/download PDF

18. Note on the Tur\'an number of the $3$-linear hypergraph $C_{13}$

Author: Tang, Chaoliang, Wu, Hehui, Zhang, Shengtong, and Zheng, Zeyu
Subjects: Mathematics - Combinatorics
Abstract: Let the crown $C_{13}$ be the linear $3$-graph on $9$ vertices $\{a,b,c,d,e,f,g,h,i\}$ with edges $$E = \{\{a,b,c\}, \{a, d,e\}, \{b, f, g\}, \{c, h,i\}\}.$$ Proving a conjecture of Gy\'arf\'as et. al., we show that for any crown-free linear $3$-graph $G$ on $n$ vertices, its number of edges satisfy $$\lvert E(G) \rvert \leq \frac{3(n - s)}{2}$$ where $s$ is the number of vertices in $G$ with degree at least $6$. This result, combined with previous work, essentially completes the determination of linear Tur\'an number for linear $3$-graphs with at most $4$ edges., Comment: 5 pages, 1 figures. Correct Typos, and add acknowledgement to Professor Gyarfas
Published: 2021

19. Learning State Representations from Random Deep Action-conditional Predictions

Author: Zheng, Zeyu, Veeriah, Vivek, Vuorio, Risto, Lewis, Richard, and Singh, Satinder
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i.e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of actions the predictions are conditioned upon -- form good auxiliary tasks for reinforcement learning (RL) problems. In particular, we show that random deep action-conditional predictions when used as auxiliary tasks yield state representations that produce control performance competitive with state-of-the-art hand-crafted auxiliary tasks like value prediction, pixel control, and CURL in both Atari and DeepMind Lab tasks. In another set of experiments we stop the gradients from the RL part of the network to the state representation learning part of the network and show, perhaps surprisingly, that the auxiliary tasks alone are sufficient to learn state representations good enough to outperform an end-to-end trained actor-critic baseline. We opensourced our code at https://github.com/Hwhitetooth/random_gvfs., NeurIPS 2021
Published: 2021

20. Note on the Tur��n number of the $3$-linear hypergraph $C_{13}$

Author: Tang, Chaoliang, Wu, Hehui, Zhang, Shengtong, and Zheng, Zeyu
Subjects: FOS: Mathematics, Combinatorics (math.CO)
Abstract: Let the crown $C_{13}$ be the linear $3$-graph on $9$ vertices $\{a,b,c,d,e,f,g,h,i\}$ with edges $$E = \{\{a,b,c\}, \{a, d,e\}, \{b, f, g\}, \{c, h,i\}\}.$$ Proving a conjecture of Gy��rf��s et. al., we show that for any crown-free linear $3$-graph $G$ on $n$ vertices, its number of edges satisfy $$\lvert E(G) \rvert \leq \frac{3(n - s)}{2}$$ where $s$ is the number of vertices in $G$ with degree at least $6$. This result, combined with previous work, essentially completes the determination of linear Tur��n number for linear $3$-graphs with at most $4$ edges., 5 pages, 1 figures. Correct Typos, and add acknowledgement to Professor Gyarfas
Published: 2021
Full Text: View/download PDF

21. Offline Planning and Online Learning under Recovering Rewards

Author: Simchi-Levi, David, Zheng, Zeyu, and Zhu, Feng
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Discrete Mathematics (cs.DM), Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Computer Science - Discrete Mathematics, Machine Learning (cs.LG)
Abstract: Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce and solve a general class of non-stationary multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from up to $K\,(\ge 1)$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops after it is pulled, and then non-parametrically recovers as the arm's idle time increases. With the objective of maximizing the expected cumulative reward over $T$ time periods, we design a class of ``Purely Periodic Policies'' that jointly set a period to pull each arm. For the proposed policies, we prove performance guarantees for both the offline problem and the online problems. For the offline problem when all model parameters are known, the proposed periodic policy obtains an approximation ratio that is at the order of $1-\mathcal O(1/\sqrt{K})$, which is asymptotically optimal when $K$ grows to infinity. For the online problem when the model parameters are unknown and need to be dynamically learned, we integrate the offline periodic policy with the upper confidence bound procedure to construct on online policy. The proposed online policy is proved to approximately have $\widetilde{\mathcal O}(N\sqrt{T})$ regret against the offline benchmark. Our framework and policy design may shed light on broader offline planning and online learning applications with non-stationary and recovering rewards., Comment: v1 accepted by ICML 2021
Published: 2021
Full Text: View/download PDF

22. Continuous Conditional Generative Adversarial Networks (cGAN) with Generator Regularization

Author: Zheng, Yufeng, Zhang, Yunkai, and Zheng, Zeyu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)
Abstract: Conditional Generative Adversarial Networks are known to be difficult to train, especially when the conditions are continuous and high-dimensional. To partially alleviate this difficulty, we propose a simple generator regularization term on the GAN generator loss in the form of Lipschitz penalty. Thus, when the generator is fed with neighboring conditions in the continuous space, the regularization term will leverage the neighbor information and push the generator to generate samples that have similar conditional distributions for each neighboring condition. We analyze the effect of the proposed regularization term and demonstrate its robust performance on a range of synthetic and real-world tasks.
Published: 2021
Full Text: View/download PDF

23. On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification

Author: Lin, Tianyi, Zheng, Zeyu, Chen, Elynn Y., Cuturi, Marco, and Jordan, Michael I.
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Statistics Theory, Statistics Theory (math.ST), Machine Learning (cs.LG)
Abstract: Optimal transport (OT) distances are increasingly used as loss functions for statistical inference, notably in the learning of generative models or supervised learning. Yet, the behavior of minimum Wasserstein estimators is poorly understood, notably in high-dimensional regimes or under model misspecification. In this work we adopt the viewpoint of projection robust (PR) OT, which seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected. Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances, complementing and improving previous literature that has been restricted to one-dimensional and well-specified cases. Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces. Our complexity bounds can help explain why both PRW and IPRW distances outperform Wasserstein distances empirically in high-dimensional inference tasks. Finally, we consider parametric inference using the PRW distance. We provide an asymptotic guarantee of two types of minimum PRW estimators and formulate a central limit theorem for max-sliced Wasserstein estimator under model misspecification. To enable our analysis on PRW with projection dimension larger than one, we devise a novel combination of variational analysis and statistical theory., Accepted by AISTATS 2021; Fix some inaccuracy in the definition and proof; 49 Pages, 41 figures
Published: 2020

24. A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation

Author: Zheng, Yufeng, Zheng, Zeyu, and Zhu, Tingyu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We propose a framework that integrates classical Monte Carlo simulators and Wasserstein generative adversarial networks to model, estimate, and simulate a broad class of arrival processes with general non-stationary and multi-dimensional random arrival rates. Classical Monte Carlo simulators have advantages at capturing the interpretable "physics" of a stochastic object, whereas neural-network-based simulators have advantages at capturing less-interpretable complicated dependence within a high-dimensional distribution. We propose a doubly stochastic simulator that integrates a stochastic generative neural network and a classical Monte Carlo Poisson simulator, to utilize both advantages. Such integration brings challenges to both theoretical reliability and computational tractability for the estimation of the simulator given real data, where the estimation is done through minimizing the Wasserstein distance between the distribution of the simulation output and the distribution of real data. Regarding theoretical properties, we prove consistency and convergence rate for the estimated simulator under a non-parametric smoothness assumption. Regarding computational efficiency and tractability for the estimation procedure, we address a challenge in gradient evaluation that arise from the discontinuity in the Monte Carlo Poisson simulator. Numerical experiments with synthetic and real data sets are implemented to illustrate the performance of the proposed framework., Comment: We appreciate a lot the comments and suggestions from anonymous reviewers and editors. This is updated version, and with title changed from "Doubly Stochastic Generative Arrivals Modeling" to "A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation"
Published: 2020
Full Text: View/download PDF

25. Stochastic Localization Methods for Convex Discrete Optimization via Simulation

Author: Zhang, Haixiang, Zheng, Zeyu, and Lavaei, Javad
Subjects: Optimization and Control (math.OC), FOS: Mathematics, Mathematics - Optimization and Control
Abstract: We develop and analyze a set of new sequential simulation-optimization algorithms for large-scale multi-dimensional discrete optimization via simulation problems with a convexity structure. The "large-scale" notion refers to that the decision variable has a large number of values to choose from on each dimension. The proposed algorithms are targeted to identify a solution that is close to the optimal solution given any precision level with any given probability. To achieve this target, utilizing the convexity structure, our algorithm design does not need to scan all the choices of the decision variable, but instead sequentially draws a subset of choices of the decision variable and uses them to "localize" potentially near-optimal solutions to an adaptively shrinking region. To show the power of the localization operation, we first consider one-dimensional large-scale problems. We propose the shrinking uniform sampling algorithm, which is proved to achieve the target with an optimal expected simulation cost under an asymptotic criterion. For multi-dimensional problems, we combine the idea of localization with subgradient information and propose a framework to design stochastic cutting-plane methods and the dimension reduction algorithm, whose expected simulation cost have a low dependence on the scale and the dimension of the problems. The proposed algorithms do not require prior information about the Lipschitz constant of the objective function and the simulation costs are upper bounded by a value that is independent of the Lipschitz constant. Finally, we propose an adaptive algorithm to deal with the unknown noise variance case under the assumption that the randomness of the system is Gaussian. We implement the proposed algorithms on both synthetic and queueing simulation optimization problems, and demonstrate better performances compared to benchmark methods.
Published: 2020
Full Text: View/download PDF

26. What Can Learned Intrinsic Rewards Capture?

Author: Zheng, Zeyu, Oh, Junhyuk, Hessel, Matteo, Xu, Zhongwen, Kroiss, Manuel, van Hasselt, Hado, Silver, David, and Singh, Satinder
Subjects: FOS: Computer and information sciences, Computer Science::Machine Learning, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: The objective of a reinforcement learning agent is to behave so as to maximise the sum of a suitable scalar function of state: the reward. These rewards are typically given and immutable. In this paper, we instead consider the proposition that the reward function itself can be a good locus of learned knowledge. To investigate this, we propose a scalable meta-gradient framework for learning useful intrinsic reward functions across multiple lifetimes of experience. Through several proof-of-concept experiments, we show that it is feasible to learn and capture knowledge about long-term exploration and exploitation into a reward function. Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do., ICML 2020. The first two authors contributed equally
Published: 2019

27. Ques-Chain: An Ethereum Based E-Voting System

Author: Zhang Sicheng, Zhang Qixuan, Jing Haotian, Zheng Zeyu, and Bowen Xu
Subjects: Authentication, Smart contract, Computer science, Electronic voting, media_common.quotation_subject, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Computer security, computer.software_genre, Voting, Blind signature, Protocol (object-oriented programming), computer, Anonymity, media_common, Block (data storage)
Abstract: Ethereum is an open-source, public, block chain-based distributed computing platform and operating system featuring smart contract functionality. In this paper, we proposed an Ethereum based electronic voting (e-voting) protocol, Ques-Chain, which can ensure the authentication can be done without hurting confidentiality and the anonymity can be protected without problems of scams at the same time. Furthermore, the authors considered the wider usages Ques-Chain can be applied on, pointing out that it is able to process all kinds of messages and can be used in all fields with similar needs.
Published: 2019

28. Ques-Chain: an Ethereum Based E-Voting System

Author: Zhang, Qixuan, Xu, Bowen, Jing, Haotian, and Zheng, Zeyu
Subjects: FOS: Computer and information sciences, Computer Science - Cryptography and Security, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Cryptography and Security (cs.CR)
Abstract: Ethereum is an open-source, public, blockchain-based distributed computing platform and operating system featuring smart contract functionality. In this paper, we proposed an Ethereum based eletronic voting (e-voting) protocol, Ques-Chain, which can ensure the authentication can be done without hurting confidentiality and the anonymity can be protected without problems of scams at the same time. Furthermore, the authors considered the wider usages Ques-Chain can be applied on, pointing out that it is able to process all kinds of messages and can be used in all fields with similar needs.
Published: 2019
Full Text: View/download PDF

29. The Study of Stress Wave Tomography Algorithm for Internal Defects in RL Plane of Wood

Author: Xiaochen Du, Zheng Zeyu, Hailin Feng, Mingyue Hu, and Zheng Qian
Subjects: 040101 forestry, 0106 biological sciences, Materials science, Plane (geometry), Acoustics, 04 agricultural and veterinary sciences, 01 natural sciences, Stress wave, Position (vector), 010608 biotechnology, 0401 agriculture, forestry, and fisheries, Point (geometry), Tomography, Interpolation
Abstract: The shape, size and position of the defect on the RL(radial and longitudinal) plane of the wood cannot be obtained by conventional wood cross-sectional imaging. To overcome such limitations, A novel imaging method for the internal defects on the RL Plane of the wood was proposed. The propagation velocity of the stress wave is converted into the value of the estimated point on the RL plane of wood and the two-dimensional imaging of the RL plane of wood is realized by using the velocity correction interpolation method. The proposed method and IDW method are quantitatively analyzed using the method of confusion matrix. The imaging results of 5 samples with defects of different shapes show the effectiveness of the proposed method and the defects of RL plane of wood can be reflected.
Published: 2018

30. Research on the Technology of Searching for Fashion Trend Image Based on ResNet50 Model

Author: Chengrui Xu, Li Ge, Zheng Zeyu, Yujing Tian, Xiaogang Liu, and Shang Wenxiang
Subjects: History, Technical support, Information retrieval, Fashion design, Computer science, Order (business), business.industry, business, Clothing, Image based, Computer Science Applications, Education, Image (mathematics)
Abstract: With the further development of technology, providing technical support for the forecast and development of fashion design trends through artificial intelligence has gradually become a reality. In order to enable users to obtain more accurate design trend materials in a shorter time, this paper presents a ‘searching by image’ model of clothing trends based on the ResNet50 model. Through the targeted collection of data and the rational construction of algorithms, the system can retrieve and output more image materials with relevant design features for the existing image materials for the development of design trends as well as the development and design of styles. By comparing the system with Google’s and Taobao’s ‘searching by image’ system, it is concluded that this system has high efficiency, high accuracy and high correlation.
Published: 2020

31. Approximating Systems Fed by Poisson Processes with Rapidly Changing Arrival Rates

Author: Zheng, Zeyu, Honnappa, Harsha, and Glynn, Peter W.
Subjects: Computer Science::Performance, Probability (math.PR), FOS: Mathematics, Mathematics - Probability
Abstract: This paper introduces a new asymptotic regime for simplifying stochastic models having non-stationary effects, such as those that arise in the presence of time-of-day effects. This regime describes an operating environment within which the arrival process to a service system has an arrival intensity that is fluctuating rapidly. We show that such a service system is well approximated by the corresponding model in which the arrival process is Poisson with a constant arrival rate. In addition to the basic weak convergence theorem, we also establish a first order correction for the distribution of the cumulative number of arrivals over $[0,t]$, as well as the number-in-system process for an infinite-server queue fed by an arrival process having a rapidly changing arrival rate. This new asymptotic regime provides a second regime within which non-stationary stochastic models can be reasonably approximated by a process with stationary dynamics, thereby complementing the previously studied setting within which rates vary slowly in time.
Published: 2018

32. Approximating Performance Measures for Slowly Changing Non-stationary Markov Chains

Author: Zheng, Zeyu, Honnappa, Harsha, and Glynn, Peter W.
Subjects: Probability (math.PR), FOS: Mathematics, Mathematics - Probability
Abstract: This paper is concerned with the development of rigorous approximations to various expectations associated with Markov chains and processes having non-stationary transition probabilities. Such non-stationary models arise naturally in contexts in which time-of-day effects or seasonality effects need to be incorporated. Our approximations are valid asymptotically in regimes in which the transition probabilities change slowly over time. Specifically, we develop approximations for the expected infinite horizon discounted reward, the expected reward to the hitting time of a set, the expected reward associated with the state occupied by the chain at time $n$, and the expected cumulative reward over an interval $[0,n]$. In each case, the approximation involves a linear system of equations identical in form to that which one would need to solve to compute the corresponding quantity for a Markov model having stationary transition probabilities. In that sense, the theory provides an approximation no harder to compute than in the traditional stationary context. While most of the theory is developed for finite state Markov chains, we also provide generalizations to continuous state Markov chains, and finite state Markov jump processes in continuous time. In the latter context, one of our approximations coincides with the uniform acceleration asymptotic due to Massey and Whitt (1998)., Comment: 29 pages
Published: 2018
Full Text: View/download PDF

33. Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

Author: Zhang, Hao, Zheng, Zeyu, Xu, Shizhen, Dai, Wei, Ho, Qirong, Liang, Xiaodan, Hu, Zhiting, Wei, Jinliang, Xie, Pengtao, and Xing, Eric P.
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Computer Science - Distributed, Parallel, and Cluster Computing, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Distributed, Parallel, and Cluster Computing (cs.DC), Machine Learning (cs.LG)
Abstract: Deep learning models can take weeks to train on a single GPU-equipped machine, necessitating scaling out DL training to a GPU-cluster. However, current distributed DL implementations can scale poorly due to substantial parameter synchronization over the network, because the high throughput of GPUs allows more data batches to be processed per unit time than CPUs, leading to more frequent network synchronization. We present Poseidon, an efficient communication architecture for distributed DL on GPUs. Poseidon exploits the layered model structures in DL programs to overlap communication and computation, reducing bursty network communication. Moreover, Poseidon uses a hybrid communication scheme that optimizes the number of bytes required to synchronize each layer, according to layer properties and the number of machines. We show that Poseidon is applicable to different DL frameworks by plugging Poseidon into Caffe and TensorFlow. We show that Poseidon enables Caffe and TensorFlow to achieve 15.5x speed-up on 16 single-GPU machines, even with limited bandwidth (10GbE) and the challenging VGG19-22K network for image classification. Moreover, Poseidon-enabled TensorFlow achieves 31.5x speed-up with 32 single-GPU machines on Inception-V3, a 50% improvement over the open-source TensorFlow (20x speed-up)., To appear in 2017 USENIX Annual Technical Conference
Published: 2017

34. A Cloud Computing Platform for Data Analysis Based on R Cluster

Author: Yang Fu, Dianzheng Fu, Zheng Zeyu, Shuai Li, and Yiming Tong
Subjects: business.industry, Computer science, Distributed computing, 0211 other engineering and technologies, Process (computing), Cloud computing, 02 engineering and technology, Construct (python library), 021001 nanoscience & nanotechnology, Resource (project management), Server, 021105 building & construction, Algorithm design, 0210 nano-technology, business, Cluster analysis, Data virtualization
Abstract: Translation of Data analysis algorithms from data analysis language to high-level programming language is hard work when we construct a cloud computing platform for data analysis. It adds implementation difficulty and maintenance cost of constructing the platform. This paper suggests a new method to use the popular data analysis language R directly on constructing a cloud computing platform. By using resource virtualization techniques, computing resource environment is virtualized into R cluster based on given configurations. By allocating different R machines from R cluster to specified data analysis service, we solve the problem that data analysis algorithms implemented in R should be run in single-user mode only and cannot be customized. After verification, this method simplifies the work of translating data analysis algorithms and speeds up the process of constructing a cloud computing platform for data analysis.
Published: 2016

35. Predicting market instability: New dynamics between volume and volatility

Author: Zheng, Zeyu, Qiao, Zhi, Tenenbaum, Joel N., Stanley, H. Eugene, and Li, Baowen
Subjects: FOS: Economics and business, Statistical Finance (q-fin.ST), Quantitative Finance - Statistical Finance
Abstract: Econophysics and econometrics agree that there is a correlation between volume and volatility in a time series. Using empirical data and their distributions, we further investigate this correlation and discover new ways that volatility and volume interact, particularly when the levels of both are high. We find that the distribution of the volume-conditional volatility is well fit by a power-law function with an exponential cutoff. We find that the volume-conditional volatility distribution scales with volume, and collapses these distributions to a single curve. We exploit the characteristics of the volume-volatility scatter plot to find a strong correlation between logarithmic volume and a quantity we define as local maximum volatility (LMV), which indicates the largest volatility observed in a given range of trading volumes. This finding supports our empirical analysis showing that volume is an excellent predictor of the maximum value of volatility for both same-day and near-future time periods. We also use a joint conditional probability that includes both volatility and volume to demonstrate that invoking both allows us to better predict the largest next-day volatility than invoking either one alone.
Published: 2014

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

35 results on '"Zheng, Zeyu"'

1. Adaptive Pairwise Weights for Temporal Credit Assignment

2. Best Arm Identification with Fairness Constraints on Subpopulations

3. Understanding plasticity in neural networks

4. Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

5. Adaptive A/B Tests and Simultaneous Treatment Parameter Optimization

6. Extremal planar graphs with no cycles of particular lengths

7. Islet β‐cells physiological difference study of old and young mice based on single‐cell transcriptomics

8. Inference on the Best Policies with Many Covariates

9. Common kings of a chain of cycles in a strong tournament

10. Exosomal lncRNA HOTAIR induce macrophages to M2 polarization via PI3K/ p-AKT /AKT pathway and promote EMT and metastasis in laryngeal squamous cell carcinoma

11. GrASP: Gradient-Based Affordance Selection for Planning

12. A Simple and Optimal Policy Design with Safety against Heavy-tailed Risk for Stochastic Bandits

13. Selecting the Best Optimizing System

14. Additional file 1 of Exosomal lncRNA HOTAIR induce macrophages to M2 polarization via PI3K/ p-AKT /AKT pathway and promote EMT and metastasis in laryngeal squamous cell carcinoma

15. A Short Proof of a Convex Representation for Stationary Distributions of Markov Chains with an Application to State Space Truncation

16. Advances in Deep Reinforcement Learning: Intrinsic Rewards, Temporal Credit Assignment, State Representations, and Value-equivalent Models

17. Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

18. Note on the Tur\'an number of the $3$-linear hypergraph $C_{13}$

19. Learning State Representations from Random Deep Action-conditional Predictions

20. Note on the Tur��n number of the $3$-linear hypergraph $C_{13}$

21. Offline Planning and Online Learning under Recovering Rewards

22. Continuous Conditional Generative Adversarial Networks (cGAN) with Generator Regularization

23. On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification

24. A Doubly Stochastic Simulator with Applications in Arrivals Modeling and Simulation

25. Stochastic Localization Methods for Convex Discrete Optimization via Simulation

26. What Can Learned Intrinsic Rewards Capture?

27. Ques-Chain: An Ethereum Based E-Voting System

28. Ques-Chain: an Ethereum Based E-Voting System

29. The Study of Stress Wave Tomography Algorithm for Internal Defects in RL Plane of Wood

30. Research on the Technology of Searching for Fashion Trend Image Based on ResNet50 Model

31. Approximating Systems Fed by Poisson Processes with Rapidly Changing Arrival Rates

32. Approximating Performance Measures for Slowly Changing Non-stationary Markov Chains

33. Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

34. A Cloud Computing Platform for Data Analysis Based on R Cluster

35. Predicting market instability: New dynamics between volume and volatility

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

35 results on '"Zheng, Zeyu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources