Author: "Wu, Jingfeng" / Search Limiters: Available in Library Collection - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wu, Jingfeng"' showing total 206 results

Start Over Author "Wu, Jingfeng" Search Limiters Available in Library Collection

206 results on '"Wu, Jingfeng"'

1. How Does Critical Batch Size Scale in Pre-training?

Author: Zhang, Hanlin, Morwani, Depen, Vyas, Nikhil, Wu, Jingfeng, Zou, Difan, Ghai, Udaya, Foster, Dean, and Kakade, Sham
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Training large-scale models under given resources requires careful design of parallelism strategies. In particular, the efficiency notion of critical batch size, concerning the compromise between time and compute, marks the threshold beyond which greater data parallelism leads to diminishing returns. To operationalize it, we propose a measure of CBS and pre-train a series of auto-regressive language models, ranging from 85 million to 1.2 billion parameters, on the C4 dataset. Through extensive hyper-parameter sweeps and careful control on factors such as batch size, momentum, and learning rate along with its scheduling, we systematically investigate the impact of scale on CBS. Then we fit scaling laws with respect to model and data sizes to decouple their effects. Overall, our results demonstrate that CBS scales primarily with data size rather than model size, a finding we justify theoretically through the analysis of infinite-width limits of neural networks and infinite-dimensional least squares regression. Of independent interest, we highlight the importance of common hyper-parameter choices and strategies for studying large-scale pre-training beyond fixed training durations.
Published: 2024

2. Context-Scaling versus Task-Scaling in In-Context Learning

Author: Abedsoltan, Amirhesam, Radhakrishnan, Adityanarayanan, Wu, Jingfeng, and Belkin, Mikhail
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Transformers exhibit In-Context Learning (ICL), where these models solve new tasks by using examples in the prompt without additional training. In our work, we identify and analyze two key components of ICL: (1) context-scaling, where model performance improves as the number of in-context examples increases and (2) task-scaling, where model performance improves as the number of pre-training tasks increases. While transformers are capable of both context-scaling and task-scaling, we empirically show that standard Multi-Layer Perceptrons (MLPs) with vectorized input are only capable of task-scaling. To understand how transformers are capable of context-scaling, we first propose a significantly simplified transformer architecture without key, query, value weights. We show that it performs ICL comparably to the original GPT-2 model in various statistical learning tasks including linear regression, teacher-student settings. Furthermore, a single block of our simplified transformer can be viewed as data dependent feature map followed by an MLP. This feature map on its own is a powerful predictor that is capable of context-scaling but is not capable of task-scaling. We show empirically that concatenating the output of this feature map with vectorized data as an input to MLPs enables both context-scaling and task-scaling. This finding provides a simple setting to study context and task-scaling for ICL.
Published: 2024

3. UELLM: A Unified and Efficient Approach for LLM Inference Serving

Author: He, Yiyuan, Xu, Minxian, Wu, Jingfeng, Zheng, Wanyi, Ye, Kejiang, and Xu, Chengzhong
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: In the context of Machine Learning as a Service (MLaaS) clouds, the extensive use of Large Language Models (LLMs) often requires efficient management of significant query loads. When providing real-time inference services, several challenges arise. Firstly, increasing the number of GPUs may lead to a decrease in inference speed due to heightened communication overhead, while an inadequate number of GPUs can lead to out-of-memory errors. Secondly, different deployment strategies need to be evaluated to guarantee optimal utilization and minimal inference latency. Lastly, inefficient orchestration of inference queries can easily lead to significant Service Level Objective (SLO) violations. Lastly, inefficient orchestration of inference queries can easily lead to significant Service Level Objective (SLO) violations. To address these challenges, we propose a Unified and Efficient approach for Large Language Model inference serving (UELLM), which consists of three main components: 1) resource profiler, 2) batch scheduler, and 3) LLM deployer. UELLM minimizes resource overhead, reduces inference latency, and lowers SLO violation rates. Compared with state-of-the-art (SOTA) techniques, UELLM reduces the inference latency by 72.3% to 90.3%, enhances GPU utilization by 1.2X to 4.1X, and increases throughput by 1.92X to 4.98X, it can also serve without violating the inference latency SLO., Comment: 15 pages, 5 figures, ICSOC 2024
Published: 2024

4. CloudNativeSim: a toolkit for modeling and simulation of cloud-native applications

Author: Wu, Jingfeng, Xu, Minxian, He, Yiyuan, Ye, Kejiang, and Xu, Chengzhong
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system availability and flexibility. However, cloud-native applications also introduce new challenges, such as frequent inter-service communication and the complexity of managing heterogeneous codebases and hardware, resulting in unpredictable complexity and dynamism. Furthermore, as applications scale, only limited research teams or enterprises possess the resources for large-scale deployment and testing, which impedes progress in the cloud-native domain. To address these challenges, we propose CloudNativeSim, a simulator for cloud-native applications with a microservice-based architecture. CloudNativeSim offers several key benefits: (i) comprehensive and dynamic modeling for cloud-native applications, (ii) an extended simulation framework with new policy interfaces for scheduling cloud-native applications, and (iii) support for customized application scenarios and user feedback based on Quality of Service (QoS) metrics. CloudNativeSim can be easily deployed on standard computers to manage a high volume of requests and services. Its performance was validated through a case study, demonstrating higher than 94.5% accuracy in terms of response time. The study further highlights the feasibility of CloudNativeSim by illustrating the effects of various scaling policies., Comment: 24 pages
Published: 2024

5. Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

Author: Cai, Yuhang, Wu, Jingfeng, Mei, Song, Lindsey, Michael, and Bartlett, Peter L.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: The typical training of neural networks using large stepsize gradient descent (GD) under the logistic loss often involves two distinct phases, where the empirical risk oscillates in the first phase but decreases monotonically in the second phase. We investigate this phenomenon in two-layer networks that satisfy a near-homogeneity condition. We show that the second phase begins once the empirical risk falls below a certain threshold, dependent on the stepsize. Additionally, we show that the normalized margin grows nearly monotonically in the second phase, demonstrating an implicit bias of GD in training non-homogeneous predictors. If the dataset is linearly separable and the derivative of the activation function is bounded away from zero, we show that the average empirical risk decreases, implying that the first phase must stop in finite steps. Finally, we demonstrate that by choosing a suitably large stepsize, GD that undergoes this phase transition is more efficient than GD that monotonically decreases the risk. Our analysis applies to networks of any width, beyond the well-known neural tangent kernel and mean-field regimes., Comment: Clarify our results on sigmoid neural networks
Published: 2024

6. Scaling Laws in Linear Regression: Compute, Parameters, and Data

Author: Lin, Licong, Wu, Jingfeng, Kakade, Sham M., Bartlett, Peter L., and Lee, Jason D.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, which predict that increasing model size monotonically improves performance. We study the theory of scaling laws in an infinite dimensional linear regression setup. Specifically, we consider a model with $M$ parameters as a linear function of sketched covariates. The model is trained by one-pass stochastic gradient descent (SGD) using $N$ data. Assuming the optimal parameter satisfies a Gaussian prior and the data covariance matrix has a power-law spectrum of degree $a>1$, we show that the reducible part of the test error is $\Theta(M^{-(a-1)} + N^{-(a-1)/a})$. The variance error, which increases with $M$, is dominated by the other errors due to the implicit regularization of SGD, thus disappearing from the bound. Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
Published: 2024

7. Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency

Author: Wu, Jingfeng, Bartlett, Peter L., Telgarsky, Matus, and Yu, Bin
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We consider gradient descent (GD) with a constant stepsize applied to logistic regression with linearly separable data, where the constant stepsize $\eta$ is so large that the loss initially oscillates. We show that GD exits this initial oscillatory phase rapidly -- in $\mathcal{O}(\eta)$ steps -- and subsequently achieves an $\tilde{\mathcal{O}}(1 / (\eta t) )$ convergence rate after $t$ additional steps. Our results imply that, given a budget of $T$ steps, GD can achieve an accelerated loss of $\tilde{\mathcal{O}}(1/T^2)$ with an aggressive stepsize $\eta:= \Theta( T)$, without any use of momentum or variable stepsize schedulers. Our proof technique is versatile and also handles general classification loss functions (where exponential tails are needed for the $\tilde{\mathcal{O}}(1/T^2)$ acceleration), nonlinear predictors in the neural tangent kernel regime, and online stochastic gradient descent (SGD) with a large stepsize, under suitable separability conditions., Comment: COLT 2024 camera ready
Published: 2024

8. In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization

Author: Zhang, Ruiqi, Wu, Jingfeng, and Bartlett, Peter L.
Subjects: Statistics - Machine Learning, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We study the \emph{in-context learning} (ICL) ability of a \emph{Linear Transformer Block} (LTB) that combines a linear attention component and a linear multi-layer perceptron (MLP) component. For ICL of linear regression with a Gaussian prior and a \emph{non-zero mean}, we show that LTB can achieve nearly Bayes optimal ICL risk. In contrast, using only linear attention must incur an irreducible additive approximation error. Furthermore, we establish a correspondence between LTB and one-step gradient descent estimators with learnable initialization ($\mathsf{GD}\text{-}\mathbf{\beta}$), in the sense that every $\mathsf{GD}\text{-}\mathbf{\beta}$ estimator can be implemented by an LTB estimator and every optimal LTB estimator that minimizes the in-class ICL risk is effectively a $\mathsf{GD}\text{-}\mathbf{\beta}$ estimator. Finally, we show that $\mathsf{GD}\text{-}\mathbf{\beta}$ estimators can be efficiently optimized with gradient flow, despite a non-convex training objective. Our results reveal that LTB achieves ICL by implementing $\mathsf{GD}\text{-}\mathbf{\beta}$, and they highlight the role of MLP layers in reducing approximation error., Comment: 39 pages
Published: 2024

9. Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

Author: Li, Xuheng, Deng, Yihe, Wu, Jingfeng, Zhou, Dongruo, and Gu, Quanquan
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Accelerated stochastic gradient descent (ASGD) is a workhorse in deep learning and often achieves better generalization performance than SGD. However, existing optimization theory can only explain the faster convergence of ASGD, but cannot explain its better generalization. In this paper, we study the generalization of ASGD for overparameterized linear regression, which is possibly the simplest setting of learning with overparameterization. We establish an instance-dependent excess risk bound for ASGD within each eigen-subspace of the data covariance matrix. Our analysis shows that (i) ASGD outperforms SGD in the subspace of small eigenvalues, exhibiting a faster rate of exponential decay for bias error, while in the subspace of large eigenvalues, its bias error decays slower than SGD; and (ii) the variance error of ASGD is always larger than that of SGD. Our result suggests that ASGD can outperform SGD when the difference between the initialization and the true weight vector is mostly confined to the subspace of small eigenvalues. Additionally, when our analysis is specialized to linear regression in the strongly convex setting, it yields a tighter bound for bias error than the best-known result., Comment: 85 pages, 5 figures
Published: 2023

10. How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

Author: Wu, Jingfeng, Zou, Difan, Chen, Zixiang, Braverman, Vladimir, Gu, Quanquan, and Bartlett, Peter L.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities, enabling them to solve unseen tasks solely based on input contexts without adjusting model parameters. In this paper, we study ICL in one of its simplest setups: pretraining a linearly parameterized single-layer linear attention model for linear regression with a Gaussian prior. We establish a statistical task complexity bound for the attention model pretraining, showing that effective pretraining only requires a small number of independent tasks. Furthermore, we prove that the pretrained model closely matches the Bayes optimal algorithm, i.e., optimally tuned ridge regression, by achieving nearly Bayes optimal risk on unseen tasks under a fixed context length. These theoretical findings complement prior experimental research and shed light on the statistical foundations of ICL., Comment: ICLR 2024 Camera Ready
Published: 2023

11. Private Federated Frequency Estimation: Adapting to the Hardness of the Instance

Author: Wu, Jingfeng, Zhu, Wennan, Kairouz, Peter, and Braverman, Vladimir
Subjects: Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning
Abstract: In federated frequency estimation (FFE), multiple clients work together to estimate the frequencies of their collective data by communicating with a server that respects the privacy constraints of Secure Summation (SecSum), a cryptographic multi-party computation protocol that ensures that the server can only access the sum of client-held vectors. For single-round FFE, it is known that count sketching is nearly information-theoretically optimal for achieving the fundamental accuracy-communication trade-offs [Chen et al., 2022]. However, we show that under the more practical multi-round FEE setting, simple adaptations of count sketching are strictly sub-optimal, and we propose a novel hybrid sketching algorithm that is provably more accurate. We also address the following fundamental question: how should a practitioner set the sketch size in a way that adapts to the hardness of the underlying problem? We propose a two-phase approach that allows for the use of a smaller sketch size for simpler problems (e.g., near-sparse or light-tailed distributions). We conclude our work by showing how differential privacy can be added to our algorithm and verifying its superior performance through extensive experiments conducted on large-scale datasets., Comment: NeurIPS 2023 camera ready version
Published: 2023

12. The (un)caring experienced by racialized and/or ethnoculturally diverse residents in supportive living: a qualitative study

Author: Chamberlain, Stephanie A., Salma, Jordana, Tong, Hongmei, Savera, Wu, Jingfeng, and Gruneir, Andrea
Published: 2024
Full Text: View/download PDF

13. Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability

Author: Wu, Jingfeng, Braverman, Vladimir, and Lee, Jason D.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Recent research has observed that in machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) [Cohen, et al., 2021], where the stepsizes are set to be large, resulting in non-monotonic losses induced by the GD iterates. This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime. Despite the presence of local oscillations, we prove that the logistic loss can be minimized by GD with \emph{any} constant stepsize over a long time scale. Furthermore, we prove that with \emph{any} constant stepsize, the GD iterates tend to infinity when projected to a max-margin direction (the hard-margin SVM direction) and converge to a fixed vector that minimizes a strongly convex potential when projected to the orthogonal complement of the max-margin direction. In contrast, we also show that in the EoS regime, GD iterates may diverge catastrophically under the exponential loss, highlighting the superiority of the logistic loss. These theoretical findings are in line with numerical simulations and complement existing theories on the convergence and implicit bias of GD for logistic regression, which are only applicable when the stepsizes are sufficiently small., Comment: NeurIPS 2023 camera ready version
Published: 2023

14. Fixed Design Analysis of Regularization-Based Continual Learning

Author: Li, Haoran, Wu, Jingfeng, and Braverman, Vladimir
Subjects: Computer Science - Machine Learning
Abstract: We consider a continual learning (CL) problem with two linear regression tasks in the fixed design setting, where the feature vectors are assumed fixed and the labels are assumed to be random variables. We consider an $\ell_2$-regularized CL algorithm, which computes an Ordinary Least Squares parameter to fit the first dataset, then computes another parameter that fits the second dataset under an $\ell_2$-regularization penalizing its deviation from the first parameter, and outputs the second parameter. For this algorithm, we provide tight bounds on the average risk over the two tasks. Our risk bounds reveal a provable trade-off between forgetting and intransigence of the $\ell_2$-regularized CL algorithm: with a large regularization parameter, the algorithm output forgets less information about the first task but is intransigent to extract new information from the second task; and vice versa. Our results suggest that catastrophic forgetting could happen for CL with dissimilar tasks (under a precise similarity measurement) and that a well-tuned $\ell_2$-regularization can partially mitigate this issue by introducing intransigence., Comment: CoLLAs 2023 camera-ready version
Published: 2023

15. Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Author: Wu, Jingfeng, Zou, Difan, Chen, Zixiang, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron (Kakade et al., 2011) and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be preferable to SGD for high-dimensional ReLU regression., Comment: ICML 2023 camera ready
Published: 2023

16. The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

Author: Wu, Jingfeng, Zou, Difan, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: We study linear regression under covariate shift, where the marginal distribution over the input covariates differs in the source and the target domains, while the conditional distribution of the output given the input covariates is similar across the two domains. We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data (both conducted by online SGD) for this problem. We establish sharp instance-dependent excess risk upper and lower bounds for this approach. Our bounds suggest that for a large class of linear regression instances, transfer learning with $O(N^2)$ source data (and scarce or no target data) is as effective as supervised learning with $N$ target data. In addition, we show that finetuning, even with only a small amount of target data, could drastically reduce the amount of source data required by pretraining. Our theory sheds light on the effectiveness and limitation of pretraining as well as the benefits of finetuning for tackling covariate shift problems., Comment: 32 pages, 1 figure, 1 table
Published: 2022

17. Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime

Author: Zou, Difan, Wu, Jingfeng, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Stochastic gradient descent (SGD) has achieved great success due to its superior performance in both optimization and generalization. Most of existing generalization analyses are made for single-pass SGD, which is a less practical variant compared to the commonly-used multi-pass SGD. Besides, theoretical analyses for multi-pass SGD often concern a worst-case instance in a class of problems, which may be pessimistic to explain the superior generalization ability for some particular problem instance. The goal of this paper is to sharply characterize the generalization of multi-pass SGD, by developing an instance-dependent excess risk bound for least squares in the interpolation regime, which is expressed as a function of the iteration number, stepsize, and data covariance. We show that the excess risk of SGD can be exactly decomposed into the excess risk of GD and a positive fluctuation error, suggesting that SGD always performs worse, instance-wisely, than GD, in generalization. On the other hand, we show that although SGD needs more iterations than GD to achieve the same level of excess risk, it saves the number of stochastic gradient evaluations, and therefore is preferable in terms of computational time., Comment: 28 pages, 2 figures
Published: 2022

18. High entropy Prussian Blue Analogues assisted by reduced graphene oxide for enhancing the lifespan of Sodium-ion batteries

Author: Wu, Jingfeng, Wang, Guiting, Li, Kun, Guo, Xu, Liang, Yongxing, Li, Li, Wang, Lei, Xie, Ying, and Guo, Chenfeng
Published: 2024
Full Text: View/download PDF

19. Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Author: Wu, Jingfeng, Zou, Difan, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Stochastic gradient descent (SGD) has been shown to generalize well in many deep learning applications. In practice, one often runs SGD with a geometrically decaying stepsize, i.e., a constant initial stepsize followed by multiple geometric stepsize decay, and uses the last iterate as the output. This kind of SGD is known to be nearly minimax optimal for classical finite-dimensional linear regression problems (Ge et al., 2019). However, a sharp analysis for the last iterate of SGD in the overparameterized setting is still open. In this paper, we provide a problem-dependent analysis on the last iterate risk bounds of SGD with decaying stepsize, for (overparameterized) linear regression problems. In particular, for last iterate SGD with (tail) geometrically decaying stepsize, we prove nearly matching upper and lower bounds on the excess risk. Moreover, we provide an excess risk lower bound for last iterate SGD with polynomially decaying stepsize and demonstrate the advantage of geometrically decaying stepsize in an instance-wise manner, which complements the minimax rate comparison made in prior works., Comment: 35 pages, 2 figures, 1 table. In ICML 2022
Published: 2021

20. Environmentally friendly fluorine-free fire extinguishing agent based on the synergistic effect of silicone, hydrocarbon surfactants and foam stabilizers

Author: Zhang, Guangwen, Jiao, Jinqing, Wu, Jingfeng, Lang, Xuqing, Wang, Chun, Wei, Yuechang, Cui, Pengyu, Shang, Zuzheng, Mu, Xiaodong, Mu, Shanjun, Liu, Linjie, Zhang, Ripeng, and Qi, Lei
Published: 2024
Full Text: View/download PDF

21. Synthesis of SrTiO3-TiO2-CaTiO3/Cu2O composite for stable and efficient H2 generation in deionized water and river water

Author: Yang, Junfeng, Lu, Jinsong, Xie, Liangsheng, Wu, Jingfeng, Wen, Yu, and Zhang, Qin
Published: 2024
Full Text: View/download PDF

22. Gap-Dependent Unsupervised Exploration for Reinforcement Learning

Author: Wu, Jingfeng, Braverman, Vladimir, and Yang, Lin F.
Subjects: Computer Science - Machine Learning
Abstract: For the problem of task-agnostic reinforcement learning (RL), an agent first collects samples from an unknown environment without the supervision of reward signals, then is revealed with a reward and is asked to compute a corresponding near-optimal policy. Existing approaches mainly concern the worst-case scenarios, in which no structural information of the reward/transition-dynamics is utilized. Therefore the best sample upper bound is $\propto\widetilde{\mathcal{O}}(1/\epsilon^2)$, where $\epsilon>0$ is the target accuracy of the obtained policy, and can be overly pessimistic. To tackle this issue, we provide an efficient algorithm that utilizes a gap parameter, $\rho>0$, to reduce the amount of exploration. In particular, for an unknown finite-horizon Markov decision process, the algorithm takes only $\widetilde{\mathcal{O}} (1/\epsilon \cdot (H^3SA / \rho + H^4 S^2 A) )$ episodes of exploration, and is able to obtain an $\epsilon$-optimal policy for a post-revealed reward with sub-optimality gap at least $\rho$, where $S$ is the number of states, $A$ is the number of actions, and $H$ is the length of the horizon, obtaining a nearly \emph{quadratic saving} in terms of $\epsilon$. We show that, information-theoretically, this bound is nearly tight for $\rho < \Theta(1/(HS))$ and $H>1$. We further show that $\propto\widetilde{\mathcal{O}}(1)$ sample bound is possible for $H=1$ (i.e., multi-armed bandit) or with a sampling simulator, establishing a stark separation between those settings and the RL setting., Comment: AISTATS 2022 camera ready version
Published: 2021

23. The Benefits of Implicit Regularization from SGD in Least Squares Problems

Author: Zou, Difan, Wu, Jingfeng, Braverman, Vladimir, Gu, Quanquan, Foster, Dean P., and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches. In this work, we seek to understand these issues in the simpler setting of linear regression (including both underparameterized and overparameterized regimes), where our goal is to make sharp instance-based comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression. For a broad class of least squares problem instances (that are natural in high-dimensional settings), we show: (1) for every problem instance and for every ridge parameter, (unregularized) SGD, when provided with logarithmically more samples than that provided to the ridge algorithm, generalizes no worse than the ridge solution (provided SGD uses a tuned constant stepsize); (2) conversely, there exist instances (in this wide problem class) where optimally-tuned ridge regression requires quadratically more samples than SGD in order to have the same generalization performance. Taken together, our results show that, up to the logarithmic factors, the generalization performance of SGD is always no worse than that of ridge regression in a wide range of overparameterized problems, and, in fact, could be much better for some problem instances. More generally, our results show how algorithmic regularization has important consequences even in simpler (overparameterized) convex settings., Comment: 33 pages, 1 figure. In NeurIPS 2021
Published: 2021

24. Effects of nobiletin on intestinal stem cell proliferation in vitro and in vivo

Author: WU Jingfeng, GUO Kenan, and ZHANG Ning
Subjects: nobiletin, intestinal stem cells, cell proliferation, small intestinal organoid, in vivo, Medicine (General), R5-920
Abstract: Objective To investigate the regulatory effect of nobiletin (NOB) at typical effective doses on intestinal stem cells in vivo and in vitro. Methods After a 3D culture model of mouse colorectal tumor cell line MC38 was constructed, the death and survival of the obtained colonies were observed after treatment of different concentrations of NOB. Mouse small intestinal crypts were cultured in the Matrigel to generate organoids. Different concentrations of NOB were added to the culture system, and the sprouting and growth of the organoids were observed. MTT assay was used to calculate the area and absorbance values of the organoids after staining. The organoid growth was observed by removing 50 and 100 μmol/L NOB for different periods. The effects of exogenous R-spondin1 and CHIR99021 (activators of Wnt pathway) on the growth of organoids were observed in the culture system with 50 and 100 μmol/L NOB treatment. C57/B6J mice were infused with different concentrations of NOB by gavage for 4 consecutive days. The ratio of intestinal crypt to villus length, cell apoptosis, and number of OLfm4 and BrdU double positive cells were calculated and observed. Results Compared with the control group, 50 μmol/L NOB significantly promoted the death of MC38 cells after sphere formation and significantly reduced the colony formation rate (P < 0.001). The dose also significantly inhibited the budding of intestinal organoids, and 50~200 μmol/L NOB significantly inhibited the growth of intestinal organoids in a dose-dependent manner when compared with the control group (P < 0.001). After NOB withdrawal, the growth of intestinal organoids in the 50 μmol/L group was partially recovered, but it was difficult to recover to normal level in the 100 μmol/L NOB group. Enhanced Wnt pathway activation could partially rescue the inhibitory effect of NOB on intestinal organoids. In vivo administration of higher concentrations of NOB did not induce apoptosis of intestinal crypt cells, did not affect the crypt to villus ratio, and had no effects on the number and proliferation status of intestinal stem cells. Conclusion NOB possesses the ability to inhibit normal intestinal stem cells by regulating stem cell-related pathways such as Wnt, but this cellular inhibition cannot be phenocopied in vivo. Our study suggests the safety of NOB at regular concentration and also raises a caution to explain the inhibitory role of NOB in cancer models.
Published: 2023
Full Text: View/download PDF

25. Overexpression of IL22 by adeno-associated virus effectively ameliorates dextran sulfate sodium-induced colitis in mice

Author: WU Tianyu, SONG Yalan, WU Jingfeng, and ZHENG Zhiyuan
Subjects: il22, adeno-associated virus, inflammatory bowel disease, dextran sulfate sodium, jak/stat3 pathway, Medicine (General), R5-920
Abstract: Objective To investigate the effects of interleukin22 (IL22) in acute colitis induced by dextran sulfate sodium salt (DSS) in mice and its underlying mechanism. Methods Three IL22 knockout mice (IL22-/-) and 3 control mice (IL22+/+) of 8-week-old males were subjected from the same litter, and then had free-choice drinking of 2.5% DSS water for 6 d followed by normal water for 2 d to construct a mouse model of acute colitis. The changes in body mass weight and disease activity index (DAI) were measured and calculated. After the intervention of 8 d, mouse enteroscopy was performed, and the length of colon and the organ index of major immune organs were measured in both groups. Protein levels of pro-inflammatory cytokines, such as IL1α, IL1β, IL6 and TNFα were detected by Western blotting. Adeno-associated virus (AAV) transfection model was established in 8-week-old male C57BL/6 mice, which were then treated by 2.5%DSS drinking. The mice were randomly divided into PBS group (n=5), AAV-mock+DSS group (n=5) and AAV-IL22+DSS group (n=5). The expression levels of colonic closed small loop protein 1 (ZO-1) and calcium adhesion protein (E-claudin) in different groups were detected by immunofluorescence assay. The protein levels of IL22, Occludin, Stat3 and p-Stat3 were detected by Western blotting. Results Compared with the IL22+/+ group, the mice in the IL22-/- group had significantly decreased body weight on day 8 (P
Published: 2023
Full Text: View/download PDF

26. Lifelong Learning with Sketched Structural Regularization

Author: Li, Haoran, Krishnan, Aditya, Wu, Jingfeng, Kolouri, Soheil, Pilly, Praveen K., and Braverman, Vladimir
Subjects: Computer Science - Machine Learning
Abstract: Preventing catastrophic forgetting while continually learning new tasks is an essential problem in lifelong learning. Structural regularization (SR) refers to a family of algorithms that mitigate catastrophic forgetting by penalizing the network for changing its "critical parameters" from previous tasks while learning a new one. The penalty is often induced via a quadratic regularizer defined by an \emph{importance matrix}, e.g., the (empirical) Fisher information matrix in the Elastic Weight Consolidation framework. In practice and due to computational constraints, most SR methods crudely approximate the importance matrix by its diagonal. In this paper, we propose \emph{Sketched Structural Regularization} (Sketched SR) as an alternative approach to compress the importance matrices used for regularizing in SR methods. Specifically, we apply \emph{linear sketching methods} to better approximate the importance matrices in SR algorithms. We show that sketched SR: (i) is computationally efficient and straightforward to implement, (ii) provides an approximation error that is justified in theory, and (iii) is method oblivious by construction and can be adapted to any method that belongs to the structural regularization class. We show that our proposed approach consistently improves various SR algorithms' performance on both synthetic experiments and benchmark continual learning tasks, including permuted-MNIST and CIFAR-100.
Published: 2021

27. Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Author: Zou, Difan, Wu, Jingfeng, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: There is an increasing realization that algorithmic inductive biases are central in preventing overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized settings for natural learning algorithms, such as stochastic gradient descent (SGD), where little to no explicit regularization has been employed. This work considers this issue in arguably the most basic setting: constant-stepsize SGD (with iterate averaging or tail averaging) for linear regression in the overparameterized regime. Our main result provides a sharp excess risk bound, stated in terms of the full eigenspectrum of the data covariance matrix, that reveals a bias-variance decomposition characterizing when generalization is possible: (i) the variance bound is characterized in terms of an effective dimension (specific for SGD) and (ii) the bias bound provides a sharp geometric characterization in terms of the location of the initial iterate (and how it aligns with the data covariance matrix). More specifically, for SGD with iterate averaging, we demonstrate the sharpness of the established excess risk bound by proving a matching lower bound (up to constant factors). For SGD with tail averaging, we show its advantage over SGD with iterate averaging by proving a better excess risk bound together with a nearly matching lower bound. Moreover, we reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares (minimum-norm interpolation) and ridge regression. Experimental results on synthetic data corroborate our theoretical findings., Comment: 56 pages, 2 figures. A short version is accepted at the 34th Annual Conference on Learning Theory (COLT 2021)
Published: 2021

28. Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

Author: Wu, Jingfeng, Braverman, Vladimir, and Yang, Lin F.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this paper we consider multi-objective reinforcement learning where the objectives are balanced using preferences. In practice, the preferences are often given in an adversarial manner, e.g., customers can be picky in many applications. We formalize this problem as an episodic learning problem on a Markov decision process, where transitions are unknown and a reward function is the inner product of a preference vector with pre-specified multi-objective reward functions. We consider two settings. In the online setting, the agent receives a (adversarial) preference every episode and proposes policies to interact with the environment. We provide a model-based algorithm that achieves a nearly minimax optimal regret bound $\widetilde{\mathcal{O}}\bigl(\sqrt{\min\{d,S\}\cdot H^2 SAK}\bigr)$, where $d$ is the number of objectives, $S$ is the number of states, $A$ is the number of actions, $H$ is the length of the horizon, and $K$ is the number of episodes. Furthermore, we consider preference-free exploration, i.e., the agent first interacts with the environment without specifying any preference and then is able to accommodate arbitrary preference vector up to $\epsilon$ error. Our proposed algorithm is provably efficient with a nearly optimal trajectory complexity $\widetilde{\mathcal{O}}\bigl({\min\{d,S\}\cdot H^3 SA}/{\epsilon^2}\bigr)$. This result partly resolves an open problem raised by \citet{jin2020reward}., Comment: NeurIPS 2021 Camera Ready Version
Published: 2020

29. Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate

Author: Wu, Jingfeng, Zou, Difan, Braverman, Vladimir, and Gu, Quanquan
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Understanding the algorithmic bias of \emph{stochastic gradient descent} (SGD) is one of the key challenges in modern machine learning and deep learning theory. Most of the existing works, however, focus on \emph{very small or even infinitesimal} learning rate regime, and fail to cover practical scenarios where the learning rate is \emph{moderate and annealing}. In this paper, we make an initial attempt to characterize the particular regularization effect of SGD in the moderate learning rate regime by studying its behavior for optimizing an overparameterized linear regression problem. In this case, SGD and GD are known to converge to the unique minimum-norm solution; however, with the moderate and annealing learning rate, we show that they exhibit different \emph{directional bias}: SGD converges along the large eigenvalue directions of the data matrix, while GD goes after the small eigenvalue directions. Furthermore, we show that such directional bias does matter when early stopping is adopted, where the SGD output is nearly optimal but the GD output is suboptimal. Finally, our theory explains several folk arts in practice used for SGD hyperparameter tuning, such as (1) linearly scaling the initial learning rate with batch size; and (2) overrunning SGD with high learning rate even when the loss stops decreasing., Comment: ICLR 2021 Camera Ready
Published: 2020

30. Obtaining Adjustable Regularization for Free via Iterate Averaging

Author: Wu, Jingfeng, Braverman, Vladimir, and Yang, Lin F.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Regularization for optimization is a crucial technique to avoid overfitting in machine learning. In order to obtain the best performance, we usually train a model by tuning the regularization parameters. It becomes costly, however, when a single round of training takes significant amount of time. Very recently, Neu and Rosasco show that if we run stochastic gradient descent (SGD) on linear regression problems, then by averaging the SGD iterates properly, we obtain a regularized solution. It left open whether the same phenomenon can be achieved for other optimization problems and algorithms. In this paper, we establish an averaging scheme that provably converts the iterates of SGD on an arbitrary strongly convex and smooth objective function to its regularized counterpart with an adjustable regularization parameter. Our approaches can be used for accelerated and preconditioned optimization methods as well. We further show that the same methods work empirically on more general optimization objectives including neural networks. In sum, we obtain adjustable regularization for free for a large class of optimization problems and resolve an open question raised by Neu and Rosasco., Comment: ICML 2020 camera ready
Published: 2020

31. Monodispersed Ni12P5 nanocrystals in situ grown on reduced graphene oxide matrix with enhanced Li-electrochemical properties

Author: Wang, Song, Guo, Xu, Li, Kun, Wang, Guiting, Su, Shaokang, Wu, Jingfeng, Li, Li, Xie, Ying, Guo, Chenfeng, and Pan, Kai
Published: 2023
Full Text: View/download PDF

32. On the Noisy Gradient Descent that Generalizes as SGD

Author: Wu, Jingfeng, Hu, Wenqing, Xiong, Haoyi, Huan, Jun, Braverman, Vladimir, and Zhu, Zhanxing
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning. While past studies confirm that the magnitude and the covariance structure of gradient noise are critical for regularization, it remains unclear whether or not the class of noise distributions is important. In this work we provide negative results by showing that noises in classes different from the SGD noise can also effectively regularize gradient descent. Our finding is based on a novel observation on the structure of the SGD noise: it is the multiplication of the gradient matrix and a sampling noise that arises from the mini-batch sampling procedure. Moreover, the sampling noises unify two kinds of gradient regularizing noises that belong to the Gaussian class: the one using (scaled) Fisher as covariance and the one using the gradient covariance of SGD as covariance. Finally, thanks to the flexibility of choosing noise class, an algorithm is proposed to perform noisy gradient descent that generalizes well, the variant of which even benefits large batch SGD training without hurting generalization., Comment: ICML 2020 near camera ready version
Published: 2019

33. Tangent-Normal Adversarial Regularization for Semi-supervised Learning

Author: Yu, Bing, Wu, Jingfeng, Ma, Jinwen, and Zhu, Zhanxing
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Compared with standard supervised learning, the key difficulty in semi-supervised learning is how to make full use of the unlabeled data. A recently proposed method, virtual adversarial training (VAT), smartly performs adversarial training without label information to impose a local smoothness on the classifier, which is especially beneficial to semi-supervised learning. In this work, we propose tangent-normal adversarial regularization (TNAR) as an extension of VAT by taking the data manifold into consideration. The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR). In TAR, VAT is applied along the tangent space of the data manifold, aiming to enforce local invariance of the classifier on the manifold, while in NAR, VAT is performed on the normal space orthogonal to the tangent space, intending to impose robustness on the classifier against the noise causing the observed data deviating from the underlying data manifold. Demonstrated by experiments on both artificial and practical datasets, our proposed TAR and NAR complement with each other, and jointly outperforms other state-of-the-art methods for semi-supervised learning., Comment: CVPR 2019
Published: 2018

34. The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

Author: Zhu, Zhanxing, Wu, Jingfeng, Yu, Bing, Wu, Lei, and Ma, Jinwen
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Understanding the behavior of stochastic gradient descent (SGD) in the context of deep neural networks has raised lots of concerns recently. Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics. Through investigating this general optimization dynamics, we analyze the behavior of SGD on escaping from minima and its regularization effects. A novel indicator is derived to characterize the efficiency of escaping from minima through measuring the alignment of noise covariance and the curvature of loss function. Based on this indicator, two conditions are established to show which type of noise structure is superior to isotropic noise in term of escaping efficiency. We further show that the anisotropic noise in SGD satisfies the two conditions, and thus helps to escape from sharp and poor minima effectively, towards more stable and flat minima that typically generalize well. We systematically design various experiments to verify the benefits of the anisotropic noise, compared with full gradient descent plus isotropic diffusion (i.e. Langevin dynamics)., Comment: ICML 2019 camera ready
Published: 2018

35. Hydrodeoxygenation of lignin-derived phenolics to cycloalkanes over Ni–Co alloy coupled with oxophilic NbOx

Author: Zhang, Chengzhi, Zhang, Xing, Wu, Jingfeng, Zhu, Lingjun, and Wang, Shurong
Published: 2022
Full Text: View/download PDF

36. Selective hydrodeoxygenation of lignin-derived phenolics to cycloalkanes over highly stable NiAl2O4 spinel-supported bifunctional catalysts

Author: Zhang, Xing, Wu, Jingfeng, Li, Tian, Zhang, Chengzhi, Zhu, Lingjun, and Wang, Shurong
Published: 2022
Full Text: View/download PDF

37. A new insight into pyrolysis mechanism of three typical actual biomass: The influence of structural differences on pyrolysis process

Author: Wang, Guanyu, Dai, Gongxin, Ding, Shaoqiu, Wu, Jingfeng, and Wang, Shurong
Published: 2021
Full Text: View/download PDF

38. Intelligent monitoring of EHV transformer bushing based on multi‐parameter composite sensing technology

Author: Zhang, Lu, primary, Sun, Lei, additional, Wang, Wensen, additional, Han, Yanhua, additional, Pu, Lu, additional, Wu, Jingfeng, additional, and Wu, Hao, additional
Published: 2023
Full Text: View/download PDF

39. Unconsolidated fault vertical sealing ability evaluation method and its application during Stationary phase/Metodo de evaluacion de la capacidad de sellado vertical de fallas no consolidadas y su aplicacion durante la fase estacionaria

Author: Sun, Ning, Fu, Guang, Liu, Lili, and Wu, Jingfeng
Published: 2019
Full Text: View/download PDF

40. Study on the relationship between multi-stage strike-slip mechanism and basin evolution in Fangzheng fault depression/Estudio sobre la relacion entre el mecanismo de deslizamiento por etapas y la evolucion de la cuenca en la depresion por falla Fangzheng

Author: Wu, Jingfeng, Meng, Qi'an, Fu, Xiaofei, Ma, Yuling, Sun, Meifeng, Sun, Ning, and Tan, Wancang
Published: 2018
Full Text: View/download PDF

41. Polypeptide Substrate Accessibility Hypothesis: Gain-of-Function R206H Mutation Allosterically Affects Activin Receptor-like Protein Kinase Activity

Author: Groppe, Jay C., primary, Lu, Guorong, additional, Tandang-Silvas, Mary R., additional, Pathi, Anupama, additional, Konda, Shruti, additional, Wu, Jingfeng, additional, Le, Viet Q., additional, Culbert, Andria L., additional, Shore, Eileen M., additional, Wharton, Kristi A., additional, and Kaplan, Frederick S., additional
Published: 2023
Full Text: View/download PDF

42. Inter-laboratory study for the certification of trace elements in seawater certified reference materials NASS-7 and CASS-6

Author: Yang, Lu, Nadeau, Kenny, Meija, Juris, Grinberg, Patricia, Pagliano, Enea, Ardini, Francisco, Grotti, Marco, Schlosser, Christian, Streu, Peter, Achterberg, Eric P., Sohrin, Yoshiki, Minami, Tomoharu, Zheng, Linjie, Wu, Jingfeng, Chen, Gedun, Ellwood, Michael J., Turetta, Clara, Aguilar-Islas, Ana, Rember, Robert, Sarthou, Géraldine, Tonnard, Manon, Planquette, Hélène, Matoušek, Tomáš, Crum, Steven, and Mester, Zoltán
Published: 2018
Full Text: View/download PDF

43. Synthesis of Carboxyl Modified Polyether Polysiloxane Surfactant for the Biodegradable Foam Fire Extinguishing Agents

Author: Jiao, Jinqing, primary, Qi, Lei, additional, Wu, Jingfeng, additional, Lang, Xuqing, additional, Wei, Yuechang, additional, Zhang, Guangwen, additional, Cui, Pengyu, additional, Shang, Zuzheng, additional, Mu, Xiaodong, additional, Mu, Shanjun, additional, Lv, Yuzhuo, additional, and Pan, Weichao, additional
Published: 2023
Full Text: View/download PDF

44. A mesoporous polydopamine-derived nanomedicine for targeted and synergistic treatment of inflammatory bowel disease by pH-Responsive drug release and ROS scavenging

Author: Guan, Haidi, primary, Xu, Zhongwei, additional, Du, Guangsheng, additional, Liu, Qinghua, additional, Tan, Qianshan, additional, Chen, Yihui, additional, Chen, Shuaishuai, additional, Wu, Jingfeng, additional, Wang, Fengchao, additional, Zhang, Jixi, additional, Sun, Lihua, additional, and Xiao, Weidong, additional
Published: 2023
Full Text: View/download PDF

45. Development of a Portable SF6/N2 Mixed Gas Charging Device for On-Site Modification of Gas Insulated Current Transformer

Author: Yang, Dingge, Liu, Tongyu, Zhao, Yuyang, Shang, Yu, Pu, Lu, Niu, Bo, Han, Yanhua, Qi, Weidong, Wu, Jingfeng, Yang, Zihao, Ren, Linyuan, Deng, Zichen, Ding, Weidong, and Wang, Yanan
Abstract: In this article, a portable gas charging device for SF6/N2 gas-insulated current transformer is developed to meet the demands of on-site modification of SF6/N2 gas-filled current transformer. This device has the functions of SF6 retrieving, vacuum pumping, mixed gas filling, and mixed gas ratio detection. This device adopts the partial pressure controlling method to realize the gas mixture ratio. By filling the mixed gas successively, the required pressure value is finally achieved without presetting the total volume of inflating gas. The operation process is simple and intuitive, and the operation efficiency is high. The built-in temperature detection and pressure detection module can carry out real-time detection and temperature correction for the charging pressure of mixed gas, improving the accuracy of the charging device. A buffer cavity is designed to speed up the homogenization of the mixed gas in the equipment cavity. The gas in the equipment cavity is poured into the buffer cavity repeatedly to force it to mix quickly, which reduces the time required for the gas to be set even and improves the charging efficiency. The experiment results show that the mixing ratio accuracy of the charging device can be controlled within 1%, and the pressure deviation can be controlled within 0.02 MPa. With the gas charging device, two current transformer on-site modifications have been successfully conducted.
Published: 2024
Full Text: View/download PDF

46. Development of a Portable SF6/N2 Mixed Gas Charging Device for On-Site Modification of Gas Insulated Current Transformer

Author: Yang, Dingge, primary, Liu, Tongyu, additional, Zhao, Yuyang, additional, Shang, Yu, additional, Pu, Lu, additional, Niu, Bo, additional, Han, Yanhua, additional, Qi, Weidong, additional, Wu, Jingfeng, additional, Yang, Zihao, additional, Ren, Linyuan, additional, Deng, Zichen, additional, Ding, Weidong, additional, and Wang, Yanan, additional
Published: 2023
Full Text: View/download PDF

47. Distribution of potassium during chemical activation of petroleum coke: Electron microscopy evidence and links to phase behaviour

Author: Montes, Vicente, primary, Xiao, Ye, additional, Wu, Jingfeng, additional, and Hill, Josephine M., additional
Published: 2022
Full Text: View/download PDF

48. Soluble and Colloidal Iron in the Oligotrophic North Atlantic and North Pacific

Author: Wu, Jingfeng, Boyle, Edward, Sunda, William, and Wen, Liang-Saw
Published: 2001

49. Distribution of potassium during chemical activation of petroleum coke: Electron microscopy evidence and links to phase behaviour.

Author: Montes, Vicente, Xiao, Ye, Wu, Jingfeng, and Hill, Josephine M.
Subjects: PETROLEUM coke, PETROLEUM chemicals, ELECTRON microscopy, POTASSIUM, PHASE diagrams, ACTIVATED carbon, ACTIVATION (Chemistry), MOLTEN carbonate fuel cells
Abstract: In this study, direct evidence for chemical penetration into petroleum coke particles during activation is presented. In addition, the porosity development was directly related to the sulphur loss and phase behaviour of the species present. Petroleum coke (petcoke, 6 wt.% sulphur) was activated with KOH and NaOH at temperatures between 400 and 800°C. The CS bonds were broken before 400°C in the presence of KOH and before 500°C in the presence of NaOH. Electron microscopy analysis of cross‐sectioned and ultramicrotomed samples revealed that sulphur was still present within the particles and that the hydroxide activation agents had penetrated to the centre of the particles (90–150 μm). After heating to 800°C and washing with a weak acid aqueous solution, essentially all the sulphur was removed, as was any remaining chemical agent. The characterization results, phase diagrams, and complementary experiments with carbonate chemical agents or steam suggest that, during heating, a molten phase formed around the petcoke particles. The composition of this molten phase changed as activation proceeded and both sulphur and ash components were liberated from the petcoke. This better understanding of the activation process will improve the efficiency of preparing activated carbon. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

50. Supplementary document for >10 GHz femtosecond fiber laser system at 2.0 μm - 5721510.pdf

Author: Liang, Zhaoheng, Lin, Wei, Wu, Jingfeng, Chen, Xuewen, Guo, Yuankai, LING, LIN, Wei, Xiaoming, and Yang, Zhongmin
Subjects: Physics::Optics, Physics::Atomic Physics
Abstract: Details about this fiber laser system and the numerical simulation
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

206 results on '"Wu, Jingfeng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources