Author: "Jan N. van Rijn" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jan N. van Rijn"' showing total 76 results

Start Over Author "Jan N. van Rijn"

76 results on '"Jan N. van Rijn"'

1. Multi-task learning with a natural metric for quantitative structure activity relationship learning

Author: Noureddin Sadawi, Ivan Olier, Joaquin Vanschoren, Jan N. van Rijn, Jeremy Besnard, Richard Bickerton, Crina Grosan, Larisa Soldatova, and Ross D. King
Subjects: Multi-task learning, Quantitative structure activity relationship, Sequence-based similarity, Random forest, Information technology, T58.5-58.64, Chemistry, QD1-999
Abstract: Abstract The goal of quantitative structure activity relationship (QSAR) learning is to learn a function that, given the structure of a small molecule (a potential drug), outputs the predicted activity of the compound. We employed multi-task learning (MTL) to exploit commonalities in drug targets and assays. We used datasets containing curated records about the activity of specific compounds on drug targets provided by ChEMBL. Totally, 1091 assays have been analysed. As a baseline, a single task learning approach that trains random forest to predict drug activity for each drug target individually was considered. We then carried out feature-based and instance-based MTL to predict drug activities. We introduced a natural metric of evolutionary distance between drug targets as a measure of tasks relatedness. Instance-based MTL significantly outperformed both, feature-based MTL and the base learner, on 741 drug targets out of 1091. Feature-based MTL won on 179 occasions and the base learner performed best on 171 drug targets. We conclude that MTL QSAR is improved by incorporating the evolutionary distance between targets. These results indicate that QSAR learning can be performed effectively, even if little data is available for specific drug targets, by leveraging what is known about similar drug targets.
Published: 2019
Full Text: View/download PDF

2. Automated Design of Linear Bounding Functions for Sigmoidal Nonlinearities in Neural Networks.

Author: Matthias König 0005, Xiyue Zhang, Holger H. Hoos, Marta Kwiatkowska, and Jan N. van Rijn
Published: 2024
Full Text: View/download PDF

3. A Preliminary Study to Examining Per-class Performance Bias via Robustness Distributions.

Author: Annelot W. Bosman, Anna L. Münz, Holger H. Hoos, and Jan N. van Rijn
Published: 2024
Full Text: View/download PDF

4. Learning Curve Extrapolation Methods Across Extrapolation Settings.

Author: Lionel Kielhöfer, Felix Mohr, and Jan N. van Rijn
Published: 2024
Full Text: View/download PDF

5. Accelerating Adversarially Robust Model Selection for Deep Neural Networks via Racing.

Author: Matthias König 0005, Holger H. Hoos, and Jan N. van Rijn
Published: 2024
Full Text: View/download PDF

6. Hyperparameter Importance of Quantum Neural Networks Across Small Datasets.

Author: Charles Moussa, Jan N. van Rijn, Thomas Bäck, and Vedran Dunjko
Published: 2022
Full Text: View/download PDF

7. Advances in Metalearning: ECML/PKDD Workshop on Meta-Knowledge Transfer.

Author: Pavel Brazdil, Jan N. van Rijn, Henry Gouk, and Felix Mohr
Published: 2022

8. LCDB 1.0: An Extensive Learning Curves Database for Classification Tasks.

Author: Felix Mohr, Tom J. Viering, Marco Loog, and Jan N. van Rijn
Published: 2022
Full Text: View/download PDF

9. Critically Assessing the State of the Art in CPU-based Local Robustness Verification.

Author: Matthias König 0005, Annelot W. Bosman, Holger H. Hoos, and Jan N. van Rijn
Published: 2023

10. Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification.

Author: Adrian El Baz, Ihsan Ullah, Edesio Alcobaça, André C. P. L. F. de Carvalho, Hong Chen, Fabio Ferreira, Henry Gouk, Chaoyu Guan, Isabelle Guyon, Timothy M. Hospedales, Shell Hu, Mike Huisman, Frank Hutter, Zhengying Liu, Felix Mohr, Ekrem öztürk, Jan N. van Rijn, Haozhe Sun, Xin Wang 0019, and Wenwu Zhu 0001
Published: 2021

11. Automatic Human-Like Detection of Code Smells.

Author: Chitsutha Soomlek, Jan N. van Rijn, and Marcello M. Bonsangue
Published: 2021
Full Text: View/download PDF

12. Automated Machine Learning for Satellite Data: Integrating Remote Sensing Pre-trained Models into AutoML Systems.

Author: Nelly Rosaura Palacios Salinas, Mitra Baratchi, Jan N. van Rijn, and Andreas Vollrath
Published: 2021
Full Text: View/download PDF

13. Advances in MetaDL: AAAI 2021 Challenge and Workshop.

Author: Adrian El Baz, Isabelle Guyon, Zhengying Liu, Jan N. van Rijn, Sébastien Treguer, and Joaquin Vanschoren
Published: 2021

14. Eating Sound Dataset for 20 Food Types and Sound Classification Using Convolutional Neural Networks.

Author: Jeannette Shijie Ma, Marcello A. Gómez Maureira, and Jan N. van Rijn
Published: 2020
Full Text: View/download PDF

15. Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification.

Author: Ihsan Ullah, Dustin Carrión-Ojeda, Sergio Escalera, Isabelle Guyon, Mike Huisman, Felix Mohr, Jan N. van Rijn, Haozhe Sun, Joaquin Vanschoren, and Phan Anh Vu
Published: 2022

16. Hyperparameter Importance for Image Classification by Residual Neural Networks.

Author: Abhinav Sharma, Jan N. van Rijn, Frank Hutter, and Andreas Müller 0004
Published: 2019
Full Text: View/download PDF

17. OpenML Benchmarking Suites.

Author: Bernd Bischl, Giuseppe Casalicchio, Matthias Feurer, Pieter Gijsbers, Frank Hutter, Michel Lang, Rafael Gomes Mantovani, Jan N. van Rijn, and Joaquin Vanschoren
Published: 2021

18. Hyperparameter Importance Across Datasets.

Author: Jan N. van Rijn and Frank Hutter
Published: 2018
Full Text: View/download PDF

19. Don't Rule Out Simple Models Prematurely: A Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML.

Author: Benjamin Strang, Peter van der Putten, Jan N. van Rijn, and Frank Hutter
Published: 2018
Full Text: View/download PDF

20. Computing and Predicting Winning Hands in the Trick-Taking Game of Klaverjas.

Author: Jan N. van Rijn, Frank W. Takes, and Jonathan K. Vis
Published: 2018
Full Text: View/download PDF

21. Learning multiple defaults for machine learning algorithms.

Author: Florian Pfisterer, Jan N. van Rijn, Philipp Probst, Andreas C. Müller 0002, and Bernd Bischl
Published: 2021
Full Text: View/download PDF

22. Meta-learning for symbolic hyperparameter defaults.

Author: Pieter Gijsbers, Florian Pfisterer, Jan N. van Rijn, Bernd Bischl, and Joaquin Vanschoren
Published: 2021
Full Text: View/download PDF

23. An Empirical Study of Hyperparameter Importance Across Datasets.

Author: Jan N. van Rijn and Frank Hutter
Published: 2017

24. Open Algorithm Selection Challenge 2017: Setup and Scenarios.

Author: Marius Lindauer, Jan N. van Rijn, and Lars Kotthoff
Published: 2017

25. Does Feature Selection Improve Classification? A Large Scale Experiment in OpenML.

Author: Martijn J. Post, Peter van der Putten, and Jan N. van Rijn
Published: 2016
Full Text: View/download PDF

26. Taking machine learning research online with OpenML.

Author: Joaquin Vanschoren, Jan N. van Rijn, and Bernd Bischl
Published: 2015

27. Having a Blast: Meta-Learning and Heterogeneous Ensembles for Data Streams.

Author: Jan N. van Rijn, Geoffrey Holmes 0001, Bernhard Pfahringer, and Joaquin Vanschoren
Published: 2015
Full Text: View/download PDF

28. Sharing RapidMiner Workflows and Experiments with OpenML.

Author: Jan N. van Rijn and Joaquin Vanschoren
Published: 2015

29. Algorithm Selection via Meta-learning and Sample-based Active Testing.

Author: Salisu Mamman Abdulrahman, Pavel Brazdil, Jan N. van Rijn, and Joaquin Vanschoren
Published: 2015

30. Fast Algorithm Selection Using Learning Curves.

Author: Jan N. van Rijn, Salisu Mamman Abdulrahman, Pavel Brazdil, and Joaquin Vanschoren
Published: 2015
Full Text: View/download PDF

31. Stateless neural meta-learning using second-order gradients

Author: Mike Huisman, Aske Plaat, and Jan N. van Rijn
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Artificial Intelligence, Machine Learning (stat.ML), Software, Machine Learning (cs.LG)
Abstract: Meta-learning can be used to learn a good prior that facilitates quick learning; two popular approaches are MAML and the meta-learner LSTM. These two methods represent important and different approaches in meta-learning. In this work, we study the two and formally show that the meta-learner LSTM subsumes MAML, although MAML, which is in this sense less general, outperforms the other. We suggest the reason for this surprising performance gap is related to second-order gradients. We construct a new algorithm (named TURTLE) to gain more insight into the importance of second-order gradients. TURTLE is simpler than the meta-learner LSTM yet more expressive than MAML and outperforms both techniques at few-shot sine wave regression and 50% of the tested image classification settings (without any additional hyperparameter tuning) and is competitive otherwise, at a computational cost that is comparable to second-order MAML. We find that second-order gradients also significantly increase the accuracy of the meta-learner LSTM. When MAML was introduced, one of its remarkable features was the use of second-order gradients. Subsequent work focused on cheaper first-order approximations. On the basis of our findings, we argue for more attention for second-order gradients.
Published: 2022

32. Algorithm Selection on Data Streams.

Author: Jan N. van Rijn, Geoffrey Holmes 0001, Bernhard Pfahringer, and Joaquin Vanschoren
Published: 2014
Full Text: View/download PDF

33. LCDB 1.0: An Extensive Learning Curves Database for Classification Tasks

Author: Felix Mohr, Tom J. Viering, Marco Loog, and Jan N. van Rijn
Published: 2023

34. OpenML: A Collaborative Science Platform.

Author: Jan N. van Rijn, Bernd Bischl, Luís Torgo, Bo Gao 0002, Venkatesh Umaashankar, Simon Fischer 0001, Patrick Winter, Bernd Wiswedel, Michael R. Berthold, and Joaquin Vanschoren
Published: 2013
Full Text: View/download PDF

35. Dataset Characteristics (Metafeatures)

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter discusses dataset characteristics that play a crucial role in many metalearning systems. Typically, they help to restrict the search in a given configuration space. The basic characteristic of the target variable, for instance, determines the choice of the right approach. If it is numeric, it suggests that a suitable regression algorithm should be used, while if it is categorical, a classification algorithm should be used instead. This chapter provides an overview of different types of dataset characteristics, which are sometimes also referred to as metafeatures. These are of different types, and include so-called simple, statistical, information-theoretic, model-based, complexitybased, and performance-based metafeatures. The last group of characteristics has the advantage that it can be easily defined in any domain. These characteristics include, for instance,sampling landmarkersrepresenting the performance of particular algorithms on samples of data,relative landmarkerscapturing differences or ratios of performance values and providingestimates of performance gains. The final part of this chapter discusses the specific dataset characteristics used in different machine learning tasks, including classification, regression, time series, and clustering.
Published: 2022

36. Introduction

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter starts by describing the organization of the book, which consists of three parts. Part I discusses some basic concepts, including, for instance, what metalearning is and how it is related to automatic machine learning (AutoML). This continues with a presentation of the basic architecture of metalearning/AutoML systems, discussion of systems that exploit algorithm selection using prior metadata, methodology used in their evaluation, and different types of meta-level models, while mentioning the respective chapters where more details can be found. This part also includes discussion of methods used for hyperparameter optimization and workflow design. Part II includes the discussion of more advanced techniques and methods. The first chapter discusses the problem of setting up configuration spaces and conducting experiments. Subsequent chapters discuss different types of ensembles, metalearning in ensemble methods, algorithms used for data streams and transfer of meta-models across tasks. One chapter is dedicated to metalearning for deep neural networks. The last two chapters discuss the problem of automating various data science tasks and trying to design systems that are more complex. Part III is relatively short. It discusses repositories of metadata (including experimental results) and exemplifies what can be learned from this metadata by giving illustrative examples. The final chapter presents concluding remarks.
Published: 2022

37. Algorithm Recommendation for Data Streams

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: This chapter focuses on metalearning approaches that have been applied to data streams. This is an important area, as many real-world data arrive in the form of a stream of observations. We first review some important aspects of the data stream setting, which may involve online learning, non-stationarity, and concept drift.
Published: 2022

38. Metalearning in Ensemble Methods

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: This chapter discusses some approaches that exploit metalearning methods in ensemble learning. It starts by presenting a set of issues, such as the ensemble method used, which affect the process of ensemble learning and the resulting ensemble. In this chapter we discuss various lines of research that were followed. Some approaches seek an ensemble-based solution for the whole dataset, others for individual instances. Regarding the first group, we focus on metalearning in the construction, pruning and integration phase. Modeling the interdependence of models plays an important part in this process. In the second group, the dynamic selection of models is carried out for each instance. A separate section is dedicated to hierarchical ensembles and some methods used in their design. As this area involves potentially very large configuration spaces, recourse to advanced methods, including metalearning, is advantageous. It can be exploited to define the competence regions of different models and the dependencies between them.
Published: 2022

39. Concluding Remarks

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryAs metaknowledge has a central role in many approaches discussed in this book, we address the issue of what kind of metaknowledge is used in different metalearning/AutoML tasks, such as algorithm selection, hypeparameter optimization, and workflow generation. We draw attention to the fact that some metaknowledge is acquired (learned) by the systems, while other is given (e.g., different aspects of the given configuration space). This chapter continues by discussing future challenges, such as how to achieve better integration of metalearning and AutoML approaches, and what kind of guidance could be provided by the system when configuring metalearning/AutoML systems to new settings. This task may involve (semi-)automatic reduction of configuration spaces to make the search more effective. The last part of this chapter discusses various challenges encountered when trying to automate different steps of data science.
Published: 2022

40. Evaluating Recommendations of Metalearning/AutoML Systems

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: This chapter discusses some typical approaches that are commonly used to evaluate metalearning and AutoML systems. This helps us to establish whether we can trust the recommendations provided by a particular system, and also provides a way of comparing different competing approaches. As the performance of algorithms may vary substantially across different tasks, it is often necessary to normalize the performance values first to make comparisons meaningful. This chapter discusses some common normalization methods used. As often a given metalearning system outputs a sequence of algorithms to test, we can study how similar this sequence is from the ideal sequence. This can be determined by looking at a degree of correlation between the two sequences. This chapter provides more details on this issue. One common way of comparing systems is by considering the effect of selecting different algorithms (workflows) on base-level performance and determining how the performance evolves with time. If the ideal performance is known, it is possible to calculate the value of performance loss. The loss curve shows how the loss evolves with time or what its value is at the maximum available time (i.e., the time budget) given beforehand. This chapter also describes the methodology that is commonly used in comparisons involving several metalearning/AutoML systems with recourse to statistical tests.
Published: 2022

41. Metalearning Approaches for Algorithm Selection II

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter discusses different types of metalearning models, including regression, classification and relative performance models. Regression models use a suitable regression algorithm, which is trained on the metadata and used to predict the performance of given base-level algorithms. The predictions can in turn be used to order the base-level algorithms and hence identify the best one. These models also play an important role in the search for the potentially best hyperparameter configuration discussed in the next chapter. Classification models identify which base-level algorithms are applicable or non-applicable to the target classification task. Probabilistic classifiers can be used to construct a ranking of potentially useful alternatives. Relative performance models exploit information regarding the relative performance of base-level models, which can be either in the form of rankings or pairwise comparisons. This chapter discusses various methods that use this information in the search for the potentially best algorithm for the target task.
Published: 2022

42. Automating Workflow/Pipeline Design

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter discusses the design of workflows (or pipelines), which represent solutions that involve more than one algorithm. This is motivated by the fact that many tasks require such solutions. This problem is non-trivial, as the number of possible workflows (and their configurations) can be rather large. This chapter discusses various methods that can be used to restrict the design options and thus reduce the size of the configuration space. These include, for instance, ontologies and context-free grammars. Each of these formalisms has its merits and shortcomings. Many platforms have resorted to planning systems that use operators. These can be designed to be in accordance with the given ontologies or grammars. As the search space may be rather large, it is important to leverage prior experience. This topic is addressed in one of the sections, which discusses rankings of plans that have proved to be useful in the past. The workflows/pipelines that have proved successful in the past can be retrieved and used as plans in future tasks. Thus, it is possible to exploit both planning and metalearning.
Published: 2022

43. Automating Data Science

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: It has been observed that, in data science, a great part of the effort usually goes into various preparatory steps that precede model-building. The aim of this chapter is to focus on some of these steps. A comprehensive description of a given task to be resolved is usually supplied by the domain expert. Techniques exist that can process natural language description to obtain task descriptors (e.g., keywords), determine the task type, the domain, and the goals. This in turn can be used to search for the required domain-specific knowledge appropriate for the given task. In some situations, the data required may not be available and a plan needs to be elaborated regarding how to get it. Although not much research has been done in this area so far, we expect that progress will be made in the future. In contrast to this, the area of preprocessing and transformation has been explored by various researchers. Methods exist for selection of instances and/or elimination of outliers, discretization and other kinds of transformations. This area is sometimes referred to as data wrangling. These transformations can be learned by exploiting existing machine learning techniques (e.g., learning by demonstration). The final part of this chapter discusses decisions regarding the appropriate level of detail (granularity) to be used in a given task. Although it is foreseeable that further progress could be made in this area, more work is needed to determine how to do this effectively.
Published: 2022

44. Metalearning for Hyperparameter Optimization

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter describes various approaches for the hyperparameter optimization (HPO) and combined algorithm selection and hyperparameter optimization problems (CASH). It starts by presenting some basic hyperparameter optimization methods, including grid search, random search, racing strategies, successive halving and hyperband. Next, it discusses Bayesian optimization, a technique that learns from the observed performance of previously tried hyperparameter settings on the current task. This knowledge is used to build a meta-model (surrogate model) that can be used to predict which unseen configurations may work better on that task. This part includes the descriptionsequential model-based optimization(SMBO). This chapter also covers metalearning techniques that extend the previously discussed optimization techniques with the ability to transfer knowledge across tasks. This includes techniques such aswarm-startingthe search, ortransferring previously learned meta-modelsthat were trained on prior (similar) tasks. A key question here is how to establish how similar prior tasks are to the new task. This can be done on the basis of past experiments, but can also exploit the information gained from recent experiments on the target task. This chapter presents an overview of some recent methods proposed in this area.
Published: 2022

45. Automating the Design of Complex Systems

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: This chapter discusses the issue of whether it is possible to automate the design of rather complex workflows needed when addressing more complex data science tasks. The focus here is on symbolic approaches, which continue to be relevant. The chapter starts by discussing some more complex operators, including, for instance, conditional operators and operators used in iterative processing. Next, we discuss the issue of introduction of new concepts and the changes of granularity that can be achieved as a result. We review various approaches explored in the past, such as constructive induction, propositionalization, reformulation of rules, among others, but also draw attention to some new advances, such as feature construction in deep NNs. It is foreseeable that in the future both symbolic and subsymbolic approaches will coexist in systems exhibiting a kind of functional symbiosis. There are tasks that cannot be learned in one go, but rather require a sub-division into subtasks, a plan for learning the constituents, and joining the parts together. Some of these subtasks may be interdependent. Some tasks may require an iterative process in the process of learning. This chapter discusses various examples that can stimulate both further research and some practical solutions in this rather challenging area.
Published: 2022

46. Setting Up Configuration Spaces and Experiments

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter discusses the issues relative to so-called configuration spaces that need to be set up before initiating the search for a solution. It starts by introducing some basic concepts, such as discrete and continuous subspaces. Then it discusses certain criteria that help us to determine whether the given configuration space is (or is not) adequate for the tasks at hand. One important topic which is addressed here ishyperparameter importance, as it helps us to determine which hyperparameters have a high influence on the performance and should therefore be optimized. This chapter also discusses some methods for reducing the configuration space. This is important as it can speed up the process of finding the potentially best workflow for the new task. One problem that current systems face nowadays is that the number of alternatives in a given configuration space can be so large that it is virtually impossible to gather complete metadata. This chapter discusses the issue of whether the system can still function satisfactorily even when the metadata is incomplete. The final part of this chapter discusses some strategies that can be used for gathering metadata that originated in the area of multi-armed bandits, including, for instance, SoftMax, upper confidence bound (UCB) and pricing strategies.
Published: 2022

47. Metalearning

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Published: 2022

48. Metadata Repositories

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: SummaryThis chapter presents a review of online repositories where researchers can share data, code, and experiments. In particular, it covers OpenML, an online platform for sharing and organizing machine learning data automatically. OpenML contains thousands of datasets and algorithms, and millions of experimental results. We describe the basic philosophy involved, and its basic components: datasets, tasks, flows, setups, runs, and benchmark suites. OpenML has API bindings in various programming languages, making it easy for users to interact with the API in their native language. One important feature of OpenML is the integration into various machine learning toolboxes, such as Scikit-learn, Weka, and mlR. Users of these toolboxes can automatically upload all their results, leading to a large repository of experimental results.
Published: 2022

49. Learning from Metadata in Repositories

Author: Pavel Brazdil, Jan N. van Rijn, Carlos Soares, and Joaquin Vanschoren
Abstract: This chapter describes the various types of experiments that can be done with the vast amount of data, stored in experiment databases. We focus on three types of experiments done with the data stored in OpenML.
Published: 2022

50. Fast and Informative Model Selection using Learning Curve Cross-Validation

Author: Felix Mohr and Jan N. van Rijn
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computational Theory and Mathematics, Artificial Intelligence, Applied Mathematics, Computer Vision and Pattern Recognition, Software, Machine Learning (cs.LG)
Abstract: Common cross-validation (CV) methods like k-fold cross-validation or Monte-Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing on the remaining data. These techniques have two major drawbacks. First, they can be unnecessarily slow on large datasets. Second, beyond an estimation of the final performance, they give almost no insights into the learning process of the validated algorithm. In this paper, we present a new approach for validation based on learning curves (LCCV). Instead of creating train-test splits with a large portion of training data, LCCV iteratively increases the number of instances used for training. In the context of model selection, it discards models that are very unlikely to become competitive. We run a large scale experiment on the 67 datasets from the AutoML benchmark and empirically show that in over 90% of the cases using LCCV leads to similar performance (at most 1.5% difference) as using 5/10-fold CV. However, it yields substantial runtime reductions of over 20% on average. Additionally, it provides important insights, which for example allow assessing the benefits of acquiring more data. These results are orthogonal to other advances in the field of AutoML.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Database

Publisher

76 results on '"Jan N. van Rijn"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources