The new emerging passively-collected data sources of the recent years have been increasingly providing new avenues for research due to their high spatio-temporal granularity. Datasets captured from GPS, mobile phone, location-enabled social media, smart-card payments etc., falling under the general term of Big Data, have the ability to provide mobility information at an unprecedented volume, velocity and variety, even if they were not initially collected with that purpose in mind. Despite the benefits they offer, however, they also pose new challenges for their analysis and the skills and knowledge required to derive value out of them, which transcend into different scientific fields. On one hand, that has largely limited their use in aggregate descriptive analyses and inference of general insights about mobility behaviour. Those studies offer valuable information and insightful comparisons with traditional Revealed Preference data, however they lack in their ability to generalise their findings and make them suitable for policy analysis. A significant reason for that is the often missing contextual information in terms of the trip makers' sociodemographic characteristics and any information regarding the observed mode and trip purpose, all of which are important inputs to any model of disaggregate mobility behaviour. On the other hand, even in the presence of additional semantic information, the overall complexity of those datasets has been proven to be challenging for traditional econometric specifications. That has led to the increasing popularity of Machine Learning, which generally excels at identifying patterns within complex datasets. Nonetheless, Machine Learning methods, typically described by non-parametric algorithms have limited ability to provide insights useful for policy analysis hindering their adoption for real-world policy making. Furthermore, the limited case studies of behavioural modelling using emerging datasets do not provide any systematic comparison with traditional data sources to properly assess the benefits and drawbacks of both data collection methods. The current thesis aims to address the three aforementioned overarching literature gaps utilising a specific form of emerging dataset, namely a semi-passive GPS-based trip diary collected from a sample consisting of recruited participants. The collected dataset was complemented with a background household survey capturing the individuals' important sociodemographic information and minimal input from the trip makers regarding the chosen mode of the trip and the activity purpose at the destination. Further enhancing that GPS trip diary with data derived from APIs and other openly available data sources can be sufficient to make the dataset usable for estimating a behavioural model. In the studies presented in this thesis, a general methodology is outlined on how to transform the initial GPS traces into useful inputs for behavioural models, which are able to uncover realistic sensitivities and trade-offs in accordance with the ones already proposed in the literature. More specifically, behaviourally accurate Values of Travel Time estimates have been derived from a mode choice model utilising such a dataset similar to the official values currently used for appraisal in the UK, which are derived from large-scale Stated Preference studies. In addition, a range of studies has been proposed aiming to address common research questions in spatial choice models, such as sampling of alternatives to reduce their computational complexity, accounting for the latent nature of the consideration choice set and capturing spatial correlation among alternatives, in all of which geography-derived concepts are also being implemented. Furthermore, the current thesis aims to focus on the integration of Machine Learning and Choice Modelling, rather than their comparison, by proposing a combined framework in which Machine Learning is used to identify patterns in the dataset, while Choice Modelling is implemented in order to understand individual mode and location choice behaviour, thus taking advantage of the best of both approaches, which offers significant improvements over traditional econometric specifications without compromising the microeconomic interpretation of the outputs. The overall purpose of the current thesis is to enhance the confidence of the research community in the use of similar emerging datasets and to promote the integration of data-driven and econometric specifications. This is expected to provide useful insights about expanding the discussion moving forward into more advance combined specifications of capturing individual behaviour.