Automatic analysis of facial images is a problem of paramount importance due to its application in numerous real world scenarios including security, entertainment, medicine, health care, multimedia, human-computer, and human-robot interactions. Arguably the most important step of an automatic face analysis system is the localization of the facial landmarks. This is due to the fact that it has a crucial impact on the robustness and accuracy of the designed system. Facial landmarks localization is a very challenging Computer Vision problem, since the face is a highly deformable object and its appearance drastically changes under different poses, expressions, and illuminations conditions. Recently, Computer Vision has witnessed great research advance towards automatic facial landmarks localization. Numerous methodologies have been proposed during the last few years that achieve accurate and efficient performance. The most successful methods are based on statistical deformable models. Developing powerful facial deformable models requires massive, annotated facial databases on which techniques can be trained, validated and tested. The past twenty years the research community has collected and annotated a number of facial databases captured under both constrained and unconstrained (in-the-wild) conditions. However, the existing facial databases cannot be utilised for training the aforementioned models due to several limitations of the provided annotations. More specifically, most databases have been annotated using different mark-ups, and in most cases, the accuracy of the provided annotations is low. Additional to the aforementioned problems the use of different training/testing sets and different error metrics makes the fair comparison between the existing methodologies almost infeasible. In this Thesis, we first aim to overcome the aforementioned problems by (a) proposing a semi-automatic annotation technique that was employed to re-annotate most existing facial databases under a unified protocol, and (b) presenting the 300 Faces In-The-Wild Challenge (300-W), the first facial landmark localization challenge that was organized twice, in 2013 and 2015. This is the first effort towards a unified annotation scheme of massive databases and a fair experimental comparison of existing facial landmarks localization systems. By tracking the published papers in recent Computer Vision conferences it can be seen that the produced annotations allowed the researchers to propose powerful generic facial deformable models for facial landmarks localization in still images. Nevertheless, when it comes to applications that require perfect facial landmarks localization and tracking accuracy, such as the analysis of human facial behaviour and facial motion capture, generic facial deformable models could be insufficient. In this case, person-specific facial deformable models are mainly employed, requiring manual annotation of facial landmarks for each person and subsequently person-specific training. In this Thesis, a novel method for the automatic construction of person-specific facial deformable models is proposed. To this end, an orthonormal subspace which is suitable for facial image reconstruction is learned. Next, to correct the erroneous fittings produced by a generic facial deformable model, image congealing (i.e., ensemble image alignment) is performed by employing only the learned orthonormal subspace. The image congealing problem is solved by formulating a suitable sparsity regularized rank minimization problem. After correcting the fittings, the next step is to construct the person-specific facial deformable model which could be further used to localize or track the facial landmarks in images that depict the same subject. This consists another contribution of this Thesis. After applying a generic or person-specific facial deformable model into still or a sequence of facial images, the next step of an automatic face analysis system is to remove the pose effect from the faces. To do that, landmark points-driven normalization (i.e., warping) of the faces into a common frame (e.g., frontal-view frame) is performed. However, most face normalization (pose correction) methods can be greatly affected from huge poses, illumination variations, occlusions and bad localization of the facial landmarks. A final, significant contribution of this Thesis is the development of a novel method, robust to aforementioned problems, for joint face frontalization (i.e., pose correction) and facial landmarks localization. Unlike the state-of-the-art methods for facial landmarks localization and pose correction, where large amount of manually annotated images or 3D facial models are required, the proposed method relies on a small set of frontal images only. By observing that the frontal facial image of both humans and animals, is the one having the minimum rank of all different poses, a model which is able to jointly recover the frontalized version of the face as well as the facial landmarks is devised. Therefore, we solve the optimization problem concerning minimization of the nuclear norm and the matrix ell_1 norm accounting for occlusions. This method is assessed for frontal view reconstruction of human and animal faces, landmark localization, pose-invariant face recognition, face verification in unconstrained conditions, and video in-painting by conducting experiment on nine databases. The experimental results demonstrate the effectiveness of the proposed method in comparison to the state-of-the-art methods for the target problems.