A consistent evaluation approach for ML models

This thesis focuses on identifying a consistent approach to evaluate machine learning (ML) models. With the increasing popularity of machine learning models in various domains, evaluation of their accuracy and performance is critical for their success. However, inconsistent evaluation methods may lead to unreliable performance metrics and subpar models. This thesis aims to understand the process of building machine learning models and proposes process flows diagrams for activities performed while building and evaluating machine learning models. Additionally, the thesis identifies variations in evaluation techniques, data splitting, common issues, and best practices for model evaluations. A design for an Evaluation service is proposed to incorporate the variations in evaluation for evaluating machine learning models while being suitable for inclusion in the machine learning process. The thesis aims provide understanding of the evaluation process in order to ensure consistency in evaluation across different models and datasets for a machine learning task.