fix: some errors on the fyp report
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
Andre Henriques 2024-01-03 14:55:38 +00:00
parent f3d8b02114
commit 3834b7b289

View File

@ -67,72 +67,80 @@
The project aims to create a platform where users can create different types of classification models without the users having any knowledge of image classification.
\subsection{Project Objectives}
This project's primary objectives are to:
This project's primary objectives are to create:
\begin{itemize}
\item Create platform where the users can create and manage their models.
\item Create a system to automatically create and train models.
\item Create a system to automatically expand and reduce models without fully retraining the models.
\item Create an API so that users can interact programatically with the system.
\item a platform where the users can create and manage their models.
\item a system to automatically create and train models.
\item a system to automatically expand and reduce models without fully retraining the models.
\item an API so that users can interact programmatically with the system.
\end{itemize}
This project extended objectives are to:
\begin{itemize}
\item Create a system to automatically to merge modules to increase efficiency.
\item Create a system to distribute the load of training the model's among multiple services.
\item Create a system to distribute the load of training the model's among multiple services.
\end{itemize}
\section{Literature and Techincal Review}
% 1 page of background and literature review. Here you will need to references things. Gamal et al.~\cite{gamal} introduce the concept of \ldots
\subsection{Intruduction}
This section reviews current existing thechnologies in the market that do image classification. It also reviews current image classification technologies, and which meats the requirements fot the project. This review also analysis methods that are use to distrubute the learning between various machines, and how to spread the load so miminum reloading of the models is required when running the model.
\section{Literature and Technical Review}
\subsection{Introduction}
This section reviews current existing technologies in the market that do image classification. It also reviews current image classification technologies, and which meats the requirements for the project. This review also analysis methods that are used to distribute the learning between various machines, and how to spread the load so minimum reloading of the models is required when running the model.
\subsection{Current existing classification platforms}
There are currently some existing software as a service(SaaS) platfomrs that do provide similar services to the ones this will project will be providing.
There are currently some existing software as a service (SaaS) platforms that do provide similar services to the ones this will project will be providing.
%Amazon provides bespoque machine learning services that if were contacted would be able to provide image classification services. Amazon provides general machine learning services \cite{amazon-machine-learning}.
Amazon provides an image classification service called ''Rekognition`` \cite{amazon-rekognition}. This services provides multiple services from face regonition, celebrity regonition, object regonition and others. One of this services is called custom labels \cite{amazon-rekognition-custom-labels} which provides the most similiar service, to the one this project is about. The custom labels service allows the users to provide custom datasets and labels and using AutoML the rekognition service would generate a model that allows the users to classify images acording to the generated model.
Amazon provides an image classification service called ''Rekognition`` \cite{amazon-rekognition}. This services provides multiple services from face recognition, celebrity recognition, object recognition and others. One of these services is called custom labels \cite{amazon-rekognition-custom-labels} which provides the most similar service, to the one this project is about. The custom labels service allows the users to provide custom datasets and labels and using AutoML the Rekognition service would generate a model that allows the users to classify images according to the generated model.
The models generated using Amazon's rekognition dont provide ways to update the number of labels that were originaly created without generating a new project which will envolve retraining a large part of the model which would envolve large downtime between being able to add new classes. Training models also could take 30 minutes to 24 hours \cite{amazon-rekognition-custom-labels-training} which cloud result in up to 24 hours of lag between the need of creating a new label and beeing able to classify that label. A problem also arrises when the uses needs to add more than one label at the same time, for example the user sees the need to create a new label and starts a new model training, but while the model is traning a new label is also needed the user now either stops the training of the new model and retrains a new one or waits until the one currently running stops and trains a new one. If new classification classes are required with frequency this might not be the best platform to choose.
The models generated using Amazon's Rekognition don't provide ways to update the number of labels that were created without generating a new project which will involve retraining a large part of the model which would involve large downtime between being able to add new classes. Training models also could take 30 minutes to 24 hours \cite{amazon-rekognition-custom-labels-training} which cloud result in up to 24 hours of lag between the need of creating a new label and being able to classify that label. A problem also arrises when the uses need to add more than one label at the same time, for example, the user sees the need to create a new label and starts a new model training, but while the model is training a new label is also needed the user now either stops the training of the new model and retrains a new one or waits until the one currently running stops and trains a new one. If new classification classes are required with frequency, this might not be the best platform to choose.
%https://aws.amazon.com/machine-learning/ml-use-cases/
%https://aws.amazon.com/rekognition/image-features/
Similarly Google also has ''Cloud Vision Api`` \cite{google-vision-api} which provides similiar services to Amazon's Rekognition. But Google's Vision Api apears to be more targetd at videos than images, as indicated by their proce sheet \cite{google-vision-price-sheet}. They have tag and product idetifiers, where every image only has one tag or product. The product identififer system seams to work diferently than the Amazon's regonition and worked based on K neighorings giving the user similar products on not classification labels \cite{google-vision-product-recognizer-guide}.
Similarly, Google also has ''Cloud Vision API`` \cite{google-vision-api} which provides similar services to Amazon's Rekognition. But Google's Vision API appears to be more targeted at videos than images, as indicated by their price sheet \cite{google-vision-price-sheet}. They have tag and product identifiers, where every image only has one tag or product. The product identifier system seams to work differently than the Amazon's Rekognition and worked based on K neighbouring giving the user similar products on not classification labels \cite{google-vision-product-recognizer-guide}.
This method is more effective at allowing users to add new types of products but as it does not give defined classes as the output the system does not give the target functionality that this project is hoping to achive.
This method is more effective at allowing users to add new types of products, but as it does not give defined classes as the output the system does not give the target functionality that this project is hoping to achieve.
\subsection{Requirements of the Image Classification Models}
The of the main ojectives of this project is to be able to create models that can give a class given an image for anydataset. Which means that there will be no ''one solution fits all to the problem``. While the most complex way to solve a problem would most likely result in success it might not be the most efficient way to achive the problem.
The of the main objectives of this project are to be able to create models that can give a class given an image for any dataset. Which means that there will be no ''one solution fits all to the problem``. While the most complex way to solve a problem would most likely result in success, it might not be the most efficient way to achieve the problem this porject is trying to achieve.
This section will analyse possible models that would obtain the best results. The models for this project have to be the most effiecient as possible while resulting in the best accuracry as possible.
This section will analyse possible models that would obtain the best results. The models for this project have to be the most efficient as possible while resulting in the best accuracy as possible.
A classical example is the MISNT Dataset \cite{mnist}. Models for the classfication of the mnist dataset can be both vary simple or extremely complex and achive diferent levels of complexity.
For example in \cite{mist-high-accuracy} a acurracy $99.91\%$, by combining 3 Convolutional Neural Networks(CNNs), with different kernel sizes and by changing hyperparameters, augmenting the data, and in \cite{lecun-98} an accuracy of $95\%$ was accived using a 2 layer neurual network with 300 hiden nodes. Both these models achive the accuracy that is required for this project but \cite{mist-high-accuracy} is more way more expensice to run. There when deciding when to choose the what models the create the system should chose to create the model that can achive the required accuracy while taking the leas amount of effort to train.
A classical example is the MNIST Dataset \cite{mnist}. Models for the classification of the MNIST dataset can be both very simple or extremely complex and achieve different levels of complexity.
For example, in \cite{mist-high-accuracy} an accuracy $99.91\%$, by combining 3 Convolutional Neural Networks (CNNs), with different kernel sizes and by changing hyperparameters, augmenting the data, and in \cite{lecun-98} an accuracy of $95\%$ was achieved using a 2 layer neural network with 300 hidden nodes. Both these models achieve the accuracy that is required for this project but \cite{mist-high-accuracy} is more way more expensive to run. There when deciding when to choose what models they create the system should choose to create the model that can achieve the required accuracy while taking the leas amount of effort to train.
% TODO fix the inglish in these sentance
The models for this system to work as indented shold be as small as possible while obtaining the required accuracy required to achive the task of classification the classes.
The models for this system to work as indented should be as small as possible while obtaining the required accuracy required to achieve the task of classification of the classes.
\subsection{Method of image classification models}
There all multitple ways of creating of achiving image classification, the requirements of the system are that the system should return the class that an image that belongs to. Which means that we are going to be using superfised classification methods as this ones are the ones that meet the requirements of the system.
There are all multiple ways of creating of achieving image classification, the requirements of the system are that the system should return the class that an image that belongs to. Which means that we will be using supervised classification methods, as these ones are the ones that meet the requirements of the system.
% TODO find some papers to proff this
The system will use supervised models to classify images, using a combination of different types models, using neural networks, convulution neural networks, deed neural networks and deep convluution neural networks.
The system will use supervised models to classify images, using a combination of different types of models, using neural networks, convolution neural networks, deed neural networks and deep convolution neural networks.
These types where chosen as they have had a large success in past in other image classification chalanges, for example in the imagenet chanlage \cite{imagenet}, which has ranked various different models in classifiying a 14 million images. The contest has been running since 2010 to 2017.
These types were decided as they have had a large success in past in other image classification chanalges, for example in the ImageNet chanllage \cite{imagenet}, which has ranked different models in classifying a 14 million images. The contest has been running since 2010 to 2017.
The models that participated in the contest tended to use more and more Deep convlution neural networks, out of various model that where generated there are a few landmark models that were able to acchive high acurracies, including AlexNet \cite{krizhevsky2012imagenet}, VVG, ResNet-152\cite{resnet-152}, EfficientNet\cite{efficientnet}.
The models that participated in the contest tended to use more and more Deep convolution neural networks, out of various models that were generated there are a few landmark models that were able to achieve high accuracies, including AlexNet \cite{krizhevsky2012imagenet}, ResNet-152 \cite{resnet-152}, EfficientNet \cite{efficientnet}.
% TODO find vgg to cite
These models can used in two ways in the system, they can be used to generate the models via transferlearning and by using the model structure as a basis to generate a complete new model.
These models can be used in two ways in the system, they can be used to generate the models via transfer learning and by using the model structure as a basis to generate a complete new model.
\subsection{Models as a basis}
% TODO compare the models
This section will compare the different models that did well in the image net challenge.
AlexNet \cite{krizhevsky2012imagenet} is a deep convolution neural network that participated in the ImageNet LSVRC-2010 contest, it achieved a top-1 error rate of $37.5\%$, and a top-5 error rate of $37.5\%$, and a variant of this model participated in the ImageNet LSVRC-2012 contest and achieved top-5 error rate of $15.3\%$. The architecture of AlexNet consists of 5 convolution layers that are run separately followed by 3 dense layers, some layers are followed by Max pooling. The training the that was done using multiple GPUs, one GPU would run the part of each layer, and some layers are connected between GPUs. The model during training also contained data argumentation techniques such as label preserving data augmentation and dropout.
While using AlexNet would probably yield desired results, it would complicate the other parts of the service. As a platform as a service, the system needs to manage the amount of resources available, and requiring to use 2 GPUs to train a model would limit the amount of resources available to the system by 2-fold.
% TODO talk more about this
% RestNet-152
% EddicientNet
\subsection{Efficiency of transfer learning}
\subsection{Creation Models}
@ -142,7 +150,7 @@
The models will be created using TensorFlow \cite{tensorflow2015-whitepaper} and Keras \cite{chollet2015keras}. These theologies are chosen since they are both robust and used in industry.
\subsection{Expandable Models}
The current most used approach for expanding a CNN model is to retrain the model. This is done by, recreating an entire new model that does the new task, using the older model as a base for the new model\cite{amazon-rekognition}, or using a pretrained model as a base and training the last few layers.
The current most used approach for expanding a CNN model is to retrain the model. This is done by, recreating an entire new model that does the new task, using the older model as a base for the new model \cite{amazon-rekognition}, or using a pretrained model as a base and training the last few layers.
There are also unsupervised learning methods that do not have a fixed number of classes. While this method would work as an expandable model method, it would not work for the purpose of this project. This project requires that the model has a specific set of labels which does not work with unsupervised learning which has unlabelled data. Some technics that are used for unsupervised learning might be useful in the process of creating expandable models.