chore: more work done
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
Andre Henriques 2024-02-29 15:38:41 +00:00
parent 4b48afaffb
commit 3a6220eaf4

View File

@ -135,7 +135,7 @@
%Amazon provides bespoque machine learning services that if were contacted would be able to provide image classification services. Amazon provides general machine learning services \cite{amazon-machine-learning}.
Amazon provides an image classification service called ``Rekognition'' \cite{amazon-rekognition}. This service provides multiple services from face recognition, celebrity recognition, object recognition and others. One of these services is called custom labels \cite{amazon-rekognition-custom-labels} that provide the most similar service, to the one this project is about. The custom labels service allows the users to provide custom datasets and labels and using AutoML the Rekognition service would generate a model that allows the users to classify images according to the generated model.
Amazon provides an image classification service called ``Rekognition'' \cite{amazon-rekognition}. This service provides multiple services from face recognition, celebrity recognition, object recognition and others. One of these services is called custom labels \cite{amazon-rekognition-custom-labels} that provides the most similar service, to the one this project is about. The custom labels service allows the users to provide custom datasets and labels and using AutoML the Rekognition service would generate a model that allows the users to classify images according to the generated model.
The models generated using Amazon's Rekognition do not provide ways to update the number of labels that were created, without generating a new project. This will involve retraining a large part of the model, which would involve large downtime between being able to add new classes. Training models also could take 30 minutes to 24 hours, \cite{amazon-rekognition-custom-labels-training}, which could result in up to 24 hours of lag between the need of creating a new label and being able to classify that label. A problem also arises when the uses need to add more than one label at the same time. For example, the user sees the need to create a new label and starts a new model training, but while the model is training a new label is also needed. The user now either stops the training of the new model and retrains a new one, or waits until the one currently running stops and trains a new one. If new classification classes are required with frequency, this might not be the best platform to choose.
@ -173,7 +173,7 @@
These types were decided as they have had a large success in the past in other image classification challenges, for example in the ImageNet challenges \cite{imagenet}, which has ranked different models in classifying a 14 million images. The contest has been running since 2010 to 2017.
The models that participated in the contest tended to use more and more Deep convolution neural networks, out of various models that were generated there are a few landmark models that were able to achieve high accuracies, including AlexNet \cite{krizhevsky2012imagenet}, ResNet-152 \cite{resnet-152}, EfficientNet \cite{efficientnet}.
The models that participated in the contest tended to use more and more Deep convolution neural networks, out of the various models that were generated there are a few landmark models that were able to achieve high accuracies, including AlexNet \cite{krizhevsky2012imagenet}, ResNet-152 \cite{resnet-152}, EfficientNet \cite{efficientnet}.
% TODO find vgg to cite
These models can be used in two ways in the system, they can be used to generate the models via transfer learning and by using the model structure as a basis to generate a complete new model.
@ -196,9 +196,9 @@
% MobileNet
% EfficientNet
EfficientNet \cite{efficient-net} is a deep convolution neural network that was able to achieve $84.3\%$ top-1 accuracy while ``$8.4x$ smaller and $6.1x$ faster on inference than the best existing ConvNet''. EfficientNets\footnote{the family of models that use the thecniques that described in \cite{efficient-net}} are models that instead of the of just increasing the depth or the width of the model, we increase all the parameters at the same time by a constant value. By not scaling only depth EfficientNets can acquire more information about the images specially the image size is considered.
To test their results, the EfficientNet team created a baseline model which as a building block used the mobile inverted bottleneck MBConv \cite{inverted-bottleneck-mobilenet}. The baseline model was then scaled using the compound method which resulted in better top-1 and top-5 accuracy.
While EfficientNets are smaller than their non-EfficientNet counterparts they are more computational intensive, a ResNet-50 scaled using the EfficientNet compound scaling method is $3\%$ more computational intensive than a ResNet-50 scaled using only depth while only improving the top-1 accuracy by $0.7\%$, and as the model will be trained and run multiple times decreasing the computational cost might be a better overall target for sustainability then being able to offer higher accuracies.
EfficientNet \cite{efficient-net} is a deep convolution neural network that was able to achieve $84.3\%$ top-1 accuracy while ``$8.4x$ smaller and $6.1x$ faster on inference than the best existing ConvNet''. EfficientNets \footnote{the family of models that use the thecniques that described in \cite{efficient-net}} are models that instead of the of just increasing the depth or the width of the model, we increase all the parameters at the same time by a constant value. By not scaling only depth, EfficientNets can acquire more information about the images, specially the image size is considered.
To test their results, the EfficientNet team created a baseline model which as a building block used the mobile inverted bottleneck MBConv \cite{inverted-bottleneck-mobilenet}. The baseline model was then scaled using the compound method, which resulted in better top-1 and top-5 accuracy.
While EfficientNets are smaller than their non-EfficientNet counterparts, they are more computational intensive, a ResNet-50 scaled using the EfficientNet compound scaling method is $3\%$ more computational intensive than a ResNet-50 scaled using only depth while improving the top-1 accuracy by $0.7\%$, and as the model will be trained and run multiple times decreasing the computational cost might be a better overall target for sustainability then being able to offer higher accuracies.
Even though scaling using the EfficientNet compound method might not yield the best results using some EfficientNets what were optimized by the team to would be optimal, for example, EfficientNet-B1 is both small and efficient while still obtaining $79.1\%$ top-1 accuracy in ImageNet, and realistically the datasets that this system will process will be smaller and more scope specific than ImageNet.
% \subsection{Efficiency of transfer learning}
@ -230,22 +230,22 @@
\end{itemize}
\subsection{Overall structure}
The system needs to have some level of distributivity, this requirement exists because the expensive nature of machine learning training.
The system needs to have some level of distributivity, this requirement exists because of the expensive nature of machine learning training.
It would be unwise to perform machine learning training on the same machine that the main web server is running, as it would starve that server of resources.
\subsection{Resources}
The system has to manage what servers are available to do machine learning tasks.
The system needs to be aware and manage all GPU servers, servers that have GPUs available, and run the possible models.
The system has to be aware and manage all GPU servers, servers that have GPUs available, and run the possible models.
\subsection{Web platform}
The web app is where users manage models, and data. The user will access the web app and configure the model, and manage that data set.
\subsection{JSON API}
A big part of a SaaS is the ability to communicate with other services, nowadays, the way that systems communicate with each other is using mostly JSON and Rest APIs \cite{json-api-usage-stats}. Since the system will need to be communicated with other services to work as intended.
A big part of a SaaS is the ability to communicate with other services, nowadays, the way that systems communicate with each other is using mostly JSON and Rest APIs \cite{json-api-usage-stats}. Since the system will need to communicate with other services to work as intended.
\subsection{Server Management}
Since AI training is notoriously expensive, the system cannot run on one server alone, as this would put too much strain in that server.
Since AI training is notoriously expensive, the system cannot run on one server alone, as this would put too much strain on that server.
The system needs to be able to distribute the load between the multiple servers.
For that reason, the service needs to both be able to send training and prediction jobs to servers that have the resources to train models or predict classes from images.
@ -259,7 +259,7 @@
\subsection{Model Management}
Once the model has been created, the system has to keep track of the model, as well as the actual accuracy of the model.
It has to keep track of how much the model used so it can distribute the load from in different GPU servers.
It has to record how much the model used so it can distribute the load from in different GPU servers.
\pagebreak
@ -285,9 +285,9 @@
\item{The user requests the classification or confirmation of an image}
\end{itemize}
\subsection{Webapp}
\subsection{Web app}
The goal of the project is to provide a software as a service platform for classification tasks with that in mind the service needs to have a way of controlling it. This will be achieved with a web interface.
The goal of the project is to provide a software as a service platform for classification tasks. With that in mind, the service needs to have a way of controlling it. This will be achieved with a web interface.
The web-interface will have to manage:
\begin{itemize}
@ -323,7 +323,7 @@
The TP when training the model decides when the training is finished, this could be when the training time has finished or if the model accuracy is not substantially increasing within the last training rounds.
During the training process the TP needs to cache the dataset being used, this is because to create one model, the system might have to generate and train more than one model, during this process if the dataset is not cached then time is spent reloading the dataset into memory.
During the training process, the TP needs to cache the dataset being used, this is because to create one model, the system might have to generate and train more than one model, during this process, if the dataset is not cached then time is spent reloading the dataset into memory.
\pagebreak