more work done on the report

This commit is contained in:
Andre Henriques 2024-04-29 20:59:12 +01:00
parent 15d4de3b45
commit b5f61c5e20

View File

@ -453,7 +453,7 @@
The design on this section is an ideal design solution, where no time limitations or engineering limitations were considered.
This section tries to provide a description of a designed solution that would allow for the best user experience possible.
The design proposed on this section can be viewed as a scope version of this project, and the \hyperref[sec:si]{Service Implementation} section will discuss how the scope was limited so that the service would achieve the primary goals of the project while following the design.
The design proposed in this section can be viewed as a scope version of this project, and the \hyperref[sec:si]{Service Implementation} section will discuss how the scope was limited so that the service would achieve the primary goals of the project while following the design.
\subsection{Structure of the Service}
@ -468,9 +468,9 @@
This structure was selected because it allows separation of concerns, to happen based on the resources required by that layer.
The presentation layer requires interactivity of the user, and therefore it needs to be accessible from the outside, and be simple to use.
The presentation can be either implemented as webpage working directly on the server that is running the API, or it can be implemented as separate web app that uses the API to interface directly.
The presentation can be either implemented as a webpage working directly on the server that is running the API, or it can be implemented as a separate web app that uses the API to interface directly.
The API layer, is one of the most important part of the service. As it's going to be most used way to interact with the service.
The API layer, is one of the most important part of the service. As it's going to be the most used way to interact with the service.
The user can use the API to control their entire model process from importing, to classification of images.
The Worker layer, consists of a set of servers available to perform GPU loads.
@ -489,7 +489,7 @@
It makes the most sense for the application to be a WEB application, since this project is software as a service.
Most of the interactions will be made by the users' services programmatically via the API, which will be an HTTPS REST API.
Independently of the kind of the application is it needs to allow users to fully control their data in an easy to use and understand way.
Independently of the kind of the application is, it needs to allow users to fully control their data in an easy to use and understand way.
The application should allow users to:
%TODO add more
\begin{multicols}{2}
@ -523,7 +523,7 @@
\end{multicols}
The API should be implemented to be able to handle large amounts of simultaneous requests, and respond to those requests as fast as possible.
API should be implemented such that it can be expanded easily, so that future improvements can happen.
The API should be implemented such that it can be expanded easily, so that future improvements can happen.
The API should be consistent and easy to use, information on how to use the API should also be available to possible users.
@ -545,7 +545,7 @@
\subsection{Models Training}
% The Training process follows % TODO have a flow diagram
Model Training should be independent of image classification. A model training should not affect any current classification. The system could use multiple ways to achieve this such as:
Model Training should be independent of image classification. A model training should not affect any current classification. The system could use multiple ways to achieve this, such as:
\begin{multicols}{2} % TODO think of more ways
\begin{itemize}
@ -582,7 +582,7 @@
\subsection{Structure of the Service}
The structure of the service matches the designed structure as it can be seen \ref{fig:simplified_service_diagram}.
The structure of the service matches the designed structure, as it can be seen \ref{fig:simplified_service_diagram}.
\begin{figure}[h!]
\centering
@ -628,7 +628,7 @@
The web application uses the API to control the functionality of the service, this design is advantages.
It allows users of the application to do everything that the application does with the API, which is ideal in a SaaS project.
\subsubsection*{Service authentication}
\subsubsection*{Service authentication} \label{sec:impl:service-auth}
\begin{figure}[h!]
\centering
@ -731,7 +731,7 @@
The steps are very similar to the normal model management.
The user would follow all the steps that are required for normal model creation and training.
At the end of the process the user will be able to add new data to the model and retrain it.
At the end of the process, the user will be able to add new data to the model and retrain it.
To achieve that, the user would simply to the data tab and create a new class.
Once a new class is added, the webpage will inform the user that the model can be retrained.
The user might choose to retrain the model now or more new classes and retrain later.
@ -740,94 +740,107 @@
During the entire process of creating new classes in the model and retraining the model, the user can still perform all the classifications tasks they desire.
\subsubsection*{Task Management}
Task management is the section of the website is where uses can manage their tasks. This includes training and classification tasks.
Users in this tab can see what is the progress, and results of their tasks.
The webpage also provides nice, easy to see statistics on the task results, allowing the user to see how the model is performing.
\textbf{TODO add image}
On the administrator, users should be able to change the status of tasks as well as see a more comprehensive view on how the tasks are being performed.
Administrator users can see the current status of runners, as well as which task the runners are doing.
\textbf{TODO add image}
\subsection{API}
As a software as a service, one of the main requirements is to be able to communicate with other services.
The API provides the simplest way for other services to interact with this service.
The API was implemented as multithreaded go \cite{go} server.
The application on launch loads a configuration file, connects to the database.
After connecting to the database, the application performs pre-startup checks to make sure no tasks that were interrupted via a server restart and were not left in an unrecoverable state.
Once the checks are done, the application creates workers (which will be covered in the next subsection), which when completed the API server is finally started up.
The API provides a various HTTPS REST JSON endpoints that will allow any user of the service to fully control their model using only the API.
Information about the API is shown around the web page so that the user can see information about the API right next were the user would normally do the action providing a good user interface.
Information about the API is shown around the web page so that the user can see information about the API right next to where the user would normally do the action, providing a good user interface.
As the user can get information about right where they would normally do the action.
\textbf{TODO add image}
\subsubsection*{Implementation Details}
A go \cite{go} http server will run the application.
This server will take JSON and multipart form data requests, the requests are processed, and answered with a JSON response.
This server will take JSON and multipart form data requests, the requests are processed, and answered with a JSON response.
The multipart requests are required due to JSON's inability to transmit binary data, which will make the uploading of images extremely inefficient.
Those images would have to be transformed into binary data and then uploaded as a byte array or encoded as base64 and uploaded.
Either of those options is extremely inefficient.
Therefore, there is a need to use multipart form requests are required to allow the easy uploading of binary files.
The multipart requests are required due to JSON's inability to transmit binary data, which will make the uploading of images extremely inefficient.
Those images would have to be transformed into binary data and then uploaded as a byte array or encoded as base64 and uploaded.
Either of those options is extremely inefficient.
Therefore, there is a need to use multipart form requests are required to allow the easy uploading of binary files.
Go was selected as the language to implement the backend due to various of its advantages.
Go has extremely fast compilations which allows for rapid development, and iteration.
It has a very minimal runtime which allows it to be faster, than heavy runtime languages such as JavaScript.
It is also a simple language, which helps maintain the codebase.
Go was selected as the language to implement the backend due to various of its advantages.
Go has extremely fast compilations which allows for rapid development, and iteration.
It has a very minimal runtime which allows it to be faster, than heavy runtime languages such as JavaScript.
It is also a simple language, which helps maintain the codebase.
% TODO cite cgo tensorflow and torch
The Go language integrates well with C libraries, which allows it access to machine learning libraries like TensorFlow or Lib Torch.
% TODO cite cgo tensorflow and torch
The Go language integrates well with C libraries, which allows it access to machine learning libraries like TensorFlow or Lib Torch.
\subsubsection*{Authentication}
For a user to be authenticated with the server, it must first log in.
The API allows users to login, which emits a token, and manually create tokens.
While using the web application, this is done transparently, but it can also be manually done via the respective API calls.
During the login process, the service checks to see if the user is registered and if the password provided during the login matches the stored hash.
Upon verifying the user, a token is emitted.
The tokens can also be created in the settings page.
The advantage of creating the tokens in the settings page is that they are named, and their lifetime is controllable.
Once a user is logged in they can then create mode tokens as seen in \ref{sec:impl:service-auth}.
While using the API the user should only use created tokens in the settings page as those tokens are named, and have controllable expiration dates.
This is advantageous from a security perspective as the user can manage who has access to the API.
If the token gets leaked, the user can then process to delete the named token, to guarantee the safety of his access.
The token can then be used in the ``token'' header as proof to the API that the user is authenticated.
That token can be used as the header ``token'' as proof that the user is authenticated.
\subsection{Generation and Training of Models}
In able to be able to provide specialized models to the user the service needs to be able to generate the models first.
Models generation happens in the API server, the API server analyses what the image that provided and generates several model candidates accordingly.
The number of model candidates is user defined.
The model generation subsystem decides the structure of the model candidates based on the image size, it prioritizes the smaller models for smaller images and convolution networks with bigger images.
The depth is controlled both by the image size and number of outputs, models candidates that need to be expanded are generated with bigger values to account for possible new values.
It tries to generate the optimal size if only one model is requested.
If more than one is requested then the generator tries to generate models of various types and sizes, so if there is possible smaller model it will also be tested.
The service requires the generation of models \ref{fig:expandable_models_generator}.
Model training happens in a runner, more information about runners will be explained in section \ref{impl:runner}.
\subsubsection*{Implementation Details}
% TODO explore this a bit more
Model training was implemented using TensorFlow \cite{tensorflow}.
Normally when using go with machine learning only the prediction is run in go and the training happens in python.
The training system was implemented that way.
The model definitions are generated in the go API and then stored in the database.
The runner then loads the definition from the API and creates a model based on that.
The runner when it needs to perform training it generates a python script tailored to the model candidate that needs to be trained then runs the that python script, and monitors the result of the python script.
While the python script is running, it takes use of the API to inform the runner of epoch and accuracy changes.
\subsubsection*{Model Generation}
The during train, the runner takes a round-robin approach.
It trains every model candidate for a few epochs, then compares the different models candidates.
If there is too much operation from the best model to the worst model, then the system might decide not to continue training a certain candidate and focus the training resources on candidates that are performing better.
Once one candidate archives the target accuracy, which is user defined, the training system stops training the models candidates.
The model candidate that achieved the target accuracy is then promoted to the model, the other candidates are removed.
The model now can be used to predict the labels for any image that the user decides to upload.
Generating all models based on one single model would decrease the complexity of the system, but it would not guarantee success.
The system needs to generate successful models, to achieve this, the system will be performing two approaches:
\begin{itemize}
\item{Database search}
\item{AutoML (secondary goal)}
\end{itemize}
\subsubsection*{Expandable Models}
The database search will consist of trying both previous models that are known to work to similar inputs, either by models that were previously generated by the system or known good models; base known architectures that are modified to match the size of the input images.
Expandable models follow mostly the same process as the normal models.
First, bigger model candidates are generated.
Then the models are training using the same technic.
At the end, after the model candidate has been promoted to the full model, the system starts another python process that loads the just generated model and splits into a base model and a head model.
An example of the first approach would be to try the ResNet model, while the second approach would be using the architecture of ResNet and configuring the architecture so it is more optimized for the input images.
With this two separate models, the system is now ready to start classifying new images.
AutoML approach would consist of using an AutoML system to generate new models that match the task at hand.
\subsubsection*{Expanding Expandable Models}
Since the AutoML approach would be more computational intensive, it would be less desirable to run. Therefore, the approach would be for the database search to happen first, where known possibly good models would be first tested. If a good model is found, then the search stops and if no model is found, the system would resort to AutoML to find a suitable model.
\subsection{Model Training}
% The Training process follows % TODO have a flow diagram
The training of the models happens in a secondary Training Process(TP).
Once a model candidate is generated, the main process informs the TP of the new model.
The TP obtains the dataset and starts training.
Once the model finished training, it reports to the main process with the results.
The main process then decides if the model matches the requirements.
If that the case, then the main process goes to the next steps; otherwise, the service goes for the next model that requires training.
The TP when training the model decides when the training is finished, this could be when the training time has finished or if the model accuracy is not substantially increasing within the last training rounds.
During the training process, the TP needs to cache the dataset being use.
This is because to create one model, the service might have to generate and train more than one model, during this process, if the dataset is not cached then time is spent reloading the dataset into memory.
During the expanding process, the generation system creates a new head candidate that matches the newly added classes.
The base model, that was created in the original training process, is used with all available data to train the head candidate to perform the classification tasks.
The training process is similar to the normal training system, but this uses a different training script.
Once the model has finished training and the system meets the accuracy requirements, then makes the new head available for classification.
\subsection{Model Inference}
@ -840,6 +853,8 @@
TODO
\subsection{Runner} \label{impl:runner}
\subsection{Conclusion}
This section discussed the design and implementation specifications for the system.