Prototype Report

Image Processing Platform

Team 04

Roles / Team Members
Project Manager / Meiyi Yang
Operational Concept Engineer / Junran Liu
Requirements Engineer & Software Architect / Hao Wu
Prototyper / Yifan Liu
Life Cycle Plan / Xiangchen Zhao
Feasibility Analyst / Xinhui Liu
IIV&V and Quality Focal Point / Vincent DeGenova

10/06/2016

Version History

/ Date / Author / Version / Changes made / Rationale /
10/05/16 / YFL,XCJ / 1.0 / · Created a draft version / · Initial draft
· / ·
· / ·

Prototype Report i

Version History ii

Table of Contents iii

Table of Tables iv

Table of Figures v

1. Introduction 1

1.1 Purpose of the prototype report 1

1.2 Status of the prototype 1

2. Navigation Flow 2

3. Prototype 3

Table of Tables

Table 1: Risks and their Prototypes 1

Table 2: TensorFlow 3

Table 3: Model Testing 5

Table 4: Train and Retrain the Model 9

Table 5: Dataset 10

Table of Figures

Figure 1: Navigation Flow 2

Figure 2: Algorithms comparison 4

Figure 3: Retrain 6

Figure 4: Flower sample 7

Figure 5: Classify result 8

Figure 6: dataset 11

1. Introduction

1.1 Purpose of the prototype

This prototype report introduces top risks of our project and approaches to mitigate the risks. Prototyping is an efficient way for us to demonstrate our understanding about the project and deliver our concepts visually to the client. So that the client could have an idea about how the product will look like and give us useful feedback to improve it.

In this document, risks about implementing machine learning algorithms, training detector models, and integrating graphic user interface with detecting pipeline are described. Besides, it also introduces prototypes to mitigate these risks.

Table 1: Risks and their Prototypes

No. / Risk / Prototype
1 / Machine learning algorithms implementation / Algorithm prototype
Providing a clear idea of how the algorithm will operate and advantages of the algorithm compared with others.
2 / Model testing / Testing prototype
Providing a clear idea about the time and accuracy of the algorithm on pre-trained model, our own dataset, and retrained model.
3 / Train and retrain the model / Training prototype
Providing a clear idea about the performance of the model when our training dataset is small.
4 / Dataset / Dataset prototype
A prototype that provides a feasible evidence that we can find enough image libraries to train our model.

1.2 Status of the prototype

1) Algorithm prototype revision

2) Testing prototype creation.

3) Risk avoidance of the API integration

4) Training prototype creation

5) Dataset prototype creation

2. Navigation Flow

Figure 1: Navigation Flow

3. Prototype

3.1 Prototype 1 Algorithm Implementation

3.1.1 Purpose of this prototype

One of core capabilities of our project is to recognize images and classify images into different classes by implementing machine learning algorithms. Once the user upload a set of images to our system, the system will preprocess the images, then output images with the class it belongs to.

Since there are many different machine learning algorithms that can accomplish image recognition, we have to ensure that the algorithm we select is easy to use with applicable performance. In addition, we have to make sure that the algorithm is not hard for all team members to learn and use.

3.1.2 Result

After prototyping, we found out that TensorFlow is applicable for our system compared with other machine learning algorithms.

Table 2: TensorFlow

Description / This screenshot is the algorithm prototype that compares TensorFlow with other machine learning algorithms.
Related Capability / WC_4107: The pipeline shall use deep learning algorithm.

Figure 2: Algorithms comparison

3.2 Prototype 2 Model Testing

3.2.1 Purpose of this prototype

Testing speed and accuracy of TensorFlow when dealing with pre-trained classes is great. In this scenario, classifying an image with a pre-trained class only takes 2 second with GTX970M GPU and the accuracy rate is 90%.

However, there is no feasible evidence to prove that this algorithm will work as efficient as it is when dealing with topics trained with our own dataset. We have to ensure that the performance of the algorithm is acceptable while testing images with new topics.

3.2.2 Result

After prototyping, we have a better understanding about the testing speed and accuracy of TensorFlow when dealing with topics trained with our own dataset.

Table 3: Model Testing

Description / This screenshot is the testing prototype that verifies the running speed and accuracy of the algorithm with our own dataset.
Related Capability / WC_4101: The pipeline shall use detector model to detect the images uploaded by the users.
Pre-condition / A user uploads a set of images.
Post condition / The user can view images with their topics.

Figure 3: Retrain

Figure 4: Flower sample

Figure 5: Classify result

3.3 Prototype 3 Train and Retrain the Model

3.3.1 Purpose of this prototype

In order to classify topics other than pre-trained classes, we have to train or retrain the model to scale up to more topics. There is a risk that training and retraining processes may take a long period of time. As a result, the topic scope may be restricted to certain topics by the delivery time. Also, there is a risk that the testing speed and accuracy may be less than our expectation if the training or retraining dataset is small.

We have to make sure that the algorithm will work well on both small and large dataset, and the training process won’t take too much of time.

3.3.2 Result

After prototyping, we know that the number of training images is not the key element for testing speed and accuracy, because the results of small dataset with 500 images and large dataset with 4000 images are very close. The steps of iteration will affect the running speed and accuracy primarily.

Table 4: Train and Retrain the Model

Description / This screenshot is the training prototype that figures out the difference of testing performance between model with a small dataset and model with a large dataset.
Related Capability / WC_4148: As a client I can re train the pipeline by giving a new topic and a new set of images
Pre-condition / A user uploads a set of images, and add a new topic or select a topic.
Post condition / The pipeline will be able to classify a new topic.

3.4 Prototype 4 Dataset

3.4.1 Purpose of this prototype

Dataset determines how many topics our pipeline can classify. There is a risk that we can’t find enough images to train our model. In that case, our pipeline can only classify a few topics. We have to make sure that there will be enough image libraries to train the model.

3.4.2 Result

We have to keep looking for different image libraries. Also, we could ask the client for advice.

Table 5: Dataset

Description / This screenshot is the current dataset we have.
Related Capability / WC_4148: As a client I can re train the pipeline by giving a new topic and a new set of images
Pre-condition / A trainer uploads a set of images, and add a new topic or select a topic.
Post condition / The dataset increases.

Figure 6: Dataset