Introduction: TensorFlow is a machine learning framework for JavaScript applications. It helps us build models that can be directly used in the browser or in the node.js server. We use this framework for building an application that can detect objects in images using a regressor rather than a classifier.
Description: A classifier groups entries based on similarity to each other. Images can also be compared to one another. However, it has no relevance to the position of an object in an image. A regressor uses a bounding box that spans the image with varying sizes until it finds a portion of the image that matches an object. The object itself can be specified as a bounding box within an image. The data for training as well as test are images where the training data set has bounding box and label while the test data set does not.
The JavaScript application uses labels from the images to train the model. When enough training data images have been processed, the model learns the characteristics of the object detected. Then as it runs through the test data set, it can predict the bounding box and the label if a similar object is determined in the test data image.
As with any ML learning example, the data is split into 70% training set and 30% test set. There is no order to the data and the split is taken over a random set.
The model chosen is an object detection model. This model specifies the bounding box as top left and bottom right co-ordinates using horizontal and vertical offset notations. The size of the image is known before hand in terms of width and length and the bounding boxes are guaranteed to be within the image. The object, filename and type of file as image can be optionally specified to each image so that they can be looked up in a collection. The output consists of a label and a bounding box. A label map file is used to specify the objects to be detected and, in this case, there is only one object specified.
TensorFlow makes it easy to construct this model using an API. It can only present the output after the model is trained. In this case, the model must be run after the training data has labels assigned. This might be done by hand. The API expects data to be converted into a sequence of binary records also called TFRecord which is a simple format for storing a sequence of binary records.
With the model and training/test sets defined, it is now as easy to evaluate the model and run the inference. The model can also be saved and restored. It is executed faster when there is GPU added to the computing.
When the model is trained, it can be done in batches of predefined size. The number of passes of the entire training dataset called epochs can also be set up front. A batch size of 90 and the number of steps as 7000 could be used. These are called model tuning parameters. Every model has a speed, Mean Average Precision and output. The higher the precision, the lower the speed. It is helpful to visualize the training with the help of a high chart that updates the chart with the loss after each epoch. Usually there will be a downward trend in the loss which is referred to as the model is converging.
When the model is trained, it might take a lot of time say about 4 hours. When the test data has been evaluated, the model’s efficiency can be predicted using precision and recall, terms that are used to refer to positive inferences by the model and those that were indeed positive within those inferences.
Conclusion: Tensorflow.js is becoming a standard for implementing machine learning models. Its usage is simple, but the choice of model and the preparation of data takes significantly more time than setting it up, evaluating, and using it.
No comments:
Post a Comment