Wednesday, September 19, 2018

Computer Vision Internship in Corvallis with BigML

We are pleased to introduce Dimitrios Trigkakis, a Computer Vision Intern who worked with the BigML Team for the summer of 2018. On the other side the world from Efe Toros, BigML’s other summer intern, Dimitrios gained industry experience while applying his years of research knowledge, which he shares first-hand here.

BigML Internship Corvallis

I had the opportunity of being introduced to BigML as I searched for summer internships, and I quickly realized this company would be a great fit. I joined the team as a Computer Vision Intern in Corvallis, Oregon, and this role proved to be a nice change of pace between my Master’s degree work last year, and my ongoing research for my Ph.D. at Oregon State University.

BigML Intern Dimitrios Trigkakis

During the interview process, it became apparent that BigML has a team of intelligent and ambitious people, who share a lot of the interest and motivation that originally lead me to study data science and Machine Learning at the beginning of my academic career.

For the initial phases of my training, I discussed potential projects with my mentor, Dr. Charles Parker, and we developed a plan for practical and important contributions focused on Computer Vision problems, as well as strengthening the pre-existing foundations for image-based Machine Learning.

All of my projects were aimed towards developing methods for identification of the content of images. Typical Machine Learning algorithms can find patterns given the features of a dataset, but in computer vision, such features are not present and have to be constructed from lower level information (the image’s pixels). BigML can provide a platform for re-training of image-based models, with great benefits in all areas where images are involved. Some examples of tasks involving computer vision include:

  • Recognizing car plates
  • Reading subtitles in other languages
  • Identifying people in videos
  • Recognizing objects in scenes
  • Identifying medically relevant visual features in x-rays or other scans

There are many tasks where computer vision can provide excellent assistance in automating labor-intensive image identification tasks that are currently being assigned to large groups of people, costing a lot of time and money. My internship projects involved implementing or expanding the existing ingredients enabling large-scale image recognition for BigML. All of these projects are aimed towards image classification, with future potential for object detection, image segmentation, and other computer vision tasks. More specifically, my projects revolved around the following:

  • Expanding BigML’s infrastructure for employing several pre-trained convolutional neural network models, which form the basis for later fine-tuning on many computer-vision related datasets.
  • Developing a similar infrastructure for support of our models on web-pages, for an accurate and fast web inference that is user-friendly.
  • Training models on image datasets that do not revolve around fine-grained classification of object categories in photographic images (e.g. Imagenet dataset). Moving in a different direction, we trained a model for classification of images that occur in an artistic setting, where object classification is challenging.
  • We developed a deep learning model that is capable of identifying structure, clusters and visualizations for datasets without labels (unsupervised learning).

Overview of Projects

  • On the software engineering side of things, we wanted to implement pre-trained models for various architectures, each with different strengths and weaknesses. We now support five different neural network architectures, one of which (namely Mobilenet) is very lightweight and was designed in order to run on mobile phone hardware. All architectures achieve competitive performance on the Imagenet dataset, while their differences focus on the trade-off between accuracy and size/inference time.
  • As an extension of the work on the server side, I developed a similar infrastructure for classifying images online. The javascript re-implementation of the above work allows a user to submit their own network definition files, and proceed to classify images that they upload to the webpage.
  • For expanding our repertoire of provided models, I trained a neural network for classification on the BAM (Behance Artistic Media dataset), an artistic dataset with labels for three categories: content, artistic medium and emotion. The network learns image features that are relevant for correctly predicting not only the content of an artistic image but also the emotion and style that the artwork represents. The learned network features can be reused by only training the predictive part of the network, enabling re-training for a new dataset that may not contain real-life photographic material. We did notice a respectable performance gap between training the entire network, or re-training only the last layers of the network (from image features to prediction). Both networks were pre-trained on the Imagenet.
    bam_samples

    BAM dataset: examples for content category ‘dog’

    bird

    The prediction task includes three categories (content, medium and emotion)

  • I developed a variational autoencoder architecture for unsupervised learning on simple feature datasets. Datasets like the Iris or text-based datasets contain patterns that a neural network is able to extract, given the task of reproducing its input to its output. Using the feature vector that unlabelled images are assigned to when the network attempts to reconstruct them is a way to reduce the dimensionality of the input without losing a lot of information about the content of the input data. By using the T-SNE algorithm, we can further reduce the dimensionality of the input data into two dimensions for easy visualization. Finally, k-means clustering can identify membership into classes, grouping the input data together and giving hints about the regularities in the dataset, without requiring any labelled examples.
Kmeans

Inspection of unsupervised categories gives insight into the regularities in the input data

I have been very grateful for this opportunity to work with BigML, as it was a great culture fit, full of vibrant people who guided me through my first contact with the industry. Applying my knowledge to real-world problems was very satisfying, and I learned a lot about communication skills, software development, collaboration, and I have gained confidence in myself and in my future. All in all, BigML has provided a great experience, by people who work very hard to make approachable and intuitive Machine Learning a reality.

Interested in a BigML Internship?

More internship positions will be available at BigML in 2019. Keep an eye on the BigML Internship page and feel free to contact us at internships@bigml.com with any questions or project ideas. We look forward to hearing from you!


The Official Blog of BigML.com published first on The Official Blog of BigML.com

No comments:

Post a Comment