ML

What is model serving?

You can think of a model as just a file that people run to create predictions. In some cases, it can be very large. In other cases, it may be slow to run. Model serving refers to putting the model somewhere so that people can run it at scale. For example, some models may run thousands of times per second.


What is multimodal?

Multimodal refers to being able to process vision, audio, and/or motion simultaneously. For example, a model may process both the image frames and audio from a YouTube to classify what the video is about.


What companies hire for computer vision?

There are some companies into computer vision:

  • Microsoft

  • Google

  • Facebook

  • Nvidia

  • Amazon - for their Amazon stores

  • Snap - for their filters

  • Pinterest - for visual search

Usually specialists like computer vision engineers work on the same team. You can use the LinkedIn "Visitors also looked at" feature to see people with similar profiles.


Case Study: Mortgage Company

Let's say a mortgage company wants to apply natural language processing to improve the efficiency of refinancing. Specifically, people upload their mortgage documents and the system should automatically extract information like name of the mortgage lender.

Let's walk through the end-to-end life cycle for a machine learning engineer.

Prototyping. The ML engineer will take a few sample documents. He might try some simple rules (look for term "Lender") to see if it works okay.

Collecting training data. Machine learning model is built off of data. The data is called "training data". The mortgage company probably already has millions of example mortgage documents with the data already extracted. The engineer can use this as training data.

Training models. The ML engineers trains a model on the data. The training time varies depending on task. For small data sets it can take a few minutes. For large sophisticated models it can take weeks.

Model selection. The ML has several options for what type of algorithm to use. Even for a given algorithm, there are many options. It's kind of like buying a car. You have different models that have different strengths and weaknesses. Even for a given car, you have options to customize the interior (e.g., sunroof option, etc.). Choosing a model has the same type of optionality.

Typically the model that has the highest accuracy is chosen.

Deployment. After the model is selected, the ML engineer can provide the code (in TensorFlow or Pytorch) to deploy it into whatever application that needs it.