The Cloud Vision API enables developers to understand the content of an image. It quickly classifies images into thousands of categories - sailboat, lion, Eiffel Tower - detects individual objects within images, and finds and reads printed words contained within images. Like the other APIs I'm describing here, encapsulates powerful machine learning models behind an easy-to-use API. You can use it to build metadata on your image catalog, moderate offensive content or even do image sentiment analysis. The Cloud Speech API enables developers to convert audio to text. Because you have an increasingly global user base, The API recognizes over 80 languages and variants. You can transcribe the text of users, dictating in an applications' microphone, enable command and control through voice or transcribe audio files. The Cloud Natural Language API offers a variety of natural language understanding technologies to developers. It can do syntax analysis, breaking down sentences supplied by our users into tokens, identify the nouns, verbs, adjectives, and other parts of speech and figure out the relationships among the words. It can do entity recognition. In other words, it can parse text and flag mentions of people, organizations, locations, events, products, and media. It can understand the overall sentiment expressed in a block of text. It has these capabilities in multiple language, including English, Spanish, and Japanese. Cloud Translation API provides a simple, programmatic interface for translating an arbitrary string into a supported language. When you don't know the source language, the API can detect it. The Cloud Video Intelligence API lets you annotate videos in a variety of formats. It helps you identify key entities - that is, nouns - within your video and when they occur. You can use it to make video content searchable and discoverable. At the time this video was produced, the Cloud Video Intelligence Service was in beta. So, check the GCP website for updates.