What Is The Computer Vision API? Reference Guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 270
Download | |
Open PDF In Browser | View PDF |
Contents Computer Vision API Documentation Overview What is Computer Vision? Quickstarts Using the REST API Analyze a remote image cURL Go Java JavaScript Node.js PHP Python Ruby Analyze a local image C# Python Generate a thumbnail C# cURL Go Java JavaScript Node.js PHP Python Ruby Extract printed text C# cURL Go Java JavaScript Node.js PHP Python Ruby Extract handwritten text C# Java JavaScript Python Use a domain model PHP Python Using the .NET SDK Analyze an image Generate a thumbnail Extract text Using the Python SDK Tutorials Generate metadata for images Concepts Tagging images Detecting objects Detecting brands Categorizing images Describing images Detecting faces Detecting image types Detecting domain-specific content Detecting color schemes Generating thumbnails Recognize printed and handwritten text Detecting adult and racy content How-to guides Use Computer Vision Java JavaScript Python Call the Computer Vision API Use containers Install and run containers Configure containers Use the Computer Vision Connected Service Analyze videos in real time Reference Azure CLI Azure PowerShell Computer Vision API v2.0 Computer Vision API v1.0 SDKs .NET Node.js Python Go Android (Java) Swift Resources Samples Explore an image processing app Other Computer Vision samples FAQ Category taxonomy Language support Pricing and limits UserVoice Stack Overflow Azure roadmap Regional availability Compliance What is Computer Vision? 5/29/2019 • 5 minutes to read • Edit Online Azure's Computer Vision service provides developers with access to advanced algorithms that process images and return information. To analyze an image, you can either upload an image or specify an image URL. The images processing algorithms can analyze content in several different ways, depending on the visual features you're interested in. For example, Computer Vision can determine if an image contains adult or racy content, or it can find all of the human faces in an image. You can use Computer Vision in your application by using either a native SDK or invoking the REST API directly. This page broadly covers what you can do with Computer Vision. Analyze images for insight You can analyze images to detect and provide insights about their visual features and characteristics. All of the features in the table below are provided by the Analyze Image API. ACTION DESCRIPTION Tag visual features Identify and tag visual features in an image, from a set of thousands of recognizable objects, living things, scenery, and actions. When the tags are ambiguous or not common knowledge, the API response provides 'hints' to clarify the meaning of the tag in the context of a known setting. Tagging isn't limited to the main subject, such as a person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on. Detect objects Object detection is similar to tagging, but the API returns the bounding box coordinates for each tag applied. For example, if an image contains a dog, cat and person, the Detect operation will list those objects together with their coordinates in the image. You can use this functionality to process further relationships between the objects in an image. It also lets you know when there are multiple instances of the same tag in an image. Detect brands Identify commercial brands in images or videos from a database of thousands of global logos. You can use this feature, for example, to discover which brands are most popular on social media or most prevalent in media product placement. Categorize an image Identify and categorize an entire image, using a category taxonomy with parent/child hereditary hierarchies. Categories can be used alone, or with our new tagging models. Currently, English is the only supported language for tagging and categorizing images. ACTION DESCRIPTION Describe an image Generate a description of an entire image in human-readable language, using complete sentences. Computer Vision's algorithms generate various descriptions based on the objects identified in the image. The descriptions are each evaluated and a confidence score generated. A list is then returned ordered from highest confidence score to lowest. Detect faces Detect faces in an image and provide information about each detected face. Computer Vision returns the coordinates, rectangle, gender, and age for each detected face. Computer Vision provides a subset of the functionality that can be found in Face, and you can use the Face service for more detailed analysis, such as facial identification and pose detection. Detect image types Detect characteristics about an image, such as whether an image is a line drawing or the likelihood of whether an image is clip art. Detect domain-specific content Use domain models to detect and identify domain-specific content in an image, such as celebrities and landmarks. For example, if an image contains people, Computer Vision can use a domain model for celebrities included with the service to determine if the people detected in the image match known celebrities. Detect the color scheme Analyze color usage within an image. Computer Vision can determine whether an image is black & white or color and, for color images, identify the dominant and accent colors. Generate a thumbnail Analyze the contents of an image to generate an appropriate thumbnail for that image. Computer Vision first generates a high-quality thumbnail and then analyzes the objects within the image to determine the area of interest. Computer Vision then crops the image to fit the requirements of the area of interest. The generated thumbnail can be presented using an aspect ratio that is different from the aspect ratio of the original image, depending on your needs. Get the area of interest Analyze the contents of an image to return the coordinates of the area of interest. This is the same function that is used to generate a thumbnail, but instead of cropping the image, Computer Vision returns the bounding box coordinates of the region, so the calling application can modify the original image as desired. Extract text from images You can use Computer Vision to extract text from an image into a machine-readable character stream using optical character recognition (OCR ). If needed, OCR corrects the rotation of the recognized text and provides the frame coordinates of each word. OCR supports 25 languages and automatically detects the language of the recognized text. You can also use the Read API to extract both printed and handwritten text from images and text-heavy documents. The Read API uses updated models and works for a variety objects with different surfaces and backgrounds, such as receipts, posters, business cards, letters, and whiteboards. Currently, English is the only supported language. Moderate content in images You can use Computer Vision to detect adult and racy content in an image and return a confidence score for both. The filter for adult and racy content detection can be set on a sliding scale to accommodate your preferences. Use containers Use Computer Vision containers to recognize printed and handwritten text locally by installing a standardized Docker container closer to your data. Image requirements Computer Vision can analyze images that meet the following requirements: The image must be presented in JPEG, PNG, GIF, or BMP format The file size of the image must be less than 4 megabytes (MB ) The dimensions of the image must be greater than 50 x 50 pixels For OCR, the dimensions of the image must be between 50 x 50 and 4200 x 4200 pixels Data privacy and security As with all of the Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's policies on customer data. See the Cognitive Services page on the Microsoft Trust Center to learn more. Next steps Get started with Computer Vision by following a quickstart guide: Quickstart: Analyze an image Quickstart: Extract handwritten text Quickstart: Generate a thumbnail Quickstart: Analyze a remote image using the REST API and cURL in Computer Vision 4/18/2019 • 2 minutes to read • Edit Online In this quickstart, you analyze a remotely stored image to extract visual features using Computer Vision's REST API. With the Analyze Image method, you can extract visual features based on image content. If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have cURL. You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. Create and run the sample command To create and run the sample, do the following steps: 1. Copy the following command into a text editor. 2. Make the following changes in the command where needed: a. Replace the value ofwith your subscription key. b. Replace the request URL ( https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze ) with the endpoint URL for the Analyze Image method from the Azure region where you obtained your subscription keys, if necessary. c. Optionally, change the language parameter of the request URL ( language=en ) to use a different supported language. d. Optionally, change the image URL in the request body ( http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\ ) to the URL of a different image to be analyzed. 3. Open a command prompt window. 4. Paste the command from the text editor into the command prompt window, and then run the command. curl -H "Ocp-Apim-Subscription-Key: " -H "Content-Type: application/json" "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze? visualFeatures=Categories,Description&details=Landmarks&language=en" -d " {\"url\":\"http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\"}" Examine the response A successful response is returned in JSON. The sample application parses and displays a successful response in the command prompt window, similar to the following example: { "categories": [ { "name": "outdoor_water", "score": 0.9921875, "detail": { "landmarks": [] } } ], "description": { "tags": [ "nature", "water", "waterfall", "outdoor", "rock", "mountain", "rocky", "grass", "hill", "covered", "hillside", "standing", "side", "group", "walking", "white", "man", "large", "snow", "grazing", "forest", "slope", "herd", "river", "giraffe", "field" ], "captions": [ { "text": "a large waterfall over a rocky cliff", "confidence": 0.916458423253597 } ] }, "requestId": "b6e33879-abb2-43a0-a96e-02cb5ae0b795", "metadata": { "height": 959, "width": 1280, "format": "Jpeg" } } Next steps Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console. Explore the Computer Vision API Quickstart: Analyze a remote image using the REST API and Go in Computer Vision 4/18/2019 • 3 minutes to read • Edit Online In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST API. With the Analyze Image method, you can extract visual features based on image content. If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have Go installed. You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. Create and run the sample To create and run the sample, do the following steps: 1. Copy the below code into a text editor. 2. Make the following changes in code where needed: a. Replace the value of subscriptionKey with your subscription key. b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure region where you obtained your subscription keys, if necessary. c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze. 3. Save the code as a file with a .go extension. For example, analyze-image.go . 4. Open a command prompt window. 5. At the prompt, run the go build command to compile the package from the file. For example, go build analyze-image.go . 6. At the prompt, run the compiled package. For example, analyze-image . package main import ( "encoding/json" "fmt" "io/ioutil" "net/http" "strings" "time" ) func main() { // Replace with your valid subscription key. const subscriptionKey = " " // // // // // // You must use the same Azure region in your REST API method as you used to get your subscription keys. For example, if you got your subscription keys from the West US region, replace "westcentralus" in the URL below with "westus". Free trial subscription keys are generated in the "westus" region. // Free trial subscription keys are generated in the "westus" region. // If you use a free trial subscription key, you shouldn't need to change // this region. const uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze" const imageUrl = "https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg" const params = "?visualFeatures=Description&details=Landmarks&language=en" const uri = uriBase + params const imageUrlEnc = "{\"url\":\"" + imageUrl + "\"}" reader := strings.NewReader(imageUrlEnc) // Create the HTTP client client := &http.Client{ Timeout: time.Second * 2, } // Create the POST request, passing the image URL in the request body req, err := http.NewRequest("POST", uri, reader) if err != nil { panic(err) } // Add request headers req.Header.Add("Content-Type", "application/json") req.Header.Add("Ocp-Apim-Subscription-Key", subscriptionKey) // Send the request and retrieve the response resp, err := client.Do(req) if err != nil { panic(err) } defer resp.Body.Close() // Read the response body // Note, data is a byte array data, err := ioutil.ReadAll(resp.Body) if err != nil { panic(err) } // Parse the JSON data from the byte array var f interface{} json.Unmarshal(data, &f) // Format and display the JSON result jsonFormatted, _ := json.MarshalIndent(f, "", " ") fmt.Println(string(jsonFormatted)) } Examine the response A successful response is returned in JSON. The sample application parses and displays a successful response in the command prompt window, similar to the following example: { "categories": [ { "detail": { "landmarks": [] }, "name": "outdoor_water", "score": 0.9921875 } ], "description": { "captions": [ { "confidence": 0.916458423253597, "text": "a large waterfall over a rocky cliff" } ], "tags": [ "nature", "water", "waterfall", "outdoor", "rock", "mountain", "rocky", "grass", "hill", "covered", "hillside", "standing", "side", "group", "walking", "white", "man", "large", "snow", "grazing", "forest", "slope", "herd", "river", "giraffe", "field" ] }, "metadata": { "format": "Jpeg", "height": 959, "width": 1280 }, "requestId": "a92f89ab-51f8-4735-a58d-507da2213fc2" } Next steps Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console. Explore the Computer Vision API Quickstart: Analyze a remote image using the Computer Vision REST API and Java 4/18/2019 • 4 minutes to read • Edit Online In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST API. With the Analyze Image method, you can extract visual features based on image content. If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have Java™ Platform, Standard Edition Development Kit 7 or 8 (JDK 7 or 8) installed. You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. Create and run the sample application To create and run the sample, do the following steps: 1. Create a new Java project in your favorite IDE or editor. If the option is available, create the Java project from a command line application template. 2. Import the following libraries into your Java project. If you're using Maven, the Maven coordinates are provided for each library. Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5) Apache HTTP core (org.apache.httpcomponents:httpcore:4.4.9) JSON library (org.json:json:20180130) 3. Add the following import statements to the file that contains the Main public class for your project. import import import import import import import import import import java.net.URI; org.apache.http.HttpEntity; org.apache.http.HttpResponse; org.apache.http.client.methods.HttpPost; org.apache.http.entity.StringEntity; org.apache.http.client.utils.URIBuilder; org.apache.http.impl.client.CloseableHttpClient; org.apache.http.impl.client.HttpClientBuilder; org.apache.http.util.EntityUtils; org.json.JSONObject; 4. Replace the needed: Main public class with the following code, then make the following changes in code where a. Replace the value of subscriptionKey with your subscription key. b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure region where you obtained your subscription keys, if necessary. c. Optionally, replace the value of imageToAnalyze with the URL of a different image that you want to analyze. public // // // class Main { ********************************************** *** Update or verify the following values. *** ********************************************** // Replace with your valid subscription key. private static final String subscriptionKey = " "; // You must use the same Azure region in your REST API method as you used to // get your subscription keys. For example, if you got your subscription keys // from the West US region, replace "westcentralus" in the URL // below with "westus". // // Free trial subscription keys are generated in the "westus" region. // If you use a free trial subscription key, you shouldn't need to change // this region. private static final String uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze"; private static final String imageToAnalyze = "https://upload.wikimedia.org/wikipedia/commons/" + "1/12/Broadway_and_Times_Square_by_night.jpg"; public static void main(String[] args) { CloseableHttpClient httpClient = HttpClientBuilder.create().build(); try { URIBuilder builder = new URIBuilder(uriBase); // Request parameters. All of them are optional. builder.setParameter("visualFeatures", "Categories,Description,Color"); builder.setParameter("language", "en"); // Prepare the URI for the REST API method. URI uri = builder.build(); HttpPost request = new HttpPost(uri); // Request headers. request.setHeader("Content-Type", "application/json"); request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey); // Request body. StringEntity requestEntity = new StringEntity("{\"url\":\"" + imageToAnalyze + "\"}"); request.setEntity(requestEntity); // Call the REST API method and get the response entity. HttpResponse response = httpClient.execute(request); HttpEntity entity = response.getEntity(); if (entity != null) { // Format and display the JSON response. String jsonString = EntityUtils.toString(entity); JSONObject json = new JSONObject(jsonString); System.out.println("REST Response:\n"); System.out.println(json.toString(2)); } } catch (Exception e) { // Display error message. System.out.println(e.getMessage()); } } } Compile and run the program 1. Save, then build the Java project. 2. If you're using an IDE, run Main . Alternately, if you're running the program from a command line window, run the following commands. These commands presume your libraries are in a folder named libs that is in the same folder as Main.java ; if not, you will need to replace libs with the path to your libraries. 1. Compile the file Main.java . javac -cp ".;libs/*" Main.java 2. Run the program. It will send the request to the QnA Maker API to create the KB, then it will poll for the results every 30 seconds. Each response is printed to the command line window. java -cp ".;libs/*" Main Examine the response A successful response is returned in JSON. The sample application parses and displays a successful response in the console window, similar to the following example: REST Response: { "metadata": { "width": 1826, "format": "Jpeg", "height": 2436 }, "color": { "dominantColorForeground": "Brown", "isBWImg": false, "accentColor": "B74314", "dominantColorBackground": "Brown", "dominantColors": ["Brown"] }, "requestId": "bbffe1a1-4fa3-4a6b-a4d5-a4964c58a811", "description": { "captions": [{ "confidence": 0.8241405091548035, "text": "a group of people on a city street filled with traffic at night" }], "tags": [ "outdoor", "building", "street", "city", "busy", "people", "filled", "traffic", "many", "table", "car", "group", "walking", "bunch", "crowded", "large", "night", "light", "standing", "man", "tall", "umbrella", "riding", "sign", "crowd" ] }, "categories": [{ "score": 0.625, "name": "outdoor_street" }] } Clean up resources When no longer needed, delete the Java project, including the compiled class and imported libraries. Next steps Explore a Java Swing application that uses Computer Vision to perform optical character recognition (OCR ); create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To rapidly experiment with the Computer Vision API, try the Open API testing console. Computer Vision API Java Tutorial Quickstart: Analyze a remote image using the REST API and JavaScript in Computer Vision 4/19/2019 • 3 minutes to read • Edit Online In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST API. With the Analyze Image method, you can extract visual features based on image content. If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. Create and run the sample To create and run the sample, do the following steps: 1. Copy the following code into a text editor. 2. Make the following changes in code where needed: a. Replace the value of subscriptionKey with your subscription key. b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure region where you obtained your subscription keys, if necessary. c. Optionally, replace the value of the value attribute for the inputImage control with the URL of a different image that you want to analyze. 3. Save the code as a file with an .html extension. For example, analyze-image.html . 4. Open a browser window. 5. In the browser, drag and drop the file into the browser window. 6. When the webpage is displayed in the browser, choose the Analyze Image button. Analyze Sample Analyze image:
Enter the URL to an image, then click the Analyze image button.
Image to analyze:
Response:
Source image:
" . json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . ""; } catch (HttpException $ex) { echo "
" . $ex . ""; } ?> Examine the response A successful response is returned in JSON. The sample website parses and displays a successful response in the browser window, similar to the following example: { "categories": [ { "name": "outdoor_water", "score": 0.9921875, "detail": { "landmarks": [] } } ], "description": { "tags": [ "nature", "water", "waterfall", "outdoor", "rock", "mountain", "rocky", "grass", "hill", "covered", "hillside", "standing", "side", "group", "walking", "white", "man", "large", "snow", "grazing", "forest", "slope", "herd", "river", "giraffe", "field" ], "captions": [ { "text": "a large waterfall over a rocky cliff", "confidence": 0.916458423253597 } ] }, "requestId": "ebf5a1bc-3ba2-4c56-99b4-bbd20ba28705", "metadata": { "height": 959, "width": 1280, "format": "Jpeg" } } Clean up resources When no longer needed, delete the file, and then uninstall the PHP5 package, do the following steps: 1. Open a command prompt window as an administrator. 2. Run the following command: HTTP_Request2 package. To uninstall the pear uninstall HTTP_Request2 3. After the package is successfully uninstalled, close the command prompt window. Next steps Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console. Explore the Computer Vision API Quickstart: Analyze a remote image using the REST API and Python in Computer Vision 4/19/2019 • 3 minutes to read • Edit Online In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST API. With the Analyze Image method, you can extract visual features based on image content. You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder, select the following button: launch binder If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have Python installed if you want to run the sample locally. You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. You must have the following Python packages installed. You can use pip to install Python packages. requests matplotlib pillow Create and run the sample To create and run the sample, do the following steps: 1. Copy the following code into a text editor. 2. Make the following changes in code where needed: a. Replace the value of subscription_key with your subscription key. b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the Azure region where you obtained your subscription keys, if necessary. c. Optionally, replace the value of image_url with the URL of a different image that you want to analyze. 3. Save the code as a file with an .py extension. For example, analyze-image.py . 4. Open a command prompt window. 5. At the prompt, use the python command to run the sample. For example, python analyze-image.py . import requests # If you are using a Jupyter notebook, uncomment the following line. #%matplotlib inline import matplotlib.pyplot as plt import json from PIL import Image from io import BytesIO # Replace
Generate thumbnail image:
Enter the URL to an image to use in creating a thumbnail image, then click the Generate thumbnail button.Image for thumbnail:
Response:
Source image:
Thumbnail:
" . json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . ""; } catch (HttpException $ex) { echo "
" . $ex . ""; } ?> Examine the response A successful response is returned as binary data, which represents the image data for the thumbnail. If the request fails, the response is displayed in the browser window. The response for the failed request contains an error code and a message to help determine what went wrong. Clean up resources When no longer needed, delete the file, and then uninstall the PHP5 package, do the following steps: HTTP_Request2 package. To uninstall the 1. Open a command prompt window as an administrator. 2. Run the following command: pear uninstall HTTP_Request2 3. After the package is successfully uninstalled, close the command prompt window. Next steps Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing console. Explore the Computer Vision API Quickstart: Generate a thumbnail using the REST API and Python in Computer Vision 4/18/2019 • 2 minutes to read • Edit Online In this quickstart, you will generate a thumbnail from an image using Computer Vision's REST API. With the Get Thumbnail method, you can specify the desired height and width, and Computer Vision uses smart cropping to intelligently identify the area of interest and generate cropping coordinates based on that region. If you don't have an Azure subscription, create a free account before you begin. Prerequisites You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key. A code editor such as Visual Studio Code Create and run the sample To create and run the sample, copy the following code into the code editor. import requests # If you are using a Jupyter notebook, uncomment the following line. #%matplotlib inline import matplotlib.pyplot as plt from PIL import Image from io import BytesIO # Replace
Optical Character Recognition (OCR):
Enter the URL to an image of printed text, then click the Read image button.Image to read:
Response:
Source image:
Source image:
" . json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . ""; } catch (HttpException $ex) { echo "
" . $ex . ""; } ?>