What Is The Computer Vision API? Reference Guide

ContentsContents
 Computer Vision API Documentation
 Overview
 What is Computer Vision?
 Quickstarts
 Using the REST API
 Analyze a remote image
 cURL
 Go
 Java
 JavaScript
 Node.js
 PHP
 Python
 Ruby
 Analyze a local image
 C#
 Python
 Generate a thumbnail
 C#
 cURL
 Go
 Java
 JavaScript
 Node.js
 PHP
 Python
 Ruby
 Extract printed text
 C#

 cURL
 Go
 Java
 JavaScript
 Node.js
 PHP
 Python
 Ruby
 Extract handwritten text
 C#
 Java
 JavaScript
 Python
 Use a domain model
 PHP
 Python
 Using the .NET SDK
 Analyze an image
 Generate a thumbnail
 Extract text
 Using the Python SDK
 Tutorials
 Generate metadata for images
 Concepts
 Tagging images
 Detecting objects
 Detecting brands
 Categorizing images
 Describing images
 Detecting faces
 Detecting image types
 Detecting domain-specific content

 Detecting color schemes
 Generating thumbnails
 Recognize printed and handwritten text
 Detecting adult and racy content
 How-to guides
 Use Computer Vision
 Java
 JavaScript
 Python
 Call the Computer Vision API
 Use containers
 Install and run containers
 Configure containers
 Use the Computer Vision Connected Service
 Analyze videos in real time
 Reference
 Azure CLI
 Azure PowerShell
 Computer Vision API v2.0
 Computer Vision API v1.0
 SDKs
 .NET
 Node.js
 Python
 Go
 Android (Java)
 Swift
 Resources
 Samples
 Explore an image processing app
 Other Computer Vision samples
 FAQ

 Category taxonomy
 Language support
 Pricing and limits
 UserVoice
 Stack Overflow
 Azure roadmap
 Regional availability
 Compliance

What is Computer Vision?

5/29/2019 • 5 minutes to read • Edit Online

Analyze images for insight

ACTION DESCRIPTION

Tag visual features Identify and tag visual features in an image, from a set of

thousands of recognizable objects, living things, scenery, and

actions. When the tags are ambiguous or not common

knowledge, the API response provides 'hints' to clarify the

meaning of the tag in the context of a known setting. Tagging

isn't limited to the main subject, such as a person in the

foreground, but also includes the setting (indoor or outdoor),

furniture, tools, plants, animals, accessories, gadgets, and so

on.

Detect objects Object detection is similar to tagging, but the API returns the

bounding box coordinates for each tag applied. For example, if

an image contains a dog, cat and person, the Detect operation

will list those objects together with their coordinates in the

image. You can use this functionality to process further

relationships between the objects in an image. It also lets you

know when there are multiple instances of the same tag in an

image.

Detect brands Identify commercial brands in images or videos from a

database of thousands of global logos. You can use this

feature, for example, to discover which brands are most

popular on social media or most prevalent in media product

placement.

Categorize an image Identify and categorize an entire image, using a category

taxonomy with parent/child hereditary hierarchies. Categories

can be used alone, or with our new tagging models.

Currently, English is the only supported language for tagging

and categorizing images.

Azure's Computer Vision service provides developers with access to advanced algorithms that process images and

return information. To analyze an image, you can either upload an image or specify an image URL. The images

processing algorithms can analyze content in several different ways, depending on the visual features you're

interested in. For example, Computer Vision can determine if an image contains adult or racy content, or it can find

all of the human faces in an image.

You can use Computer Vision in your application by using either a native SDK or invoking the REST API directly.

This page broadly covers what you can do with Computer Vision.

You can analyze images to detect and provide insights about their visual features and characteristics. All of the

features in the table below are provided by the Analyze Image API.

Describe an image Generate a description of an entire image in human-readable
language, using complete sentences. Computer Vision's
algorithms generate various descriptions based on the objects
identified in the image. The descriptions are each evaluated
and a confidence score generated. A list is then returned
ordered from highest confidence score to lowest.
Detect faces Detect faces in an image and provide information about each
detected face. Computer Vision returns the coordinates,
rectangle, gender, and age for each detected face.
Computer Vision provides a subset of the functionality that
can be found in Face, and you can use the Face service for
more detailed analysis, such as facial identification and pose
detection.
Detect image types Detect characteristics about an image, such as whether an
image is a line drawing or the likelihood of whether an image
is clip art.
Detect domain-specific content Use domain models to detect and identify domain-specific
content in an image, such as celebrities and landmarks. For
example, if an image contains people, Computer Vision can use
a domain model for celebrities included with the service to
determine if the people detected in the image match known
celebrities.
Detect the color scheme Analyze color usage within an image. Computer Vision can
determine whether an image is black & white or color and, for
color images, identify the dominant and accent colors.
Generate a thumbnail Analyze the contents of an image to generate an appropriate
thumbnail for that image. Computer Vision first generates a
high-quality thumbnail and then analyzes the objects within
the image to determine the area of interest. Computer Vision
then crops the image to fit the requirements of the area of
interest. The generated thumbnail can be presented using an
aspect ratio that is different from the aspect ratio of the
original image, depending on your needs.
Get the area of interest Analyze the contents of an image to return the coordinates of
the area of interest. This is the same function that is used to
generate a thumbnail, but instead of cropping the image,
Computer Vision returns the bounding box coordinates of the
region, so the calling application can modify the original image
as desired.
ACTION DESCRIPTION
Extract text from images
You can use Computer Vision to extract text from an image into a machine-readable character stream using optical
character recognition (OCR). If needed, OCR corrects the rotation of the recognized text and provides the frame
coordinates of each word. OCR supports 25 languages and automatically detects the language of the recognized
text.
You can also use the Read API to extract both printed and handwritten text from images and text-heavy documents.
The Read API uses updated models and works for a variety objects with different surfaces and backgrounds, such
as receipts, posters, business cards, letters, and whiteboards. Currently, English is the only supported language.

Moderate content in images

Use containers

Image requirements

Data privacy and security

Next steps

You can use Computer Vision to detect adult and racy content in an image and return a confidence score for both.

The filter for adult and racy content detection can be set on a sliding scale to accommodate your preferences.

Use Computer Vision containers to recognize printed and handwritten text locally by installing a standardized

Docker container closer to your data.

Computer Vision can analyze images that meet the following requirements:

The image must be presented in JPEG, PNG, GIF, or BMP format

The file size of the image must be less than 4 megabytes (MB)

The dimensions of the image must be greater than 50 x 50 pixels

For OCR, the dimensions of the image must be between 50 x 50 and 4200 x 4200 pixels

As with all of the Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's

policies on customer data. See the Cognitive Services page on the Microsoft Trust Center to learn more.

Get started with Computer Vision by following a quickstart guide:

Quickstart: Analyze an image

Quickstart: Extract handwritten text

Quickstart: Generate a thumbnail

Quickstart: Analyze a remote image using the REST

API and cURL in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample command

curl -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -H "Content-Type: application/json"

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze?

visualFeatures=Categories,Description&details=Landmarks&language=en" -d "

{\"url\":\"http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\"}"

Examine the response

In this quickstart, you analyze a remotely stored image to extract visual features using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have cURL.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following command into a text editor.

2. Make the following changes in the command where needed:

3. Open a command prompt window.

4. Paste the command from the text editor into the command prompt window, and then run the command.

a. Replace the value of <subscriptionKey> with your subscription key.

b. Replace the request URL (https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze ) with

the endpoint URL for the Analyze Image method from the Azure region where you obtained your

subscription keys, if necessary.

c. Optionally, change the language parameter of the request URL (language=en ) to use a different

supported language.

d. Optionally, change the image URL in the request body (

http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\ ) to the URL of a different

image to be analyzed.

A successful response is returned in JSON. The sample application parses and displays a successful response in the

command prompt window, similar to the following example:

{

"categories": [

{

"name": "outdoor_water",

"score": 0.9921875,

"detail": {

"landmarks": []

}

],

"description": {

"tags": [

"nature",

"water",

"waterfall",

"outdoor",

"rock",

"mountain",

"rocky",

"grass",

"hill",

"covered",

"hillside",

"standing",

"side",

"group",

"walking",

"white",

"man",

"large",

"snow",

"grazing",

"forest",

"slope",

"herd",

"river",

"giraffe",

"field"

],

"captions": [

{

"text": "a large waterfall over a rocky cliff",

"confidence": 0.916458423253597

}

]

},

"requestId": "b6e33879-abb2-43a0-a96e-02cb5ae0b795",

"metadata": {

"height": 959,

"width": 1280,

"format": "Jpeg"

}

Next steps

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Analyze a remote image using the REST

API and Go in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

package main

import (

"encoding/json"

"fmt"

"io/ioutil"

"net/http"

"strings"

"time"

)

func main() {

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = "<Subscription Key>"

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have Go installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the below code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with a .go extension. For example, analyze-image.go .

4. Open a command prompt window.

5. At the prompt, run the go build command to compile the package from the file. For example,

go build analyze-image.go .

6. At the prompt, run the compiled package. For example, analyze-image .

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze"

const imageUrl =

"https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg"

const params = "?visualFeatures=Description&details=Landmarks&language=en"

const uri = uriBase + params

const imageUrlEnc = "{\"url\":\"" + imageUrl + "\"}"

reader := strings.NewReader(imageUrlEnc)

// Create the HTTP client

client := &http.Client{

Timeout: time.Second * 2,

}

// Create the POST request, passing the image URL in the request body

req, err := http.NewRequest("POST", uri, reader)

if err != nil {

panic(err)

}

// Add request headers

req.Header.Add("Content-Type", "application/json")

req.Header.Add("Ocp-Apim-Subscription-Key", subscriptionKey)

// Send the request and retrieve the response

resp, err := client.Do(req)

if err != nil {

panic(err)

}

defer resp.Body.Close()

// Read the response body

// Note, data is a byte array

data, err := ioutil.ReadAll(resp.Body)

if err != nil {

panic(err)

}

// Parse the JSON data from the byte array

var f interface{}

json.Unmarshal(data, &f)

// Format and display the JSON result

jsonFormatted, _ := json.MarshalIndent(f, "", " ")

fmt.Println(string(jsonFormatted))

}

Examine the response

A successful response is returned in JSON. The sample application parses and displays a successful response in the

command prompt window, similar to the following example:

{

"categories": [

{

"detail": {

"landmarks": []

},

"name": "outdoor_water",

"score": 0.9921875

}

],

"description": {

"captions": [

{

"confidence": 0.916458423253597,

"text": "a large waterfall over a rocky cliff"

}

],

"tags": [

"nature",

"water",

"waterfall",

"outdoor",

"rock",

"mountain",

"rocky",

"grass",

"hill",

"covered",

"hillside",

"standing",

"side",

"group",

"walking",

"white",

"man",

"large",

"snow",

"grazing",

"forest",

"slope",

"herd",

"river",

"giraffe",

"field"

]

},

"metadata": {

"format": "Jpeg",

"height": 959,

"width": 1280

},

"requestId": "a92f89ab-51f8-4735-a58d-507da2213fc2"

}

Next steps

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Analyze a remote image using the

Computer Vision REST API and Java

4/18/2019 • 4 minutes to read • Edit Online

Prerequisites

Create and run the sample application

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have Java™ Platform, Standard Edition Development Kit 7 or 8 (JDK 7 or 8) installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

import java.net.URI;

import org.apache.http.HttpEntity;

import org.apache.http.HttpResponse;

import org.apache.http.client.methods.HttpPost;

import org.apache.http.entity.StringEntity;

import org.apache.http.client.utils.URIBuilder;

import org.apache.http.impl.client.CloseableHttpClient;

import org.apache.http.impl.client.HttpClientBuilder;

import org.apache.http.util.EntityUtils;

import org.json.JSONObject;

1. Create a new Java project in your favorite IDE or editor. If the option is available, create the Java project from

a command line application template.

2. Import the following libraries into your Java project. If you're using Maven, the Maven coordinates are

provided for each library.

Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5)

Apache HTTP core (org.apache.httpcomponents:httpcore:4.4.9)

JSON library (org.json:json:20180130)

3. Add the following import statements to the file that contains the Main public class for your project.

4. Replace the Main public class with the following code, then make the following changes in code where

needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageToAnalyze with the URL of a different image that you want to

analyze.

public class Main {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

private static final String subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

private static final String uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze";

private static final String imageToAnalyze =

"https://upload.wikimedia.org/wikipedia/commons/" +

"1/12/Broadway_and_Times_Square_by_night.jpg";

public static void main(String[] args) {

CloseableHttpClient httpClient = HttpClientBuilder.create().build();

try {

URIBuilder builder = new URIBuilder(uriBase);

// Request parameters. All of them are optional.

builder.setParameter("visualFeatures", "Categories,Description,Color");

builder.setParameter("language", "en");

// Prepare the URI for the REST API method.

URI uri = builder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request body.

StringEntity requestEntity =

new StringEntity("{\"url\":\"" + imageToAnalyze + "\"}");

request.setEntity(requestEntity);

// Call the REST API method and get the response entity.

HttpResponse response = httpClient.execute(request);

HttpEntity entity = response.getEntity();

if (entity != null) {

// Format and display the JSON response.

String jsonString = EntityUtils.toString(entity);

JSONObject json = new JSONObject(jsonString);

System.out.println("REST Response:\n");

System.out.println(json.toString(2));

}

} catch (Exception e) {

// Display error message.

System.out.println(e.getMessage());

}

Compile and run the program

Examine the response

1. Save, then build the Java project.

2. If you're using an IDE, run Main .

Alternately, if you're running the program from a command line window, run the following commands. These

commands presume your libraries are in a folder named libs that is in the same folder as Main.java ; if not, you

will need to replace libs with the path to your libraries.

javac -cp ".;libs/*" Main.java

java -cp ".;libs/*" Main

1. Compile the file Main.java .

2. Run the program. It will send the request to the QnA Maker API to create the KB, then it will poll for the

results every 30 seconds. Each response is printed to the command line window.

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

REST Response:

{

"metadata": {

"width": 1826,

"format": "Jpeg",

"height": 2436

},

"color": {

"dominantColorForeground": "Brown",

"isBWImg": false,

"accentColor": "B74314",

"dominantColorBackground": "Brown",

"dominantColors": ["Brown"]

},

"requestId": "bbffe1a1-4fa3-4a6b-a4d5-a4964c58a811",

"description": {

"captions": [{

"confidence": 0.8241405091548035,

"text": "a group of people on a city street filled with traffic at night"

}],

"tags": [

"outdoor",

"building",

"street",

"city",

"busy",

"people",

"filled",

"traffic",

"many",

"table",

"car",

"group",

"walking",

"bunch",

"crowded",

"large",

"night",

"light",

"standing",

"man",

"tall",

"umbrella",

"riding",

"sign",

"crowd"

]

},

"categories": [{

"score": 0.625,

"name": "outdoor_street"

}]

}

Clean up resources

Next steps

When no longer needed, delete the Java project, including the compiled class and imported libraries.

Explore a Java Swing application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Java Tutorial

Quickstart: Analyze a remote image using the REST

API and JavaScript in Computer Vision

4/19/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

<!DOCTYPE html>

<html>

<head>

<title>Analyze Sample</title>

</head>

<body>

function processImage() {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

var subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or,

follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .html extension. For example, analyze-image.html .

4. Open a browser window.

5. In the browser, drag and drop the file into the browser window.

6. When the webpage is displayed in the browser, choose the Analyze Image button.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of the value attribute for the inputImage control with the URL of a

different image that you want to analyze.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

var uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze";

// Request parameters.

var params = {

"visualFeatures": "Categories,Description,Color",

"details": "",

"language": "en",

};

// Display the image.

var sourceImageUrl = document.getElementById("inputImage").value;

document.querySelector("#sourceImage").src = sourceImageUrl;

// Make the REST API call.

$.ajax({

url: uriBase + "?" + $.param(params),

// Request headers.

beforeSend: function(xhrObj){

xhrObj.setRequestHeader("Content-Type","application/json");

xhrObj.setRequestHeader(

"Ocp-Apim-Subscription-Key", subscriptionKey);

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

$("#responseTextArea").val(JSON.stringify(data, null, 2));

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Display error message.

var errorString = (errorThrown === "") ? "Error. " :

errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" :

jQuery.parseJSON(jqXHR.responseText).message;

alert(errorString);

});

};

</script>

<h1>Analyze image:</h1>

Enter the URL to an image, then click the <strong>Analyze image</strong> button.

Image to analyze:

<input type="text" name="inputImage" id="inputImage"

value="https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg" />

<button onclick="processImage()">Analyze image</button>

Response:

<textarea id="responseTextArea" class="UIInput"

style="width:580px; height:400px;"></textarea>

</div>

Source image:

</div>

</body>

</html>

Examine the response

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

browser window, similar to the following example:

{

"categories": [

{

"name": "outdoor_water",

"score": 0.9921875,

"detail": {

"landmarks": []

}

],

"description": {

"tags": [

"nature",

"water",

"waterfall",

"outdoor",

"rock",

"mountain",

"rocky",

"grass",

"hill",

"covered",

"hillside",

"standing",

"side",

"group",

"walking",

"white",

"man",

"large",

"snow",

"grazing",

"forest",

"slope",

"herd",

"river",

"giraffe",

"field"

],

"captions": [

{

"text": "a large waterfall over a rocky cliff",

"confidence": 0.916458423253597

}

]

},

"color": {

"dominantColorForeground": "Grey",

"dominantColorBackground": "Green",

"dominantColors": [

"Grey",

"Green"

],

"accentColor": "4D5E2F",

"isBwImg": false

},

"requestId": "73ef10ce-a4ea-43c6-aee7-70325777e4b3",

"metadata": {

"height": 959,

"width": 1280,

"format": "Jpeg"

}

Next steps

Explore a JavaScript application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API JavaScript Tutorial

Quickstart: Analyze a remote image using the REST

API with Node.js in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have Node.js 4.x or later installed.

You must have npm installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the npm request package.

npm install request

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

d. Optionally, replace the value of the language request parameter with a different language.

4. Save the code as a file with a .js extension. For example, analyze-image.js .

5. Open a command prompt window.

6. At the prompt, use the node command to run the file. For example, node analyze-image.js .

'use strict';

const request = require('request');

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to get your

// subscription keys. For example, if you got your subscription keys from

// westus, replace "westcentralus" in the URL below with "westus".

const uriBase =

'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze';

const imageUrl =

'https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg';

// Request parameters.

const params = {

'visualFeatures': 'Categories,Description,Color',

'details': '',

'language': 'en'

};

const options = {

uri: uriBase,

qs: params,

body: '{"url": ' + '"' + imageUrl + '"}',

headers: {

'Content-Type': 'application/json',

'Ocp-Apim-Subscription-Key' : subscriptionKey

}

};

request.post(options, (error, response, body) => {

if (error) {

console.log('Error: ', error);

return;

}

let jsonResponse = JSON.stringify(JSON.parse(body), null, ' ');

console.log('JSON Response\n');

console.log(jsonResponse);

});

Examine the response

A successful response is returned in JSON. The sample parses and displays a successful response in the command

prompt window, similar to the following example:

{

"categories": [

{

"name": "outdoor_water",

"score": 0.9921875,

"detail": {

"landmarks": []

}

],

"description": {

"tags": [

"nature",

"water",

"waterfall",

"outdoor",

"rock",

"mountain",

"rocky",

"grass",

"hill",

"covered",

"hillside",

"standing",

"side",

"group",

"walking",

"white",

"man",

"large",

"snow",

"grazing",

"forest",

"slope",

"herd",

"river",

"giraffe",

"field"

],

"captions": [

{

"text": "a large waterfall over a rocky cliff",

"confidence": 0.916458423253597

}

]

},

"color": {

"dominantColorForeground": "Grey",

"dominantColorBackground": "Green",

"dominantColors": [

"Grey",

"Green"

],

"accentColor": "4D5E2F",

"isBwImg": false

},

"requestId": "81b4e400-e3c1-41f1-9020-e6871ad9f0ed",

"metadata": {

"height": 959,

"width": 1280,

"format": "Jpeg"

}

Clean up resources

Next steps

When no longer needed, delete the file, and then uninstall the npm request package. To uninstall the package, do

the following steps:

npm uninstall request

1. Open a command prompt window as an administrator.

2. Run the following command:

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision APIs used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Analyze a remote image using the REST

API and PHP in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have PHP installed.

You must have Pear installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the PHP5 HTTP_Request2 package.

pear install HTTP_Request2

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

d. Optionally, replace the value of the language request parameter with a different language.

4. Save the code as a file with a .php extension. For example, analyze-image.php .

5. Open a browser window with PHP support.

6. Drag and drop the file into the browser window.

<html>

<head>

<title>Analyze Image Sample</title>

</head>

<body>

<?php

// Replace <Subscription Key> with a valid subscription key.

$ocpApimSubscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to obtain

// your subscription keys. For example, if you obtained your subscription keys

// from westus, replace "westcentralus" in the URL below with "westus".

$uriBase = 'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/';

$imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . '/analyze');

$url = $request->getUrl();

$headers = array(

// Request headers

'Content-Type' => 'application/json',

'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey

);

$request->setHeader($headers);

$parameters = array(

// Request parameters

'visualFeatures' => 'Categories,Description',

'details' => '',

'language' => 'en'

);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters

$body = json_encode(array('url' => $imageUrl));

// Request body

$request->setBody($body);

try

{

$response = $request->send();

echo "<pre>" .

json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";

}

catch (HttpException $ex)

{

echo "<pre>" . $ex . "</pre>";

}

?>

</body>

</html>

Examine the response

A successful response is returned in JSON. The sample website parses and displays a successful response in the

browser window, similar to the following example:

{

"categories": [

{

"name": "outdoor_water",

"score": 0.9921875,

"detail": {

"landmarks": []

}

],

"description": {

"tags": [

"nature",

"water",

"waterfall",

"outdoor",

"rock",

"mountain",

"rocky",

"grass",

"hill",

"covered",

"hillside",

"standing",

"side",

"group",

"walking",

"white",

"man",

"large",

"snow",

"grazing",

"forest",

"slope",

"herd",

"river",

"giraffe",

"field"

],

"captions": [

{

"text": "a large waterfall over a rocky cliff",

"confidence": 0.916458423253597

}

]

},

"requestId": "ebf5a1bc-3ba2-4c56-99b4-bbd20ba28705",

"metadata": {

"height": 959,

"width": 1280,

"format": "Jpeg"

}

Clean up resources

When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. To uninstall the

package, do the following steps:

1. Open a command prompt window as an administrator.

2. Run the following command:

Next steps

pear uninstall HTTP_Request2

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Analyze a remote image using the REST

API and Python in Computer Vision

4/19/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder,

select the following button:

launch

launch binder

binder

If you don't have an Azure subscription, create a free account before you begin.

You must have Python installed if you want to run the sample locally.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

You must have the following Python packages installed. You can use pip to install Python packages.

requests

matplotlib

pillow

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, analyze-image.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python analyze-image.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_url with the URL of a different image that you want to analyze.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

import json

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

analyze_url = vision_base_url + "analyze"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/12/" + \

"Broadway_and_Times_Square_by_night.jpg/450px-Broadway_and_Times_Square_by_night.jpg"

headers = {'Ocp-Apim-Subscription-Key': subscription_key }

params = {'visualFeatures': 'Categories,Description,Color'}

data = {'url': image_url}

response = requests.post(analyze_url, headers=headers, params=params, json=data)

response.raise_for_status()

# The 'analysis' object contains various fields that describe the image. The most

# relevant caption for the image is obtained from the 'description' property.

analysis = response.json()

print(json.dumps(response.json()))

image_caption = analysis["description"]["captions"][0]["text"].capitalize()

# Display the image and overlay it with the caption.

image = Image.open(BytesIO(requests.get(image_url).content))

plt.imshow(image)

plt.axis("off")

_ = plt.title(image_caption, size="x-large", y=-0.1)

plt.show()

Examine the response

{

"categories": [

{

"name": "outdoor_",

"score": 0.00390625,

"detail": {

"landmarks": []

}

},

{

"name": "outdoor_street",

"score": 0.33984375,

"detail": {

"landmarks": []

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

"landmarks": []

}

],

"description": {

"tags": [

"building",

"outdoor",

"street",

"city",

"people",

"busy",

"table",

"walking",

"traffic",

"filled",

"large",

"many",

"group",

"night",

"light",

"crowded",

"bunch",

"standing",

"man",

"sign",

"crowd",

"umbrella",

"riding",

"tall",

"woman",

"bus"

],

"captions": [

{

"text": "a group of people on a city street at night",

"confidence": 0.9122243847383961

}

]

},

"color": {

"dominantColorForeground": "Brown",

"dominantColorBackground": "Brown",

"dominantColors": [

"Brown"

],

"accentColor": "B54316",

"isBwImg": false

},

"requestId": "c11894eb-de3e-451b-9257-7c8b168073d1",

"metadata": {

"height": 600,

"width": 450,

"format": "Jpeg"

}

Next steps

Explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Python Tutorial

Quickstart: Analyze a remote image using the REST

API and Ruby in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you analyze a remotely stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have Ruby 2.4.x or later installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .rb extension. For example, analyze-image.rb .

4. Open a command prompt window.

5. At the prompt, use the ruby command to run the sample. For example, ruby analyze-image.rb .

a. Replace <Subscription Key> with your subscription key.

b. Replace https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze with the endpoint URL

for the Analyze Image method in the Azure region where you obtained your subscription keys, if

necessary.

c. Optionally, replace the value of the language request parameter with a different language.

d. Optionally, replace http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\ with the

URL of a different image that you want to analyze.

require 'net/http'

# You must use the same location in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from westus,

# replace "westcentralus" in the URL below with "westus".

uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze')

uri.query = URI.encode_www_form({

# Request parameters

'visualFeatures' => 'Categories, Description',

'details' => 'Landmarks',

'language' => 'en'

})

request = Net::HTTP::Post.new(uri.request_uri)

# Request headers

# Replace <Subscription Key> with your valid subscription key.

request['Ocp-Apim-Subscription-Key'] = '<Subscription Key>'

request['Content-Type'] = 'application/json'

request.body =

"{\"url\": \"http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg\"}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|

http.request(request)

end

puts response.body

Examine the response

{

"categories": [

{

"name": "abstract_",

"score": 0.00390625

},

{

"name": "people_",

"score": 0.83984375,

"detail": {

"celebrities": [

{

"name": "Satya Nadella",

"faceRectangle": {

"left": 597,

"top": 162,

"width": 248,

"height": 248

},

"confidence": 0.999028444

}

]

}

],

"adult": {

"isAdultContent": false,

"isRacyContent": false,

"adultScore": 0.0934349000453949,

"racyScore": 0.068613491952419281

},

A successful response is returned in JSON. The sample parses and displays a successful response in the command

prompt window, similar to the following example:

},

"tags": [

{

"name": "person",

"confidence": 0.98979085683822632

},

{

"name": "man",

"confidence": 0.94493889808654785

},

{

"name": "outdoor",

"confidence": 0.938492476940155

},

{

"name": "window",

"confidence": 0.89513939619064331

}

],

"description": {

"tags": [

"person",

"man",

"outdoor",

"window",

"glasses"

],

"captions": [

{

"text": "Satya Nadella sitting on a bench",

"confidence": 0.48293603002174407

}

]

},

"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",

"metadata": {

"width": 1500,

"height": 1000,

"format": "Jpeg"

},

"faces": [

{

"age": 44,

"gender": "Male",

"faceRectangle": {

"left": 593,

"top": 160,

"width": 250,

"height": 250

}

],

"color": {

"dominantColorForeground": "Brown",

"dominantColorBackground": "Brown",

"dominantColors": [

"Brown",

"Black"

],

"accentColor": "873B59",

"isBWImg": false

},

"imageType": {

"clipArtType": 0,

"lineDrawingType": 0

}

Next steps

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Analyze a local image using the REST API

and C# in Computer Vision

4/18/2019 • 4 minutes to read • Edit Online

Prerequisites

Create and run the sample application

using Newtonsoft.Json.Linq;

using System;

using System.IO;

using System.Net.Http;

using System.Net.Http.Headers;

using System.Threading.Tasks;

namespace CSHttpClientSample

{

static class Program

{

// Replace <Subscription Key> with your valid subscription key.

const string subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

In this quickstart, you will analyze a locally stored image to extract visual features by using Computer Vision's

REST API. With the Analyze Image method, you can extract visual feature information based on image content.

If you don't have an Azure subscription, create a free account before you begin.

You must have Visual Studio 2015 or later.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create the sample in Visual Studio, do the following steps:

1. Create a new Visual Studio solution in Visual Studio, using the Visual C# Console App (.NET Framework)

template.

2. Install the Newtonsoft.Json NuGet package.

3. Replace the code in Program.cs with the following code, and then make the following changes in code where

needed:

4. Run the program.

5. At the prompt, enter the path to a local image.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type "Newtonsoft.Json".

c. Select Newtonsoft.Json when it displays, then click the checkbox next to your project name, and Install.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Analyze Image method from the Azure

region where you obtained your subscription keys, if necessary.

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const string uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze";

static void Main()

{

// Get the path and filename to process from the user.

Console.WriteLine("Analyze an image:");

Console.Write(

"Enter the path to the image you wish to analyze: ");

string imageFilePath = Console.ReadLine();

if (File.Exists(imageFilePath))

{

// Call the REST API method.

Console.WriteLine("\nWait a moment for the results to appear.\n");

MakeAnalysisRequest(imageFilePath).Wait();

}

else

{

Console.WriteLine("\nInvalid file path");

}

Console.WriteLine("\nPress Enter to exit...");

Console.ReadLine();

}

/// <summary>

/// Gets the analysis of the specified image file by using

/// the Computer Vision REST API.

/// </summary>

/// <param name="imageFilePath">The image file to analyze.</param>

static async Task MakeAnalysisRequest(string imageFilePath)

{

try

{

HttpClient client = new HttpClient();

// Request headers.

client.DefaultRequestHeaders.Add(

"Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters. A third optional parameter is "details".

// The Analyze Image method returns information about the following

// visual features:

// Categories: categorizes image content according to a

// taxonomy defined in documentation.

// Description: describes the image content with a complete

// sentence in supported languages.

// Color: determines the accent color, dominant color,

// and whether an image is black & white.

string requestParameters =

"visualFeatures=Categories,Description,Color";

// Assemble the URI for the REST API method.

string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Read the contents of the specified local image

// into a byte array.

byte[] byteData = GetImageAsByteArray(imageFilePath);

// Add the byte array as an octet stream to the request body.

using (ByteArrayContent content = new ByteArrayContent(byteData))

{

// This example uses the "application/octet-stream" content type.

// The other content types you can use are "application/json"

// and "multipart/form-data".

content.Headers.ContentType =

new MediaTypeHeaderValue("application/octet-stream");

// Asynchronously call the REST API method.

response = await client.PostAsync(uri, content);

}

// Asynchronously get the JSON response.

string contentString = await response.Content.ReadAsStringAsync();

// Display the JSON response.

Console.WriteLine("\nResponse:\n\n{0}\n",

JToken.Parse(contentString).ToString());

}

catch (Exception e)

{

Console.WriteLine("\n" + e.Message);

}

/// <summary>

/// Returns the contents of the specified file as a byte array.

/// </summary>

/// <param name="imageFilePath">The image file to read.</param>

/// <returns>The byte array of the image data.</returns>

static byte[] GetImageAsByteArray(string imageFilePath)

{

// Open a read-only file stream for the specified file.

using (FileStream fileStream =

new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))

{

// Read the file's contents into a byte array.

BinaryReader binaryReader = new BinaryReader(fileStream);

return binaryReader.ReadBytes((int)fileStream.Length);

}

Examine the response

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

{

"categories": [

{

"name": "abstract_",

"score": 0.00390625

},

{

"name": "others_",

"score": 0.0234375

},

{

"name": "outdoor_",

"score": 0.00390625

}

],

"description": {

"tags": [

"road",

"building",

"outdoor",

"street",

"night",

"black",

"city",

"white",

"light",

"sitting",

"riding",

"man",

"side",

"empty",

"rain",

"corner",

"traffic",

"lit",

"hydrant",

"stop",

"board",

"parked",

"bus",

"tall"

],

"captions": [

{

"text": "a close up of an empty city street at night",

"confidence": 0.7965622853462756

}

]

},

"requestId": "dddf1ac9-7e66-4c47-bdef-222f3fe5aa23",

"metadata": {

"width": 3733,

"height": 1986,

"format": "Jpeg"

},

"color": {

"dominantColorForeground": "Black",

"dominantColorBackground": "Black",

"dominantColors": [

"Black",

"Grey"

],

"accentColor": "666666",

"isBWImg": true

}

Clean up resources

Next steps

When no longer needed, delete the Visual Studio solution. To do so, open File Explorer, navigate to the folder in

which you created the Visual Studio solution, and delete the folder.

Explore a basic Windows application that uses Computer Vision to perform optical character recognition (OCR);

create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an

image.

Computer Vision API C# Tutorial

Quickstart: Analyze a local image using the REST API

and Python in Computer Vision

4/19/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you analyze a locally stored image to extract visual features by using Computer Vision's REST

API. With the Analyze Image method, you can extract visual features based on image content.

You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder,

select the following button:

launch

launch binder

binder

If you don't have an Azure subscription, create a free account before you begin.

You must have Python installed if you want to run the sample locally.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

You must have the following Python packages installed. You can use pip to install Python packages.

requests

matplotlib

pillow

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, analyze-local-image.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python analyze-local-image.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_path with the path and file name of a different image that you

want to analyze.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

analyze_url = vision_base_url + "analyze"

# Set image_path to the local path of an image that you want to analyze.

image_path = "C:/Documents/ImageToAnalyze.jpg"

# Read the image into a byte array

image_data = open(image_path, "rb").read()

headers = {'Ocp-Apim-Subscription-Key': subscription_key,

'Content-Type': 'application/octet-stream'}

params = {'visualFeatures': 'Categories,Description,Color'}

response = requests.post(

analyze_url, headers=headers, params=params, data=image_data)

response.raise_for_status()

# The 'analysis' object contains various fields that describe the image. The most

# relevant caption for the image is obtained from the 'description' property.

analysis = response.json()

print(analysis)

image_caption = analysis["description"]["captions"][0]["text"].capitalize()

# Display the image and overlay it with the caption.

image = Image.open(BytesIO(image_data))

plt.imshow(image)

plt.axis("off")

_ = plt.title(image_caption, size="x-large", y=-0.1)

Examine the response

{

"categories": [

{

"name": "outdoor_",

"score": 0.00390625,

"detail": {

"landmarks": []

}

},

{

"name": "outdoor_street",

"score": 0.33984375,

"detail": {

"landmarks": []

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

"landmarks": []

}

],

"description": {

"tags": [

"building",

"outdoor",

"street",

"city",

"people",

"busy",

"table",

"walking",

"traffic",

"filled",

"large",

"many",

"group",

"night",

"light",

"crowded",

"bunch",

"standing",

"man",

"sign",

"crowd",

"umbrella",

"riding",

"tall",

"woman",

"bus"

],

"captions": [

{

"text": "a group of people on a city street at night",

"confidence": 0.9122243847383961

}

]

},

"color": {

"dominantColorForeground": "Brown",

"dominantColorBackground": "Brown",

"dominantColors": [

"Brown"

],

"accentColor": "B54316",

"isBwImg": false

},

"requestId": "c11894eb-de3e-451b-9257-7c8b168073d1",

"metadata": {

"height": 600,

"width": 450,

"format": "Jpeg"

}

Clean up resources

Next steps

When no longer needed, delete the file.

Explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Python Tutorial

Quickstart: Generate a thumbnail using the REST API

and C# in Computer Vision

4/18/2019 • 5 minutes to read • Edit Online

Prerequisites

Create and run the sample application

using Newtonsoft.Json.Linq;

using System;

using System.IO;

using System.Net.Http;

using System.Net.Http.Headers;

using System.Threading.Tasks;

namespace CSHttpClientSample

{

static class Program

{

// Replace <Subscription Key> with your valid subscription key.

const string subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. With the Get

Thumbnail method, you can generate a thumbnail of an image. You specify the height and width, which can differ

from the aspect ratio of the input image. Computer Vision uses smart cropping to intelligently identify the area of

interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have Visual Studio 2015 or later.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create the sample in Visual Studio, do the following steps:

1. Create a new Visual Studio solution in Visual Studio, using the Visual C# Console App template.

2. Install the Newtonsoft.Json NuGet package.

3. Replace the code in Program.cs with the following code, and then make the following changes in code where

needed:

4. Run the program.

5. At the prompt, enter the path to a local image.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type "Newtonsoft.Json".

c. Select Newtonsoft.Json when it displays, then click the checkbox next to your project name, and Install.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const string uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail";

static void Main()

{

// Get the path and filename to process from the user.

Console.WriteLine("Thumbnail:");

Console.Write(

"Enter the path to the image you wish to use to create a thumbnail image: ");

string imageFilePath = Console.ReadLine();

if (File.Exists(imageFilePath))

{

// Call the REST API method.

Console.WriteLine("\nWait a moment for the results to appear.\n");

MakeThumbNailRequest(imageFilePath).Wait();

}

else

{

Console.WriteLine("\nInvalid file path");

}

Console.WriteLine("\nPress Enter to exit...");

Console.ReadLine();

}

/// <summary>

/// Gets a thumbnail image from the specified image file by using

/// the Computer Vision REST API.

/// </summary>

/// <param name="imageFilePath">The image file to use to create the thumbnail image.</param>

static async Task MakeThumbNailRequest(string imageFilePath)

{

try

{

HttpClient client = new HttpClient();

// Request headers.

client.DefaultRequestHeaders.Add(

"Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters.

// The width and height parameters specify a thumbnail that's

// 200 pixels wide and 150 pixels high.

// The smartCropping parameter is set to true, to enable smart cropping.

string requestParameters = "width=200&height=150&smartCropping=true";

// Assemble the URI for the REST API method.

string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Read the contents of the specified local image

// into a byte array.

byte[] byteData = GetImageAsByteArray(imageFilePath);

// Add the byte array as an octet stream to the request body.

using (ByteArrayContent content = new ByteArrayContent(byteData))

{

// This example uses the "application/octet-stream" content type.

// The other content types you can use are "application/json"

// and "multipart/form-data".

content.Headers.ContentType =

new MediaTypeHeaderValue("application/octet-stream");

// Asynchronously call the REST API method.

response = await client.PostAsync(uri, content);

}

// Check the HTTP status code of the response. If successful, display

// display the response and save the thumbnail.

if (response.IsSuccessStatusCode)

{

// Display the response data.

Console.WriteLine("\nResponse:\n{0}", response);

// Get the image data for the thumbnail from the response.

byte[] thumbnailImageData =

await response.Content.ReadAsByteArrayAsync();

// Save the thumbnail to the same folder as the original image,

// using the original name with the suffix "_thumb".

// Note: This will overwrite an existing file of the same name.

string thumbnailFilePath =

imageFilePath.Insert(imageFilePath.Length - 4, "_thumb");

File.WriteAllBytes(thumbnailFilePath, thumbnailImageData);

Console.WriteLine("\nThumbnail written to: {0}", thumbnailFilePath);

}

else

{

// Display the JSON error data.

string errorString = await response.Content.ReadAsStringAsync();

Console.WriteLine("\n\nResponse:\n{0}\n",

JToken.Parse(errorString).ToString());

}

catch (Exception e)

{

Console.WriteLine("\n" + e.Message);

}

/// <summary>

/// Returns the contents of the specified file as a byte array.

/// </summary>

/// <param name="imageFilePath">The image file to read.</param>

/// <returns>The byte array of the image data.</returns>

static byte[] GetImageAsByteArray(string imageFilePath)

{

// Open a read-only file stream for the specified file.

using (FileStream fileStream =

new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))

{

// Read the file's contents into a byte array.

BinaryReader binaryReader = new BinaryReader(fileStream);

return binaryReader.ReadBytes((int)fileStream.Length);

}

Examine the response

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

succeeds, the thumbnail is saved to the same folder as the local image, using the original name with the suffix

"_thumb". If the request fails, the response contains an error code and a message to help determine what went

wrong.

Response:

StatusCode: 200, ReasonPhrase: 'OK', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:

{

Pragma: no-cache

apim-request-id: 131eb5b4-5807-466d-9656-4c1ef0a64c9b

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

x-content-type-options: nosniff

Cache-Control: no-cache

Date: Tue, 06 Jun 2017 20:54:07 GMT

X-AspNet-Version: 4.0.30319

X-Powered-By: ASP.NET

Content-Length: 5800

Content-Type: image/jpeg

Expires: -1

}

Clean up resources

Next steps

The sample application displays a successful response in the console window, similar to the following example:

When no longer needed, delete the Visual Studio solution. To do so, open File Explorer, navigate to the folder in

which you created the Visual Studio solution, and delete the folder.

Explore a basic Windows application that uses Computer Vision to perform optical character recognition (OCR);

create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an

image. To rapidly experiment with the Computer Vision APIs, try the Open API testing console.

Computer Vision API C# Tutorial

Quickstart: Generate a thumbnail using the REST API

and cURL in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Get Thumbnail request

NOTENOTE

Create and run the sample command

In this quickstart, you generate a thumbnail from an image using Computer Vision's REST API. You specify the

desired height and width, which can differ in aspect ration from the input image. Computer Vision uses smart

cropping to intelligently identify the area of interest and generate cropping coordinates around that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have cURL.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

With the Get Thumbnail method, you can generate a thumbnail of an image.

To run the sample, do the following steps:

1. Copy the following code into an editor.

2. Replace <Subscription Key> with your valid subscription key.

3. Replace <File> with the path and filename to save the thumbnail.

4. Change the Request URL (https://westcentralus.api.cognitive.microsoft.com/vision/v2.0 ) to use the location

where you obtained your subscription keys, if necessary.

5. Optionally, change the image ({\"url\":\"... ) to analyze.

6. Open a command window on a computer with cURL installed.

7. Paste the code in the window and run the command.

You must use the same location in your REST call as you used to obtain your subscription keys. For example, if you obtained

your subscription keys from westus, replace "westcentralus" in the URL below with "westus".

To create and run the sample, do the following steps:

1. Copy the following command into a text editor.

2. Make the following changes in the command where needed:

a. Replace the value of <subscriptionKey> with your subscription key.

b. Replace the value of <thumbnailFile> with the path and name of the file in which to save the thumbnail.

c. Replace the request URL (

https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail ) with the endpoint

Examine the response

Next steps

curl -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -o <thumbnailFile> -H "Content-Type:

application/json" "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail?

width=100&height=100&smartCropping=true" -d "

{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Shorkie_Poo_Puppy.jpg/1280px-

Shorkie_Poo_Puppy.jpg\"}"

URL for the Get Thumbnail method from the Azure region where you obtained your subscription keys, if

necessary.

d. Optionally, change the image URL in the request body (

https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Shorkie_Poo_Puppy.jpg/1280px-

Shorkie_Poo_Puppy.jpg\

) to the URL of a different image from which to generate a thumbnail.

3. Open a command prompt window.

4. Paste the command from the text editor into the command prompt window, and then run the command.

A successful response writes the thumbnail image to the file specified in <thumbnailFile> . If the request fails, the

response contains an error code and a message to help determine what went wrong. If the request seems to

succeed but the created thumbnail is not a valid image file, it might be that your subscription key is not valid.

Explore the Computer Vision API to how to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Generate a thumbnail using the REST API

and Go in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

package main

import (

"encoding/json"

"fmt"

"io/ioutil"

"net/http"

"strings"

"time"

)

func main() {

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = "<Subscription Key>"

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

In this quickstart, you generate a thumbnail from an image using Computer Vision's REST API. You specify the

height and width, which can differ in aspect ratio from the input image. Computer Vision uses smart cropping to

intelligently identify the area of interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have Go installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with a .go extension. For example, get-thumbnail.go .

4. Open a command prompt window.

5. At the prompt, run the go build command to compile the package from the file. For example,

go build get-thumbnail.go .

6. At the prompt, run the compiled package. For example, get-thumbnail .

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image from which you want to

generate a thumbnail.

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail"

const imageUrl =

"https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg"

const params = "?width=100&height=100&smartCropping=true"

const uri = uriBase + params

const imageUrlEnc = "{\"url\":\"" + imageUrl + "\"}"

reader := strings.NewReader(imageUrlEnc)

// Create the HTTP client

client := &http.Client{

Timeout: time.Second * 2,

}

// Create the POST request, passing the image URL in the request body

req, err := http.NewRequest("POST", uri, reader)

if err != nil {

panic(err)

}

// Add headers

req.Header.Add("Content-Type", "application/json")

req.Header.Add("Ocp-Apim-Subscription-Key", subscriptionKey)

// Send the request and retrieve the response

resp, err := client.Do(req)

if err != nil {

panic(err)

}

defer resp.Body.Close()

// Read the response body.

// Note, data is a byte array

data, err := ioutil.ReadAll(resp.Body)

if err != nil {

panic(err)

}

// Parse the JSON data

var f interface{}

json.Unmarshal(data, &f)

// Format and display the JSON result

jsonFormatted, _ := json.MarshalIndent(f, "", " ")

fmt.Println(string(jsonFormatted))

}

Examine the response

Next steps

A successful response contains the thumbnail image binary data. If the request fails, the response contains an error

code and a message to help determine what went wrong.

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and

extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing

console.

Explore the Computer Vision API

Quickstart: Generate a thumbnail using the REST API

and Java in Computer Vision

4/18/2019 • 4 minutes to read • Edit Online

Prerequisites

Create and run the sample application

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. You specify the

height and width, which can differ from the aspect ratio of the input image. Computer Vision uses smart cropping

to intelligently identify the area of interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have Java™ Platform, Standard Edition Development Kit 7 or 8 (JDK 7 or 8) installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

import java.awt.*;

import javax.swing.*;

import java.net.URI;

import java.io.InputStream;

import javax.imageio.ImageIO;

import java.awt.image.BufferedImage;

import org.apache.http.HttpEntity;

import org.apache.http.HttpResponse;

import org.apache.http.client.methods.HttpPost;

import org.apache.http.entity.StringEntity;

import org.apache.http.client.utils.URIBuilder;

import org.apache.http.impl.client.CloseableHttpClient;

import org.apache.http.impl.client.HttpClientBuilder;

import org.apache.http.util.EntityUtils;

import org.json.JSONObject;

1. Create a new Java project in your favorite IDE or editor. If the option is available, create the Java project from

a command line application template.

2. Import the following libraries into your Java project. If you're using Maven, the Maven coordinates are

provided for each library.

Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5)

Apache HTTP core (org.apache.httpcomponents:httpcore:4.4.9)

JSON library (org.json:json:20180130)

3. Add the following import statements to the file that contains the Main public class for your project.

4. Replace the Main public class with the following code, then make the following changes in code where

needed:

a. Replace the value of subscriptionKey with your subscription key.

// This sample uses the following libraries:

// - Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5)

// - Apache HTTP core (org.apache.httpcomponents:httpccore:4.4.9)

// - JSON library (org.json:json:20180130).

public class Main {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

private static final String subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

private static final String uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail";

private static final String imageToAnalyze =

"https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg";

public static void main(String[] args) {

CloseableHttpClient httpClient = HttpClientBuilder.create().build();

try {

URIBuilder uriBuilder = new URIBuilder(uriBase);

// Request parameters.

uriBuilder.setParameter("width", "100");

uriBuilder.setParameter("height", "150");

uriBuilder.setParameter("smartCropping", "true");

// Prepare the URI for the REST API method.

URI uri = uriBuilder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request body.

StringEntity requestEntity =

new StringEntity("{\"url\":\"" + imageToAnalyze + "\"}");

request.setEntity(requestEntity);

// Call the REST API method and get the response entity.

HttpResponse response = httpClient.execute(request);

HttpEntity entity = response.getEntity();

// Check for success.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageToAnalyze with the URL of a different image for which you want to

generate a thumbnail.

5. Save, then build the Java project.

6. If you're using an IDE, run Main . Otherwise, open a command prompt window and then use the java

command to run the compiled class. For example, java Main .

// Check for success.

if (response.getStatusLine().getStatusCode() == 200) {

// Display the thumbnail.

System.out.println("\nDisplaying thumbnail.\n");

displayImage(entity.getContent());

} else {

// Format and display the JSON error message.

String jsonString = EntityUtils.toString(entity);

JSONObject json = new JSONObject(jsonString);

System.out.println("Error:\n");

System.out.println(json.toString(2));

}

} catch (Exception e) {

System.out.println(e.getMessage());

}

// Displays the given input stream as an image.

private static void displayImage(InputStream inputStream) {

try {

BufferedImage bufferedImage = ImageIO.read(inputStream);

ImageIcon imageIcon = new ImageIcon(bufferedImage);

JLabel jLabel = new JLabel();

jLabel.setIcon(imageIcon);

JFrame jFrame = new JFrame();

jFrame.setLayout(new FlowLayout());

jFrame.setSize(100, 150);

jFrame.add(jLabel);

jFrame.setVisible(true);

jFrame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

} catch (Exception e) {

System.out.println(e.getMessage());

}

Examine the response

Next steps

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

succeeds, the thumbnail is generated from the binary data in the response and displayed in a separate window

created by the sample application. If the request fails, the response is displayed in the console window. The

response for the failed request contains an error code and a message to help determine what went wrong.

Explore a Java Swing application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Java Tutorial

Quickstart: Generate a thumbnail using the REST API

and JavaScript in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

<!DOCTYPE html>

<html>

<head>

<title>Thumbnail Sample</title>

</head>

<body>

function processImage() {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

var subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. You specify the

height and width, which can differ in aspect ratio from the input image. Computer Vision uses smart cropping to

intelligently identify the area of interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or,

follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .html extension. For example, get-thumbnail.html .

4. Open a browser window.

5. In the browser, drag and drop the file into the browser window.

6. When the webpage is displayed in the browser, choose the Generate thumbnail button.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of the value attribute for the inputImage control with the URL of a

different image that you want to analyze.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

var uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail";

// Request parameters.

var params = "?width=100&height=150&smartCropping=true";

// Display the source image.

var sourceImageUrl = document.getElementById("inputImage").value;

document.querySelector("#sourceImage").src = sourceImageUrl;

// Prepare the REST API call:

// Create the HTTP Request object.

var xhr = new XMLHttpRequest();

// Identify the request as a POST, with the URL and parameters.

xhr.open("POST", uriBase + params);

// Add the request headers.

xhr.setRequestHeader("Content-Type","application/json");

xhr.setRequestHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Set the response type to "blob" for the thumbnail image data.

xhr.responseType = "blob";

// Process the result of the REST API call.

xhr.onreadystatechange = function(e) {

if(xhr.readyState === XMLHttpRequest.DONE) {

// Thumbnail successfully created.

if (xhr.status === 200) {

// Show response headers.

var s = JSON.stringify(xhr.getAllResponseHeaders(), null, 2);

document.getElementById("responseTextArea").value =

JSON.stringify(xhr.getAllResponseHeaders(), null, 2);

// Show thumbnail image.

var urlCreator = window.URL || window.webkitURL;

var imageUrl = urlCreator.createObjectURL(this.response);

document.querySelector("#thumbnailImage").src = imageUrl;

} else {

// Display the error message. The error message is the response

// body as a JSON string. The code in this code block extracts

// the JSON string from the blob response.

var reader = new FileReader();

// This event fires after the blob has been read.

reader.addEventListener('loadend', (e) => {

document.getElementById("responseTextArea").value =

JSON.stringify(JSON.parse(e.srcElement.result), null, 2);

});

// Start reading the blob as text.

reader.readAsText(xhr.response);

}

// Make the REST API call.

xhr.send('{"url": ' + '"' + sourceImageUrl + '"}');

};

</script>

<h1>Generate thumbnail image:</h1>

Enter the URL to an image to use in creating a thumbnail image,

then click the <strong>Generate thumbnail</strong> button.

Image for thumbnail:

<input type="text" name="inputImage" id="inputImage"

value="https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Shorkie_Poo_Puppy.jpg/1280px-

Shorkie_Poo_Puppy.jpg" />

<button onclick="processImage()">Generate thumbnail</button>

Response:

<textarea id="responseTextArea" class="UIInput"

style="width:580px; height:400px;"></textarea>

</div>

Source image:

</div>

Thumbnail:

</div>

</body>

</html>

Examine the response

Next steps

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

succeeds, the thumbnail is generated from the binary data in the response and displayed in the browser window. If

the request fails, the response is displayed in the console window. The response for the failed request contains an

error code and a message to help determine what went wrong.

Explore a JavaScript application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API JavaScript Tutorial

Quickstart: Generate a thumbnail using the REST API

and Node.js in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. With the Get

Thumbnail method, you can generate a thumbnail of an image. You specify the height and width, which can differ

from the aspect ratio of the input image. Computer Vision uses smart cropping to intelligently identify the area of

interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have Node.js 4.x or later installed.

You must have npm installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the npm request package.

npm install request

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

4. Save the code as a file with a .js extension. For example, get-thumbnail.js .

5. Open a command prompt window.

6. At the prompt, use the node command to run the file. For example, node get-thumbnail.js .

'use strict';

const request = require('request');

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to get your

// subscription keys. For example, if you got your subscription keys from

// westus, replace "westcentralus" in the URL below with "westus".

const uriBase =

'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail';

const imageUrl =

'https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg';

// Request parameters.

const params = {

'width': '100',

'height': '100',

'smartCropping': 'true'

};

const options = {

uri: uriBase,

qs: params,

body: '{"url": ' + '"' + imageUrl + '"}',

headers: {

'Content-Type': 'application/json',

'Ocp-Apim-Subscription-Key' : subscriptionKey

}

};

request.post(options, (error, response, body) => {

if (error) {

console.log('Error: ', error);

return;

}

});

Examine the response

Next steps

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

fails, the response is displayed in the console window. The response for the failed request contains an error code

and a message to help determine what went wrong.

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Generate a thumbnail using the REST API

and PHP in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. With the Get

Thumbnail method, you can generate a thumbnail of an image. You specify the height and width, which can differ

from the aspect ratio of the input image. Computer Vision uses smart cropping to intelligently identify the area of

interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have PHP installed.

You must have Pear installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the PHP5 HTTP_Request2 package.

pear install HTTP_Request2

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Get Thumbnail method from the Azure

region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image for which you want to

generate a thumbnail.

4. Save the code as a file with a .php extension. For example, get-thumbnail.php .

5. Open a browser window with PHP support.

6. Drag and drop the file into the browser window.

<html>

<head>

<title>Get Thumbnail Sample</title>

</head>

<body>

<?php

// Replace <Subscription Key> with a valid subscription key.

$ocpApimSubscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to obtain

// your subscription keys. For example, if you obtained your subscription keys

// from westus, replace "westcentralus" in the URL below with "westus".

$uriBase = 'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/';

$imageUrl =

'https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . 'generateThumbnail');

$url = $request->getUrl();

$headers = array(

// Request headers

'Content-Type' => 'application/json',

'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey

);

$request->setHeader($headers);

$parameters = array(

// Request parameters

'width' => '100', // Width of the thumbnail.

'height' => '100', // Height of the thumbnail.

'smartCropping' => 'true',

);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters

$body = json_encode(array('url' => $imageUrl));

// Request body

$request->setBody($body);

try

{

$response = $request->send();

echo "<pre>" .

json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";

}

catch (HttpException $ex)

{

echo "<pre>" . $ex . "</pre>";

}

?>

</body>

</html>

Examine the response

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

fails, the response is displayed in the browser window. The response for the failed request contains an error code

and a message to help determine what went wrong.

Clean up resources

Next steps

When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. To uninstall the

package, do the following steps:

pear uninstall HTTP_Request2

1. Open a command prompt window as an administrator.

2. Run the following command:

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and

extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing

console.

Explore the Computer Vision API

Quickstart: Generate a thumbnail using the REST API

and Python in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you will generate a thumbnail from an image using Computer Vision's REST API. With the Get

Thumbnail method, you can specify the desired height and width, and Computer Vision uses smart cropping to

intelligently identify the area of interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

A code editor such as Visual Studio Code

To create and run the sample, copy the following code into the code editor.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

thumbnail_url = vision_base_url + "generateThumbnail"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg"

headers = {'Ocp-Apim-Subscription-Key': subscription_key}

params = {'width': '50', 'height': '50', 'smartCropping': 'true'}

data = {'url': image_url}

response = requests.post(thumbnail_url, headers=headers, params=params, json=data)

response.raise_for_status()

thumbnail = Image.open(BytesIO(response.content))

# Display the thumbnail.

plt.imshow(thumbnail)

plt.axis("off")

# Verify the thumbnail size.

print("Thumbnail is {0}-by-{1}".format(*thumbnail.size))

Examine the response

Run in Jupyter (optional)

Next, do the following:

1. Replace the value of subscription_key with your subscription key.

2. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the Azure

region where you obtained your subscription keys, if necessary.

3. Optionally, replace the value of image_url with the URL of a different image for which you want to generate a

thumbnail.

4. Save the code as a file with an .py extension. For example, get-thumbnail.py .

5. Open a command prompt window.

6. At the prompt, use the python command to run the sample. For example, python get-thumbnail.py .

A successful response is returned as binary data which represents the image data for the thumbnail. The sample

should display this image. If the request fails, the response is displayed in the command prompt window and

should contain an error code.

You can optionally run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch

Next steps

Binder, select the following button:

launch

launch binder

binder

Next, learn more detailed information about the thumbnail generation feature.

Generating thumbnails

Quickstart: Generate a thumbnail using the REST API

and Ruby in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you generate a thumbnail from an image by using Computer Vision's REST API. With the Get

Thumbnail method, you can generate a thumbnail of an image. You specify the height and width, which can differ

from the aspect ratio of the input image. Computer Vision uses smart cropping to intelligently identify the area of

interest and generate cropping coordinates based on that region.

If you don't have an Azure subscription, create a free account before you begin.

You must have Ruby 2.4.x or later installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .rb extension. For example, get-thumbnail.rb .

4. Open a command prompt window.

5. At the prompt, use the ruby command to run the sample. For example, ruby get-thumbnail.rb .

a. Replace <Subscription Key> with your subscription key.

b. Replace https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/analyze with the endpoint URL

for the Get Thumbnail method in the Azure region where you obtained your subscription keys, if

necessary.

c. Optionally, replace

https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Shorkie_Poo_Puppy.jpg/1280px-

Shorkie_Poo_Puppy.jpg\

with the URL of a different image for which you want to generate a thumbnail.

require 'net/http'

# You must use the same location in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from westus,

# replace "westcentralus" in the URL below with "westus".

uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/generateThumbnail')

uri.query = URI.encode_www_form({

# Request parameters

'width' => '100',

'height' => '100',

'smartCropping' => 'true'

})

request = Net::HTTP::Post.new(uri.request_uri)

# Request headers

# Replace <Subscription Key> with your valid subscription key.

request['Ocp-Apim-Subscription-Key'] = '<Subscription Key>'

request['Content-Type'] = 'application/json'

request.body =

"{\"url\": \"https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/" +

"Shorkie_Poo_Puppy.jpg/1280px-Shorkie_Poo_Puppy.jpg\"}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|

http.request(request)

end

#puts response.body

Examine the response

Next steps

A successful response is returned as binary data, which represents the image data for the thumbnail. If the request

fails, the response is displayed in the console window. The response for the failed request contains an error code

and a message to help determine what went wrong.

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the REST

API and C# in Computer Vision

4/18/2019 • 4 minutes to read • Edit Online

Prerequisites

Create and run the sample application

using Newtonsoft.Json.Linq;

using System;

using System.IO;

using System.Net.Http;

using System.Net.Http.Headers;

using System.Threading.Tasks;

namespace CSHttpClientSample

{

static class Program

{

// Replace <Subscription Key> with your valid subscription key.

const string subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

In this quickstart, you will extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR feature, you can detect printed text in an image and extract recognized

characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have Visual Studio 2015 or later.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create the sample in Visual Studio, do the following steps:

1. Create a new Visual Studio solution in Visual Studio, using the Visual C# Console App template.

2. Install the Newtonsoft.Json NuGet package.

3. Replace the code in Program.cs with the following code, and then make the following changes in code where

needed:

4. Run the program.

5. At the prompt, enter the path to a local image.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type "Newtonsoft.Json".

c. Select Newtonsoft.Json when it displays, then click the checkbox next to your project name, and Install.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const string uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr";

static void Main()

{

// Get the path and filename to process from the user.

Console.WriteLine("Optical Character Recognition:");

Console.Write("Enter the path to an image with text you wish to read: ");

string imageFilePath = Console.ReadLine();

if (File.Exists(imageFilePath))

{

// Call the REST API method.

Console.WriteLine("\nWait a moment for the results to appear.\n");

MakeOCRRequest(imageFilePath).Wait();

}

else

{

Console.WriteLine("\nInvalid file path");

}

Console.WriteLine("\nPress Enter to exit...");

Console.ReadLine();

}

/// <summary>

/// Gets the text visible in the specified image file by using

/// the Computer Vision REST API.

/// </summary>

/// <param name="imageFilePath">The image file with printed text.</param>

static async Task MakeOCRRequest(string imageFilePath)

{

try

{

HttpClient client = new HttpClient();

// Request headers.

client.DefaultRequestHeaders.Add(

"Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters.

// The language parameter doesn't specify a language, so the

// method detects it automatically.

// The detectOrientation parameter is set to true, so the method detects and

// and corrects text orientation before detecting text.

string requestParameters = "language=unk&detectOrientation=true";

// Assemble the URI for the REST API method.

string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Read the contents of the specified local image

// into a byte array.

byte[] byteData = GetImageAsByteArray(imageFilePath);

// Add the byte array as an octet stream to the request body.

using (ByteArrayContent content = new ByteArrayContent(byteData))

{

// This example uses the "application/octet-stream" content type.

// The other content types you can use are "application/json"

// and "multipart/form-data".

content.Headers.ContentType =

new MediaTypeHeaderValue("application/octet-stream");

// Asynchronously call the REST API method.

response = await client.PostAsync(uri, content);

}

// Asynchronously get the JSON response.

string contentString = await response.Content.ReadAsStringAsync();

// Display the JSON response.

Console.WriteLine("\nResponse:\n\n{0}\n",

JToken.Parse(contentString).ToString());

}

catch (Exception e)

{

Console.WriteLine("\n" + e.Message);

}

/// <summary>

/// Returns the contents of the specified file as a byte array.

/// </summary>

/// <param name="imageFilePath">The image file to read.</param>

/// <returns>The byte array of the image data.</returns>

static byte[] GetImageAsByteArray(string imageFilePath)

{

// Open a read-only file stream for the specified file.

using (FileStream fileStream =

new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))

{

// Read the file's contents into a byte array.

BinaryReader binaryReader = new BinaryReader(fileStream);

return binaryReader.ReadBytes((int)fileStream.Length);

}

Examine the response

{

"language": "en",

"textAngle": -1.5000000000000335,

"orientation": "Up",

"regions": [

{

"boundingBox": "154,49,351,575",

"lines": [

{

"boundingBox": "165,49,340,117",

"words": [

{

"boundingBox": "165,49,63,109",

"text": "A"

},

{

"boundingBox": "261,50,244,116",

"text": "GOAL"

}

]

},

{

"boundingBox": "165,169,339,93",

"words": [

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

{

"boundingBox": "165,169,339,93",

"text": "WITHOUT"

}

]

},

{

"boundingBox": "159,264,342,117",

"words": [

{

"boundingBox": "159,264,64,110",

"text": "A"

},

{

"boundingBox": "255,266,246,115",

"text": "PLAN"

}

]

},

{

"boundingBox": "161,384,338,119",

"words": [

{

"boundingBox": "161,384,86,113",

"text": "IS"

},

{

"boundingBox": "274,387,225,116",

"text": "JUST"

}

]

},

{

"boundingBox": "154,506,341,118",

"words": [

{

"boundingBox": "154,506,62,111",

"text": "A"

},

{

"boundingBox": "248,508,247,116",

"text": "WISH"

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the Visual Studio solution. To do so, open File Explorer, navigate to the folder in

which you created the Visual Studio solution, and delete the folder.

Explore a basic Windows application that uses Computer Vision to perform optical character recognition (OCR);

create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an

image.

Computer Vision API C# Tutorial

Quickstart: Extract printed text (OCR) using the REST

API and cURL in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample command

curl -H "Ocp-Apim-Subscription-Key: <subscriptionKey>" -H "Content-Type: application/json"

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation=true" -d "

{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-

Atomist_quote_from_Democritus.png\"}"

Examine the response

{

"language": "en",

"orientation": "Up",

"textAngle": 0,

"regions": [

{

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have cURL.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following command into a text editor.

2. Make the following changes in the command where needed:

3. Open a command prompt window.

4. Paste the command from the text editor into the command prompt window, and then run the command.

a. Replace the value of <subscriptionKey> with your subscription key.

b. Replace the request URL (https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr ) with the

endpoint URL for the OCR method from the Azure region where you obtained your subscription keys, if

necessary.

c. Optionally, change the image URL in the request body (

https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-

Atomist_quote_from_Democritus.png\

) to the URL of a different image to be analyzed.

A successful response is returned in JSON. The sample application parses and displays a successful response in the

command prompt window, similar to the following example:

{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

]

}

Next steps

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the REST

API and Go in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

package main

import (

"encoding/json"

"fmt"

"io/ioutil"

"net/http"

"strings"

"time"

)

func main() {

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = "<Subscription Key>"

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have Go installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with a .go extension. For example, get-printed-text.go .

4. Open a command prompt window.

5. At the prompt, run the go build command to compile the package from the file. For example,

go build get-printed-text.go .

6. At the prompt, run the compiled package. For example, get-printed-text .

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr"

const imageUrl = "https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" +

"Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png"

const params = "?language=unk&detectOrientation=true"

const uri = uriBase + params

const imageUrlEnc = "{\"url\":\"" + imageUrl + "\"}"

reader := strings.NewReader(imageUrlEnc)

// Create the Http client

client := &http.Client{

Timeout: time.Second * 2,

}

// Create the Post request, passing the image URL in the request body

req, err := http.NewRequest("POST", uri, reader)

if err != nil {

panic(err)

}

// Add headers

req.Header.Add("Content-Type", "application/json")

req.Header.Add("Ocp-Apim-Subscription-Key", subscriptionKey)

// Send the request and retrieve the response

resp, err := client.Do(req)

if err != nil {

panic(err)

}

defer resp.Body.Close()

// Read the response body.

// Note, data is a byte array

data, err := ioutil.ReadAll(resp.Body)

if err != nil {

panic(err)

}

// Parse the Json data

var f interface{}

json.Unmarshal(data, &f)

// Format and display the Json result

jsonFormatted, _ := json.MarshalIndent(f, "", " ")

fmt.Println(string(jsonFormatted))

}

Examine the response

{

"language": "en",

"orientation": "Up",

"regions": [

{

"boundingBox": "21,16,304,451",

"lines": [

A successful response is returned in JSON. The sample application parses and displays a successful response in the

command prompt window, similar to the following example:

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

],

"textAngle": 0

}

Next steps

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the REST

API and Java in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample application

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have Java™ Platform, Standard Edition Development Kit 7 or 8 (JDK 7 or 8) installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

import java.net.URI;

import org.apache.http.HttpEntity;

import org.apache.http.HttpResponse;

import org.apache.http.client.methods.HttpPost;

import org.apache.http.entity.StringEntity;

import org.apache.http.client.utils.URIBuilder;

import org.apache.http.impl.client.CloseableHttpClient;

import org.apache.http.impl.client.HttpClientBuilder;

import org.apache.http.util.EntityUtils;

import org.json.JSONObject;

1. Create a new Java project in your favorite IDE or editor. If the option is available, create the Java project from

a command line application template.

2. Import the following libraries into your Java project. If you're using Maven, the Maven coordinates are

provided for each library.

Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5)

Apache HTTP core (org.apache.httpcomponents:httpcore:4.4.9)

JSON library (org.json:json:20180130)

3. Add the following import statements to the file that contains the Main public class for your project.

4. Replace the Main public class with the following code, then make the following changes in code where

needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageToAnalyze with the URL of a different image from which you want

to extract printed text.

5. Save, then build the Java project.

6. If you're using an IDE, run Main . Otherwise, open a command prompt window and then use the java

command to run the compiled class. For example, java Main .

public class Main {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

private static final String subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

private static final String uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr";

private static final String imageToAnalyze =

"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" +

"Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png";

public static void main(String[] args) {

CloseableHttpClient httpClient = HttpClientBuilder.create().build();

try {

URIBuilder uriBuilder = new URIBuilder(uriBase);

uriBuilder.setParameter("language", "unk");

uriBuilder.setParameter("detectOrientation", "true");

// Request parameters.

URI uri = uriBuilder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request body.

StringEntity requestEntity =

new StringEntity("{\"url\":\"" + imageToAnalyze + "\"}");

request.setEntity(requestEntity);

// Call the REST API method and get the response entity.

HttpResponse response = httpClient.execute(request);

HttpEntity entity = response.getEntity();

if (entity != null) {

// Format and display the JSON response.

String jsonString = EntityUtils.toString(entity);

JSONObject json = new JSONObject(jsonString);

System.out.println("REST Response:\n");

System.out.println(json.toString(2));

}

} catch (Exception e) {

// Display error message.

System.out.println(e.getMessage());

}

Examine the response

REST Response:

{

"orientation": "Up",

"regions": [{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}]

},

{

"boundingBox": "27,66,283,52",

"words": [{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}]

},

{

"boundingBox": "27,128,292,49",

"words": [{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}]

},

{

"boundingBox": "24,188,292,54",

"words": [{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}],

"textAngle": 0,

"language": "en"

}

Clean up resources

Next steps

When no longer needed, delete the Java project, including the compiled class and imported libraries.

Explore a Java Swing application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Java Tutorial

Quickstart: Extract printed text (OCR) using the REST

API and JavaScript in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

<!DOCTYPE html>

<html>

<head>

<title>OCR Sample</title>

</head>

<body>

function processImage() {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

var subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or,

follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .html extension. For example, get-printed-text.html .

4. Open a browser window.

5. In the browser, drag and drop the file into the browser window.

6. When the webpage is displayed in the browser, choose the Read image button.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

c. Optionally, replace the value of the value attribute for the inputImage control with the URL of a

different image that you want to analyze.

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

var uriBase =

"https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr";

// Request parameters.

var params = {

"language": "unk",

"detectOrientation": "true",

};

// Display the image.

var sourceImageUrl = document.getElementById("inputImage").value;

document.querySelector("#sourceImage").src = sourceImageUrl;

// Perform the REST API call.

$.ajax({

url: uriBase + "?" + $.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

$("#responseTextArea").val(JSON.stringify(data, null, 2));

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Display error message.

var errorString = (errorThrown === "") ?

"Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" :

(jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message :

jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

};

</script>

<h1>Optical Character Recognition (OCR):</h1>

Enter the URL to an image of printed text, then

click the <strong>Read image</strong> button.

Image to read:

<input type="text" name="inputImage" id="inputImage"

value="https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-

Atomist_quote_from_Democritus.png" />

<button onclick="processImage()">Read image</button>

Response:

<textarea id="responseTextArea" class="UIInput"

style="width:580px; height:400px;"></textarea>

</div>

Source image:

</div>

</body>

</html>

Examine the response

{

"language": "en",

"orientation": "Up",

"textAngle": 0,

"regions": [

{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

browser window, similar to the following example:

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

]

}

Next steps

Explore a JavaScript application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API JavaScript Tutorial

Quickstart: Extract printed text (OCR) using the REST

API and Node.js in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have Node.js 4.x or later installed.

You must have npm installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the npm request package.

npm install request

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image from which you want to

extract printed text.

4. Save the code as a file with a .js extension. For example, get-printed-text.js .

5. Open a command prompt window.

6. At the prompt, use the node command to run the file. For example, node get-printed-text.js .

'use strict';

const request = require('request');

// Replace <Subscription Key> with your valid subscription key.

const subscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to get your

// subscription keys. For example, if you got your subscription keys from

// westus, replace "westcentralus" in the URL below with "westus".

const uriBase =

'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr';

const imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/' +

'Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png';

// Request parameters.

const params = {

'language': 'unk',

'detectOrientation': 'true',

};

const options = {

uri: uriBase,

qs: params,

body: '{"url": ' + '"' + imageUrl + '"}',

headers: {

'Content-Type': 'application/json',

'Ocp-Apim-Subscription-Key' : subscriptionKey

}

};

request.post(options, (error, response, body) => {

if (error) {

console.log('Error: ', error);

return;

}

let jsonResponse = JSON.stringify(JSON.parse(body), null, ' ');

console.log('JSON Response\n');

console.log(jsonResponse);

});

Examine the response

{

"language": "en",

"orientation": "Up",

"textAngle": 0,

"regions": [

{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

A successful response is returned in JSON. The sample parses and displays a successful response in the command

prompt window, similar to the following example:

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the file, and then uninstall the npm request package. To uninstall the package, do

the following steps:

npm uninstall request

1. Open a command prompt window as an administrator.

2. Run the following command:

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the REST

API and PHP in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have PHP installed.

You must have Pear installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the PHP5 HTTP_Request2 package.

pear install HTTP_Request2

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the OCR method from the Azure region where

you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image from which you want to

extract printed text.

4. Save the code as a file with a .php extension. For example, get-printed-text.php .

5. Open a browser window with PHP support.

6. Drag and drop the file into the browser window.

<?php

<html>

<head>

<title>OCR Sample</title>

</head>

<body>

<?php

// Replace <Subscription Key> with a valid subscription key.

$ocpApimSubscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to obtain

// your subscription keys. For example, if you obtained your subscription keys

// from westus, replace "westcentralus" in the URL below with "westus".

$uriBase = 'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/';

$imageUrl = 'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/' .

'Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . 'ocr');

$url = $request->getUrl();

$headers = array(

// Request headers

'Content-Type' => 'application/json',

'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey

);

$request->setHeader($headers);

$parameters = array(

// Request parameters

'language' => 'unk',

'detectOrientation' => 'true'

);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters

$body = json_encode(array('url' => $imageUrl));

// Request body

$request->setBody($body);

try

{

$response = $request->send();

echo "<pre>" .

json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";

}

catch (HttpException $ex)

{

echo "<pre>" . $ex . "</pre>";

}

?>

</body>

</html>

Examine the response

{

A successful response is returned in JSON. The sample website parses and displays a successful response in the

browser window, similar to the following example:

{

"language": "en",

"orientation": "Up",

"textAngle": 0,

"regions": [

{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. To uninstall the

package, do the following steps:

pear uninstall HTTP_Request2

1. Open a command prompt window as an administrator.

2. Run the following command:

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and

extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing

console.

Explore the Computer Vision API

Quickstart: Extract printed text (OCR) using the REST

API and Python in Computer Vision

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder,

select the following button:

launch

launch binder

binder

If you don't have an Azure subscription, create a free account before you begin.

You must have Python installed if you want to run the sample locally.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, get-printed-text.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python get-printed-text.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_url with the URL of a different image from which you want to

extract printed text.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from matplotlib.patches import Rectangle

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

ocr_url = vision_base_url + "ocr"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" + \

"Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png"

headers = {'Ocp-Apim-Subscription-Key': subscription_key}

params = {'language': 'unk', 'detectOrientation': 'true'}

data = {'url': image_url}

response = requests.post(ocr_url, headers=headers, params=params, json=data)

response.raise_for_status()

analysis = response.json()

# Extract the word bounding boxes and text.

line_infos = [region["lines"] for region in analysis["regions"]]

word_infos = []

for line in line_infos:

for word_metadata in line:

for word_info in word_metadata["words"]:

word_infos.append(word_info)

word_infos

# Display the image and overlay it with the extracted text.

plt.figure(figsize=(5, 5))

image = Image.open(BytesIO(requests.get(image_url).content))

ax = plt.imshow(image, alpha=0.5)

for word in word_infos:

bbox = [int(num) for num in word["boundingBox"].split(",")]

text = word["text"]

origin = (bbox[0], bbox[1])

patch = Rectangle(origin, bbox[2], bbox[3], fill=False, linewidth=2, color='y')

ax.axes.add_patch(patch)

plt.text(origin[0], origin[1], text, fontsize=20, weight="bold", va="top")

plt.axis("off")

Examine the response

{

"language": "en",

"orientation": "Up",

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

"orientation": "Up",

"textAngle": 0,

"regions": [

{

"boundingBox": "21,16,304,451",

"lines": [

{

"boundingBox": "28,16,288,41",

"words": [

{

"boundingBox": "28,16,288,41",

"text": "NOTHING"

}

]

},

{

"boundingBox": "27,66,283,52",

"words": [

{

"boundingBox": "27,66,283,52",

"text": "EXISTS"

}

]

},

{

"boundingBox": "27,128,292,49",

"words": [

{

"boundingBox": "27,128,292,49",

"text": "EXCEPT"

}

]

},

{

"boundingBox": "24,188,292,54",

"words": [

{

"boundingBox": "24,188,292,54",

"text": "ATOMS"

}

]

},

{

"boundingBox": "22,253,297,32",

"words": [

{

"boundingBox": "22,253,105,32",

"text": "AND"

},

{

"boundingBox": "144,253,175,32",

"text": "EMPTY"

}

]

},

{

"boundingBox": "21,298,304,60",

"words": [

{

"boundingBox": "21,298,304,60",

"text": "SPACE."

}

]

},

{

"boundingBox": "26,387,294,37",

"words": [

{

"boundingBox": "26,387,210,37",

"text": "Everything"

},

{

"boundingBox": "249,389,71,27",

"text": "else"

}

]

},

{

"boundingBox": "127,431,198,36",

"words": [

{

"boundingBox": "127,431,31,29",

"text": "is"

},

{

"boundingBox": "172,431,153,36",

"text": "opinion."

}

]

}

]

}

]

}

Next steps

Explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Python Tutorial

Quickstart: Extract printed text (OCR) using the REST

API and Ruby in Computer Vision

4/18/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you extract printed text with optical character recognition (OCR) from an image by using

Computer Vision's REST API. With the OCR method, you can detect printed text in an image and extract

recognized characters into a machine-usable character stream.

If you don't have an Azure subscription, create a free account before you begin.

You must have Ruby 2.4.x or later installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .rb extension. For example, get-printed-text.rb .

4. Open a command prompt window.

5. At the prompt, use the ruby command to run the sample. For example, ruby get-printed-text.rb .

a. Replace <Subscription Key> with your subscription key.

b. Replace https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr with the endpoint URL for

the OCR method in the Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace

https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-

Atomist_quote_from_Democritus.png\

with the URL of a different image from which you want to extract printed text.

require 'net/http'

# You must use the same location in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from westus,

# replace "westcentralus" in the URL below with "westus".

uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr')

uri.query = URI.encode_www_form({

# Request parameters

'language' => 'unk',

'detectOrientation' => 'true'

})

request = Net::HTTP::Post.new(uri.request_uri)

# Request headers

# Replace <Subscription Key> with your valid subscription key.

request['Ocp-Apim-Subscription-Key'] = '<Subscription Key>'

request['Content-Type'] = 'application/json'

request.body =

"{\"url\": \"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/" +

"Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\"}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|

http.request(request)

end

puts response.body

Examine the response

A successful response is returned in JSON. The sample parses and displays a successful response in the command

prompt window, similar to the following example:

{

"language": "en",

"textAngle": -2.0000000000000338,

"orientation": "Up",

"regions": [

{

"boundingBox": "462,379,497,258",

"lines": [

{

"boundingBox": "462,379,497,74",

"words": [

{

"boundingBox": "462,379,41,73",

"text": "A"

},

{

"boundingBox": "523,379,153,73",

"text": "GOAL"

},

{

"boundingBox": "694,379,265,74",

"text": "WITHOUT"

}

]

},

{

"boundingBox": "565,471,289,74",

"words": [

{

"boundingBox": "565,471,41,73",

"text": "A"

},

{

"boundingBox": "626,471,150,73",

"text": "PLAN"

},

{

"boundingBox": "801,472,53,73",

"text": "IS"

}

]

},

{

"boundingBox": "519,563,375,74",

"words": [

{

"boundingBox": "519,563,149,74",

"text": "JUST"

},

{

"boundingBox": "683,564,41,72",

"text": "A"

},

{

"boundingBox": "741,564,153,73",

"text": "WISH"

}

]

}

]

}

]

}

Next steps

Explore the Computer Vision API to analyze an image, detect celebrities and landmarks, create a thumbnail, and

extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API testing

console.

Explore the Computer Vision API

Quickstart: Extract handwritten text using the REST

API and C# in Computer Vision

5/29/2019 • 6 minutes to read • Edit Online

IMPORTANTIMPORTANT

Prerequisites

Create and run the sample application

using Newtonsoft.Json.Linq;

using System;

using System.IO;

using System.Linq;

using System.Net.Http;

using System.Net.Http.Headers;

using System.Threading.Tasks;

In this quickstart, you will extract handwritten text from an image by using Computer Vision's REST API. With the

Batch Read API and the Read Operation Result API, you can detect handwritten text in an image and extract

recognized characters into a machine-readable character stream.

Unlike the OCR method, the Batch Read method runs asynchronously. This method does not return any information in the

body of a successful response. Instead, the Read method returns a URI in the Operation-Location response header field.

You can then call this URI, which represents the Read Operation Result method, in order to check the status and return the

results of the Batch Read method call.

If you don't have an Azure subscription, create a free account before you begin.

You must have Visual Studio 2015 or later.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create the sample in Visual Studio, do the following steps:

1. Create a new Visual Studio solution in Visual Studio, using the Visual C# Console App template.

2. Install the Newtonsoft.Json NuGet package.

3. Replace the code in Program.cs with the following code, and then make the following changes in code where

needed:

4. Run the program.

5. At the prompt, enter the path to a local image.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type "Newtonsoft.Json".

c. Select Newtonsoft.Json when it displays, then click the checkbox next to your project name, and Install.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Batch Read method from the Azure region

where you obtained your subscription keys, if necessary.

using System.Threading.Tasks;

namespace CSHttpClientSample

{

static class Program

{

// Replace <Subscription Key> with your valid subscription key.

const string subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

const string uriBase =

"https://westus.api.cognitive.microsoft.com/vision/v2.0/read/core/asyncBatchAnalyze";

static void Main()

{

// Get the path and filename to process from the user.

Console.WriteLine("Handwriting Recognition:");

Console.Write(

"Enter the path to an image with handwritten text you wish to read: ");

string imageFilePath = Console.ReadLine();

if (File.Exists(imageFilePath))

{

// Call the REST API method.

Console.WriteLine("\nWait a moment for the results to appear.\n");

ReadHandwrittenText(imageFilePath).Wait();

}

else

{

Console.WriteLine("\nInvalid file path");

}

Console.WriteLine("\nPress Enter to exit...");

Console.ReadLine();

}

/// <summary>

/// Gets the handwritten text from the specified image file by using

/// the Computer Vision REST API.

/// </summary>

/// <param name="imageFilePath">The image file with handwritten text.</param>

static async Task ReadHandwrittenText(string imageFilePath)

{

try

{

HttpClient client = new HttpClient();

// Request headers.

client.DefaultRequestHeaders.Add(

"Ocp-Apim-Subscription-Key", subscriptionKey);

// Assemble the URI for the REST API method.

string uri = uriBase;

HttpResponseMessage response;

// Two REST API methods are required to extract handwritten text.

// One method to submit the image for processing, the other method

// to retrieve the text found in the image.

// operationLocation stores the URI of the second REST API method,

// returned by the first REST API method.

string operationLocation;

// Reads the contents of the specified local image

// into a byte array.

byte[] byteData = GetImageAsByteArray(imageFilePath);

// Adds the byte array as an octet stream to the request body.

using (ByteArrayContent content = new ByteArrayContent(byteData))

{

// This example uses the "application/octet-stream" content type.

// The other content types you can use are "application/json"

// and "multipart/form-data".

content.Headers.ContentType =

new MediaTypeHeaderValue("application/octet-stream");

// The first REST API method, Batch Read, starts

// the async process to analyze the written text in the image.

response = await client.PostAsync(uri, content);

}

// The response header for the Batch Read method contains the URI

// of the second method, Read Operation Result, which

// returns the results of the process in the response body.

// The Batch Read operation does not return anything in the response body.

if (response.IsSuccessStatusCode)

operationLocation =

response.Headers.GetValues("Operation-Location").FirstOrDefault();

else

{

// Display the JSON error data.

string errorString = await response.Content.ReadAsStringAsync();

Console.WriteLine("\n\nResponse:\n{0}\n",

JToken.Parse(errorString).ToString());

return;

}

// If the first REST API method completes successfully, the second

// REST API method retrieves the text written in the image.

//

// Note: The response may not be immediately available. Handwriting

// recognition is an asynchronous operation that can take a variable

// amount of time depending on the length of the handwritten text.

// You may need to wait or retry this operation.

//

// This example checks once per second for ten seconds.

string contentString;

int i = 0;

do

{

System.Threading.Thread.Sleep(1000);

response = await client.GetAsync(operationLocation);

contentString = await response.Content.ReadAsStringAsync();

++i;

}

while (i < 10 && contentString.IndexOf("\"status\":\"Succeeded\"") == -1);

if (i == 10 && contentString.IndexOf("\"status\":\"Succeeded\"") == -1)

{

Console.WriteLine("\nTimeout error.\n");

return;

}

// Display the JSON response.

Console.WriteLine("\nResponse:\n\n{0}\n",

JToken.Parse(contentString).ToString());

}

catch (Exception e)

{

Console.WriteLine("\n" + e.Message);

}

/// <summary>

/// Returns the contents of the specified file as a byte array.

/// </summary>

/// <param name="imageFilePath">The image file to read.</param>

/// <returns>The byte array of the image data.</returns>

static byte[] GetImageAsByteArray(string imageFilePath)

{

// Open a read-only file stream for the specified file.

using (FileStream fileStream =

new FileStream(imageFilePath, FileMode.Open, FileAccess.Read))

{

// Read the file's contents into a byte array.

BinaryReader binaryReader = new BinaryReader(fileStream);

return binaryReader.ReadBytes((int)fileStream.Length);

}

Examine the response

{

"status": "Succeeded",

"recognitionResults": [

{

"page": 1,

"clockwiseOrientation": 349.59,

"width": 3200,

"height": 3200,

"unit": "pixel",

"lines": [

{

"boundingBox": [202,618,2047,643,2046,840,200,813],

"text": "Our greatest glory is not",

"words": [

{

"boundingBox": [204,627,481,628,481,830,204,829],

"text": "Our"

},

{

"boundingBox": [519,628,1057,630,1057,832,518,830],

"text": "greatest"

},

{

"boundingBox": [1114,630,1549,631,1548,833,1114,832],

"text": "glory"

},

{

"boundingBox": [1586,631,1785,632,1784,834,1586,833],

"text": "is"

},

{

"boundingBox": [1822,632,2115,633,2115,835,1822,834],

"text": "not"

}

]

},

{

"boundingBox": [420,1273,2954,1250,2958,1488,422,1511],

"text": "but in rising every time we fall",

"words": [

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

{

"boundingBox": [423,1269,634,1268,635,1507,424,1508],

"text": "but"

},

{

"boundingBox": [667,1268,808,1268,809,1506,668,1507],

"text": "in"

},

{

"boundingBox": [874,1267,1289,1265,1290,1504,875,1506],

"text": "rising"

},

{

"boundingBox": [1331,1265,1771,1263,1772,1502,1332,1504],

"text": "every"

},

{

"boundingBox": [1812, 1263, 2178, 1261, 2179, 1500, 1813, 1502],

"text": "time"

},

{

"boundingBox": [2219, 1261, 2510, 1260, 2511, 1498, 2220, 1500],

"text": "we"

},

{

"boundingBox": [2551, 1260, 3016, 1258, 3017, 1496, 2552, 1498],

"text": "fall"

}

]

},

{

"boundingBox": [1612, 903, 2744, 935, 2738, 1139, 1607, 1107],

"text": "in never failing ,",

"words": [

{

"boundingBox": [1611, 934, 1707, 933, 1708, 1147, 1613, 1147],

"text": "in"

},

{

"boundingBox": [1753, 933, 2132, 930, 2133, 1144, 1754, 1146],

"text": "never"

},

{

"boundingBox": [2162, 930, 2673, 927, 2674, 1140, 2164, 1144],

"text": "failing"

},

{

"boundingBox": [2703, 926, 2788, 926, 2790, 1139, 2705, 1140],

"text": ",",

"confidence": "Low"

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the Visual Studio solution. To do so, open File Explorer, navigate to the folder in

which you created the Visual Studio solution, and delete the folder.

Explore a basic Windows application that uses Computer Vision to perform optical character recognition (OCR).

Create smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an

image.

Computer Vision API C# Tutorial

Quickstart: Extract handwritten text using the REST

API and Java in Computer Vision

5/29/2019 • 6 minutes to read • Edit Online

IMPORTANTIMPORTANT

Prerequisites

Create and run the sample application

In this quickstart, you extract handwritten text from an image by using Computer Vision's REST API. With the

Batch Read API and the Read Operation Result API, you can detect handwritten text in an image, then extract

recognized characters into a machine-usable character stream.

Unlike the OCR method, the Batch Read method runs asynchronously. This method does not return any information in the

body of a successful response. Instead, the Batch Read method returns a URI in the value of the Operation-Content

response header field. You can then call this URI, which represents the Read Operation Result method, in order to check the

status and return the results of the Batch Read method call.

If you don't have an Azure subscription, create a free account before you begin.

You must have Java™ Platform, Standard Edition Development Kit 7 or 8 (JDK 7 or 8) installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Create a new Java project in your favorite IDE or editor. If the option is available, create the Java project from

a command line application template.

2. Import the following libraries into your Java project. If you're using Maven, the Maven coordinates are

provided for each library.

Apache HTTP client (org.apache.httpcomponents:httpclient:4.5.5)

Apache HTTP core (org.apache.httpcomponents:httpcore:4.4.9)

JSON library (org.json:json:20180130)

3. Add the following import statements to the file that contains the Main public class for your project.

public class Main {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

private static final String subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

private static final String uriBase =

"https://westus.api.cognitive.microsoft.com/vision/v2.0/read/core/asyncBatchAnalyze";

private static final String imageToAnalyze =

"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/" +

"Cursive_Writing_on_Notebook_paper.jpg/800px-Cursive_Writing_on_Notebook_paper.jpg";

public static void main(String[] args) {

CloseableHttpClient httpTextClient = HttpClientBuilder.create().build();

CloseableHttpClient httpResultClient = HttpClientBuilder.create().build();;

try {

// This operation requires two REST API calls. One to submit the image

// for processing, the other to retrieve the text found in the image.

URIBuilder builder = new URIBuilder(uriBase);

// Prepare the URI for the REST API method.

URI uri = builder.build();

HttpPost request = new HttpPost(uri);

import java.net.URI;

import org.apache.http.HttpEntity;

import org.apache.http.HttpResponse;

import org.apache.http.client.methods.HttpGet;

import org.apache.http.client.methods.HttpPost;

import org.apache.http.client.utils.URIBuilder;

import org.apache.http.entity.StringEntity;

import org.apache.http.impl.client.CloseableHttpClient;

import org.apache.http.impl.client.HttpClientBuilder;

import org.apache.http.util.EntityUtils;

import org.apache.http.Header;

import org.json.JSONObject;

4. Replace the Main public class with the following code, then make the following changes in code where

needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Batch Read method from the Azure region

where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageToAnalyze with the URL of a different image from which you want

to extract handwritten text.

5. Save, then build the Java project.

6. If you're using an IDE, run Main . Otherwise, open a command prompt window and then use the java

command to run the compiled class. For example, java Main .

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request body.

StringEntity requestEntity =

new StringEntity("{\"url\":\"" + imageToAnalyze + "\"}");

request.setEntity(requestEntity);

// Two REST API methods are required to extract handwritten text.

// One method to submit the image for processing, the other method

// to retrieve the text found in the image.

// Call the first REST API method to detect the text.

HttpResponse response = httpTextClient.execute(request);

// Check for success.

if (response.getStatusLine().getStatusCode() != 202) {

// Format and display the JSON error message.

HttpEntity entity = response.getEntity();

String jsonString = EntityUtils.toString(entity);

JSONObject json = new JSONObject(jsonString);

System.out.println("Error:\n");

System.out.println(json.toString(2));

return;

}

// Store the URI of the second REST API method.

// This URI is where you can get the results of the first REST API method.

String operationLocation = null;

// The 'Operation-Location' response header value contains the URI for

// the second REST API method.

Header[] responseHeaders = response.getAllHeaders();

for (Header header : responseHeaders) {

if (header.getName().equals("Operation-Location")) {

operationLocation = header.getValue();

break;

}

if (operationLocation == null) {

System.out.println("\nError retrieving Operation-Location.\nExiting.");

System.exit(1);

}

// If the first REST API method completes successfully, the second

// REST API method retrieves the text written in the image.

//

// Note: The response may not be immediately available. Handwriting

// recognition is an asynchronous operation that can take a variable

// amount of time depending on the length of the handwritten text.

// You may need to wait or retry this operation.

System.out.println("\nHandwritten text submitted.\n" +

"Waiting 10 seconds to retrieve the recognized text.\n");

Thread.sleep(10000);

// Call the second REST API method and get the response.

HttpGet resultRequest = new HttpGet(operationLocation);

resultRequest.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

HttpResponse resultResponse = httpResultClient.execute(resultRequest);

HttpEntity responseEntity = resultResponse.getEntity();

if (responseEntity != null) {

// Format and display the JSON response.

String jsonString = EntityUtils.toString(responseEntity);

JSONObject json = new JSONObject(jsonString);

System.out.println("Text recognition result response: \n");

System.out.println(json.toString(2));

}

} catch (Exception e) {

System.out.println(e.getMessage());

}

Examine the response

Handwritten text submitted. Waiting 10 seconds to retrieve the recognized text.

Text recognition result response:

{

"status": "Succeeded",

"recognitionResults": [

{

"page": 1,

"clockwiseOrientation": 349.59,

"width": 3200,

"height": 3200,

"unit": "pixel",

"lines": [

{

"boundingBox": [202,618,2047,643,2046,840,200,813],

"text": "Our greatest glory is not",

"words": [

{

"boundingBox": [204,627,481,628,481,830,204,829],

"text": "Our"

},

{

"boundingBox": [519,628,1057,630,1057,832,518,830],

"text": "greatest"

},

{

"boundingBox": [1114,630,1549,631,1548,833,1114,832],

"text": "glory"

},

{

"boundingBox": [1586,631,1785,632,1784,834,1586,833],

"text": "is"

},

{

"boundingBox": [1822,632,2115,633,2115,835,1822,834],

"text": "not"

}

]

},

{

"boundingBox": [420,1273,2954,1250,2958,1488,422,1511],

"text": "but in rising every time we fall",

"words": [

{

"boundingBox": [423,1269,634,1268,635,1507,424,1508],

"text": "but"

},

{

"boundingBox": [667,1268,808,1268,809,1506,668,1507],

"text": "in"

},

A successful response is returned in JSON. The sample application parses and displays a successful response in the

console window, similar to the following example:

},

{

"boundingBox": [874,1267,1289,1265,1290,1504,875,1506],

"text": "rising"

},

{

"boundingBox": [1331,1265,1771,1263,1772,1502,1332,1504],

"text": "every"

},

{

"boundingBox": [1812, 1263, 2178, 1261, 2179, 1500, 1813, 1502],

"text": "time"

},

{

"boundingBox": [2219, 1261, 2510, 1260, 2511, 1498, 2220, 1500],

"text": "we"

},

{

"boundingBox": [2551, 1260, 3016, 1258, 3017, 1496, 2552, 1498],

"text": "fall"

}

]

},

{

"boundingBox": [1612, 903, 2744, 935, 2738, 1139, 1607, 1107],

"text": "in never failing ,",

"words": [

{

"boundingBox": [1611, 934, 1707, 933, 1708, 1147, 1613, 1147],

"text": "in"

},

{

"boundingBox": [1753, 933, 2132, 930, 2133, 1144, 1754, 1146],

"text": "never"

},

{

"boundingBox": [2162, 930, 2673, 927, 2674, 1140, 2164, 1144],

"text": "failing"

},

{

"boundingBox": [2703, 926, 2788, 926, 2790, 1139, 2705, 1140],

"text": ",",

"confidence": "Low"

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the Java project, including the compiled class and imported libraries.

Explore a Java Swing application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Java Tutorial

Quickstart: Extract handwritten text using the REST

API and JavaScript in Computer Vision

5/29/2019 • 5 minutes to read • Edit Online

IMPORTANTIMPORTANT

Prerequisites

Create and run the sample

<!DOCTYPE html>

<html>

<head>

<title>Handwriting Sample</title>

</head>

<body>

function processImage() {

// **********************************************

// *** Update or verify the following values. ***

// **********************************************

In this quickstart, you extract handwritten text from an image by using Computer Vision's REST API. With the

Batch Read API and the Read Operation Result API, you can detect handwritten text in an image, then extract

recognized characters into a machine-usable character stream.

Unlike the OCR method, the Batch Read method runs asynchronously. This method does not return any information in the

body of a successful response. Instead, the Batch Read method returns a URI in the value of the Operation-Content

response header field. You can then call this URI, which represents the Read Operation Result method, to both check the

status and return the results of the Batch Read method call.

If you don't have an Azure subscription, create a free account before you begin.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services. Or,

follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .html extension. For example, get-handwriting.html .

4. Open a browser window.

5. In the browser, drag and drop the file into the browser window.

6. When the webpage is displayed in the browser, choose the Read image button.

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Batch Read method from the Azure region

where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of the value attribute for the inputImage control with the URL of a

different image from which you want to extract handwritten text.

// **********************************************

// Replace <Subscription Key> with your valid subscription key.

var subscriptionKey = "<Subscription Key>";

// You must use the same Azure region in your REST API method as you used to

// get your subscription keys. For example, if you got your subscription keys

// from the West US region, replace "westcentralus" in the URL

// below with "westus".

//

// Free trial subscription keys are generated in the "westus" region.

// If you use a free trial subscription key, you shouldn't need to change

// this region.

var uriBase =

"https://westus.api.cognitive.microsoft.com/vision/v2.0/read/core/asyncBatchAnalyze";

// Display the image.

var sourceImageUrl = document.getElementById("inputImage").value;

document.querySelector("#sourceImage").src = sourceImageUrl;

// This operation requires two REST API calls. One to submit the image

// for processing, the other to retrieve the text found in the image.

//

// Make the first REST API call to submit the image for processing.

$.ajax({

url: uriBase,

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data, textStatus, jqXHR) {

// Show progress.

$("#responseTextArea").val("Handwritten text submitted. " +

"Waiting 10 seconds to retrieve the recognized text.");

// Note: The response may not be immediately available. Handwriting

// recognition is an asynchronous operation that can take a variable

// amount of time depending on the length of the text you want to

// recognize. You may need to wait or retry the GET operation.

//

// Wait ten seconds before making the second REST API call.

setTimeout(function () {

// "Operation-Location" in the response contains the URI

// to retrieve the recognized text.

var operationLocation = jqXHR.getResponseHeader("Operation-Location");

// Make the second REST API call and get the response.

$.ajax({

url: operationLocation,

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader(

"Ocp-Apim-Subscription-Key", subscriptionKey);

},

type: "GET",

})

.done(function(data) {

// Show formatted JSON on webpage.

$("#responseTextArea").val(JSON.stringify(data, null, 2));

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Display error message.

var errorString = (errorThrown === "") ? "Error. " :

errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" :

(jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message :

jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

}, 10000);

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Put the JSON description into the text area.

$("#responseTextArea").val(JSON.stringify(jqXHR, null, 2));

// Display error message.

var errorString = (errorThrown === "") ? "Error. " :

errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" :

(jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message :

jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

};

</script>

<h1>Read handwritten image:</h1>

Enter the URL to an image of handwritten text, then click

the <strong>Read image</strong> button.

Image to read:

<input type="text" name="inputImage" id="inputImage"

value="https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-

Cursive_Writing_on_Notebook_paper.jpg" />

<button onclick="processImage()">Read image</button>

Response:

<textarea id="responseTextArea" class="UIInput"

style="width:580px; height:400px;"></textarea>

</div>

Source image:

</div>

</body>

</html>

Examine the response

{

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

browser window, similar to the following example:

{

"status": "Succeeded",

"recognitionResults": [

{

"page": 1,

"clockwiseOrientation": 349.59,

"width": 3200,

"height": 3200,

"unit": "pixel",

"lines": [

{

"boundingBox": [202,618,2047,643,2046,840,200,813],

"text": "Our greatest glory is not",

"words": [

{

"boundingBox": [204,627,481,628,481,830,204,829],

"text": "Our"

},

{

"boundingBox": [519,628,1057,630,1057,832,518,830],

"text": "greatest"

},

{

"boundingBox": [1114,630,1549,631,1548,833,1114,832],

"text": "glory"

},

{

"boundingBox": [1586,631,1785,632,1784,834,1586,833],

"text": "is"

},

{

"boundingBox": [1822,632,2115,633,2115,835,1822,834],

"text": "not"

}

]

},

{

"boundingBox": [420,1273,2954,1250,2958,1488,422,1511],

"text": "but in rising every time we fall",

"words": [

{

"boundingBox": [423,1269,634,1268,635,1507,424,1508],

"text": "but"

},

{

"boundingBox": [667,1268,808,1268,809,1506,668,1507],

"text": "in"

},

{

"boundingBox": [874,1267,1289,1265,1290,1504,875,1506],

"text": "rising"

},

{

"boundingBox": [1331,1265,1771,1263,1772,1502,1332,1504],

"text": "every"

},

{

"boundingBox": [1812, 1263, 2178, 1261, 2179, 1500, 1813, 1502],

"text": "time"

},

{

"boundingBox": [2219, 1261, 2510, 1260, 2511, 1498, 2220, 1500],

"text": "we"

},

{

"boundingBox": [2551, 1260, 3016, 1258, 3017, 1496, 2552, 1498],

"text": "fall"

}

]

},

{

"boundingBox": [1612, 903, 2744, 935, 2738, 1139, 1607, 1107],

"text": "in never failing ,",

"words": [

{

"boundingBox": [1611, 934, 1707, 933, 1708, 1147, 1613, 1147],

"text": "in"

},

{

"boundingBox": [1753, 933, 2132, 930, 2133, 1144, 1754, 1146],

"text": "never"

},

{

"boundingBox": [2162, 930, 2673, 927, 2674, 1140, 2164, 1144],

"text": "failing"

},

{

"boundingBox": [2703, 926, 2788, 926, 2790, 1139, 2705, 1140],

"text": ",",

"confidence": "Low"

}

]

}

]

}

]

}

Clean up resources

Next steps

When no longer needed, delete the file.

Explore a JavaScript application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API JavaScript Tutorial

Quickstart: Extract handwritten text using the REST

API and Python in Computer Vision

5/29/2019 • 5 minutes to read • Edit Online

IMPORTANTIMPORTANT

Prerequisites

Create and run the sample

import requests

import time

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from matplotlib.patches import Polygon

In this quickstart, you extract handwritten text from an image by using Computer Vision's REST API. With the

Batch Read API and the Read Operation Result API, you can detect handwritten text in an image, then extract

recognized characters into a machine-usable character stream.

Unlike the OCR method, the Batch Read method runs asynchronously. This method does not return any information in the

body of a successful response. Instead, the Batch Read method returns a URI in the value of the Operation-Content

response header field. You can then call this URI, which represents the Read Operation Result API, to both check the status

and return the results of the Batch Read method call.

You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder,

select the following button:

launch

launch binder

binder

If you don't have an Azure subscription, create a free account before you begin.

You must have Python installed if you want to run the sample locally.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, get-handwritten-text.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python get-handwritten-text.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_url with the URL of a different image from which you want to

extract handwritten text.

from matplotlib.patches import Polygon

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

text_recognition_url = vision_base_url + "read/core/asyncBatchAnalyze"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/d/dd/Cursive_Writing_on_Notebook_paper.jpg"

headers = {'Ocp-Apim-Subscription-Key': subscription_key}

data = {'url': image_url}

response = requests.post(

text_recognition_url, headers=headers, json=data)

response.raise_for_status()

# Extracting handwritten text requires two API calls: One call to submit the

# image for processing, the other to retrieve the text found in the image.

# Holds the URI used to retrieve the recognized text.

operation_url = response.headers["Operation-Location"]

# The recognized text isn't immediately available, so poll to wait for completion.

analysis = {}

poll = True

while (poll):

response_final = requests.get(

response.headers["Operation-Location"], headers=headers)

analysis = response_final.json()

print(analysis)

time.sleep(1)

if ("recognitionResults" in analysis):

poll= False

if ("status" in analysis and analysis['status'] == 'Failed'):

poll= False

polygons=[]

if ("recognitionResults" in analysis):

# Extract the recognized text, with bounding boxes.

polygons = [(line["boundingBox"], line["text"])

for line in analysis["recognitionResults"][0]["lines"]]

# Display the image and overlay it with the extracted text.

plt.figure(figsize=(15, 15))

image = Image.open(BytesIO(requests.get(image_url).content))

ax = plt.imshow(image)

for polygon in polygons:

vertices = [(polygon[0][i], polygon[0][i+1])

for i in range(0, len(polygon[0]), 2)]

text = polygon[1]

patch = Polygon(vertices, closed=True, fill=False, linewidth=2, color='y')

ax.axes.add_patch(patch)

plt.text(vertices[0][0], vertices[0][1], text, fontsize=20, va="top")

Examine the response

{

"status": "Succeeded",

"recognitionResult": {

"lines": [

{

"boundingBox": [

2,

52,

65,

46,

69,

89,

7,

95

],

"text": "dog",

"words": [

{

"boundingBox": [

0,

59,

63,

43,

77,

86,

3,

102

],

"text": "dog"

}

]

},

{

"boundingBox": [

6,

2,

771,

13,

770,

75,

5,

64

],

"text": "The quick brown fox jumps over the lazy",

"words": [

{

"boundingBox": [

0,

4,

92,

5,

77,

71,

0,

71

],

"text": "The"

},

{

"boundingBox": [

74,

4,

189,

5,

174,

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

174,

72,

60,

71

],

"text": "quick"

},

{

"boundingBox": [

176,

5,

321,

6,

306,

73,

161,

72

],

"text": "brown"

},

{

"boundingBox": [

308,

6,

387,

6,

372,

73,

293,

73

],

"text": "fox"

},

{

"boundingBox": [

382,

6,

506,

7,

491,

74,

368,

73

],

"text": "jumps"

},

{

"boundingBox": [

492,

7,

607,

8,

592,

75,

478,

74

],

"text": "over"

},

{

"boundingBox": [

589,

8,

673,

8,

658,

75,

575,

75

],

"text": "the"

},

{

"boundingBox": [

660,

8,

783,

9,

768,

76,

645,

75

],

"text": "lazy"

}

]

},

{

"boundingBox": [

2,

84,

783,

96,

782,

154,

1,

148

],

"text": "Pack my box with five dozen liquor jugs",

"words": [

{

"boundingBox": [

0,

86,

94,

87,

72,

151,

0,

149

],

"text": "Pack"

},

{

"boundingBox": [

76,

87,

164,

88,

142,

152,

54,

150

],

"text": "my"

},

{

"boundingBox": [

155,

88,

243,

89,

222,

152,

134,

151

],

"text": "box"

},

{

"boundingBox": [

226,

89,

344,

90,

323,

154,

204,

152

],

"text": "with"

},

{

"boundingBox": [

336,

90,

432,

91,

411,

154,

314,

154

],

"text": "five"

},

{

"boundingBox": [

419,

91,

538,

92,

516,

154,

398,

154

],

"text": "dozen"

},

{

"boundingBox": [

547,

92,

701,

94,

679,

154,

525,

154

],

"text": "liquor"

},

{

"boundingBox": [

696,

94,

800,

95,

780,

154,

675,

154

],

"text": "jugs"

}

]

}

]

}

Next steps

Explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Python Tutorial

Quickstart: Recognize domain-specific content using

the REST API and PHP with Computer Vision

4/19/2019 • 2 minutes to read • Edit Online

Prerequisites

Create and run the sample

In this quickstart, you use a domain model to identify landmarks or, optionally, celebrities in a remotely stored

image by using Computer Vision's REST API. With the Recognize Domain Specific Content method, you can apply

a domain-specific model to recognize content within an image.

If you don't have an Azure subscription, create a free account before you begin.

You must have PHP installed.

You must have Pear installed.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the sample, do the following steps:

1. Install the PHP5 HTTP_Request2 package.

pear install HTTP_Request2

a. Open a command prompt window as an administrator.

b. Run the following command:

c. After the package is successfully installed, close the command prompt window.

2. Copy the following code into a text editor.

3. Make the following changes in code where needed:

a. Replace the value of subscriptionKey with your subscription key.

b. Replace the value of uriBase with the endpoint URL for the Recognize Domain Specific Content method

from the Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of imageUrl with the URL of a different image that you want to analyze.

d. Optionally, replace the value of the domain request parameter with celebrites if you want to use the

celebrities domain model instead of the landmarks domain model.

4. Save the code as a file with a .php extension. For example, use-domain-model.php .

5. Open a browser window with PHP support.

6. Drag and drop the file into the browser window.

<html>

<head>

<title>Analyze Domain Model Sample</title>

</head>

<body>

<?php

// Replace <Subscription Key> with a valid subscription key.

$ocpApimSubscriptionKey = '<Subscription Key>';

// You must use the same location in your REST call as you used to obtain

// your subscription keys. For example, if you obtained your subscription keys

// from westus, replace "westcentralus" in the URL below with "westus".

$uriBase = 'https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/';

// Change 'landmarks' to 'celebrities' to use the Celebrities model.

$domain = 'landmarks';

$imageUrl =

'https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-04.jpg';

require_once 'HTTP/Request2.php';

$request = new Http_Request2($uriBase . 'models/' . $domain . '/analyze');

$url = $request->getUrl();

$headers = array(

// Request headers

'Content-Type' => 'application/json',

'Ocp-Apim-Subscription-Key' => $ocpApimSubscriptionKey

);

$request->setHeader($headers);

$parameters = array(

// Request parameters

'model' => $domain

);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body parameters

$body = json_encode(array('url' => $imageUrl));

// Request body

$request->setBody($body);

try

{

$response = $request->send();

echo "<pre>" .

json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";

}

catch (HttpException $ex)

{

echo "<pre>" . $ex . "</pre>";

}

?>

</body>

</html>

Examine the response

A successful response is returned in JSON. The sample website parses and displays a successful response in the

browser window, similar to the following example:

{

"result": {

"landmarks": [

{

"name": "Space Needle",

"confidence": 0.9998177886009216

}

]

},

"requestId": "4d26587b-b2b9-408d-a70c-1f8121d84b0d",

"metadata": {

"height": 4132,

"width": 2096,

"format": "Jpeg"

}

Clean up resources

Next steps

When no longer needed, delete the file, and then uninstall the PHP5 HTTP_Request2 package. To uninstall the

package, do the following steps:

pear uninstall HTTP_Request2

1. Open a command prompt window as an administrator.

2. Run the following command:

3. After the package is successfully uninstalled, close the command prompt window.

Explore the Computer Vision API used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text. To rapidly experiment with the Computer Vision API, try the Open API

testing console.

Explore the Computer Vision API

Quickstart: Use a domain model using the REST API

and Python in Computer Vision

4/19/2019 • 4 minutes to read • Edit Online

Prerequisites

Create and run the landmarks sample

In this quickstart, you use a domain model to identify landmarks or, optionally, celebrities in a remotely stored

image by using Computer Vision's REST API. With the Recognize Domain Specific Content method, you can apply

a domain-specific model to recognize content within an image.

You can run this quickstart in a step-by step fashion using a Jupyter notebook on MyBinder. To launch Binder,

select the following button:

launch

launch binder

binder

If you don't have an Azure subscription, create a free account before you begin.

You must have Python installed if you want to run the sample locally.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

To create and run the landmark sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, get-landmarks.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python get-landmarks.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_url with the URL of a different image in which you want to detect

landmarks.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

# You must use the same region in your REST call as you used to get your

# subscription keys. For example, if you got your subscription keys from

# westus, replace "westcentralus" in the URI below with "westus".

#

# Free trial subscription keys are generated in the "westus" region.

# If you use a free trial subscription key, you shouldn't need to change

# this region.

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

landmark_analyze_url = vision_base_url + "models/landmarks/analyze"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/f/f6/" + \

"Bunker_Hill_Monument_2005.jpg"

headers = {'Ocp-Apim-Subscription-Key': subscription_key}

params = {'model': 'landmarks'}

data = {'url': image_url}

response = requests.post(

landmark_analyze_url, headers=headers, params=params, json=data)

response.raise_for_status()

# The 'analysis' object contains various fields that describe the image. The

# most relevant landmark for the image is obtained from the 'result' property.

analysis = response.json()

assert analysis["result"]["landmarks"] is not []

print(analysis)

landmark_name = analysis["result"]["landmarks"][0]["name"].capitalize()

# Display the image and overlay it with the landmark name.

image = Image.open(BytesIO(requests.get(image_url).content))

plt.imshow(image)

plt.axis("off")

_ = plt.title(landmark_name, size="x-large", y=-0.1)

Examine the response for the landmarks sample

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

{

"result": {

"landmarks": [

{

"name": "Bunker Hill Monument",

"confidence": 0.9768505096435547

}

]

},

"requestId": "659a10cd-44bb-44db-9147-a295b853b2b8",

"metadata": {

"height": 1600,

"width": 1200,

"format": "Jpeg"

}

Create and run the celebrities sample

To create and run the landmark sample, do the following steps:

1. Copy the following code into a text editor.

2. Make the following changes in code where needed:

3. Save the code as a file with an .py extension. For example, get-celebrities.py .

4. Open a command prompt window.

5. At the prompt, use the python command to run the sample. For example, python get-celebrities.py .

a. Replace the value of subscription_key with your subscription key.

b. Replace the value of vision_base_url with the endpoint URL for the Computer Vision resource in the

Azure region where you obtained your subscription keys, if necessary.

c. Optionally, replace the value of image_url with the URL of a different image in which you want to detect

celebrities.

import requests

# If you are using a Jupyter notebook, uncomment the following line.

#%matplotlib inline

import matplotlib.pyplot as plt

from PIL import Image

from io import BytesIO

# Replace <Subscription Key> with your valid subscription key.

subscription_key = "<Subscription Key>"

assert subscription_key

vision_base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/"

celebrity_analyze_url = vision_base_url + "models/celebrities/analyze"

# Set image_url to the URL of an image that you want to analyze.

image_url = "https://upload.wikimedia.org/wikipedia/commons/d/d9/" + \

"Bill_gates_portrait.jpg"

headers = {'Ocp-Apim-Subscription-Key': subscription_key}

params = {'model': 'celebrities'}

data = {'url': image_url}

response = requests.post(

celebrity_analyze_url, headers=headers, params=params, json=data)

response.raise_for_status()

# The 'analysis' object contains various fields that describe the image. The

# most relevant celebrity for the image is obtained from the 'result' property.

analysis = response.json()

assert analysis["result"]["celebrities"] is not []

print(analysis)

celebrity_name = analysis["result"]["celebrities"][0]["name"].capitalize()

# Display the image and overlay it with the celebrity name.

image = Image.open(BytesIO(requests.get(image_url).content))

plt.imshow(image)

plt.axis("off")

_ = plt.title(celebrity_name, size="x-large", y=-0.1)

Examine the response for the celebrities sample

A successful response is returned in JSON. The sample webpage parses and displays a successful response in the

command prompt window, similar to the following example:

{

"result": {

"celebrities": [

{

"faceRectangle": {

"top": 123,

"left": 156,

"width": 187,

"height": 187

},

"name": "Bill Gates",

"confidence": 0.9993845224380493

}

]

},

"requestId": "f14ec1d0-62d4-4296-9ceb-6b5776dc2020",

"metadata": {

"height": 521,

"width": 550,

"format": "Jpeg"

}

Clean up resources

Next steps

When no longer needed, delete the files for both samples.

Explore a Python application that uses Computer Vision to perform optical character recognition (OCR); create

smart-cropped thumbnails; plus detect, categorize, tag, and describe visual features, including faces, in an image. To

rapidly experiment with the Computer Vision API, try the Open API testing console.

Computer Vision API Python Tutorial

Quickstart: Analyze an image using the Computer

Vision SDK and C#

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample application

In this quickstart, you will analyze both a local and a remote image to extract visual features using the Computer

Vision client library for C#. If you wish, you can download the code in this guide as a complete sample app from

the Cognitive Services Csharp Vision repo on GitHub.

A Computer Vision subscription key. You can get a free trial subscription key from Try Cognitive Services. Or,

follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

Any edition of Visual Studio 2015 or 2017.

The Microsoft.Azure.CognitiveServices.Vision.ComputerVision client library NuGet package. It isn't necessary

to download the package. Installation instructions are provided below.

To run the sample, do the following steps:

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;

using System;

using System.Collections.Generic;

using System.IO;

using System.Threading.Tasks;

namespace ImageAnalyze

{

class Program

{

// subscriptionKey = "0123456789abcdef0123456789ABCDEF"

private const string subscriptionKey = "<SubscriptionKey>";

// localImagePath = @"C:\Documents\LocalImage.jpg"

private const string localImagePath = @"<LocalImage>";

1. Create a new Visual C# Console App in Visual Studio.

2. Install the Computer Vision client library NuGet package.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type

"Microsoft.Azure.CognitiveServices.Vision.ComputerVision".

c. Select Microsoft.Azure.CognitiveServices.Vision.ComputerVision when it displays, then click the

checkbox next to your project name, and Install.

3. Replace the contents of Program.cs with the following code. The AnalyzeImageAsync and

AnalyzeImageInStreamAsync methods wrap the Analyze Image REST API for remote and local images,

respectively.

private const string remoteImageUrl =

"https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg";

// Specify the features to return

private static readonly List<VisualFeatureTypes> features =

new List<VisualFeatureTypes>()

{

VisualFeatureTypes.Categories, VisualFeatureTypes.Description,

VisualFeatureTypes.Faces, VisualFeatureTypes.ImageType,

VisualFeatureTypes.Tags

};

static void Main(string[] args)

{

ComputerVisionClient computerVision = new ComputerVisionClient(

new ApiKeyServiceClientCredentials(subscriptionKey),

new System.Net.Http.DelegatingHandler[] { });

// You must use the same region as you used to get your subscription

// keys. For example, if you got your subscription keys from westus,

// replace "westcentralus" with "westus".

//

// Free trial subscription keys are generated in the "westus"

// region. If you use a free trial subscription key, you shouldn't

// need to change the region.

// Specify the Azure region

computerVision.Endpoint = "https://westcentralus.api.cognitive.microsoft.com";

Console.WriteLine("Images being analyzed ...");

var t1 = AnalyzeRemoteAsync(computerVision, remoteImageUrl);

var t2 = AnalyzeLocalAsync(computerVision, localImagePath);

Task.WhenAll(t1, t2).Wait(5000);

Console.WriteLine("Press ENTER to exit");

Console.ReadLine();

}

// Analyze a remote image

private static async Task AnalyzeRemoteAsync(

ComputerVisionClient computerVision, string imageUrl)

{

if (!Uri.IsWellFormedUriString(imageUrl, UriKind.Absolute))

{

Console.WriteLine(

"\nInvalid remoteImageUrl:\n{0} \n", imageUrl);

return;

}

ImageAnalysis analysis =

await computerVision.AnalyzeImageAsync(imageUrl, features);

DisplayResults(analysis, imageUrl);

}

// Analyze a local image

private static async Task AnalyzeLocalAsync(

ComputerVisionClient computerVision, string imagePath)

{

if (!File.Exists(imagePath))

{

Console.WriteLine(

"\nUnable to open or read localImagePath:\n{0} \n", imagePath);

return;

}

using (Stream imageStream = File.OpenRead(imagePath))

{

ImageAnalysis analysis = await computerVision.AnalyzeImageInStreamAsync(

imageStream, features);

Examine the response

https://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg

a large waterfall over a rocky cliff

Next steps

imageStream, features);

DisplayResults(analysis, imagePath);

}

// Display the most relevant caption for the image

private static void DisplayResults(ImageAnalysis analysis, string imageUri)

{

Console.WriteLine(imageUri);

if (analysis.Description.Captions.Count != 0)

{

Console.WriteLine(analysis.Description.Captions[0].Text + "\n");

}

else

{

Console.WriteLine("No description generated.");

}

4. Replace <Subscription Key> with your valid subscription key.

5. Change computerVision.Endpoint to the Azure region associated with your subscription keys, if necessary.

6. Replace <LocalImage> with the path and file name of a local image.

7. Optionally, set remoteImageUrl to a different image URL.

8. Run the program.

A successful response displays the most relevant caption for each image. You can change the DisplayResults

method to output different image data. See the AnalyzeLocalAsync method to learn more.

See API Quickstarts: Analyze a local image with C# for an example of a raw JSON output.

Explore the Computer Vision APIs used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text.

Explore Computer Vision APIs

Quickstart: Generate a thumbnail using the Computer

Vision SDK and C#

4/18/2019 • 3 minutes to read • Edit Online

Prerequisites

GenerateThumbnailAsync method

In this quickstart, you will generate a smart-cropped thumbnail from an image using the Computer Vision SDK for

C#. If you wish, you can download the code in this guide as a complete sample app from the Cognitive Services

Csharp Vision repo on GitHub.

A Computer Vision subscription key. You can get a free trial key from Try Cognitive Services. Or, follow the

instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

Any edition of Visual Studio 2015 or 2017.

The Microsoft.Azure.CognitiveServices.Vision.ComputerVision client library NuGet package. It isn't necessary

to download the package. Installation instructions are provided below.

You can use these methods to generate a thumbnail of an image. You specify the height and width, which can differ

from the aspect ratio of the input image. Computer Vision uses smart cropping to intelligently identify the area of

interest and generate cropping coordinates based on that region.

To run the sample, do the following steps:

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;

using System;

using System.IO;

using System.Threading.Tasks;

namespace ImageThumbnail

{

class Program

{

private const bool writeThumbnailToDisk = false;

// subscriptionKey = "0123456789abcdef0123456789ABCDEF"

private const string subscriptionKey = "<SubscriptionKey>";

1. Create a new Visual C# Console App in Visual Studio.

2. Install the Computer Vision client library NuGet package.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type

"Microsoft.Azure.CognitiveServices.Vision.ComputerVision".

c. Select Microsoft.Azure.CognitiveServices.Vision.ComputerVision when it displays, then click the

checkbox next to your project name, and Install.

3. Replace Program.cs with the following code. The GenerateThumbnailAsync and

GenerateThumbnailInStreamAsync methods wrap the Get Thumbnail API for remote and local images,

respectively.

// localImagePath = @"C:\Documents\LocalImage.jpg"

private const string localImagePath = @"<LocalImage>";

private const string remoteImageUrl =

"https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg";

private const int thumbnailWidth = 100;

private const int thumbnailHeight = 100;

static void Main(string[] args)

{

ComputerVisionClient computerVision = new ComputerVisionClient(

new ApiKeyServiceClientCredentials(subscriptionKey),

new System.Net.Http.DelegatingHandler[] { });

// You must use the same region as you used to get your subscription

// keys. For example, if you got your subscription keys from westus,

// replace "westcentralus" with "westus".

//

// Free trial subscription keys are generated in the "westus"

// region. If you use a free trial subscription key, you shouldn't

// need to change the region.

// Specify the Azure region

computerVision.Endpoint = "https://westcentralus.api.cognitive.microsoft.com";

Console.WriteLine("Images being analyzed ...\n");

var t1 = GetRemoteThumbnailAsync(computerVision, remoteImageUrl);

var t2 = GetLocalThumbnailAsnc(computerVision, localImagePath);

Task.WhenAll(t1, t2).Wait(5000);

Console.WriteLine("Press ENTER to exit");

Console.ReadLine();

}

// Create a thumbnail from a remote image

private static async Task GetRemoteThumbnailAsync(

ComputerVisionClient computerVision, string imageUrl)

{

if (!Uri.IsWellFormedUriString(imageUrl, UriKind.Absolute))

{

Console.WriteLine(

"\nInvalid remoteImageUrl:\n{0} \n", imageUrl);

return;

}

Stream thumbnail = await computerVision.GenerateThumbnailAsync(

thumbnailWidth, thumbnailHeight, imageUrl, true);

string path = Environment.CurrentDirectory;

string imageName = imageUrl.Substring(imageUrl.LastIndexOf('/') + 1);

string thumbnailFilePath =

path + "\\" + imageName.Insert(imageName.Length - 4, "_thumb");

// Save the thumbnail to the current working directory,

// using the original name with the suffix "_thumb".

SaveThumbnail(thumbnail, thumbnailFilePath);

}

// Create a thumbnail from a local image

private static async Task GetLocalThumbnailAsnc(

ComputerVisionClient computerVision, string imagePath)

{

if (!File.Exists(imagePath))

{

Console.WriteLine(

"\nUnable to open or read localImagePath:\n{0} \n", imagePath);

return;

}

Examine the response

Thumbnail written to: C:\Documents\LocalImage_thumb.jpg

Thumbnail written to: ...\bin\Debug\Bloodhound_Puppy_thumb.jpg

Next steps

}

using (Stream imageStream = File.OpenRead(imagePath))

{

Stream thumbnail = await computerVision.GenerateThumbnailInStreamAsync(

thumbnailWidth, thumbnailHeight, imageStream, true);

string thumbnailFilePath =

localImagePath.Insert(localImagePath.Length - 4, "_thumb");

// Save the thumbnail to the same folder as the original image,

// using the original name with the suffix "_thumb".

SaveThumbnail(thumbnail, thumbnailFilePath);

}

// Save the thumbnail locally.

// NOTE: This will overwrite an existing file of the same name.

private static void SaveThumbnail(Stream thumbnail, string thumbnailFilePath)

{

if (writeThumbnailToDisk)

{

using (Stream file = File.Create(thumbnailFilePath))

{

thumbnail.CopyTo(file);

}

Console.WriteLine("Thumbnail {0} written to: {1}\n",

writeThumbnailToDisk ? "" : "NOT", thumbnailFilePath);

}

4. Replace <Subscription Key> with your valid subscription key.

5. Change computerVision.Endpoint to the Azure region associated with your subscription keys, if necessary.

6. Optionally, replace <LocalImage> with the path and file name of a local image (will be ignored if not set).

7. Optionally, set remoteImageUrl to a different image.

8. Optionally, set writeThumbnailToDisk to true to save the thumbnail to disk.

9. Run the program.

A successful response saves the thumbnail for each image locally and displays the thumbnail's location, for

example:

Explore the Computer Vision APIs used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text.

Explore Computer Vision APIs

Quickstart: Extract handwritten text using the

Computer Vision C# SDK

5/29/2019 • 3 minutes to read • Edit Online

Prerequisites

Create and run the sample app

In this quickstart, you will extract handwritten or printed text from an image using the Computer Vision SDK for

C#. If you wish, you can download the code in this guide as a complete sample app from the Cognitive Services

Csharp Vision repo on GitHub.

A Computer Vision subscription key. You can get a free trial key from Try Cognitive Services. Or, follow the

instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your key.

Any edition of Visual Studio 2015 or 2017.

The Microsoft.Azure.CognitiveServices.Vision.ComputerVision client library NuGet package. It isn't necessary

to download the package. Installation instructions are provided below.

To run the sample, do the following steps:

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;

using System;

using System.IO;

using System.Threading.Tasks;

namespace ExtractText

{

class Program

{

// subscriptionKey = "0123456789abcdef0123456789ABCDEF"

private const string subscriptionKey = "<Subscription key>";

// localImagePath = @"C:\Documents\LocalImage.jpg"

private const string localImagePath = @"<LocalImage>";

private const string remoteImageUrl =

"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-

1. Create a new Visual C# Console App in Visual Studio.

2. Install the Computer Vision client library NuGet package.

a. On the menu, click Tools, select NuGet Package Manager, then Manage NuGet Packages for

Solution.

b. Click the Browse tab, and in the Search box type

"Microsoft.Azure.CognitiveServices.Vision.ComputerVision".

c. Select Microsoft.Azure.CognitiveServices.Vision.ComputerVision when it displays, then click the

checkbox next to your project name, and Install.

3. Replace Program.cs with the following code. The BatchReadFileAsync and BatchReadFileInStreamAsync

methods wrap the Batch Read API for remote and local images, respectively. The

GetReadOperationResultAsync method wraps the Get Read Operation Result API.

"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-

Cursive_Writing_on_Notebook_paper.jpg";

private const int numberOfCharsInOperationId = 36;

static void Main(string[] args)

{

ComputerVisionClient computerVision = new ComputerVisionClient(

new ApiKeyServiceClientCredentials(subscriptionKey),

new System.Net.Http.DelegatingHandler[] { });

// You must use the same region as you used to get your subscription

// keys. For example, if you got your subscription keys from westus,

// replace "westcentralus" with "westus".

//

// Free trial subscription keys are generated in the westcentralus

// region. If you use a free trial subscription key, you shouldn't

// need to change the region.

// Specify the Azure region

computerVision.Endpoint = "https://westus.api.cognitive.microsoft.com";

Console.WriteLine("Images being analyzed ...");

var t1 = ExtractRemoteTextAsync(computerVision, remoteImageUrl);

var t2 = ExtractLocalTextAsync(computerVision, localImagePath);

Task.WhenAll(t1, t2).Wait(5000);

Console.WriteLine("Press ENTER to exit");

Console.ReadLine();

}

// Read text from a remote image

private static async Task ExtractRemoteTextAsync(

ComputerVisionClient computerVision, string imageUrl)

{

if (!Uri.IsWellFormedUriString(imageUrl, UriKind.Absolute))

{

Console.WriteLine(

"\nInvalid remoteImageUrl:\n{0} \n", imageUrl);

return;

}

// Start the async process to read the text

BatchReadFileHeaders textHeaders =

await computerVision.BatchReadFileAsync(

imageUrl);

await GetTextAsync(computerVision, textHeaders.OperationLocation);

}

// Recognize text from a local image

private static async Task ExtractLocalTextAsync(

ComputerVisionClient computerVision, string imagePath)

{

if (!File.Exists(imagePath))

{

Console.WriteLine(

"\nUnable to open or read localImagePath:\n{0} \n", imagePath);

return;

}

using (Stream imageStream = File.OpenRead(imagePath))

{

// Start the async process to recognize the text

BatchReadFileInStreamHeaders textHeaders =

await computerVision.BatchReadFileInStreamAsync(

imageStream);

await GetTextAsync(computerVision, textHeaders.OperationLocation);

Examine the response

}

// Retrieve the recognized text

private static async Task GetTextAsync(

ComputerVisionClient computerVision, string operationLocation)

{

// Retrieve the URI where the recognized text will be

// stored from the Operation-Location header

string operationId = operationLocation.Substring(

operationLocation.Length - numberOfCharsInOperationId);

Console.WriteLine("\nCalling GetHandwritingRecognitionOperationResultAsync()");

ReadOperationResult result =

await computerVision.GetReadOperationResultAsync(operationId);

// Wait for the operation to complete

int i = 0;

int maxRetries = 10;

while ((result.Status == TextOperationStatusCodes.Running ||

result.Status == TextOperationStatusCodes.NotStarted) && i++ < maxRetries)

{

Console.WriteLine(

"Server status: {0}, waiting {1} seconds...", result.Status, i);

await Task.Delay(1000);

result = await computerVision.GetReadOperationResultAsync(operationId);

}

// Display the results

Console.WriteLine();

var recResults = result.RecognitionResults;

foreach (TextRecognitionResult recResult in recResults)

{

foreach (Line line in recResult.Lines)

{

Console.WriteLine(line.Text);

}

Console.WriteLine();

}

4. Replace <Subscription Key> with your valid subscription key.

5. Change computerVision.Endpoint to the Azure region associated with your subscription keys, if necessary.

6. Replace <LocalImage> with the path and file name of a local image.

7. Optionally, set remoteImageUrl to a different image.

8. Run the program.

A successful response prints the lines of recognized text for each image.

Calling GetHandwritingRecognitionOperationResultAsync()

Server status: Running, waiting 1 seconds...

dog

The quick brown fox jumps over the lazy

Pack my box with five dozen liquor jugs

Next steps

See Quickstart: Extract handwritten text - REST, C# for an example of the raw JSON output from the API call.

Explore the Computer Vision APIs used to analyze an image, detect celebrities and landmarks, create a thumbnail,

and extract printed and handwritten text.

Explore Computer Vision APIs

Azure Cognitive Services Computer Vision SDK for

Python

5/29/2019 • 6 minutes to read • Edit Online

Prerequisites

If you don't have an Azure SubscriptionIf you don't have an Azure Subscription

If you have an Azure SubscriptionIf you have an Azure Subscription

The Computer Vision service provides developers with access to advanced algorithms for processing images and

returning information. Computer Vision algorithms analyze the content of an image in different ways, depending

on the visual features you're interested in.

Analyze an image

Get subject domain list

Analyze an image by domain

Get text description of an image

Get handwritten text from image

Generate thumbnail

For more information about this service, see What is Computer Vision?.

Looking for more documentation?

SDK reference documentation

Cognitive Services Computer Vision documentation

Python 3.6+

Free Computer Vision key and associated endpoint. You need these values when you create the instance of the

ComputerVisionClient client object. Use one of the following methods to get these values.

Create a free key valid for 7 days with the Try It experience for the Computer Vision service. When the key is

created, copy the key and endpoint name. You will need this to create the client.

Keep the following after the key is created:

Key value: a 32 character string with the format of xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Key endpoint: the base endpoint URL, https://westcentralus.api.cognitive.microsoft.com

The easiest method to create a resource in your subscription is to use the following Azure CLI command. This

creates a Cognitive Service key that can be used across many cognitive services. You need to choose the existing

resource group name, for example, "my-cogserv-group" and the new computer vision resource name, such as "my-

computer-vision-resource".

RES_REGION=westeurope

RES_GROUP=<resourcegroup-name>

ACCT_NAME=<computervision-account-name>

az cognitiveservices account create \

--resource-group $RES_GROUP \

--name $ACCT_NAME \

--location $RES_REGION \

--kind CognitiveServices \

--sku S0 \

--yes

Install the SDKInstall the SDK

pip install azure-cognitiveservices-vision-computervision

Authentication

ACCOUNT_ENDPOINT=<resourcegroup-name>

ACCT_NAME=<computervision-account-name>

For Azure subscription users, get credentials for key and endpointFor Azure subscription users, get credentials for key and endpoint

RES_GROUP=<resourcegroup-name>

ACCT_NAME=<computervision-account-name>

export ACCOUNT_ENDPOINT=$(az cognitiveservices account show \

--resource-group $RES_GROUP \

--name $ACCT_NAME \

--query endpoint \

--output tsv)

export ACCOUNT_KEY=$(az cognitiveservices account keys list \

--resource-group $RES_GROUP \

--name $ACCT_NAME \

--query key1 \

--output tsv)

Create clientCreate client

Install the Azure Cognitive Services Computer Vision SDK for Python package with pip:

Once you create your Computer Vision resource, you need its endpoint, and one of its account keys to

instantiate the client object.

Use these values when you create the instance of the ComputerVisionClient client object.

For example, use the Bash terminal to set the environment variables:

If you do not remember your endpoint and key, you can use the following method to find them. If you need to

create a key and endpoint, you can use the method for Azure subscription holders or for users without an Azure

subscription.

Use the Azure CLI snippet below to populate two environment variables with the Computer Vision account

endpoint and one of its keys (you can also find these values in the Azure portal). The snippet is formatted for the

Bash shell.

Get the endpoint and key from environment variables then create the ComputerVisionClient client object.

from azure.cognitiveservices.vision.computervision import ComputerVisionClient

from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes

from msrest.authentication import CognitiveServicesCredentials

# Get endpoint and key from environment variables

import os

endpoint = os.environ['ACCOUNT_ENDPOINT']

key = os.environ['ACCOUNT_KEY']

# Set credentials

credentials = CognitiveServicesCredentials(key)

# Create client

client = ComputerVisionClient(endpoint, credentials)

Examples

Analyze an imageAnalyze an image

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/1/12/Broadway_and_Times_Square_by_night.jpg/450px-

Broadway_and_Times_Square_by_night.jpg"

image_analysis = client.analyze_image(url,visual_features=[VisualFeatureTypes.tags])

for tag in image_analysis.tags:

print(tag)

Get subject domain listGet subject domain list

models = client.list_models()

for x in models.models_property:

print(x)

Analyze an image by domainAnalyze an image by domain

You need a ComputerVisionClient client object before using any of the following tasks.

You can analyze an image for certain features with analyze_image . Use the visual_features property to set the

types of analysis to perform on the image. Common values are VisualFeatureTypes.tags and

VisualFeatureTypes.description .

Review the subject domains used to analyze your image with list_models . These domain names are used when

analyzing an image by domain. An example of a domain is landmarks .

You can analyze an image by subject domain with analyze_image_by_domain . Get the list of supported subject

domains in order to use the correct domain name.

# type of prediction

domain = "landmarks"

# Public domain image of Eiffel tower

url = "https://images.pexels.com/photos/338515/pexels-photo-338515.jpeg"

# English language response

language = "en"

analysis = client.analyze_image_by_domain(domain, url, language)

for landmark in analysis.result["landmarks"]:

print(landmark["name"])

print(landmark["confidence"])

Get text description of an imageGet text description of an image

domain = "landmarks"

url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-

francisco.jpg"

language = "en"

max_descriptions = 3

analysis = client.describe_image(url, max_descriptions, language)

for caption in analysis.captions:

print(caption.text)

print(caption.confidence)

Get text from imageGet text from image

You can get a language-based text description of an image with describe_image . Request several descriptions with

the max_description property if you are doing text analysis for keywords associated with the image. Examples of a

text description for the following image include a train crossing a bridge over a body of water ,

a large bridge over a body of water , and a train crossing a bridge over a large body of water .

You can get any handwritten or printed text from an image. This requires two calls to the SDK: batch_read_file

and get_read_operation_result . The call to batch_read_file is asynchronous. In the results of the

get_read_operation_result call, you need to check if the first call completed with TextOperationStatusCodes before

extracting the text data. The results include the text as well as the bounding box coordinates for the text.

# import models

from azure.cognitiveservices.vision.computervision.models import TextOperationStatusCodes

import time

url = "https://azurecomcdn.azureedge.net/cvt-

1979217d3d0d31c5c87cbd991bccfee2d184b55eeb4081200012bdaf6a65601a/images/shared/cognitive-services-demos/read-

text/read-1-thumbnail.png"

raw = True

custom_headers = None

numberOfCharsInOperationId = 36

# Async SDK call

rawHttpResponse = client.batch_read_file(url, custom_headers, raw)

# Get ID from returned headers

operationLocation = rawHttpResponse.headers["Operation-Location"]

idLocation = len(operationLocation) - numberOfCharsInOperationId

operationId = operationLocation[idLocation:]

# SDK call

while True:

result = client.get_read_operation_result(operationId)

if result.status not in ['NotStarted', 'Running']:

break

time.sleep(1)

# Get data

if result.status == TextOperationStatusCodes.succeeded:

for textResult in result.recognition_results:

for line in textResult.lines:

print(line.text)

print(line.bounding_box)

Generate thumbnailGenerate thumbnail

pip install Pillow

# Pillow package

from PIL import Image

# IO package to create local image

import io

width = 50

height = 50

url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-

francisco.jpg"

thumbnail = client.generate_thumbnail(width, height, url)

for x in thumbnail:

image = Image.open(io.BytesIO(x))

image.save('thumbnail.jpg')

You can generate a thumbnail (JPG) of an image with generate_thumbnail . The thumbnail does not need to be in

the same proportions as the original image.

Install Pillow to use this example:

Once Pillow is installed, use the package in the following code example to generate the thumbnail image.

Troubleshooting

GeneralGeneral

domain = "landmarks"

url = "http://www.public-domain-photos.com/free-stock-photos-4/travel/san-francisco/golden-gate-bridge-in-san-

francisco.jpg"

language = "en"

max_descriptions = 3

try:

analysis = client.describe_image(url, max_descriptions, language)

for caption in analysis.captions:

print(caption.text)

print(caption.confidence)

except HTTPFailure as e:

if e.status_code == 401:

print("Error unauthorized. Make sure your key and endpoint are correct.")

else:

raise

Handle transient errors with retriesHandle transient errors with retries

Next steps

When you interact with the ComputerVisionClient client object using the Python SDK, the

ComputerVisionErrorException class is used to return errors. Errors returned by the service correspond to the same

HTTP status codes returned for REST API requests.

For example, if you try to analyze an image with an invalid key, a 401 error is returned. In the following snippet,

the error is handled gracefully by catching the exception and displaying additional information about the error.

While working with the ComputerVisionClient client, you might encounter transient failures caused by rate limits

enforced by the service, or other transient problems like network outages. For information about handling these

types of failures, see Retry pattern in the Cloud Design Patterns guide, and the related Circuit Breaker pattern.

Applying content tags to images

Tutorial: Use Computer Vision to generate image

metadata in Azure Storage

5/10/2019 • 5 minutes to read • Edit Online

Prerequisites

Create a Computer Vision resource

In this tutorial, you will learn how to integrate the Azure Computer Vision service into a web app to generate

metadata for uploaded images. A full app guide can be found in the Azure Storage and Cognitive Services Lab on

GitHub, and this tutorial essentially covers Exercise 5 of the lab. You may wish to create the end-to-end application

by following every step, but if you'd just like to see how Computer Vision can be integrated into an existing web

app, read along here.

This tutorial shows you how to:

Create a Computer Vision resource in Azure

Perform image analysis on Azure Storage images

Attach metadata to Azure Storage images

Check image metadata using Azure Storage Explorer

If you don't have an Azure subscription, create a free account before you begin.

Visual Studio 2017 Community edition or higher, with the "ASP.NET and web development" and "Azure

development" workloads installed.

An Azure Storage account with a blob container allocated for images (follow Exercises 1 of the Azure Storage

Lab if you need help with this step).

The Azure Storage Explorer tool (follow Exercise 2 of the Azure Storage Lab if you need help with this step).

An ASP.NET web application with access to Azure Storage (follow Exercise 3 of the Azure Storage Lab to create

such an app quickly).

You will need to create a Computer Vision resource for your Azure account; this resource manages your access to

Azure's Computer Vision service.

1. Follow the instructions in Create an Azure Cognitive Services resource to create a Computer Vision

resource.

2. Then go to the menu for your resource group and click the Computer Vision API subscription that you just

created. Copy the URL under Endpoint to somewhere you can easily retrieve it in a moment. Then click

Show access keys.

Add Computer Vision credentials

Add metadata generation code

3. In the next window, copy the value of KEY 1 to the clipboard.

Next, you will add required credentials to your app so that it can access Computer Vision resources

Open your ASP.NET web application in Visual Studio and navigate to the Web.config file at the root of the project.

Add the following statements to the <appSettings> section of the file, replacing VISION_KEY with the key you

copied in the previous step, and VISION_ENDPOINT with the URL you saved in the step before that.

Then in the Solution Explorer, right-click the project and use the Manage NuGet Packages command to install

the package Microsoft.Azure.CognitiveServices.Vision.ComputerVision. This package contains the types

needed to call the Computer Vision API.

Next, you will add the code that actually leverages the Computer Vision service to create metadata for images.

These steps will apply to the ASP.NET app in the lab, but you can adapt them to your own app. What's important is

that at this point you have an ASP.NET web application that can upload images to an Azure Storage container, read

images from it, and display them in the view. If you're unsure about this, it's best to follow Exercise 3 of the Azure

Storage Lab.

Test the app

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;

// Submit the image to Azure's Computer Vision API

ComputerVisionClient vision = new ComputerVisionClient(

new ApiKeyServiceClientCredentials(ConfigurationManager.AppSettings["SubscriptionKey"]),

new System.Net.Http.DelegatingHandler[] { });

vision.Endpoint = ConfigurationManager.AppSettings["VisionEndpoint"];

VisualFeatureTypes[] features = new VisualFeatureTypes[] { VisualFeatureTypes.Description };

var result = await vision.AnalyzeImageAsync(photo.Uri.ToString(), features);

// Record the image description and tags in blob metadata

photo.Metadata.Add("Caption", result.Description.Captions[0].Text);

for (int i = 0; i < result.Description.Tags.Count; i++)

{

string key = String.Format("Tag{0}", i);

photo.Metadata.Add(key, result.Description.Tags[i]);

}

await photo.SetMetadataAsync();

foreach (IListBlobItem item in container.ListBlobs())

{

var blob = item as CloudBlockBlob;

if (blob != null)

{

blob.FetchAttributes(); // Get blob metadata

var caption = blob.Metadata.ContainsKey("Caption") ? blob.Metadata["Caption"] : blob.Name;

blobs.Add(new BlobInfo()

{

ImageUri = blob.Uri.ToString(),

ThumbnailUri = blob.Uri.ToString().Replace("/photos/", "/thumbnails/"),

Caption = caption

});

}

1. Open the HomeController.cs file in the project's Controllers folder and add the following using statements

at the top of the file:

2. Then, go to the Upload method; this method converts and uploads images to blob storage. Add the

following code immediately after the block that begins with // Generate a thumbnail (or at the end of your

image-blob-creation process). This code takes the blob containing the image (photo ), and uses Computer

Vision to generate a description for that image. The Computer Vision API also generates a list of keywords

that apply to the image. The generated description and keywords are stored in the blob's metadata so that

they can be retrieved later on.

3. Next, go to the Index method in the same file; this method enumerates the stored image blobs in the

targeted blob container (as IListBlobItem instances) and passes them to the application view. Replace the

foreach block in this method with the following code. This code calls CloudBlockBlob.FetchAttributes to

get each blob's attached metadata. It extracts the computer-generated description (caption ) from the

metadata and adds it to the BlobInfo object, which gets passed to the view.

Clean up resources

Save your changes in Visual Studio and press Ctrl+F5 to launch the application in your browser. Use the app to

upload a few images, either from the "photos" folder in the lab's resources or from your own folder. When you

hover the cursor over one of the images in the view, a tooltip window should appear and display the computer-

generated caption for the image.

To view all of the attached metadata, use the Azure Storage Explorer to view the storage container you're using for

images. Right-click any of the blobs in the container and select Properties. In the dialog, you'll see a list of key-

value pairs. The computer-generated image description is stored in the item "Caption," and the search keywords

are stored in "Tag0," "Tag1," and so on. When you're finished, click Cancel to close the dialog.

If you'd like to keep working on your web app, see the Next steps section. If you don't plan to continue using this

Next steps

application, you should delete all app-specific resources. To do this, you can simply delete the resource group that

contains your Azure Storage subscription and Computer Vision resource. This will remove the storage account, the

blobs uploaded to it, and the App Service resource needed to connect with the ASP.NET web app.

To delete the resource group, open the Resource groups blade in the portal, navigate to the resource group you

used for this project, and click Delete resource group at the top of the view. You will be asked to type the resource

group's name to confirm that you want to delete it, because once deleted, a resource group can't be recovered.

In this tutorial, you integrated Azure's Computer Vision service into an existing web app to automatically generate

captions and keywords for blob images as they're uploaded. Next, refer to the Azure Storage Lab, Exercise 6, to

learn how to add search functionality to your web app. This takes advantage of the search keywords that the

Computer Vision service generates.

Add search to your app

Applying content tags to images

2/15/2019 • 2 minutes to read • Edit Online

Image tagging example

Computer Vision returns tags based on thousands of recognizable objects, living beings, scenery, and actions.

When tags are ambiguous or not common knowledge, the API response provides 'hints' to clarify the meaning of

the tag in context of a known setting. Tags are not organized as a taxonomy and no inheritance hierarchies exist. A

collection of content tags forms the foundation for an image 'description' displayed as human readable language

formatted in complete sentences. Note, that at this point English is the only supported language for image

description.

After uploading an image or specifying an image URL, Computer Vision algorithms output tags based on the

objects, living beings, and actions identified in the image. Tagging is not limited to the main subject, such as a

person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals,

accessories, gadgets etc.

The following JSON response illustrates what Computer Vision returns when tagging visual features detected in

the example image.

.

{

"tags": [

{

"name": "grass",

"confidence": 0.9999995231628418

},

{

"name": "outdoor",

"confidence": 0.99992108345031738

},

{

"name": "house",

"confidence": 0.99685388803482056

},

{

"name": "sky",

"confidence": 0.99532157182693481

},

{

"name": "building",

"confidence": 0.99436837434768677

},

{

"name": "tree",

"confidence": 0.98880356550216675

},

{

"name": "lawn",

"confidence": 0.788884699344635

},

{

"name": "green",

"confidence": 0.71250593662261963

},

{

"name": "residential",

"confidence": 0.70859086513519287

},

{

"name": "grassy",

"confidence": 0.46624681353569031

}

],

"requestId": "06f39352-e445-42dc-96fb-0a1288ad9cf1",

"metadata": {

"height": 200,

"width": 300,

"format": "Jpeg"

}

Next steps

Learn concepts about categorizing images and describing images.

Detect common objects in images

4/19/2019 • 2 minutes to read • Edit Online

Object detection example

Object detection is similar to tagging, but the API returns the bounding box coordinates (in pixels) for each object

found. For example, if an image contains a dog, cat and person, the Detect operation will list those objects together

with their coordinates in the image. You can use this functionality to process the relationships between the objects

in an image. It also lets you determine whether there are multiple instances of the same tag in an image.

The Detect API applies tags based on the objects or living things identified in the image. There is currently no

formal relationship between the tagging taxonomy and the object detection taxonomy. At a conceptual level, the

Detect API only finds objects and living things, while the Tag API can also include contextual terms like "indoor",

which can't be localized with bounding boxes.

The following JSON response illustrates what Computer Vision returns when detecting objects in the example

image.

{

"objects":[

{

"rectangle":{

"x":730,

"y":66,

"w":135,

"h":85

},

"object":"kitchen appliance",

"confidence":0.501

},

{

"rectangle":{

"x":523,

"y":377,

"w":185,

"h":46

},

"object":"computer keyboard",

"confidence":0.51

},

{

"rectangle":{

"x":471,

"y":218,

"w":289,

"h":226

},

"object":"Laptop",

"confidence":0.85,

"parent":{

"object":"computer",

"confidence":0.851

}

},

{

"rectangle":{

"x":654,

"y":0,

"w":584,

"h":473

},

"object":"person",

"confidence":0.855

}

],

"requestId":"a7fde8fd-cc18-4f5f-99d3-897dcd07b308",

"metadata":{

"width":1260,

"height":473,

"format":"Jpeg"

}

Limitations

It's important to note the limitations of object detection so you can avoid or mitigate the effects of false negatives

(missed objects) and limited detail.

Objects are generally not detected if they're small (less than 5% of the image).

Objects are generally not detected if they're arranged closely together (a stack of plates, for example).

Objects are not differentiated by brand or product names (different types of sodas on a store shelf, for

example). However, you can get brand information from an image by using the Brand detection feature.

Use the API

The object detection feature is part of the Analyze Image API. You can call this API through a native SDK or

through REST calls. Include Objects in the visualFeatures query parameter. Then, when you get the full JSON

response, simply parse the string for the contents of the "objects" section.

Quickstart: Analyze an image (.NET SDK)

Quickstart: Analyze an image (REST API)

Detect popular brands in images

4/19/2019 • 2 minutes to read • Edit Online

Brand detection example

{

"brands":[

{

"name":"Microsoft",

"confidence":0.706,

"rectangle":{

"x":470,

"y":862,

"w":338,

"h":327

}

],

"requestId":"5fda6b40-3f60-4584-bf23-911a0042aa13",

"metadata":{

"width":2286,

"height":1715,

"format":"Jpeg"

}

Brand detection is a specialized mode of object detection that uses a database of thousands of global logos to

identify commercial brands in images or video. You can use this feature, for example, to discover which brands are

most popular on social media or most prevalent in media product placement.

The Computer Vision service detects whether there are brand logos in a given image; if so, it returns the brand

name, a confidence score, and the coordinates of a bounding box around the logo.

The built-in logo database covers popular brands in consumer electronics, clothing, and more. If you find that the

brand you're looking for is not detected by the Computer Vision service, you may be better served creating and

training your own logo detector using the Custom Vision service.

The following JSON responses illustrate what Computer Vision returns when detecting brands in the example

images.

In some cases, the brand detector will pick up both the logo image and the stylized brand name as two separate

logos.

{

"brands":[

{

"name":"Microsoft",

"confidence":0.657,

"rectangle":{

"x":436,

"y":473,

"w":568,

"h":267

}

},

{

"name":"Microsoft",

"confidence":0.85,

"rectangle":{

"x":101,

"y":561,

"w":273,

"h":263

}

],

"requestId":"10dcd2d6-0cf6-4a5e-9733-dc2e4b08ac8d",

"metadata":{

"width":1286,

"height":1715,

"format":"Jpeg"

}

Use the API

The brand detection feature is part of the Analyze Image API. You can call this API through a native SDK or

through REST calls. Include Brands in the visualFeatures query parameter. Then, when you get the full JSON

response, simply parse the string for the contents of the "brands" section.

Quickstart: Analyze an image (.NET SDK)

Quickstart: Analyze an image (REST API)

Categorize images by subject matter

4/19/2019 • 2 minutes to read • Edit Online

The 86-category concept

Image categorization examples

In addition to tags and a description, Computer Vision returns the taxonomy-based categories detected in an

image. Unlike tags, categories are organized in a parent/child hereditary hierarchy, and there are fewer of them

(86, as opposed to thousands of tags). All category names are in English. Categorization can be done by itself or

alongside the newer tags model.

Computer vision can categorize an image broadly or specifically, using the list of 86 categories in the following

diagram. For the full taxonomy in text format, see Category Taxonomy.

The following JSON response illustrates what Computer Vision returns when categorizing the example image

based on its visual features.

{

"categories": [

{

"name": "people_",

"score": 0.81640625

}

],

"requestId": "bae7f76a-1cc7-4479-8d29-48a694974705",

"metadata": {

"height": 200,

"width": 300,

"format": "Jpeg"

}

IMAGE CATEGORY

people_group

animal_dog

outdoor_mountain

The following table illustrates a typical image set and the category returned by Computer Vision for each image.

food_bread

IMAGE CATEGORY

Next steps

Learn concepts about tagging images and describing images.

Describe images with human-readable language

2/15/2019 • 2 minutes to read • Edit Online

Image description example

{

"description": {

"tags": ["outdoor", "building", "photo", "city", "white", "black", "large", "sitting", "old", "water",

"skyscraper", "many", "boat", "river", "group", "street", "people", "field", "tall", "bird", "standing"],

"captions": [

{

"text": "a black and white photo of a city",

"confidence": 0.95301952483304808

},

{

"text": "a black and white photo of a large city",

"confidence": 0.94085190563213816

},

{

"text": "a large white building in a city",

"confidence": 0.93108362931954824

}

]

},

"requestId": "b20bfc83-fb25-4b8d-a3f8-b2a1f084b159",

"metadata": {

"height": 300,

"width": 239,

"format": "Jpeg"

}

Next steps

Computer Vision can analyze an image and generate a human-readable sentence that describes its contents. The

algorithm actually retruns several descriptions based on different visual features, and each description is given a

confidence score. The final output is a list of descriptions ordered from highest to lowest confidence.

The following JSON response illustrates what Computer Vision returns when describing the example image

based on its visual features.

Learn concepts about tagging images and categorizing images.

Face detection with Computer Vision

4/19/2019 • 2 minutes to read • Edit Online

NOTENOTE

Face detection examples

{

"faces": [

{

"age": 23,

"gender": "Female",

"faceRectangle": {

"top": 45,

"left": 194,

"width": 44,

"height": 44

}

],

"requestId": "8439ba87-de65-441b-a0f1-c85913157ecd",

"metadata": {

"height": 200,

"width": 300,

"format": "Png"

}

Computer Vision can detect human faces within an image and generate the age, gender, and rectangle for each

detected face.

This feature is also offered by the Azure Face service. See this alternative for more detailed face analysis, including face

identification and pose detection.

The following example demonstrates the JSON response returned by Computer Vision for an image containing a

single human face.

The next example demonstrates the JSON response returned for an image containing multiple human faces.

{

"faces": [

{

"age": 11,

"gender": "Male",

"faceRectangle": {

"top": 62,

"left": 22,

"width": 45,

"height": 45

}

},

{

"age": 11,

"gender": "Female",

"faceRectangle": {

"top": 127,

"left": 240,

"width": 42,

"height": 42

}

},

{

"age": 37,

"gender": "Female",

"faceRectangle": {

"top": 55,

"left": 200,

"width": 41,

"height": 41

}

},

{

"age": 41,

"gender": "Male",

"faceRectangle": {

"top": 45,

"left": 103,

"width": 39,

"height": 39

}

],

"requestId": "3a383cbe-1a05-4104-9ce7-1b5cf352b239",

"metadata": {

"height": 230,

"width": 300,

"format": "Png"

}

Next steps

See the Analyze Image reference documentation to learn more about how to use the face detection feature.

Detecting image types with Computer Vision

3/12/2019 • 2 minutes to read • Edit Online

Detecting clip art

VALUE MEANING

0 Non-clip-art

1 Ambiguous

2 Normal-clip-art

3 Good-clip-art

Clip art detection examplesClip art detection examples

{

"imageType": {

"clipArtType": 3,

"lineDrawingType": 0

},

"requestId": "88c48d8c-80f3-449f-878f-6947f3b35a27",

"metadata": {

"height": 225,

"width": 300,

"format": "Jpeg"

}

With the Analyze Image API, Computer Vision can analyze the content type of images, indicating whether an

image is clip art or a line drawing.

Computer Vision analyzes an image and rates the likelihood of the image being clip art on a scale of 0 to 3, as

described in the following table.

The following JSON responses illustrates what Computer Vision returns when rating the likelihood of the example

images being clip art.

{

"imageType": {

"clipArtType": 0,

"lineDrawingType": 0

},

"requestId": "a9c8490a-2740-4e04-923b-e8f4830d0e47",

"metadata": {

"height": 200,

"width": 300,

"format": "Jpeg"

}

Detecting line drawings

Line drawing detection examplesLine drawing detection examples

{

"imageType": {

"clipArtType": 2,

"lineDrawingType": 1

},

"requestId": "6442dc22-476a-41c4-aa3d-9ceb15172f01",

"metadata": {

"height": 268,

"width": 300,

"format": "Jpeg"

}

Computer Vision analyzes an image and returns a boolean value indicating whether the image is a line drawing.

The following JSON responses illustrates what Computer Vision returns when indicating whether the example

images are line drawings.

{

"imageType": {

"clipArtType": 0,

"lineDrawingType": 0

},

"requestId": "98437d65-1b05-4ab7-b439-7098b5dfdcbf",

"metadata": {

"height": 200,

"width": 300,

"format": "Jpeg"

}

Next steps

See the Analyze Image reference documentation to learn how to detect image types.

Detect domain-specific content

4/19/2019 • 2 minutes to read • Edit Online

Scoped analysisScoped analysis

In addition to tagging and high-level categorization, Computer Vision also supports further domain-specific

analysis using models that have been trained on specialized data.

There are two ways to use the domain-specific models: by themselves (scoped analysis) or as an enhancement to

the categorization feature.

You can analyze an image using only the chosen domain-specific model by calling the Models/<model>/Analyze

API.

The following is a sample JSON response returned by the models/celebrities/analyze API for the given image:

{

"result": {

"celebrities": [{

"faceRectangle": {

"top": 391,

"left": 318,

"width": 184,

"height": 184

},

"name": "Satya Nadella",

"confidence": 0.99999856948852539

}]

},

"requestId": "8217262a-1a90-4498-a242-68376a4b956b",

"metadata": {

"width": 800,

"height": 1200,

"format": "Jpeg"

}

Enhanced categorization analysisEnhanced categorization analysis

"categories":[

{

"name":"abstract_",

"score":0.00390625

},

{

"name":"people_",

"score":0.83984375,

"detail":{

"celebrities":[

{

"name":"Satya Nadella",

"faceRectangle":{

"left":597,

"top":162,

"width":248,

"height":248

},

"confidence":0.999028444

}

],

"landmarks":[

{

"name":"Forbidden City",

"confidence":0.9978346

}

]

}

]

You can also use domain-specific models to supplement general image analysis. You do this as part of high-level

categorization by specifying domain-specific models in the details parameter of the Analyze API call.

In this case, the 86-category taxonomy classifier is called first. If any of the detected categories have a matching

domain-specific model, the image is passed through that model as well and the results are added.

The following JSON response shows how domain-specific analysis can be included as the detail node in a

broader categorization analysis.

List the domain-specific models

NAME DESCRIPTION

celebrities Celebrity recognition, supported for images classified in the

people_ category

landmarks Landmark recognition, supported for images classified in the

outdoor_ or building_ categories

{

"models":[

{

"name":"celebrities",

"categories":[

"people_",

"人_",

"pessoas_",

"gente_"

]

},

{

"name":"landmarks",

"categories":[

"outdoor_",

"户外_",

"屋外_",

"aoarlivre_",

"alairelibre_",

"building_",

"建筑_",

"建物_",

"edifício_"

]

}

]

}

Next steps

Currently, Computer Vision supports the following domain-specific models:

Calling the Models API will return this information along with the categories to which each model can apply:

Learn concepts about categorizing images.

Detect color schemes in images

4/19/2019 • 2 minutes to read • Edit Online

Color scheme detection examples

{

"color": {

"dominantColorForeground": "Black",

"dominantColorBackground": "Black",

"dominantColors": ["Black", "White"],

"accentColor": "BB6D10",

"isBwImg": false

},

"requestId": "0dc394bf-db50-4871-bdcc-13707d9405ea",

"metadata": {

"height": 202,

"width": 300,

"format": "Jpeg"

}

Dominant color examplesDominant color examples

Computer Vision analyzes the colors in an image to provide three different attributes: the dominant foreground

color, the dominant background color, and the set of dominant colors for the image as a whole. Returned colors

belong to the set: black, blue, brown, gray, green, orange, pink, purple, red, teal, white, and yellow.

Computer Vision also extracts an accent color, which represents the most vibrant color in the image, based on a

combination of dominant colors and saturation. The accent color is returned as a hexadecimal HTML color code.

Computer Vision also returns a boolean value indicating whether an image is in black and white.

The following example illustrates the JSON response returned by Computer Vision when detecting the color

scheme of the example image. In this case, the example image is not a black and white image, but the dominant

foreground and background colors are black, and the dominant colors for the image as a whole are black and

white.

The following table shows the returned foreground, background, and image colors for each sample image.

IMAGE DOMINANT COLORS

Foreground: Black

Background: White

Colors: Black, White, Green

Foreground: Black

Background: Black

Colors: Black

Accent color examplesAccent color examples

IMAGE ACCENT COLOR

#BB6D10

#C6A205

The following table shows the returned accent color, as a hexadecimal HTML color value, for each example image.

#474A84

IMAGE ACCENT COLOR

Black & white detection examplesBlack & white detection examples

IMAGE BLACK & WHITE?

true

false

Next steps

The following table shows Computer Vision's black and white evaluation in the sample images.

Learn concepts about detecting image types.

Generating smart-cropped thumbnails with

Computer Vision

4/19/2019 • 2 minutes to read • Edit Online

Area of interest

Examples

A thumbnail is a reduced-size representation of an image. Thumbnails are used to represent images and other

data in a more economical, layout-friendly way. The Computer Vision API uses smart cropping, together with

resizing the image, to create intuitive thumbnails for a given image.

The Computer Vision thumbnail generation algorithm works as follows:

1. Remove distracting elements from the image and identify the area of interest—the area of the image in which

the main object(s) appears.

2. Crop the image based on the identified area of interest.

3. Change the aspect ratio to fit the target thumbnail dimensions.

When you upload an image, the Computer Vision API analyzes it to determine the area of interest. It can then use

this region to determine how to crop the image. The cropping operation, however, will always match the desired

aspect ratio if one is specified.

You can also get the raw bounding box coordinates of this same area of interest by calling the areaOfInterest API

instead. You can then use this information to modify the original image however you wish.

The generated thumbnail can vary widely depending on what you specify for height, width, and smart cropping, as

shown in the following image.

The following table illustrates typical thumbnails generated by Computer Vision for the example images. The

thumbnails were generated for a specified target height and width of 50 pixels, with smart cropping enabled.

IMAGE THUMBNAIL

Next steps

Learn about tagging images and categorizing images.

Recognize printed and handwritten text

5/29/2019 • 3 minutes to read • Edit Online

Read API

NOTENOTE

Image requirementsImage requirements

LimitationsLimitations

OCR (optical character recognition) API

Computer Vision provides a number of services that detect and extract printed or handwritten text that appears in

images. This is useful in a variety of scenarios such as note taking, medical records, security, and banking. The

following three sections detail three different text recognition APIs, each optimized for different use cases.

The Read API detects text content in an image using our latest recognition models and converts the identified text

into a machine-readable character stream. It's optimized for text-heavy images (such as documents that have been

digitally scanned) and for images with a lot of visual noise. It will determine which recognition model to use for

each line of text, supporting images with both printed and handwritten text. The Read API executes asynchronously

because larger documents can take several minutes to return a result.

The Read operation maintains the original line groupings of recognized words in its output. Each line comes with

bounding box coordinates, and each word within the line also has its own coordinates. If a word was recognized

with low confidence, that information is conveyed as well. See the Read API reference docs to learn more.

This feature is only available for English text.

The Read API works with images that meet the following requirements:

The image must be presented in JPEG, PNG, BMP, PDF, or TIFF format.

The dimensions of the image must be between 50 x 50 and 4200 x 4200 pixels. PDF pages must be 17 x 17

inches or smaller.

The file size of the image must be less than 20 megabytes (MB).

If you are using a free-tier subscription, the Read API will only process the first two pages of a PDF or TIFF

document. With a paid subscription, it will process up to 200 pages. Also note that the API will detect a maximum

of 300 lines per page.

Computer Vision's optical character recognition (OCR) API is similar to the Read API, but it executes

synchronously and is not optimized for large documents. It uses an earlier recognition model but works with more

languages; see Language support for a full list of the supported languages.

If necessary, OCR corrects the rotation of the recognized text by returning the rotational offset in degrees about

the horizontal image axis. OCR also provides the frame coordinates of each word, as seen in the following

illustration.

Image requirementsImage requirements

LimitationsLimitations

Recognize Text API

NOTENOTE

Image requirementsImage requirements

Limitations

See the OCR reference docs to learn more.

The OCR API works on images that meet the following requirements:

The image must be presented in JPEG, PNG, GIF, or BMP format.

The size of the input image must be between 50 x 50 and 4200 x 4200 pixels.

The text in the image can be rotated by any multiple of 90 degrees plus a small angle of up to 40 degrees.

On photographs where text is dominant, false positives may come from partially recognized words. On some

photographs, especially photos without any text, precision can vary depending on the type of image.

The Recognize Text API is being deprecated in favor of the Read API. The Read API has similar capabilities and is updated to

handle PDF, TIFF, and multi-page files.

The Recognize Text API is similar to OCR, but it executes asynchronously and uses updated recognition models.

See the Recognize Text API reference docs to learn more.

The Recognize Text API works with images that meet the following requirements:

The image must be presented in JPEG, PNG, or BMP format.

The dimensions of the image must be between 50 x 50 and 4200 x 4200 pixels.

The file size of the image must be less than 4 megabytes (MB).

The accuracy of text recognition operations depends on the quality of the images. The following factors may cause

an inaccurate reading:

Blurry images.

Handwritten or cursive text.

Artistic font styles.

Small text size.

Complex backgrounds, shadows, or glare over text or perspective distortion.

Oversized or missing capital letters at the beginnings of words.

Subscript, superscript, or strikethrough text.

Next steps

Follow the Extract printed text (OCR) quickstart to implement text recognition in a simple C# app.

Detect adult and racy content

2/15/2019 • 2 minutes to read • Edit Online

NOTENOTE

Content flag definitions

Identify adult and racy content

Next steps

Computer Vision can detect adult material in images so that developers can restrict the display of such images in

their software. Content flags are applied with a score between zero and one so that developers can interpret the

results according to their own preferences.

This feature is also offered by the Azure Content Moderator service. See this alternative for solutions to more rigorous

content moderation scenarios, such as text moderation and human review workflows.

Adult images are defined as those which are pornographic in nature and often depict nudity and sexual acts.

Racy images are defined as images that are sexually suggestive in nature and often contain less sexually explicit

content than images tagged as Adult.

The Analyze API.

The Analyze Image method returns two boolean properties, isAdultContent and isRacyContent , in the JSON

response of the method to indicate adult and racy content respectively. The method also returns two properties,

adultScore and racyScore , which represent the confidence scores for identifying adult and racy content

respectively.

Learn concepts about detecting domain-specific content and detecting faces.

Use Computer Vision features with the REST API and

Java

5/7/2019 • 22 minutes to read • Edit Online

Prerequisites

Platform requirementsPlatform requirements

Subscribe to Computer Vision API and get a subscription keySubscribe to Computer Vision API and get a subscription key

Acquire incomplete tutorial project

Download the projectDownload the project

Import the tutorial projectImport the tutorial project

This tutorial shows the features of the Azure Cognitive Services Computer Vision REST API.

Explore a Java Swing application that uses the Computer Vision REST API to perform optical character recognition

(OCR), create smart-cropped thumbnails, plus detect, categorize, tag, and describe visual features, including faces,

in an image. This example lets you submit an image URL for analysis or processing. You can use this open source

example as a template for building your own app in Java to use the Computer Vision REST API.

This tutorial will cover how to use Computer Vision to:

Analyze an image

Identify a natural or artificial landmark in an image

Identify a celebrity in an image

Create a quality thumbnail from an image

Read printed text in an image

Read handwritten text in an image

The Java Swing form application has already been written but has no functionality. In this tutorial, you add the code

specific to the Computer Vision REST API to complete the application's functionality.

This tutorial has been developed using the NetBeans IDE. Specifically, the Java SE version of NetBeans, which you

can download here.

Before creating the example, you must subscribe to Computer Vision API which is part of Azure Cognitive Services.

For subscription and key management details, see Subscriptions. Both the primary and secondary keys are valid to

use in this tutorial.

1. Go to the Cognitive Services Java Computer Vision Tutorial repository.

2. Click the Clone or download button.

3. Click Download ZIP to download a .zip file of the tutorial project.

There is no need to extract the contents of the .zip file because NetBeans imports the project from the .zip file.

Import the cognitive-services-java-computer-vision-tutorial-master.zip file into NetBeans.

1. In NetBeans, click File > Import Project > From ZIP.... The Import Project(s) from ZIP dialog box appears.

2. In the ZIP File: field, click the Browse button to locate the cognitive-services-java-computer-vision-

tutorial-master.zip file, then click Open.

Build and run the tutorial projectBuild and run the tutorial project

Add tutorial code to the project

Analyze an imageAnalyze an image

Add the event handler code for the analyze buttonAdd the event handler code for the analyze button

NOTENOTE

3. Click Import from the Import Project(s) from ZIP dialog box.

4. In the Projects panel, expand ComputerVision > Source Packages > <default package>. Some versions of

NetBeans use src instead of Source Packages > <default package>. In that case, expand src.

5. Double-click MainFrame.java to load the file into the NetBeans editor. The Design tab of the

MainFrame.java file appears.

6. Click the Source tab to view the Java source code.

1. Press F6 to build and run the tutorial application.

In the tutorial application, click a tab to bring up the pane for that feature. The buttons have empty methods,

so they do nothing.

At the bottom of the window are the fields Subscription Key and Subscription Region. These fields must

be filled with a valid subscription key and the correct region for that subscription key. To obtain a

subscription key, see Subscriptions. If you obtained your subscription key from the free trial at that link, then

the default region westcentralus is the correct region for your subscription keys.

2. Exit the tutorial application.

The Java Swing application is set up with six tabs. Each tab demonstrates a different function of Computer Vision

(analyze, OCR, and so on). The six tutorial sections do not have interdependencies, so you can add one section, all

six sections, or any subset. You can add the sections in any order.

The Analyze feature of Computer Vision scans an image for more than 2,000 recognizable objects, living things,

scenery, and actions. Once the analysis is complete, Analyze returns a JSON object that describes the image with

descriptive tags, color analysis, captions, and more.

To complete the Analyze feature of the tutorial application, perform the following steps:

The analyzeImageButtonActionPerformed event handler method clears the form, displays the image specified

in the URL, then calls the AnalyzeImage method to analyze the image. When AnalyzeImage returns, the method

displays the formatted JSON response in the Response text area, extracts the first caption from the JSONObject,

and displays the caption and the confidence level that the caption is correct.

Copy and paste the following code into the analyzeImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

private void analyzeImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL analyzeImageUrl;

// Clear out the previous image, response, and caption, if any.

analyzeImage.setIcon(new ImageIcon());

analyzeCaptionLabel.setText("");

analyzeResponseTextArea.setText("");

// Display the image specified in the text box.

try {

analyzeImageUrl = new URL(analyzeImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(analyzeImageUrl);

scaleAndShowImage(bImage, analyzeImage);

} catch(IOException e) {

analyzeResponseTextArea.setText("Error loading Analyze image: " + e.getMessage());

return;

}

// Analyze the image.

JSONObject jsonObj = AnalyzeImage(analyzeImageUrl.toString());

// A return of null indicates failure.

if (jsonObj == null) {

return;

}

// Format and display the JSON response.

analyzeResponseTextArea.setText(jsonObj.toString(2));

// Extract the text and confidence from the first caption in the description object.

if (jsonObj.has("description") && jsonObj.getJSONObject("description").has("captions")) {

JSONObject jsonCaption =

jsonObj.getJSONObject("description").getJSONArray("captions").getJSONObject(0);

if (jsonCaption.has("text") && jsonCaption.has("confidence")) {

analyzeCaptionLabel.setText("Caption: " + jsonCaption.getString("text") +

" (confidence: " + jsonCaption.getDouble("confidence") + ").");

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

The AnalyzeImage method wraps the REST API call to analyze an image. The method returns a JSONObject

describing the image, or null if there was an error.

Copy and paste the AnalyzeImage method to just underneath the analyzeImageButtonActionPerformed

method.

/**

* Encapsulates the Microsoft Cognitive Services REST API call to analyze an image.

* @param imageUrl: The string URL of the image to analyze.

* @return: A JSONObject describing the image, or null if a runtime error occurs.

*/

private JSONObject AnalyzeImage(String imageUrl) {

try (CloseableHttpClient httpclient = HttpClientBuilder.create().build())

{

// Create the URI to access the REST API call for Analyze Image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseAnalyze;

URIBuilder builder = new URIBuilder(uriString);

// Request parameters. All of them are optional.

builder.setParameter("visualFeatures", "Categories,Description,Color,Adult");

builder.setParameter("language", "en");

// Prepare the URI for the REST API call.

URI uri = builder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity reqEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(reqEntity);

// Execute the REST API call and get the response entity.

HttpResponse response = httpclient.execute(request);

HttpEntity entity = response.getEntity();

// If we got a response, parse it and display it.

if (entity != null)

{

// Return the JSONObject.

String jsonString = EntityUtils.toString(entity);

return new JSONObject(jsonString);

} else {

// No response. Return null.

return null;

}

catch (Exception e)

{

// Display error message.

System.out.println(e.getMessage());

return null;

}

Run the Analyze functionRun the Analyze function

Recognize a landmarkRecognize a landmark

Press F6 to run the application. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Enter a URL to an image to analyze, then click the Analyze

Image button to analyze an image and see the result.

The Landmark feature of Computer Vision analyzes an image for natural and artificial landmarks, such as

mountains or famous buildings. Once the analysis is complete, Landmark returns a JSON object that identifies the

landmarks found in the image.

To complete the Landmark feature of the tutorial application, perform the following steps:

Add the event handler code for the form buttonAdd the event handler code for the form button

NOTENOTE

private void landmarkImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL landmarkImageUrl;

// Clear out the previous image, response, and caption, if any.

landmarkImage.setIcon(new ImageIcon());

landmarkCaptionLabel.setText("");

landmarkResponseTextArea.setText("");

// Display the image specified in the text box.

try {

landmarkImageUrl = new URL(landmarkImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(landmarkImageUrl);

scaleAndShowImage(bImage, landmarkImage);

} catch(IOException e) {

landmarkResponseTextArea.setText("Error loading Landmark image: " + e.getMessage());

return;

}

// Identify the landmark in the image.

JSONObject jsonObj = LandmarkImage(landmarkImageUrl.toString());

// A return of null indicates failure.

if (jsonObj == null) {

return;

}

// Format and display the JSON response.

landmarkResponseTextArea.setText(jsonObj.toString(2));

// Extract the text and confidence from the first caption in the description object.

if (jsonObj.has("result") && jsonObj.getJSONObject("result").has("landmarks")) {

JSONObject jsonCaption =

jsonObj.getJSONObject("result").getJSONArray("landmarks").getJSONObject(0);

if (jsonCaption.has("name") && jsonCaption.has("confidence")) {

landmarkCaptionLabel.setText("Caption: " + jsonCaption.getString("name") +

" (confidence: " + jsonCaption.getDouble("confidence") + ").");

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

The landmarkImageButtonActionPerformed event handler method clears the form, displays the image

specified in the URL, then calls the LandmarkImage method to analyze the image. When LandmarkImage

returns, the method displays the formatted JSON response in the Response text area, then extracts the first

landmark name from the JSONObject and displays it on the window along with the confidence level that the

landmark was identified correctly.

Copy and paste the following code into the landmarkImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

The LandmarkImage method wraps the REST API call to analyze an image. The method returns a JSONObject

describing the landmarks found in the image, or null if there was an error.

/**

* Encapsulates the Microsoft Cognitive Services REST API call to identify a landmark in an image.

* @param imageUrl: The string URL of the image to process.

* @return: A JSONObject describing the image, or null if a runtime error occurs.

*/

private JSONObject LandmarkImage(String imageUrl) {

try (CloseableHttpClient httpclient = HttpClientBuilder.create().build())

{

// Create the URI to access the REST API call to identify a Landmark in an image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseLandmark;

URIBuilder builder = new URIBuilder(uriString);

// Request parameters. All of them are optional.

builder.setParameter("visualFeatures", "Categories,Description,Color");

builder.setParameter("language", "en");

// Prepare the URI for the REST API call.

URI uri = builder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity reqEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(reqEntity);

// Execute the REST API call and get the response entity.

HttpResponse response = httpclient.execute(request);

HttpEntity entity = response.getEntity();

// If we got a response, parse it and display it.

if (entity != null)

{

// Return the JSONObject.

String jsonString = EntityUtils.toString(entity);

return new JSONObject(jsonString);

} else {

// No response. Return null.

return null;

}

catch (Exception e)

{

// Display error message.

System.out.println(e.getMessage());

return null;

}

Run the landmark functionRun the landmark function

Recognize celebritiesRecognize celebrities

Copy and paste the LandmarkImage method to just underneath the landmarkImageButtonActionPerformed

method.

Press F6 to run the application. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Click the Landmark tab, enter a URL to an image of a landmark,

then click the Analyze Image button to analyze an image and see the result.

The Celebrities feature of Computer Vision analyzes an image for famous people. Once the analysis is complete,

Celebrities returns a JSON object that identifies the Celebrities found in the image.

Add the event handler code for the celebrities buttonAdd the event handler code for the celebrities button

NOTENOTE

private void celebritiesImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL celebritiesImageUrl;

// Clear out the previous image, response, and caption, if any.

celebritiesImage.setIcon(new ImageIcon());

celebritiesCaptionLabel.setText("");

celebritiesResponseTextArea.setText("");

// Display the image specified in the text box.

try {

celebritiesImageUrl = new URL(celebritiesImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(celebritiesImageUrl);

scaleAndShowImage(bImage, celebritiesImage);

} catch(IOException e) {

celebritiesResponseTextArea.setText("Error loading Celebrity image: " + e.getMessage());

return;

}

// Identify the celebrities in the image.

JSONObject jsonObj = CelebritiesImage(celebritiesImageUrl.toString());

// A return of null indicates failure.

if (jsonObj == null) {

return;

}

// Format and display the JSON response.

celebritiesResponseTextArea.setText(jsonObj.toString(2));

// Extract the text and confidence from the first caption in the description object.

if (jsonObj.has("result") && jsonObj.getJSONObject("result").has("celebrities")) {

JSONObject jsonCaption =

jsonObj.getJSONObject("result").getJSONArray("celebrities").getJSONObject(0);

if (jsonCaption.has("name") && jsonCaption.has("confidence")) {

celebritiesCaptionLabel.setText("Caption: " + jsonCaption.getString("name") +

" (confidence: " + jsonCaption.getDouble("confidence") + ").");

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

To complete the Celebrities feature of the tutorial application, perform the following steps:

The celebritiesImageButtonActionPerformed event handler method clears the form, displays the image

specified in the URL, then calls the CelebritiesImage method to analyze the image. When CelebritiesImage

returns, the method displays the formatted JSON response in the Response text area, then extracts the first

celebrity name from the JSONObject and displays the name on the window along with the confidence level that

the celebrity was identified correctly.

Copy and paste the following code into the celebritiesImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

The CelebritiesImage method wraps the REST API call to analyze an image. The method returns a JSONObject

/**

* Encapsulates the Microsoft Cognitive Services REST API call to identify celebrities in an image.

* @param imageUrl: The string URL of the image to process.

* @return: A JSONObject describing the image, or null if a runtime error occurs.

*/

private JSONObject CelebritiesImage(String imageUrl) {

try (CloseableHttpClient httpclient = HttpClientBuilder.create().build())

{

// Create the URI to access the REST API call to identify celebrities in an image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseCelebrities;

URIBuilder builder = new URIBuilder(uriString);

// Request parameters. All of them are optional.

builder.setParameter("visualFeatures", "Categories,Description,Color");

builder.setParameter("language", "en");

// Prepare the URI for the REST API call.

URI uri = builder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity reqEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(reqEntity);

// Execute the REST API call and get the response entity.

HttpResponse response = httpclient.execute(request);

HttpEntity entity = response.getEntity();

// If we got a response, parse it and display it.

if (entity != null)

{

// Return the JSONObject.

String jsonString = EntityUtils.toString(entity);

return new JSONObject(jsonString);

} else {

// No response. Return null.

return null;

}

catch (Exception e)

{

// Display error message.

System.out.println(e.getMessage());

return null;

}

Run the celebrities functionRun the celebrities function

Intelligently generate a thumbnailIntelligently generate a thumbnail

describing the celebrities found in the image, or null if there was an error.

Copy and paste the CelebritiesImage method to just underneath the

celebritiesImageButtonActionPerformed method.

Press F6 to run the application. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Click the Celebrities tab, enter a URL to an image of a celebrity,

then click the Analyze Image button to analyze an image and see the result.

The Thumbnail feature of Computer Vision generates a thumbnail from an image. By using the Smart Crop

Add the event handler code for the thumbnail buttonAdd the event handler code for the thumbnail button

NOTENOTE

private void thumbnailImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL thumbnailImageUrl;

JSONObject jsonError[] = new JSONObject[1];

// Clear out the previous image, response, and thumbnail, if any.

thumbnailSourceImage.setIcon(new ImageIcon());

thumbnailResponseTextArea.setText("");

thumbnailImage.setIcon(new ImageIcon());

// Display the image specified in the text box.

try {

thumbnailImageUrl = new URL(thumbnailImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(thumbnailImageUrl);

scaleAndShowImage(bImage, thumbnailSourceImage);

} catch(IOException e) {

thumbnailResponseTextArea.setText("Error loading image to thumbnail: " + e.getMessage());

return;

}

// Get the thumbnail for the image.

BufferedImage thumbnail = getThumbnailImage(thumbnailImageUrl.toString(), jsonError);

// A non-null value indicates error.

if (jsonError[0] != null) {

// Format and display the JSON error.

thumbnailResponseTextArea.setText(jsonError[0].toString(2));

return;

}

// Display the thumbnail.

if (thumbnail != null) {

scaleAndShowImage(thumbnail, thumbnailImage);

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

feature, the Thumbnail feature will identify the area of interest in an image and center the thumbnail on this area, to

generate more aesthetically pleasing thumbnail images.

To complete the Thumbnail feature of the tutorial application, perform the following steps:

The thumbnailImageButtonActionPerformed event handler method clears the form, displays the image

specified in the URL, then calls the getThumbnailImage method to create the thumbnail. When

getThumbnailImage returns, the method displays the generated thumbnail.

Copy and paste the following code into the thumbnailImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

The getThumbnailImage method wraps the REST API call to analyze an image. The method returns a

BufferedImage that contains the thumbnail, or null if there was an error. The error message will be returned in

the first element of the jsonError string array.

Copy and paste the following getThumbnailImage method to just underneath the

thumbnailImageButtonActionPerformed method.

/**

* Encapsulates the Microsoft Cognitive Services REST API call to create a thumbnail for an image.

* @param imageUrl: The string URL of the image to process.

* @return: A BufferedImage containing the thumbnail, or null if a runtime error occurs. In the case

* of an error, the error message will be returned in the first element of the jsonError string array.

*/

private BufferedImage getThumbnailImage(String imageUrl, JSONObject[] jsonError) {

try (CloseableHttpClient httpclient = HttpClientBuilder.create().build())

{

// Create the URI to access the REST API call to identify celebrities in an image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseThumbnail;

URIBuilder uriBuilder = new URIBuilder(uriString);

// Request parameters.

uriBuilder.setParameter("width", "100");

uriBuilder.setParameter("height", "150");

uriBuilder.setParameter("smartCropping", "true");

// Prepare the URI for the REST API call.

URI uri = uriBuilder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity requestEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(requestEntity);

// Execute the REST API call and get the response entity.

HttpResponse response = httpclient.execute(request);

HttpEntity entity = response.getEntity();

// Check for success.

if (response.getStatusLine().getStatusCode() == 200)

{

// Return the thumbnail.

return ImageIO.read(entity.getContent());

}

else

{

// Format and display the JSON error message.

String jsonString = EntityUtils.toString(entity);

jsonError[0] = new JSONObject(jsonString);

return null;

}

catch (Exception e)

{

String errorMessage = e.getMessage();

System.out.println(errorMessage);

jsonError[0] = new JSONObject(errorMessage);

return null;

}

Run the thumbnail functionRun the thumbnail function

Read printed text (OCR)

Press F6 to run the application. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Click the Thumbnail tab, enter a URL to an image, then click the

Generate Thumbnail button to analyze an image and see the result.

Add the event handler code for the OCR buttonAdd the event handler code for the OCR button

NOTENOTE

private void ocrImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL ocrImageUrl;

// Clear out the previous image, response, and caption, if any.

ocrImage.setIcon(new ImageIcon());

ocrResponseTextArea.setText("");

// Display the image specified in the text box.

try {

ocrImageUrl = new URL(ocrImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(ocrImageUrl);

scaleAndShowImage(bImage, ocrImage);

} catch(IOException e) {

ocrResponseTextArea.setText("Error loading OCR image: " + e.getMessage());

return;

}

// Read the text in the image.

JSONObject jsonObj = OcrImage(ocrImageUrl.toString());

// A return of null indicates failure.

if (jsonObj == null) {

return;

}

// Format and display the JSON response.

ocrResponseTextArea.setText(jsonObj.toString(2));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

The Optical Character Recognition (OCR) feature of Computer Vision analyzes an image of printed text. After the

analysis is complete, OCR returns a JSON object that contains the text and the location of the text in the image.

To complete the OCR feature of the tutorial application, perform the following steps:

The ocrImageButtonActionPerformed event handler method clears the form, displays the image specified in the

URL, then calls the OcrImage method to analyze the image. When OcrImage returns, the method displays the

detected text as formatted JSON in the Response text area.

Copy and paste the following code into the ocrImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

The OcrImage method wraps the REST API call to analyze an image. The method returns a JSONObject of the

JSON data returned from the call, or null if there was an error.

Copy and paste the following OcrImage method to just underneath the ocrImageButtonActionPerformed

method.

/**

* Encapsulates the Microsoft Cognitive Services REST API call to read text in an image.

* @param imageUrl: The string URL of the image to process.

* @return: A JSONObject describing the image, or null if a runtime error occurs.

*/

private JSONObject OcrImage(String imageUrl) {

try (CloseableHttpClient httpclient = HttpClientBuilder.create().build())

{

// Create the URI to access the REST API call to read text in an image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseOcr;

URIBuilder uriBuilder = new URIBuilder(uriString);

// Request parameters.

uriBuilder.setParameter("language", "unk");

uriBuilder.setParameter("detectOrientation ", "true");

// Prepare the URI for the REST API call.

URI uri = uriBuilder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity reqEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(reqEntity);

// Execute the REST API call and get the response entity.

HttpResponse response = httpclient.execute(request);

HttpEntity entity = response.getEntity();

// If we got a response, parse it and display it.

if (entity != null)

{

// Return the JSONObject.

String jsonString = EntityUtils.toString(entity);

return new JSONObject(jsonString);

} else {

// No response. Return null.

return null;

}

catch (Exception e)

{

// Display error message.

System.out.println(e.getMessage());

return null;

}

Run the OCR functionRun the OCR function

Read handwritten text (handwriting recognition)

Press F6 to run the application. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Click the OCR tab, enter a URL to an image of printed text, then

click the Read Image button to analyze an image and see the result.

The Handwriting Recognition feature of Computer Vision analyzes an image of handwritten text. After the analysis

is complete, Handwriting Recognition returns a JSON object that contains the text and the location of the text in the

image.

To complete the Handwriting Recognition feature of the tutorial application, perform the following steps:

Add the event handler code for the handwriting buttonAdd the event handler code for the handwriting button

NOTENOTE

private void handwritingImageButtonActionPerformed(java.awt.event.ActionEvent evt) {

URL handwritingImageUrl;

// Clear out the previous image, response, and caption, if any.

handwritingImage.setIcon(new ImageIcon());

handwritingResponseTextArea.setText("");

// Display the image specified in the text box.

try {

handwritingImageUrl = new URL(handwritingImageUriTextBox.getText());

BufferedImage bImage = ImageIO.read(handwritingImageUrl);

scaleAndShowImage(bImage, handwritingImage);

} catch(IOException e) {

handwritingResponseTextArea.setText("Error loading Handwriting image: " + e.getMessage());

return;

}

// Read the text in the image.

JSONObject jsonObj = HandwritingImage(handwritingImageUrl.toString());

// A return of null indicates failure.

if (jsonObj == null) {

return;

}

// Format and display the JSON response.

handwritingResponseTextArea.setText(jsonObj.toString(2));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

/**

* Encapsulates the Microsoft Cognitive Services REST API call to read handwritten text in an image.

* @param imageUrl: The string URL of the image to process.

* @return: A JSONObject describing the image, or null if a runtime error occurs.

*/

private JSONObject HandwritingImage(String imageUrl) {

try (CloseableHttpClient textClient = HttpClientBuilder.create().build();

CloseableHttpClient resultClient = HttpClientBuilder.create().build())

The handwritingImageButtonActionPerformed event handler method clears the form, displays the image

specified in the URL, then calls the HandwritingImage method to analyze the image. When HandwritingImage

returns, the method displays the detected text as formatted JSON in the Response text area.

Copy and paste the following code into the handwritingImageButtonActionPerformed method.

NetBeans won't let you paste to the method definition line (private void ) or to the closing curly brace of that method. To

copy the code, copy the lines between the method definition and the closing curly brace, and paste them over the contents of

the method.

The HandwritingImage method wraps the two REST API calls needed to analyze an image. Because handwriting

recognition is a time consuming process, a two step process is used. The first call submits the image for processing;

the second call retrieves the detected text when the processing is complete.

After the text is retrieved, the HandwritingImage method returns a JSONObject describing the text and the

locations of the text, or null if there was an error.

Copy and paste the following HandwritingImage method to just underneath the

handwritingImageButtonActionPerformed method.

{

// Create the URI to access the REST API call to read text in an image.

String uriString = uriBasePreRegion +

String.valueOf(subscriptionRegionComboBox.getSelectedItem()) +

uriBasePostRegion + uriBaseHandwriting;

URIBuilder uriBuilder = new URIBuilder(uriString);

// Request parameters.

uriBuilder.setParameter("handwriting", "true");

// Prepare the URI for the REST API call.

URI uri = uriBuilder.build();

HttpPost request = new HttpPost(uri);

// Request headers.

request.setHeader("Content-Type", "application/json");

request.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

// Request body.

StringEntity reqEntity = new StringEntity("{\"url\":\"" + imageUrl + "\"}");

request.setEntity(reqEntity);

// Execute the REST API call and get the response.

HttpResponse textResponse = textClient.execute(request);

// Check for success.

if (textResponse.getStatusLine().getStatusCode() != 202) {

// An error occurred. Return the JSON error message.

HttpEntity entity = textResponse.getEntity();

String jsonString = EntityUtils.toString(entity);

return new JSONObject(jsonString);

}

String operationLocation = null;

// The 'Operation-Location' in the response contains the URI to retrieve the recognized text.

Header[] responseHeaders = textResponse.getAllHeaders();

for(Header header : responseHeaders) {

if(header.getName().equals("Operation-Location"))

{

// This string is the URI where you can get the text recognition operation result.

operationLocation = header.getValue();

break;

}

// NOTE: The response may not be immediately available. Handwriting recognition is an

// async operation that can take a variable amount of time depending on the length

// of the text you want to recognize. You may need to wait or retry this operation.

//

// This example checks once per second for ten seconds.

JSONObject responseObj = null;

int i = 0;

do {

// Wait one second.

Thread.sleep(1000);

// Check to see if the operation completed.

HttpGet resultRequest = new HttpGet(operationLocation);

resultRequest.setHeader("Ocp-Apim-Subscription-Key", subscriptionKeyTextField.getText());

HttpResponse resultResponse = resultClient.execute(resultRequest);

HttpEntity responseEntity = resultResponse.getEntity();

if (responseEntity != null)

{

// Get the JSON response.

String jsonString = EntityUtils.toString(responseEntity);

responseObj = new JSONObject(jsonString);

}

while (i < 10 && responseObj != null &&

!responseObj.getString("status").equalsIgnoreCase("Succeeded"));

// If the operation completed, return the JSON object.

if (responseObj != null) {

return responseObj;

} else {

// Return null for timeout error.

System.out.println("Timeout error.");

return null;

}

catch (Exception e)

{

// Display error message.

System.out.println(e.getMessage());

return null;

}

Run the handwriting functionRun the handwriting function

Next steps

To run the application, press F6. Put your subscription key into the Subscription Key field and verify that you are

using the correct region in Subscription Region. Click the Read Handwritten Text tab, enter a URL to an image

of handwritten text, then click the Read Image button to analyze an image and see the result.

In this guide, you used the Computer Vision REST API with Java to test many of the available image analysis

features. Next, see the reference documentation to learn more about the APIs involved.

Computer Vision REST API

Use Computer Vision features with the REST API and

JavaScript

5/7/2019 • 18 minutes to read • Edit Online

Prerequisites

Platform requirementsPlatform requirements

Subscribe to Computer Vision API and get a subscription keySubscribe to Computer Vision API and get a subscription key

Acquire incomplete tutorial project

Download the projectDownload the project

Add tutorial code to the project

Analyze an imageAnalyze an image

Add the event handler code for the analyze buttonAdd the event handler code for the analyze button

This guide shows the features of the Azure Cognitive Services Computer Vision REST API.

Explore a JavaScript application that uses the Computer Vision REST API to perform optical character recognition

(OCR), create smart-cropped thumbnails, plus detect, categorize, tag, and describe visual features, including faces,

in an image. This example lets you submit an image URL for analysis or processing. You can use this open source

example as a template for building your own JavaScript app to use the Computer Vision REST API.

The JavaScript form application has already been written, but has no Computer Vision functionality. In this guide,

you add the code specific to the Computer Vision REST API to complete the application's functionality.

You can follow the steps in this guide using a simple text editor.

Before creating the example, you must subscribe to Computer Vision API which is part of the Azure Cognitive

Services. For subscription and key management details, see Subscriptions. Both the primary and secondary keys

are valid to use in this guide.

Clone the Cognitive Services JavaScript Computer Vision Tutorial, or download the .zip file and extract it to an

empty directory.

If you would prefer to use the finished project with all tutorial code added, you can use the files in the Completed

folder.

The JavaScript application is set up with six .html files, one for each feature. Each file demonstrates a different

function of Computer Vision (analyze, OCR, etc.). The six sections do not have interdependencies, so you can add

the tutorial code to one file, all six files, or only a couple of files. And you can add the tutorial code to the files in any

order.

The Analyze feature of Computer Vision scans an image for thousands of recognizable objects, living things,

scenery, and actions. Once the analysis is complete, Analyze returns a JSON object that describes the image with

descriptive tags, color analysis, captions, and more.

To complete the Analyze feature of the application, perform the following steps:

Open the analyze.html file in a text editor and locate the analyzeButtonClick function near the bottom of the

file.

function analyzeButtonClick() {

// Clear the display fields.

$("#sourceImage").attr("src", "#");

$("#responseTextArea").val("");

$("#captionSpan").text("");

// Display the image.

var sourceImageUrl = $("#inputImage").val();

$("#sourceImage").attr("src", sourceImageUrl);

AnalyzeImage(sourceImageUrl, $("#responseTextArea"), $("#captionSpan"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

The analyzeButtonClick event handler function clears the form, displays the image specified in the URL, then

calls the AnalyzeImage function to analyze the image. Copy and paste the following code into the

analyzeButtonClick function.

The AnalyzeImage function wraps the REST API call to analyze an image. Upon a successful return, the formatted

JSON analysis will be displayed in the specified textarea, and the caption will be displayed in the specified span.

Copy and paste the AnalyzeImage function code to just underneath the analyzeButtonClick function.

/* Analyze the image at the specified URL by using Microsoft Cognitive Services Analyze Image API.

* @param {string} sourceImageUrl - The URL to the image to analyze.

* @param {<textarea> element} responseTextArea - The text area to display the JSON string returned

* from the REST API call, or to display the error message if there was

* an error.

* @param {<span> element} captionSpan - The span to display the image caption.

*/

function AnalyzeImage(sourceImageUrl, responseTextArea, captionSpan) {

// Request parameters.

var params = {

"visualFeatures": "Categories,Description,Color",

"details": "",

"language": "en",

};

// Perform the REST API call.

$.ajax({

url: common.uriBasePreRegion +

$("#subscriptionRegionSelect").val() +

common.uriBasePostRegion +

common.uriBaseAnalyze +

"?" +

$.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

responseTextArea.val(JSON.stringify(data, null, 2));

// Extract and display the caption and confidence from the first caption in the description object.

if (data.description && data.description.captions) {

var caption = data.description.captions[0];

if (caption.text && caption.confidence) {

captionSpan.text("Caption: " + caption.text +

" (confidence: " + caption.confidence + ").");

}

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Prepare the error string.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" : (jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message : jQuery.parseJSON(jqXHR.responseText).error.message;

// Put the error JSON in the response textarea.

responseTextArea.val(JSON.stringify(jqXHR, null, 2));

// Show the error message.

alert(errorString);

});

}

Run the Analyze functionRun the Analyze function

Recognize a landmarkRecognize a landmark

Add the event handler code for the landmark buttonAdd the event handler code for the landmark button

function landmarkButtonClick() {

// Clear the display fields.

$("#sourceImage").attr("src", "#");

$("#responseTextArea").val("");

$("#captionSpan").text("");

// Display the image.

var sourceImageUrl = $("#inputImage").val();

$("#sourceImage").attr("src", sourceImageUrl);

IdentifyLandmarks(sourceImageUrl, $("#responseTextArea"), $("#captionSpan"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

Save the analyze.html file and open it in a Web browser. Put your subscription key into the Subscription Key

field and verify that you are using the correct region in Subscription Region. Enter a URL to an image to analyze,

then click the Analyze Image button to analyze an image and see the result.

The Landmark feature of Computer Vision analyzes an image for natural and artificial landmarks, such as

mountains or famous buildings. Once the analysis is complete, Landmark returns a JSON object that identifies the

landmarks found in the image.

To complete the Landmark feature of the application, perform the following steps:

Open the landmark.html file in a text editor and locate the landmarkButtonClick function near the bottom of

the file.

The landmarkButtonClick event handler function clears the form, displays the image specified in the URL, then

calls the IdentifyLandmarks function to analyze the image. Copy and paste the following code into the

landmarkButtonClick function.

The IdentifyLandmarks function wraps the REST API call to analyze an image. Upon a successful return, the

formatted JSON analysis will be displayed in the specified textarea, and the caption will be displayed in the

specified span.

Copy and paste the IdentifyLandmarks function code to just underneath the landmarkButtonClick function.

/* Identify landmarks in the image at the specified URL by using Microsoft Cognitive Services

* Landmarks API.

* @param {string} sourceImageUrl - The URL to the image to analyze for landmarks.

* @param {<textarea> element} responseTextArea - The text area to display the JSON string returned

* from the REST API call, or to display the error message if there was

* an error.

* @param {<span> element} captionSpan - The span to display the image caption.

*/

function IdentifyLandmarks(sourceImageUrl, responseTextArea, captionSpan) {

// Request parameters.

var params = {

"model": "landmarks"

};

// Perform the REST API call.

$.ajax({

url: common.uriBasePreRegion +

$("#subscriptionRegionSelect").val() +

common.uriBasePostRegion +

common.uriBaseLandmark +

"?" +

$.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

responseTextArea.val(JSON.stringify(data, null, 2));

// Extract and display the caption and confidence from the first caption in the description object.

if (data.result && data.result.landmarks) {

var landmark = data.result.landmarks[0];

if (landmark.name && landmark.confidence) {

captionSpan.text("Landmark: " + landmark.name +

" (confidence: " + landmark.confidence + ").");

}

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Prepare the error string.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" : (jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message : jQuery.parseJSON(jqXHR.responseText).error.message;

// Put the error JSON in the response textarea.

responseTextArea.val(JSON.stringify(jqXHR, null, 2));

// Show the error message.

alert(errorString);

});

}

Run the landmark functionRun the landmark function

Recognize celebritiesRecognize celebrities

Add the event handler code for the celebrities buttonAdd the event handler code for the celebrities button

function celebritiesButtonClick() {

// Clear the display fields.

$("#sourceImage").attr("src", "#");

$("#responseTextArea").val("");

$("#captionSpan").text("");

// Display the image.

var sourceImageUrl = $("#inputImage").val();

$("#sourceImage").attr("src", sourceImageUrl);

IdentifyCelebrities(sourceImageUrl, $("#responseTextArea"), $("#captionSpan"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

Save the landmark.html file and open it in a Web browser. Put your subscription key into the Subscription Key

field and verify that you are using the correct region in Subscription Region. Enter a URL to an image to analyze,

then click the Analyze Image button to analyze an image and see the result.

The Celebrities feature of Computer Vision analyzes an image for famous people. Once the analysis is complete,

Celebrities returns a JSON object that identifies the Celebrities found in the image.

To complete the Celebrities feature of the application, perform the following steps:

Open the celebrities.html file in a text editor and locate the celebritiesButtonClick function near the bottom of

the file.

The celebritiesButtonClick event handler function clears the form, displays the image specified in the URL, then

calls the IdentifyCelebrities function to analyze the image. Copy and paste the following code into the

celebritiesButtonClick function.

/* Identify celebrities in the image at the specified URL by using Microsoft Cognitive Services

* Celebrities API.

* @param {string} sourceImageUrl - The URL to the image to analyze for celebrities.

* @param {<textarea> element} responseTextArea - The text area to display the JSON string returned

* from the REST API call, or to display the error message if there was

* an error.

* @param {<span> element} captionSpan - The span to display the image caption.

*/

function IdentifyCelebrities(sourceImageUrl, responseTextArea, captionSpan) {

// Request parameters.

var params = {

"model": "celebrities"

};

// Perform the REST API call.

$.ajax({

url: common.uriBasePreRegion +

$("#subscriptionRegionSelect").val() +

common.uriBasePostRegion +

common.uriBaseCelebrities +

"?" +

$.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

responseTextArea.val(JSON.stringify(data, null, 2));

// Extract and display the caption and confidence from the first caption in the description object.

if (data.result && data.result.celebrities) {

var celebrity = data.result.celebrities[0];

if (celebrity.name && celebrity.confidence) {

captionSpan.text("Celebrity name: " + celebrity.name +

" (confidence: " + celebrity.confidence + ").");

}

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Prepare the error string.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" : (jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message : jQuery.parseJSON(jqXHR.responseText).error.message;

// Put the error JSON in the response textarea.

responseTextArea.val(JSON.stringify(jqXHR, null, 2));

// Show the error message.

alert(errorString);

});

}

Run the celebrities functionRun the celebrities function

Intelligently generate a thumbnailIntelligently generate a thumbnail

Add the event handler code for the thumbnail buttonAdd the event handler code for the thumbnail button

function thumbnailButtonClick() {

// Clear the display fields.

document.getElementById("sourceImage").src = "#";

document.getElementById("thumbnailImageSmartCrop").src = "#";

document.getElementById("thumbnailImageNonSmartCrop").src = "#";

document.getElementById("responseTextArea").value = "";

document.getElementById("captionSpan").text = "";

// Display the image.

var sourceImageUrl = document.getElementById("inputImage").value;

document.getElementById("sourceImage").src = sourceImageUrl;

// Get a smart cropped thumbnail.

getThumbnail (sourceImageUrl, true, document.getElementById("thumbnailImageSmartCrop"),

document.getElementById("responseTextArea"));

// Get a non-smart-cropped thumbnail.

getThumbnail (sourceImageUrl, false, document.getElementById("thumbnailImageNonSmartCrop"),

document.getElementById("responseTextArea"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

/* Get a thumbnail of the image at the specified URL by using Microsoft Cognitive Services

* Thumbnail API.

* @param {string} sourceImageUrl URL to image.

* @param {boolean} smartCropping Set to true to use the smart cropping feature which crops to the

* more interesting area of an image; false to crop for the center

* of the image.

* @param {<img> element} imageElement The img element in the DOM which will display the thumbnail image.

* @param {<textarea> element} responseTextArea - The text area to display the Response Headers returned

* from the REST API call, or to display the error message if there was

* an error.

*/

function getThumbnail (sourceImageUrl, smartCropping, imageElement, responseTextArea) {

// Create the HTTP Request object.

var xhr = new XMLHttpRequest();

// Request parameters.

Save the celebrities.html file and open it in a Web browser. Put your subscription key into the Subscription Key

field and verify that you are using the correct region in Subscription Region. Enter a URL to an image to analyze,

then click the Analyze Image button to analyze an image and see the result.

The Thumbnail feature of Computer Vision generates a thumbnail from an image. By using the Smart Crop

feature, the Thumbnail feature will identify the area of interest in an image and center the thumbnail on this area, to

generate more aesthetically pleasing thumbnail images.

To complete the Thumbnail feature of the application, perform the following steps:

Open the thumbnail.html file in a text editor and locate the thumbnailButtonClick function near the bottom of

the file.

The thumbnailButtonClick event handler function clears the form, displays the image specified in the URL, then

calls the getThumbnail function twice to create two thumbnails, one smart cropped and one without smart crop.

Copy and paste the following code into the thumbnailButtonClick function.

The getThumbnail function wraps the REST API call to analyze an image. Upon a successful return, the thumbnail

will be displayed in the specified img element.

Copy and paste the following getThumbnail function to just underneath the thumbnailButtonClick function.

// Request parameters.

var params = "width=100&height=150&smartCropping=" + smartCropping.toString();

// Build the full URI.

var fullUri = common.uriBasePreRegion +

document.getElementById("subscriptionRegionSelect").value +

common.uriBasePostRegion +

common.uriBaseThumbnail +

"?" +

params;

// Identify the request as a POST, with the URI and parameters.

xhr.open("POST", fullUri);

// Add the request headers.

xhr.setRequestHeader("Content-Type","application/json");

xhr.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent(document.getElementById("subscriptionKeyInput").value));

// Set the response type to "blob" for the thumbnail image data.

xhr.responseType = "blob";

// Process the result of the REST API call.

xhr.onreadystatechange = function(e) {

if(xhr.readyState === XMLHttpRequest.DONE) {

// Thumbnail successfully created.

if (xhr.status === 200) {

// Show response headers.

var s = JSON.stringify(xhr.getAllResponseHeaders(), null, 2);

responseTextArea.value = JSON.stringify(xhr.getAllResponseHeaders(), null, 2);

// Show thumbnail image.

var urlCreator = window.URL || window.webkitURL;

var imageUrl = urlCreator.createObjectURL(this.response);

imageElement.src = imageUrl;

} else {

// Display the error message. The error message is the response body as a JSON string.

// The code in this code block extracts the JSON string from the blob response.

var reader = new FileReader();

// This event fires after the blob has been read.

reader.addEventListener('loadend', (e) => {

responseTextArea.value = JSON.stringify(JSON.parse(e.srcElement.result), null, 2);

});

// Start reading the blob as text.

reader.readAsText(xhr.response);

}

// Execute the REST API call.

xhr.send('{"url": ' + '"' + sourceImageUrl + '"}');

}

Run the thumbnail functionRun the thumbnail function

Read printed text (OCR)

Save the thumbnail.html file and open it in a Web browser. Put your subscription key into the Subscription Key

field and verify that you are using the correct region in Subscription Region. Enter a URL to an image to analyze,

then click the Generate Thumbnails button to analyze an image and see the result.

The Optical Character Recognition (OCR) feature of Computer Vision analyzes an image of printed text. After the

analysis is complete, OCR returns a JSON object that contains the text and the location of the text in the image.

To complete the OCR feature of the application, perform the following steps:

Add the event handler code for the OCR buttonAdd the event handler code for the OCR button

function ocrButtonClick() {

// Clear the display fields.

$("#sourceImage").attr("src", "#");

$("#responseTextArea").val("");

$("#captionSpan").text("");

// Display the image.

var sourceImageUrl = $("#inputImage").val();

$("#sourceImage").attr("src", sourceImageUrl);

ReadOcrImage(sourceImageUrl, $("#responseTextArea"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

Open the ocr.html file in a text editor and locate the ocrButtonClick function near the bottom of the file.

The ocrButtonClick event handler function clears the form, displays the image specified in the URL, then calls the

ReadOcrImage function to analyze the image. Copy and paste the following code into the ocrButtonClick

function.

The ReadOcrImage function wraps the REST API call to analyze an image. Upon a successful return, the

formatted JSON describing the text and the location of the text will be displayed in the specified textarea.

Copy and paste the following ReadOcrImage function to just underneath the ocrButtonClick function.

/* Recognize and read printed text in an image at the specified URL by using Microsoft Cognitive

* Services OCR API.

* @param {string} sourceImageUrl - The URL to the image to analyze for printed text.

* @param {<textarea> element} responseTextArea - The text area to display the JSON string returned

* from the REST API call, or to display the error message if there was

* an error.

*/

function ReadOcrImage(sourceImageUrl, responseTextArea) {

// Request parameters.

var params = {

"language": "unk",

"detectOrientation ": "true",

};

// Perform the REST API call.

$.ajax({

url: common.uriBasePreRegion +

$("#subscriptionRegionSelect").val() +

common.uriBasePostRegion +

common.uriBaseOcr +

"?" +

$.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data) {

// Show formatted JSON on webpage.

responseTextArea.val(JSON.stringify(data, null, 2));

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Put the JSON description into the text area.

responseTextArea.val(JSON.stringify(jqXHR, null, 2));

// Display error message.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" : (jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message : jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

}

Run the OCR functionRun the OCR function

Read handwritten text (handwriting recognition)

Save the ocr.html file and open it in a Web browser. Put your subscription key into the Subscription Key field and

verify that you are using the correct region in Subscription Region. Enter a URL to an image of text to read, then

click the Read Image button to analyze an image and see the result.

The Handwriting Recognition feature of Computer Vision analyzes an image of handwritten text. After the analysis

is complete, Handwriting Recognition returns a JSON object that contains the text and the location of the text in the

image.

To complete the Handwriting Recognition feature of the application, perform the following steps:

Add the event handler code for the handwriting buttonAdd the event handler code for the handwriting button

function handwritingButtonClick() {

// Clear the display fields.

$("#sourceImage").attr("src", "#");

$("#responseTextArea").val("");

// Display the image.

var sourceImageUrl = $("#inputImage").val();

$("#sourceImage").attr("src", sourceImageUrl);

ReadHandwrittenImage(sourceImageUrl, $("#responseTextArea"));

}

Add the wrapper for the REST API callAdd the wrapper for the REST API call

/* Recognize and read text from an image of handwriting at the specified URL by using Microsoft

* Cognitive Services Recognize Handwritten Text API.

* @param {string} sourceImageUrl - The URL to the image to analyze for handwriting.

* @param {<textarea> element} responseTextArea - The text area to display the JSON string returned

* from the REST API call, or to display the error message if there was

* an error.

*/

function ReadHandwrittenImage(sourceImageUrl, responseTextArea) {

// Request parameters.

var params = {

"handwriting": "true",

};

// This operation requires two REST API calls. One to submit the image for processing,

// the other to retrieve the text found in the image.

//

// Perform the first REST API call to submit the image for processing.

$.ajax({

url: common.uriBasePreRegion +

$("#subscriptionRegionSelect").val() +

common.uriBasePostRegion +

common.uriBaseHandwriting +

"?" +

$.param(params),

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

Open the handwriting.html file in a text editor and locate the handwritingButtonClick function near the bottom

of the file.

The handwritingButtonClick event handler function clears the form, displays the image specified in the URL,

then calls the HandwritingImage function to analyze the image.

Copy and paste the following code into the handwritingButtonClick function.

The ReadHandwrittenImage function wraps the two REST API calls needed to analyze an image. Because

Handwriting Recognition is a time consuming process, a two step process is used. The first call submits the image

for processing; the second call retrieves the detected text when the processing is complete.

After the text is retrieved, the formatted JSON describing the text and the location of the text will be displayed in

the specified textarea.

Copy and paste the following ReadHandwrittenImage function to just underneath the handwritingButtonClick

function.

},

type: "POST",

// Request body.

data: '{"url": ' + '"' + sourceImageUrl + '"}',

})

.done(function(data, textStatus, jqXHR) {

// Show progress.

responseTextArea.val("Handwritten image submitted.");

// Note: The response may not be immediately available. Handwriting Recognition is an

// async operation that can take a variable amount of time depending on the length

// of the text you want to recognize. You may need to wait or retry this GET operation.

//

// Try once per second for up to ten seconds to receive the result.

var tries = 10;

var waitTime = 100;

var taskCompleted = false;

var timeoutID = setInterval(function () {

// Limit the number of calls.

if (--tries <= 0) {

window.clearTimeout(timeoutID);

responseTextArea.val("The response was not available in the time allowed.");

return;

}

// The "Operation-Location" in the response contains the URI to retrieve the recognized text.

var operationLocation = jqXHR.getResponseHeader("Operation-Location");

// Perform the second REST API call and get the response.

$.ajax({

url: operationLocation,

// Request headers.

beforeSend: function(jqXHR){

jqXHR.setRequestHeader("Content-Type","application/json");

jqXHR.setRequestHeader("Ocp-Apim-Subscription-Key",

encodeURIComponent($("#subscriptionKeyInput").val()));

},

type: "GET",

})

.done(function(data) {

// If the result is not yet available, return.

if (data.status && (data.status === "NotStarted" || data.status === "Running")) {

return;

}

// Show formatted JSON on webpage.

responseTextArea.val(JSON.stringify(data, null, 2));

// Indicate the task is complete and clear the timer.

taskCompleted = true;

window.clearTimeout(timeoutID);

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Indicate the task is complete and clear the timer.

taskCompleted = true;

window.clearTimeout(timeoutID);

// Display error message.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" :

(jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message :

jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

}, waitTime);

})

.fail(function(jqXHR, textStatus, errorThrown) {

// Put the JSON description into the text area.

responseTextArea.val(JSON.stringify(jqXHR, null, 2));

// Display error message.

var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";

errorString += (jqXHR.responseText === "") ? "" : (jQuery.parseJSON(jqXHR.responseText).message) ?

jQuery.parseJSON(jqXHR.responseText).message : jQuery.parseJSON(jqXHR.responseText).error.message;

alert(errorString);

});

}

Run the handwriting functionRun the handwriting function

Next steps

Save the handwriting.html file and open it in a Web browser. Put your subscription key into the Subscription

Key field and verify that you are using the correct region in Subscription Region. Enter a URL to an image of text

to read, then click the Read Image button to analyze an image and see the result.

In this guide, you used the Computer Vision REST API with JavaScript to test many of the available image analysis

features. Next, see the reference documentation to learn more about the APIs involved.

Computer Vision REST API

Tutorial: Computer Vision API Python

2/15/2019 • 2 minutes to read • Edit Online

Prerequisites

Open the Tutorial Notebook in Jupyter

Run the Tutorial

# Variables

_region = 'westcentralus' #Here you enter the region of your subscription

_url = 'https://{}.api.cognitive.microsoft.com/vision/v2.0/analyze'.format(_region)

_key = None #Here you have to paste your primary key

_maxNumRetries = 10

This tutorial shows you how to use the Computer Vision API in Python and how to visualize your results using

popular libraries. You will use Jupyter to run the tutorial. To learn how to get started with interactive Jupyter

notebooks, refer to the Jupyter Documentation.

Python 2.7+ or 3.5+

pip tool

Jupyter Notebook installed

1. Go to the Cognitive Vision Python GitHub repo.

2. Click on the green button to clone or download the repo.

3. Open a command prompt and navigate to the folder Cognitive-Vision-Python\Jupyter Notebook.

4. Ensure you have all the required libraries by running the command

pip install requests opencv-python numpy matplotlib from the command prompt.

5. Start Jupyter by running the command jupyter notebook from the command prompt.

6. In the Jupyter window, click on Computer Vision API Example.ipynb to open the tutorial notebook.

To use this notebook, you will need a subscription key for the Computer Vision API. Visit the Subscription page to

sign up. On the Sign in page, use your Microsoft account to sign in and you will be able to subscribe and get free

keys. After completing the sign-up process, paste your key into the Variables section of the notebook (reproduced

below). Either the primary or the secondary key will work. Be sure to enclose the key in quotes to make it a string.

You will also need to make sure the _region field matches the region that corresponds to your subscription.

When you run the tutorial, you will be able to add images to analyze, both from a URL and from local storage. The

script will display the images and analysis information in the notebook.

Example: How to call the Computer Vision API

4/18/2019 • 5 minutes to read • Edit Online

Prerequisites

Authorize the API call

Upload an image to the Computer Vision API service and get back tags,

descriptions and celebrities

This guide demonstrates how to call Computer Vision API using REST. The samples are written both in C# using

the Computer Vision API client library, and as HTTP POST/GET calls. We will focus on:

How to get "Tags", "Description" and "Categories".

How to get "Domain-specific" information (celebrities).

Image URL or path to locally stored image.

Supported input methods: Raw image binary in the form of an application/octet stream or image URL

Supported image formats: JPEG, PNG, GIF, BMP

Image file size: Less than 4MB

Image dimension: Greater than 50 x 50 pixels

In the examples below, the following features are demonstrated:

1. Analyzing an image and getting an array of tags and a description returned.

2. Analyzing an image with a domain-specific model (specifically, "celebrities" model) and getting the

corresponding result in JSON retune.

Features are broken down on:

Option One: Scoped Analysis - Analyze only a given model

Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

Every call to the Computer Vision API requires a subscription key. This key needs to be either passed through a

query string parameter or specified in the request header.

To obtain a free trial key, see Try Cognitive Services. Or, follow the instructions in Create a Cognitive Services

account to subscribe to Computer Vision and get your key.

1. Passing the subscription key through a query string, see below as a Computer Vision API example:

https://westus.api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=Description,Tags&subscription-key=

1. Passing the subscription key can also be specified in the HTTP request header:

ocp-apim-subscription-key: <Your subscription key>

1. When using the client library, the subscription key is passed in through the constructor of VisionServiceClient:

var visionClient = new VisionServiceClient("Your subscriptionKey");

The basic way to perform the Computer Vision API call is by uploading an image directly. This is done by sending a

POST https://westus.api.cognitive.microsoft.com/vision/v2.0/analyze?

visualFeatures=Description,Tags&subscription-key=<Your subscription key>

using Microsoft.ProjectOxford.Vision;

using Microsoft.ProjectOxford.Vision.Contract;

using System.IO;

AnalysisResult analysisResult;

var features = new VisualFeature[] { VisualFeature.Tags, VisualFeature.Description };

using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))

{

analysisResult = await visionClient.AnalyzeImageAsync(fs, features);

}

Ta g s o n l y:Ta g s o n l y:

POST https://westus.api.cognitive.microsoft.com/vision/v2.0/tag&subscription-key=<Your subscription key>

var analysisResult = await visionClient.GetTagsAsync("http://contoso.com/example.jpg");

De s c ri p ti o n o n l y :De s c ri p ti o n o n l y :

POST https://westus.api.cognitive.microsoft.com/vision/v2.0/describe&subscription-key=<Your subscription key>

using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))

{

analysisResult = await visionClient.DescribeAsync(fs);

}

Get domain-specific analysis (celebrities)

POST https://westus.api.cognitive.microsoft.com/vision/v2.0/models/celebrities/analyze

var celebritiesResult = await visionClient.AnalyzeImageInDomainAsync(url, "celebrities");

GET https://westus.api.cognitive.microsoft.com/vision/v2.0/models

var models = await visionClient.ListModelsAsync();

"POST" request with application/octet-stream content type together with the data read from the image. For "Tags"

and "Description", this upload method will be the same for all the Computer Vision API calls. The only difference

will be the query parameters the user specifies.

Here’s how to get "Tags" and "Description" for a given image:

Option One: Get list of "Tags" and one "Description"

Option Two Get list of "Tags" only, or list of "Description" only:

Option One: Scoped Analysis - Analyze only a given model

For this option, all other query parameters {visualFeatures, details} are not valid. If you want to see all supported

models, use:

Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

For applications where you want to get generic image analysis in addition to details from one or more domain-

specific models, we extend the v1 API with the models query parameter.

POST https://westus.api.cognitive.microsoft.com/vision/v2.0/analyze?details=celebrities

Retrieve and understand the JSON output for analysis

{

"tags":[

{

"name":"outdoor",

"score":0.976

},

{

"name":"bird",

"score":0.95

}

],

"description":{

"tags":[

"outdoor",

"bird"

],

"captions":[

{

"text":"partridge in a pear tree",

"confidence":0.96

}

]

}

FIELD T YPE CONTENT

Tags object Top-level object for array of tags

tags[].Name string Keyword from tags classifier

tags[].Score number Confidence score, between 0 and 1.

description object Top-level object for a description.

description.tags[] string List of tags. If there insufficient

confidence in the ability to produce a

caption, the tags maybe the only

information available to the caller.

description.captions[].text string A phrase describing the image.

When this method is invoked, we will call the 86-category classifier first. If any of the categories match that of a

known/matching model, a second pass of classifier invocations will occur. For example, if "details=all", or "details"

include ‘celebrities’, we will call the celebrities model after the 86-category classifier is called and the result includes

the category person. This will increase latency for users interested in celebrities, compared to Option One.

All v1 query parameters will behave the same in this case. If visualFeatures=categories is not specified, it will be

implicitly enabled.

Here's an example:

description.captions[].confidence number Confidence for the phrase.

FIELD T YPE CONTENT

Retrieve and understand the JSON output of domain-specific models

{

"result":[

{

"name":"golden retriever",

"score":0.98

},

{

"name":"Labrador retriever",

"score":0.78

}

]

}

{

"requestId":"87e44580-925a-49c8-b661-d1c54d1b83b5",

"metadata":{

"width":640,

"height":430,

"format":"Jpeg"

},

"result":{

"celebrities":[

{

"name":"Richard Nixon",

"faceRectangle":{

"left":107,

"top":98,

"width":165,

"height":165

},

"confidence":0.9999827

}

]

}

FIELD T YPE CONTENT

categories object Top-level object

Option One: Scoped Analysis - Analyze only a given model

The output will be an array of tags, an example will be like this example:

Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy

For domain-specific models using Option Two (Enhanced Analysis), the categories return type is extended. An

example follows:

The categories field is a list of one or more of the 86-categories in the original taxonomy. Note also that categories

ending in an underscore will match that category and its children (for example, people_ as well as people_group, for

celebrities model).

categories[].name string Name from 86-category taxonomy

categories[].score number Confidence score, between 0 and 1

categories[].detail object? Optional detail object

FIELD T YPE CONTENT

Errors Responses

Next steps

Note that if multiple categories match (for example, 86-category classifier returns a score for both people_ and

people_young when model=celebrities), the details are attached to the most general level match (people_ in that

example.)

These are identical to vision.analyze, with the additional error of NotSupportedModel error (HTTP 400), which may

be returned in both Option One and Option Two scenarios. For Option Two (Enhanced Analysis), if any of the

models specified in details are not recognized, the API will return a NotSupportedModel, even if one or more of

them are valid. Users can call listModels to find out what models are supported.

To use the REST API, go to Computer Vision API Reference.

Install and run Recognize Text containers

6/11/2019 • 9 minutes to read • Edit Online

IMPORTANTIMPORTANT

Prerequisites

REQUIRED PURPOSE

Docker Engine You need the Docker Engine installed on a host computer.

Docker provides packages that configure the Docker

environment on macOS, Windows, and Linux. For a primer on

Docker and container basics, see the Docker overview.

Docker must be configured to allow the containers to connect

with and send billing data to Azure.

On Windows, Docker must also be configured to support

Linux containers.

Familiarity with Docker You should have a basic understanding of Docker concepts,

like registries, repositories, containers, and container images,

as well as knowledge of basic docker commands.

Azure Cognitive Services resource In order to use the container, you must have:

A Cognitive Services Azure resource and the associated billing

key the billing endpoint URI. Both values are available on the

Overview and Keys pages for the resource and are required to

start the container. You need to add the vision/v2.0

routing to the endpoint URI as shown in the following

BILLING_ENDPOINT_URI example.

{BILLING_KEY}: resource key

{BILLING_ENDPOINT_URI}: endpoint URI example is:

https://westus.api.cognitive.microsoft.com/vision/v2.0

Request access to the private container registry

The Recognize Text portion of Computer Vision is also available as a Docker container. It allows you to detect and

extract printed text from images of various objects with different surfaces and backgrounds, such as receipts,

posters, and business cards.

The Recognize Text container currently works only with English.

If you don't have an Azure subscription, create a free account before you begin.

You must meet the following prerequisites before using Recognize Text containers:

Fill out and submit the Cognitive Services Vision Containers Request form to request access to the container. The

form requests information about you, your company, and the user scenario for which you'll use the container. After

IMPORTANTIMPORTANT

Log in to the private container registry

docker login containerpreview.azurecr.io -u <username> -p <password>

cat <passwordFile> | docker login containerpreview.azurecr.io -u <username> --password-stdin

The host computerThe host computer

Container requirements and recommendationsContainer requirements and recommendations

CONTAINER MINIMUM RECOMMENDED

TPS

(MINIMUM, MAXIMUM)

Recognize Text 1 core, 8 GB memory, 0.5

TPS

2 cores, 8 GB memory, 1 TPS 0.5, 1

you submit the form, the Azure Cognitive Services team reviews it to make sure that you meet the criteria for

access to the private container registry.

You must use an email address associated with either a Microsoft Account (MSA) or an Azure Active Directory (Azure AD)

account in the form.

If your request is approved, you receive an email with instructions that describe how to obtain your credentials and

access the private container registry.

There are several ways to authenticate with the private container registry for Cognitive Services containers. We

recommend that you use the command-line method by using the Docker CLI.

Use the docker login command, as shown in the following example, to log in to containerpreview.azurecr.io ,

which is the private container registry for Cognitive Services containers. Replace <username> with the user name

and <password> with the password provided in the credentials you received from the Azure Cognitive Services

team.

If you secured your credentials in a text file, you can concatenate the contents of that text file to the docker login

command. Use the cat command, as shown in the following example. Replace <passwordFile> with the path and

name of the text file that contains the password. Replace <username> with the user name provided in your

credentials.

The host is a x64-based computer that runs the Docker container. It can be a computer on your premises or a

Docker hosting service in Azure, such as:

Azure Kubernetes Service.

Azure Container Instances.

A Kubernetes cluster deployed to Azure Stack. For more information, see Deploy Kubernetes to Azure Stack.

The following table describes the minimum and recommended CPU cores and memory to allocate for each

Recognize Text container.

Each core must be at least 2.6 gigahertz (GHz) or faster.

TPS - transactions per second

Core and memory correspond to the --cpus and --memory settings, which are used as part of the docker run

command.

Get the container image with docker pull

CONTAINER REPOSITORY

Recognize Text containerpreview.azurecr.io/microsoft/cognitive-

services-recognize-text:latest

Docker pull for the Recognize Text containerDocker pull for the Recognize Text container

docker pull containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text:latest

TIPTIP

docker images --format "table {{.ID}}\t{{.Repository}}\t{{.Tag}}"

IMAGE ID REPOSITORY TAG

ebbee78a6baa <container-name> latest

How to use the container

Run the container with docker run

PLACEHOLDER VALUE

{BILLING_KEY}This key is used to start the container, and is available on the

Azure Cognitive Services Keys page.

{BILLING_ENDPOINT_URI}The billing endpoint URI value. Example is:

https://westus.api.cognitive.microsoft.com/vision/v2.0

Container images for Recognize Text are available.

Use the docker pull command to download a container image.

You can use the docker images command to list your downloaded container images. For example, the following command

lists the ID, repository, and tag of each downloaded container image, formatted as a table:

Once the container is on the host computer, use the following process to work with the container.

1. Run the container, with the required billing settings. More examples of the docker run command are available.

2. Query the container's prediction endpoint.

Use the docker run command to run the container. The command uses the following parameters:

You need to add the vision/v2.0 routing to the endpoint URI as shown in the following

BILLING_ENDPOINT_URI example.

Replace these parameters with your own values in the following example docker run command.

docker run --rm -it -p 5000:5000 --memory 4g --cpus 1 \

containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text \

Eula=accept \

Billing={BILLING_ENDPOINT_URI} \

ApiKey={BILLING_KEY}

IMPORTANTIMPORTANT

Run multiple containers on the same hostRun multiple containers on the same host

Query the container's prediction endpoint

Asynchronous text recognitionAsynchronous text recognition

Synchronous text recognitionSynchronous text recognition

Validate that a container is running

REQUEST PURPOSE

http://localhost:5000/ The container provides a home page.

This command:

Runs a recognize container from the container image

Allocates one CPU core and 4 gigabytes (GB) of memory

Exposes TCP port 5000 and allocates a pseudo-TTY for the container

Automatically removes the container after it exits. The container image is still available on the host computer.

More examples of the docker run command are available.

The Eula , Billing , and ApiKey options must be specified to run the container; otherwise, the container won't start. For

more information, see Billing.

If you intend to run multiple containers with exposed ports, make sure to run each container with a different

exposed port. For example, run the first container on port 5000 and the second container on port 5001.

You can have this container and a different Azure Cognitive Services container running on the HOST together. You

also can have multiple containers of the same Cognitive Services container running.

The container provides REST-based query prediction endpoint APIs.

Use the host, https://localhost:5000 , for container APIs.

You can use the POST /vision/v2.0/recognizeText and GET /vision/v2.0/textOperations/*{id}* operations in

concert to asynchronously recognize printed text in an image, similar to how the Computer Vision service uses

those corresponding REST operations. The Recognize Text container only recognizes printed text, not handwritten

text, at this time, so the mode parameter normally specified for the Computer Vision service operation is ignored

by the Recognize Text container.

You can use the POST /vision/v2.0/recognizeTextDirect operation to synchronously recognize printed text in an

image. Because this operation is synchronous, the request body for this operation is the same as that for the

POST /vision/v2.0/recognizeText operation, but the response body for this operation is the same as that returned

by the GET /vision/v2.0/textOperations/*{id}* operation.

There are several ways to validate that the container is running.

http://localhost:5000/status Requested with GET, to validate that the container is running

without causing an endpoint query. This request can be used

for Kubernetes liveness and readiness probes.

http://localhost:5000/swagger The container provides a full set of documentation for the

endpoints and a Try it now feature. With this feature, you

can enter your settings into a web-based HTML form and

make the query without having to write any code. After the

query returns, an example CURL command is provided to

demonstrate the HTTP headers and body format that's

required.

REQUEST PURPOSE

Stop the container

Troubleshooting

Billing

Connect to AzureConnect to Azure

To shut down the container, in the command-line environment where the container is running, select Ctrl+C.

If you run the container with an output mount and logging enabled, the container generates log files that are

helpful to troubleshoot issues that happen while starting or running the container.

The Recognize Text containers send billing information to Azure, using a Recognize Text resource on your Azure

account.

Queries to the container are billed at the pricing tier of the Azure resource that's used for the <ApiKey> .

Azure Cognitive Services containers aren't licensed to run without being connected to the billing endpoint for

metering. You must enable the containers to communicate billing information with the billing endpoint at all times.

Cognitive Services containers don't send customer data, such as the image or text that's being analyzed, to

Microsoft.

The container needs the billing argument values to run. These values allow the container to connect to the billing

Billing argumentsBilling arguments

OPTION DESCRIPTION

ApiKey The API key of the Cognitive Services resource that's used to

track billing information.

The value of this option must be set to an API key for the

provisioned resource that's specified in Billing .

Billing The endpoint of the Cognitive Services resource that's used to

track billing information.

The value of this option must be set to the endpoint URI of a

provisioned Azure resource.

Eula Indicates that you accepted the license for the container.

The value of this option must be set to accept.

Blog posts

Developer samples

View webinar

Summary

endpoint. The container reports usage about every 10 to 15 minutes. If the container doesn't connect to Azure

within the allowed time window, the container continues to run but doesn't serve queries until the billing endpoint

is restored. The connection is attempted 10 times at the same time interval of 10 to 15 minutes. If it can't connect

to the billing endpoint within the 10 tries, the container stops running.

For the docker run command to start the container, all three of the following options must be specified with valid

values:

For more information about these options, see Configure containers.

Running Cognitive Services Containers

Getting started with Cognitive Services Language Understanding container

Developer samples are available at our GitHub repository.

Join the webinar to learn about:

How to deploy Cognitive Services to any machine using Docker

How to deploy Cognitive Services to AKS

In this article, you learned concepts and workflow for downloading, installing, and running Recognize Text

containers. In summary:

Recognize Text provides a Linux container for Docker, encapsulating recognize text.

Container images are downloaded from the Microsoft Container Registry (MCR) in Azure.

Container images run in Docker.

You can use either the REST API or SDK to call operations in Recognize Text containers by specifying the host

URI of the container.

You must specify billing information when instantiating a container.

IMPORTANTIMPORTANT

Next steps

Cognitive Services containers are not licensed to run without being connected to Azure for metering. Customers need to

enable the containers to communicate billing information with the metering service at all times. Cognitive Services containers

do not send customer data (for example, the image or text that is being analyzed) to Microsoft.

Review Configure containers for configuration settings

Review Computer Vision overview to learn more about recognizing printed and handwritten text

Refer to the Computer Vision API for details about the methods supported by the container.

Refer to Frequently asked questions (FAQ) to resolve issues related to Computer Vision functionality.

Use more Cognitive Services Containers

Configure Recognize Text Docker containers

6/11/2019 • 7 minutes to read • Edit Online

Configuration settings

REQUIRED SETTING PURPOSE

Yes ApiKey Tracks billing information.

No ApplicationInsights Enables adding Azure Application Insights telemetry

support to your container.

Yes Billing Specifies the endpoint URI of the service resource on

Azure.

Yes Eula Indicates that you've accepted the license for the

container.

No Fluentd Writes log and, optionally, metric data to a Fluentd

server.

No Http Proxy Configures an HTTP proxy for making outbound

requests.

No Logging Provides ASP.NET Core logging support for your

container.

No Mounts Reads and writes data from the host computer to the

container and from the container back to the host

computer.

IMPORTANTIMPORTANT

ApiKey configuration setting

ApplicationInsights setting

REQUIRED NAME DATA TYPE DESCRIPTION

No InstrumentationKey String The instrumentation key of the

Application Insights instance to which

telemetry data for the container is

sent. For more information, see

Application Insights for ASP.NET Core.

Example:

InstrumentationKey=123456789

The Recognize Text container runtime environment is configured using the docker run command arguments. This container has several required

settings, along with a few optional settings. Several examples of the command are available. The container-specific settings are the billing settings.

The container has the following configuration settings:

The ApiKey , Billing , and Eula settings are used together, and you must provide valid values for all three of them; otherwise your container won't start. For

more information about using these configuration settings to instantiate a container, see Billing.

The ApiKey setting specifies the Azure Cognitive Services resource key used to track billing information for the container. You must specify a value

for the ApiKey and the value must be a valid key for the Cognitive Services resource specified for the Billing configuration setting.

This setting can be found in the following place:

Azure portal: Cognitive Services Resource Management, under Keys

The ApplicationInsights setting allows you to add Azure Application Insights telemetry support to your container. Application Insights provides in-

depth monitoring of your container. You can easily monitor your container for availability, performance, and usage. You can also quickly identify and

diagnose errors in your container.

The following table describes the configuration settings supported under the ApplicationInsights section.

Billing configuration setting

REQUIRED NAME DATA TYPE DESCRIPTION

Yes Billing String Billing endpoint URI

Example:

Billing=https://westcentralus.api.cognitive.microsoft.com/vision/v1.0

Eula setting

REQUIRED NAME DATA TYPE DESCRIPTION

Yes Eula String License acceptance

Example:

Eula=accept

Fluentd settings

NAME DATA TYPE DESCRIPTION

Host String The IP address or DNS host name of the Fluentd

server.

Port Integer The port of the Fluentd server.

The default value is 24224.

HeartbeatMs Integer The heartbeat interval, in milliseconds. If no event

traffic has been sent before this interval expires, a

heartbeat is sent to the Fluentd server. The default

value is 60000 milliseconds (1 minute).

SendBufferSize Integer The network buffer space, in bytes, allocated for send

operations. The default value is 32768 bytes (32

kilobytes).

TlsConnectionEstablishmentTimeoutMs Integer The timeout, in milliseconds, to establish a SSL/TLS

connection with the Fluentd server. The default value

is 10000 milliseconds (10 seconds).

If UseTLS is set to false, this value is ignored.

UseTLS Boolean Indicates whether the container should use SSL/TLS

for communicating with the Fluentd server. The

default value is false.

Http proxy credentials settings

The Billing setting specifies the endpoint URI of the Cognitive Services resource on Azure used to meter billing information for the container. You

must specify a value for this configuration setting, and the value must be a valid endpoint URI for a Cognitive Services resource on Azure. The

container reports usage about every 10 to 15 minutes.

This setting can be found in the following place:

Azure portal: Cognitive Services Overview, labeled Endpoint

Remember to add the vision/v1.0 routing to the endpoint URI as shown in the following table.

The Eula setting indicates that you've accepted the license for the container. You must specify a value for this configuration setting, and the value

must be set to accept .

Cognitive Services containers are licensed under your agreement governing your use of Azure. If you do not have an existing agreement governing

your use of Azure, you agree that your agreement governing use of Azure is the Microsoft Online Subscription Agreement, which incorporates the

Online Services Terms. For previews, you also agree to the Supplemental Terms of Use for Microsoft Azure Previews. By using the container you

agree to these terms.

Fluentd is an open-source data collector for unified logging. The Fluentd settings manage the container's connection to a Fluentd server. The

container includes a Fluentd logging provider, which allows your container to write logs and, optionally, metric data to a Fluentd server.

The following table describes the configuration settings supported under the Fluentd section.

If you need to configure an HTTP proxy for making outbound requests, use these two arguments:

NAME DATA TYPE DESCRIPTION

HTTP_PROXY string The proxy to use, for example, http://proxy:8888

HTTP_PROXY_CREDS string Any credentials needed to authenticate against the

proxy, for example, username:password.

<proxy-user> string The user for the proxy.

proxy-password string The password associated with <proxy-user> for the

proxy.

docker run --rm -it -p 5000:5000 \

--memory 2g --cpus 1 \

--mount type=bind,src=/home/azureuser/output,target=/output \

<registry-location>/<image-name> \

Eula=accept \

Billing=<billing-endpoint> \

ApiKey=<api-key> \

HTTP_PROXY=<proxy-url> \

HTTP_PROXY_CREDS=<proxy-user>:<proxy-password> \

Logging settings

PROVIDER PURPOSE

Console The ASP.NET Core Console logging provider. All of the ASP.NET Core

configuration settings and default values for this logging provider are supported.

Debug The ASP.NET Core Debug logging provider. All of the ASP.NET Core configuration

settings and default values for this logging provider are supported.

Disk The JSON logging provider. This logging provider writes log data to the output

mount.

docker run --rm -it -p 5000:5000 \

--memory 2g --cpus 1 \

--mount type=bind,src=/home/azureuser/output,target=/output \

<registry-location>/<image-name> \

Eula=accept \

Billing=<billing-endpoint> \

ApiKey=<api-key> \

Logging:Disk:Format=json

docker run --rm -it -p 5000:5000 \

--memory 2g --cpus 1 \

<registry-location>/<image-name> \

Eula=accept \

Billing=<billing-endpoint> \

ApiKey=<api-key> \

Logging:Console:LogLevel:Default=Debug

Disk loggingDisk logging

NAME DATA TYPE DESCRIPTION

Format String The output format for log files.

Note: This value must be set to json to enable the

logging provider. If this value is specified without also

specifying an output mount while instantiating a

container, an error occurs.

The Logging settings manage ASP.NET Core logging support for your container. You can use the same configuration settings and values for your

container that you use for an ASP.NET Core application.

The following logging providers are supported by the container:

This container command stores logging information in the JSON format to the output mount:

This container command shows debugging information, prefixed with dbug , while the container is running:

The Disk logging provider supports the following configuration settings:

MaxFileSize Integer The maximum size, in megabytes (MB), of a log file.

When the size of the current log file meets or exceeds

this value, a new log file is started by the logging

provider. If -1 is specified, the size of the log file is

limited only by the maximum file size, if any, for the

output mount. The default value is 1.

NAME DATA TYPE DESCRIPTION

Mount settings

OPTIONAL NAME DATA TYPE DESCRIPTION

Not allowed Input String Computer Vision containers do not

use this.

Optional Output String The target of the output mount. The

default value is /output . This is the

location of the logs. This includes

container logs.

Example:

--mount

type=bind,src=c:\output,target=/output

Example docker run commands

PLACEHOLDER VALUE FORMAT OR EXAMPLE

{BILLING_KEY}The endpoint key of the Cognitive Services resource. xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

{BILLING_ENDPOINT_URI}The billing endpoint value including region. https://westcentralus.api.cognitive.microsoft.com/vision/v1.0

IMPORTANTIMPORTANT

Recognize text container Docker examples

Basic exampleBasic example

docker run --rm -it -p 5000:5000 --memory 4g --cpus 1 \

containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text \

Eula=accept \

Billing={BILLING_ENDPOINT_URI} \

ApiKey={BILLING_KEY}

Logging exampleLogging example

For more information about configuring ASP.NET Core logging support, see Settings file configuration.

Use bind mounts to read and write data to and from the container. You can specify an input mount or output mount by specifying the --mount option

in the docker run command.

The Computer Vision containers don't use input or output mounts to store training or service data.

The exact syntax of the host mount location varies depending on the host operating system. Additionally, the host computer's mount location may not

be accessible due to a conflict between permissions used by the Docker service account and the host mount location permissions.

The following examples use the configuration settings to illustrate how to write and use docker run commands. Once running, the container

continues to run until you stop it.

Line-continuation character: The Docker commands in the following sections use the back slash, \, as a line continuation character. Replace or

remove this based on your host operating system's requirements.

Argument order: Do not change the order of the arguments unless you are very familiar with Docker containers.

Remember to add the vision/v1.0 routing to the endpoint URI as shown in the following table.

Replace {argument_name} with your own values:

The Eula , Billing , and ApiKey options must be specified to run the container; otherwise, the container won't start. For more information, see Billing. The

ApiKey value is the Key from the Azure Cognitive Services Resource keys page.

The following Docker examples are for the recognize text container.

docker run --rm -it -p 5000:5000 --memory 4g --cpus 1 \

containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text \

Eula=accept \

Billing={BILLING_ENDPOINT_URI} \

ApiKey={BILLING_KEY} \

Logging:Console:LogLevel:Default=Information

Next steps

Review How to install and run containers

Use Connected Services in Visual Studio to connect

to the Computer Vision API

5/16/2019 • 5 minutes to read • Edit Online

Prerequisites

Install the Cognitive Services VSIX Extension

This article and its companion articles provide details for using the Visual Studio Connected Service feature for

Cognitive Services Computer Vision API. The capability is available in both Visual Studio 2017 15.7 or later, with

the Cognitive Services extension installed.

An Azure subscription. If you do not have one, you can sign up for a free account.

Visual Studio 2017 version 15.7 or later with the Web Development workload installed. Download it now.

1. With your web project open in Visual Studio, choose the Connected Services tab. The tab is available on

the welcome page that appears when you open a new project. If you don't see the tab, select Connected

Services in your project in Solution Explorer.

2. Scroll down to the bottom of the list of services, and select Find more services.

The Extensions and Updates dialog box appears.

3. In the Extensions and Updates dialog box, search for Cognitive Services, and then download and install

the Cognitive Services VSIX package.

Installing an extension requires a restart of the integrated development environment (IDE).

4. Restart Visual Studio. The extension installs when you close Visual Studio, and is available next time you

Add support to your project for Cognitive Services Computer Vision

API

launch the IDE.

1. Create a new ASP.NET Core web project. Use the Empty project template.

2. In Solution Explorer, choose Add > Connected Service. The Connected Service page appears with

services you can add to your project.

3. In the menu of available services, choose Cognitive Services Computer Vision API.

If you've signed into Visual Studio, and have an Azure subscription associated with your account, a page

appears with a dropdown list with your subscriptions.

4. Select the subscription you want to use, and then choose a name for the Computer Vision API, or choose

the Edit link to modify the automatically generated name, choose the resource group, and the Pricing Tier.

Use the Computer Vision API to detect attributes of an image

[4/26/2018 5:15:31.664 PM] Adding Computer Vision API to the project.

[4/26/2018 5:15:32.084 PM] Creating new ComputerVision...

[4/26/2018 5:15:32.153 PM] Creating new Resource Group...

[4/26/2018 5:15:40.286 PM] Installing NuGet package

'Microsoft.Azure.CognitiveServices.Vision.ComputerVision' version 2.1.0.

[4/26/2018 5:15:44.117 PM] Retrieving keys...

[4/26/2018 5:15:45.602 PM] Changing appsettings.json setting: ComputerVisionAPI_ServiceKey=<service key>

[4/26/2018 5:15:45.606 PM] Changing appsettings.json setting:

ComputerVisionAPI_ServiceEndPoint=https://australiaeast.api.cognitive.microsoft.com/vision/v2.0

[4/26/2018 5:15:45.609 PM] Changing appsettings.json setting: ComputerVisionAPI_Name=WebApplication-

Core-ComputerVision_ComputerVisionAPI

[4/26/2018 5:15:46.747 PM] Successfully added Computer Vision API to the project.

Follow the link for details on the pricing tiers.

5. Choose Add to add supported for the Connected Service. Visual Studio modifies your project to add the

NuGet packages, configuration file entries, and other changes to support a connection the Computer Vision

API. The Output Window shows the log of what is happening to your project. You should see something like

the following:

using System.IO;

using System.Text;

using Microsoft.Extensions.Configuration;

using System.Net.Http;

using System.Net.Http.Headers;

1. Add the following using statements in Startup.cs.

2. Add a configuration field, and add a constructor that initializes the configuration field in the Startup class to

enable configuration in your program.

private IConfiguration configuration;

public Startup(IConfiguration configuration)

{

this.configuration = configuration;

}

3. In the wwwroot folder in your project, add an images folder, and add an image file to your wwwroot folder.

As an example, you can use one of the images on this Computer Vision API page. Right-click on one of the

images, save to your local hard drive, then in Solution Explorer, right-click on the images folder, and choose

Add > Existing Item to add it to your project. Your project should look something like this in Solution

Explorer:

4. Right-click on the image file, choose Properties, and then choose Copy if newer.

5. Replace the Configure method with the following code to access the Computer Vision API and test an

image.

// This method gets called by the runtime. Use this method to configure the HTTP request pipeline.

public void Configure(IApplicationBuilder app, IHostingEnvironment env)

{

// TODO: Change this to your image's path on your site.

string imagePath = @"images/subway.png";

// Enable static files such as image files.

app.UseStaticFiles();

string visionApiKey = this.configuration["ComputerVisionAPI_ServiceKey"];

string visionApiEndPoint = this.configuration["ComputerVisionAPI_ServiceEndPoint"];

HttpClient client = new HttpClient();

// Request headers.

client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", visionApiKey);

// Request parameters. A third optional parameter is "details".

string requestParameters = "visualFeatures=Categories,Description,Color&language=en";

// Assemble the URI for the REST API Call.

string uri = visionApiEndPoint + "/analyze" + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts an image you've added to your site's images folder.

var fileInfo = env.WebRootFileProvider.GetFileInfo(imagePath);

byte[] byteData = GetImageAsByteArray(fileInfo.PhysicalPath);

string contentString = string.Empty;

using (ByteArrayContent content = new ByteArrayContent(byteData))

{

// This example uses content type "application/octet-stream".

// The other content types you can use are "application/json" and "multipart/form-data".

content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// Execute the REST API call.

response = client.PostAsync(uri, content).Result;

// Get the JSON response.

contentString = response.Content.ReadAsStringAsync().Result;

}

if (env.IsDevelopment())

{

app.UseDeveloperExceptionPage();

}

app.Run(async (context) =>

{

await context.Response.WriteAsync("<h1>Cognitive Services Demo</h1>");

await context.Response.WriteAsync($"<p><b>Test Image:</b></p>");

await context.Response.WriteAsync($"<div><img src=\"" + imagePath + "\" /></div>");

await context.Response.WriteAsync($"<p><b>Computer Vision API results:</b></p>");

await context.Response.WriteAsync("<p>");

await context.Response.WriteAsync(JsonPrettyPrint(contentString));

await context.Response.WriteAsync("<p>");

});

}

/// <summary>

The code here constructs a HTTP request with the URI and the image as binary content for a call to the

Computer Vision REST API.

6. Add the helper functions GetImageAsByteArray and JsonPrettyPrint.

/// <summary>

/// Returns the contents of the specified file as a byte array.

/// </summary>

/// <param name="imageFilePath">The image file to read.</param>

/// <returns>The byte array of the image data.</returns>

static byte[] GetImageAsByteArray(string imageFilePath)

{

FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);

BinaryReader binaryReader = new BinaryReader(fileStream);

return binaryReader.ReadBytes((int)fileStream.Length);

}

/// <summary>

/// Formats the given JSON string by adding line breaks and indents.

/// </summary>

/// <param name="json">The raw JSON string to format.</param>

/// <returns>The formatted JSON string.</returns>

static string JsonPrettyPrint(string json)

{

if (string.IsNullOrEmpty(json))

return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

string INDENT_STRING = " ";

var indent = 0;

var quoted = false;

var sb = new StringBuilder();

for (var i = 0; i < json.Length; i++)

{

var ch = json[i];

switch (ch)

{

case '{':

case '[':

sb.Append(ch);

if (!quoted)

{

sb.AppendLine();

}

break;

case '}':

case ']':

if (!quoted)

{

sb.AppendLine();

}

sb.Append(ch);

break;

case '"':

sb.Append(ch);

bool escaped = false;

var index = i;

while (index > 0 && json[--index] == '\\')

escaped = !escaped;

if (!escaped)

quoted = !quoted;

break;

case ',':

sb.Append(ch);

if (!quoted)

{

sb.AppendLine();

}

break;

case ':':

sb.Append(ch);

if (!quoted)

sb.Append(" ");

break;

Clean up resources

Next steps

break;

default:

sb.Append(ch);

break;

}

return sb.ToString();

}

7. Run the web application and see what Computer Vision API found in your image.

When no longer needed, delete the resource group. This deletes the cognitive service and related resources. To

delete the resource group through the portal:

1. Enter the name of your resource group in the Search box at the top of the portal. When you see the resource

group used in this quickstart in the search results, select it.

2. Select Delete resource group.

3. In the TYPE THE RESOURCE GROUP NAME: box type in the name of the resource group and select Delete.

Learn more about the Computer Vision API by reading the Computer Vision API documentation.

How to analyze videos in real time

4/19/2019 • 6 minutes to read • Edit Online

The Approach

A Simple ApproachA Simple Approach

while (true)

{

Frame f = GrabFrame();

if (ShouldAnalyze(f))

{

AnalysisResult r = await Analyze(f);

ConsumeResult(r);

}

Parallelizing API CallsParallelizing API Calls

This guide will demonstrate how to perform near-real-time analysis on frames taken from a live video stream. The

basic components in such a system are:

Acquire frames from a video source

Select which frames to analyze

Submit these frames to the API

Consume each analysis result that is returned from the API call

These samples are written in C# and the code can be found on GitHub here:

https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis.

There are multiple ways to solve the problem of running near-real-time analysis on video streams. We will start by

outlining three approaches in increasing levels of sophistication.

The simplest design for a near-real-time analysis system is an infinite loop, where in each iteration we grab a frame,

analyze it, and then consume the result:

If our analysis consisted of a lightweight client-side algorithm, this approach would be suitable. However, when our

analysis is happening in the cloud, the latency involved means that an API call might take several seconds, during

which time we are not capturing images, and our thread is essentially doing nothing. Our maximum frame-rate is

limited by the latency of the API calls.

While a simple single-threaded loop makes sense for a lightweight client-side algorithm, it doesn't fit well with the

latency involved in cloud API calls. The solution to this problem is to allow the long-running API calls to execute in

parallel with the frame-grabbing. In C#, we could achieve this using Task-based parallelism, for example:

while (true)

{

Frame f = GrabFrame();

if (ShouldAnalyze(f))

{

var t = Task.Run(async () =>

{

AnalysisResult r = await Analyze(f);

ConsumeResult(r);

}

A Producer-Consumer DesignA Producer-Consumer Design

// Queue that will contain the API call tasks.

var taskQueue = new BlockingCollection<Task<ResultWrapper>>();

// Producer thread.

while (true)

{

// Grab a frame.

Frame f = GrabFrame();

// Decide whether to analyze the frame.

if (ShouldAnalyze(f))

{

// Start a task that will run in parallel with this thread.

var analysisTask = Task.Run(async () =>

{

// Put the frame, and the result/exception into a wrapper object.

var output = new ResultWrapper(f);

try

{

output.Analysis = await Analyze(f);

}

catch (Exception e)

{

output.Exception = e;

}

return output;

}

// Push the task onto the queue.

taskQueue.Add(analysisTask);

}

This approach launches each analysis in a separate Task, which can run in the background while we continue

grabbing new frames. It avoids blocking the main thread while waiting for an API call to return, however we have

lost some of the guarantees that the simple version provided -- multiple API calls might occur in parallel, and the

results might get returned in the wrong order. This approach could also cause multiple threads to enter

the ConsumeResult() function simultaneously, which could be dangerous, if the function is not thread-safe. Finally,

this simple code does not keep track of the Tasks that get created, so exceptions will silently disappear. Thus, the

final ingredient for us to add is a "consumer" thread that will track the analysis tasks, raise exceptions, kill long-

running tasks, and ensure the results get consumed in the correct order, one at a time.

In our final "producer-consumer" system, we have a producer thread that looks similar to our previous infinite loop.

However, instead of consuming analysis results as soon as they are available, the producer simply puts the tasks

into a queue to keep track of them.

We also have a consumer thread, that is taking tasks off the queue, waiting for them to finish, and either displaying

// Consumer thread.

while (true)

{

// Get the oldest task.

Task<ResultWrapper> analysisTask = taskQueue.Take();

// Await until the task is completed.

var output = await analysisTask;

// Consume the exception or result.

if (output.Exception != null)

{

throw output.Exception;

}

else

{

ConsumeResult(output.Analysis);

}

Implementing the Solution

Getting StartedGetting Started

the result or raising the exception that was thrown. By using the queue, we can guarantee that results get consumed

one at a time, in the correct order, without limiting the maximum frame-rate of the system.

To get your app up and running as quickly as possible, we have implemented the system described above,

intending it to be flexible enough to implement many scenarios, while being easy to use. To access the code, go to

https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis.

The library contains the class FrameGrabber, which implements the producer-consumer system discussed above to

process video frames from a webcam. The user can specify the exact form of the API call, and the class uses events

to let the calling code know when a new frame is acquired, or a new analysis result is available.

To illustrate some of the possibilities, there are two sample apps that use the library. The first is a simple console

app, and a simplified version of this is reproduced below. It grabs frames from the default webcam, and submits

them to the Face API for face detection.

using System;

using VideoFrameAnalyzer;

using Microsoft.ProjectOxford.Face;

using Microsoft.ProjectOxford.Face.Contract;

namespace VideoFrameConsoleApplication

{

class Program

{

static void Main(string[] args)

{

// Create grabber, with analysis type Face[].

FrameGrabber<Face[]> grabber = new FrameGrabber<Face[]>();

// Create Face API Client. Insert your Face API key here.

FaceServiceClient faceClient = new FaceServiceClient("<subscription key>");

// Set up our Face API call.

grabber.AnalysisFunction = async frame => return await

faceClient.DetectAsync(frame.Image.ToMemoryStream(".jpg"));

// Set up a listener for when we receive a new result from an API call.

grabber.NewResultAvailable += (s, e) =>

{

if (e.Analysis != null)

Console.WriteLine("New result received for frame acquired at {0}. {1} faces detected",

e.Frame.Metadata.Timestamp, e.Analysis.Length);

};

// Tell grabber to call the Face API every 3 seconds.

grabber.TriggerAnalysisOnInterval(TimeSpan.FromMilliseconds(3000));

// Start running.

grabber.StartProcessingCameraAsync().Wait();

// Wait for keypress to stop

Console.WriteLine("Press any key to stop...");

Console.ReadKey();

// Stop, blocking until done.

grabber.StopProcessingAsync().Wait();

}

The second sample app is a bit more interesting, and allows you to choose which API to call on the video frames.

On the left-hand side, the app shows a preview of the live video, on the right-hand side it shows the most recent

API result overlaid on the corresponding frame.

In most modes, there will be a visible delay between the live video on the left, and the visualized analysis on the

right. This delay is the time taken to make the API call. The exception to this is in the

"EmotionsWithClientFaceDetect" mode, which performs face detection locally on the client computer

using OpenCV, before submitting any images to Cognitive Services. By doing this, we can visualize the detected

face immediately, and then update the emotions later once the API call returns. This demonstrates the possibility of

a "hybrid" approach, where some simple processing can be performed on the client, and then Cognitive Services

APIs can be used to augment this with more advanced analysis when necessary.

Integrating into your codebaseIntegrating into your codebase
Summary
To get started with this sample, follow these steps:
1.  Get API keys for the Vision APIs from Subscriptions. For video frame analysis, the applicable APIs are:
Computer Vision API
Emotion API
Face API
2.  Clone the Cognitive-Samples-VideoFrameAnalysis GitHub repo
3.  Open the sample in Visual Studio 2015, build and run the sample applications:
For BasicConsoleSample, the Face API key is hard-coded directly in BasicConsoleSample/Program.cs.
For LiveCameraSample, the keys should be entered into the Settings pane of the app. They will be
persisted across sessions as user data.
When you're ready to integrate, simply reference the VideoFrameAnalyzer library from your own projects.
The image, voice, video or text understanding capabilities of VideoFrameAnalyzer uses Azure Cognitive Services.
Microsoft will receive the images, audio, video, and other data that you upload (via this app) and may use them for
service improvement purposes. We ask for your help in protecting the people whose data your app sends to Azure
Cognitive Services.
In this guide, you learned how to run near-real-time analysis on live video streams using the Face, Computer
Vision, and Emotion APIs, and how you can use our sample code to get started. You can get started building your
app with free API keys at the Azure Cognitive Services sign-up page.
Please feel free to provide feedback and suggestions in the GitHub repository, or for more broad API feedback, on
our UserVoice site.

Sample: Explore an image processing app with C#

4/19/2019 • 18 minutes to read • Edit Online

Prerequisites

Get the sample app

git clone --recurse-submodules https://github.com/Microsoft/Cognitive-Vision-Windows.git

IMPORTANTIMPORTANT

Get optional sample imagesGet optional sample images

Explore a basic Windows application that uses Computer Vision to perform optical character recognition (OCR),

create smart-cropped thumbnails, plus detect, categorize, tag and describe visual features, including faces, in an

image. The below example lets you submit an image URL or a locally stored file. You can use this open source

example as a template for building your own app for Windows using the Computer Vision API and Windows

Presentation Foundation (WPF), a part of .NET Framework.

Get the sample app from GitHub

Open and build the sample app in Visual Studio

Run the sample app and interact with it to perform various scenarios

Explore the various scenarios included with the sample app

Before exploring the sample app, ensure that you've met the following prerequisites:

You must have Visual Studio 2015 or later.

You must have a subscription key for Computer Vision. You can get a free trial key from Try Cognitive Services.

Or, follow the instructions in Create a Cognitive Services account to subscribe to Computer Vision and get your

key.

The Computer Vision sample app is available on GitHub from the Microsoft/Cognitive-Vision-Windows repository.

This repository also includes the Microsoft/Cognitive-Common-Windows repository as a Git submodule. You can

recursively clone this repository, including the submodule, either by using the git clone --recurse-submodules

command from the command line, or by using GitHub Desktop.

For example, to recursively clone the repository for the Computer Vision sample app from a command prompt, run

the following command:

Do not download this repository as a ZIP. Git doesn't include submodules when downloading a repository as a ZIP.

You can optionally use the sample images included with the Face sample app, available on GitHub from the

Microsoft/Cognitive-Face-Windows repository. That sample app includes a folder, /Data , which contains multiple

images of people. You can recursively clone this repository, as well, by the methods described for the Computer

Vision sample app.

For example, to recursively clone the repository for the Face sample app from a command prompt, run the

following command:

git clone --recurse-submodules https://github.com/Microsoft/Cognitive-Face-Windows.git

Open and build the sample app in Visual Studio

Run and interact with the sample app

You must build the sample app first, so that Visual Studio can resolve dependencies, before you can run or explore

the sample app. To open and build the sample app, do the following steps:

1. Open the Visual Studio solution file, /Sample-WPF/VisionAPI-WPF-Samples.sln , in Visual Studio.

2. Ensure that the Visual Studio solution contains two projects:

SampleUserControlLibrary

VisionAPI-WPF-Samples

If the SampleUserControlLibrary project is unavailable, confirm that you've recursively cloned the

Microsoft/Cognitive-Vision-Windows repository.

3. In Visual Studio, either press Ctrl+Shift+B or choose Build from the ribbon menu and then choose Build

Solution to build the solution.

You can run the sample app, to see how it interacts with you and with the Computer Vision client library when

performing various tasks, such as generating thumbnails or tagging images. To run and interact with the sample

app, do the following steps:

1. After the build is complete, either press F5 or choose Debug from the ribbon menu and then choose Start

debugging to run the sample app.

2. When the sample app is displayed, choose Subscription Key Management from the navigation pane to

display the Subscription Key Management page.

3. Enter your subscription key in Subscription Key.

4. Enter the endpoint URL, omitting the /vision/v1.0 , of the Computer Vision resource for your subscription

key in Endpoint.

For example, if you're using the subscription key from the Computer Vision free trial, enter the following

endpoint URL for the West Central US Azure region: https://westcentralus.api.cognitive.microsoft.com

5. If you don't want to enter your subscription key and endpoint URL the next time you run the sample app,

choose Save Setting to save the subscription key and endpoint URL to your computer. If you want to delete

your previously-saved subscription key and endpoint URL, choose Delete Setting.

NOTENOTE
SCENARIO DESCRIPTION
Analyze Image Uses the Analyze Image operation to analyze a local or
remote image. You can choose the visual features and
language for the analysis, and see both the image and the
results.
Analyze Image with Domain Model Uses the List Domain Specific Models operation to list the
domain models from which you can select, and the
Recognize Domain Specific Content operation to analyze a
local or remote image using the selected domain model.
You can also choose the language for the analysis.
Describe Image Uses the Describe Image operation to create a human-
readable description of a local or remote image. You can
also choose the language for the description.
Generate Tags Uses the Tag Image operation to tag the visual features of
a local or remote image. You can also choose the language
used for the tags.
Recognize Text (OCR)Uses the OCR operation to recognize and extract printed
text from an image. You can either choose the language to
use, or let Computer Vision auto-detect the language.
Recognize Text V2 (English)Uses the Recognize Text and Get Recognize Text Operation
Result operations to asynchronously recognize and extract
printed or handwritten text from an image.
Get Thumbnail Uses the Get Thumbnail operation to generate a
thumbnail for a local or remote image.
The sample app uses isolated storage, and  System.IO.IsolatedStorage , to store your subscription key and
endpoint URL.
6.  Under Select a scenario in the navigation pane, select one of the scenarios currently included with the
sample app:
The following screenshot illustrates the page provided for the Analyze Image scenario, after analyzing a
sample image. 

Explore the sample app

The Visual Studio solution for the Computer Vision sample app contains two projects:

SampleUserControlLibrary

The SampleUserControlLibrary project provides functionality shared by multiple Cognitive Services samples.

The project contains the following:

VisionAPI-WPF-Samples

The main project for the Computer Vision sample app, this project contains all of the interesting functionality for

Computer Vision. The project contains the following:

SampleScenarios

A UserControl that provides a standardized presentation, such as the title bar, navigation pane, and

content pane, for samples. The Computer Vision sample app uses this control in the MainWindow.xaml

window to display scenario pages and access information shared across scenarios, such as the

subscription key and endpoint URL.

SubscriptionKeyPage

A Page that provides a standardized layout for entering a subscription key and endpoint URL for the

sample app. The Computer Vision sample app uses this page to manage the subscription key and

endpoint URL used by the scenario pages.

VideoResultControl

A UserControl that provides a standardized presentation for video information. The Computer Vision

sample app doesn't use this control.

AnalyzeInDomainPage.xaml

The scenario page for the Analyze Image with Domain Model scenario.

AnalyzeImage.xaml

The scenario page for the Analyze Image scenario.

Explore the sample codeExplore the sample code

DescribePage.xaml

The scenario page for the Describe Image scenario.

ImageScenarioPage.cs

The ImageScenarioPage class, from which all of the scenario pages in the sample app are derived. This

class manages functionality, such as providing credentials and formatting output, shared by all of the

scenario pages.

MainWindow.xaml

The main window for the sample app, it uses the SampleScenarios control to present the

SubscriptionKeyPage and scenario pages.

OCRPage.xaml

The scenario page for the Recognize Text (OCR) scenario.

RecognizeLanguage.cs

The RecognizeLanguage class, which provides information about the languages supported by the various

methods in the sample app.

TagsPage.xaml

The scenario page for the Generate Tags scenario.

TextRecognitionPage.xaml

The scenario page for the Recognize Text V2 (English) scenario.

ThumbnailPage.xaml

The scenario page for the Get Thumbnail scenario.

Key portions of sample code are framed with comment blocks that start with KEY SAMPLE CODE STARTS HERE and end

with KEY SAMPLE CODE ENDS HERE , to make it easier for you to explore the sample app. These key portions of sample

code contain the code most relevant to learning how to use the Computer Vision API client library to do various

tasks. You can search for KEY SAMPLE CODE STARTS HERE in Visual Studio to move between the most relevant sections

of code in the Computer Vision sample app.

For example, the UploadAndAnalyzeImageAsync method, shown following and included in AnalyzePage.xaml,

demonstrates how to use the client library to analyze a local image by invoking the

ComputerVisionClient.AnalyzeImageInStreamAsync method.

private async Task<ImageAnalysis> UploadAndAnalyzeImageAsync(string imageFilePath)

{

// -----------------------------------------------------------------------

// KEY SAMPLE CODE STARTS HERE

// -----------------------------------------------------------------------

//

// Create Cognitive Services Vision API Service client.

//

using (var client = new ComputerVisionClient(Credentials) { Endpoint = Endpoint })

{

Log("ComputerVisionClient is created");

using (Stream imageFileStream = File.OpenRead(imageFilePath))

{

//

// Analyze the image for all visual features.

//

Log("Calling ComputerVisionClient.AnalyzeImageInStreamAsync()...");

VisualFeatureTypes[] visualFeatures = GetSelectedVisualFeatures();

string language = (_language.SelectedItem as RecognizeLanguage).ShortCode;

ImageAnalysis analysisResult = await client.AnalyzeImageInStreamAsync(imageFileStream,

visualFeatures, null, language);

return analysisResult;

}

// -----------------------------------------------------------------------

// KEY SAMPLE CODE ENDS HERE

// -----------------------------------------------------------------------

}

Explore the client libraryExplore the client library

// -----------------------------------------------------------------------

// KEY SAMPLE CODE STARTS HERE

// Use the following namespace for ComputerVisionClient.

// -----------------------------------------------------------------------

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;

using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;

// -----------------------------------------------------------------------

// KEY SAMPLE CODE ENDS HERE

// -----------------------------------------------------------------------

Explore the Analyze Image scenario

This sample app uses the Computer Vision API client library, a thin C# client wrapper for the Computer Vision API

in Azure Cognitive Services. The client library is available from NuGet in the

Microsoft.Azure.CognitiveServices.Vision.ComputerVision package. When you built the Visual Studio application,

you retrieved the client library from its corresponding NuGet package. You can also view the source code for the

client library in the /ClientLibrary folder of the Microsoft/Cognitive-Vision-Windows repository.

The client library's functionality centers around the ComputerVisionClient class, in the

Microsoft.Azure.CognitiveServices.Vision.ComputerVision namespace, while the models used by the

ComputerVisionClient class when interacting with Computer Vision are found in the

Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models namespace. In the various XAML scenario pages

included with the sample app, you'll find the following using directives for those namespaces:

You'll learn more about the various methods included with the ComputerVisionClient class as you explore the

scenarios included with the Computer Vision sample app.

Explore the Analyze Image with Domain Model scenario

This scenario is managed by the AnalyzePage.xaml page. You can choose the visual features and language for the

analysis, and see both the image and the results. The scenario page does this by using one of the following

methods, depending on the source of the image:

UploadAndAnalyzeImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.AnalyzeImageInStreamAsync method.

AnalyzeUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.AnalyzeImageAsync method.

The UploadAndAnalyzeImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream , then

gets the visual features and language selected in the scenario page. It calls the

ComputerVisionClient.AnalyzeImageInStreamAsync method, passing the Stream for the file, the visual features, and

the language, then returns the result as an ImageAnalysis instance. The methods inherited from the

ImageScenarioPage class present the returned results in the scenario page.

The AnalyzeUrlAsync method creates a new ComputerVisionClient instance, using the specified subscription key

and endpoint URL. It gets the visual features and language selected in the scenario page. It calls the

ComputerVisionClient.AnalyzeImageInStreamAsync method, passing the image URL, the visual features, and the

language, then returns the result as an ImageAnalysis instance. The methods inherited from the ImageScenarioPage

class present the returned results in the scenario page.

This scenario is managed by the AnalyzeInDomainPage.xaml page. You can choose a domain model, such as

celebrities or landmarks , and language to perform a domain-specific analysis of the image, and see both the

image and the results. The scenario page uses the following methods, depending on the source of the image:

GetAvailableDomainModelsAsync

This method gets the list of available domain models from Computer Vision and populates the

_domainModelComboBox ComboBox control on the page, using the ComputerVisionClient.ListModelsAsync method.

UploadAndAnalyzeInDomainImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.AnalyzeImageByDomainInStreamAsync method.

AnalyzeInDomainUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.AnalyzeImageByDomainAsync method.

The UploadAndAnalyzeInDomainImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream , then

gets the language selected in the scenario page. It calls the ComputerVisionClient.AnalyzeImageByDomainInStreamAsync

method, passing the Stream for the file, the name of the domain model, and the language, then returns the result

as an DomainModelResults instance. The methods inherited from the ImageScenarioPage class present the returned

results in the scenario page.

The AnalyzeInDomainUrlAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. It gets the language selected in the scenario page. It calls the

ComputerVisionClient.AnalyzeImageByDomainAsync method, passing the image URL, the visual features, and the

language, then returns the result as an DomainModelResults instance. The methods inherited from the

Explore the Describe Image scenario

Explore the Generate Tags scenario

ImageScenarioPage class present the returned results in the scenario page.

This scenario is managed by the DescribePage.xaml page. You can choose a language to create a human-readable

description of the image, and see both the image and the results. The scenario page uses the following methods,

depending on the source of the image:

UploadAndDescribeImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.DescribeImageInStreamAsync method.

DescribeUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.DescribeImageAsync method.

The UploadAndDescribeImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream , then

gets the language selected in the scenario page. It calls the ComputerVisionClient.DescribeImageInStreamAsync

method, passing the Stream for the file, the maximum number of candidates (in this case, 3), and the language,

then returns the result as an ImageDescription instance. The methods inherited from the ImageScenarioPage class

present the returned results in the scenario page.

The DescribeUrlAsync method creates a new ComputerVisionClient instance, using the specified subscription key

and endpoint URL. It gets the language selected in the scenario page. It calls the

ComputerVisionClient.DescribeImageAsync method, passing the image URL, the maximum number of candidates (in

this case, 3), and the language, then returns the result as an ImageDescription instance. The methods inherited from

the ImageScenarioPage class present the returned results in the scenario page.

This scenario is managed by the TagsPage.xaml page. You can choose a language to tag the visual features of an

image, and see both the image and the results. The scenario page uses the following methods, depending on the

source of the image:

UploadAndGetTagsForImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.TagImageInStreamAsync method.

GenerateTagsForUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.TagImageAsync method.

The UploadAndGetTagsForImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream , then

gets the language selected in the scenario page. It calls the ComputerVisionClient.TagImageInStreamAsync method,

passing the Stream for the file and the language, then returns the result as a TagResult instance. The methods

inherited from the ImageScenarioPage class present the returned results in the scenario page.

The GenerateTagsForUrlAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. It gets the language selected in the scenario page. It calls the

ComputerVisionClient.TagImageAsync method, passing the image URL and the language, then returns the result as a

TagResult instance. The methods inherited from the ImageScenarioPage class present the returned results in the

scenario page.

Explore the Recognize Text (OCR) scenario

Explore the Recognize Text V2 (English) scenario

This scenario is managed by the OCRPage.xaml page. You can choose a language to recognize and extract printed

text from an image, and see both the image and the results. The scenario page uses the following methods,

depending on the source of the image:

UploadAndRecognizeImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.RecognizePrintedTextInStreamAsync method.

RecognizeUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.RecognizePrintedTextAsync method.

The UploadAndRecognizeImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream , then

gets the language selected in the scenario page. It calls the ComputerVisionClient.RecognizePrintedTextInStreamAsync

method, indicating that orientation is not detected and passing the Stream for the file and the language, then

returns the result as an OcrResult instance. The methods inherited from the ImageScenarioPage class present the

returned results in the scenario page.

The RecognizeUrlAsync method creates a new ComputerVisionClient instance, using the specified subscription key

and endpoint URL. It gets the language selected in the scenario page. It calls the

ComputerVisionClient.RecognizePrintedTextAsync method, indicating that orientation is not detected and passing the

image URL and the language, then returns the result as an OcrResult instance. The methods inherited from the

ImageScenarioPage class present the returned results in the scenario page.

This scenario is managed by the TextRecognitionPage.xaml page. You can choose the recognition mode and a

language to asynchronously recognize and extract either printed or handwritten text from an image, and see both

the image and the results. The scenario page uses the following methods, depending on the source of the image:

UploadAndRecognizeImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the RecognizeAsync method and passing a parameterized delegate for the

ComputerVisionClient.RecognizeTextInStreamAsync method.

RecognizeUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the RecognizeAsync method and passing a parameterized delegate for the

ComputerVisionClient.RecognizeTextAsync method.

RecognizeAsync This method handles the asynchronous calling for both the UploadAndRecognizeImageAsync and

RecognizeUrlAsync methods, as well as polling for results by calling the

ComputerVisionClient.GetTextOperationResultAsync method.

Unlike the other scenarios included in the Computer Vision sample app, this scenario is asynchronous, in that one

method is called to start the process, but a different method is called to check on the status and return the results of

that process. The logical flow in this scenario is somewhat different from that in the other scenarios.

The UploadAndRecognizeImageAsync method opens the local file specified in imageFilePath for reading as a Stream ,

then calls the RecognizeAsync method, passing:

A lambda expression for a parameterized asynchronous delegate of the

ComputerVisionClient.RecognizeTextInStreamAsync method, with the Stream for the file and the recognition

Explore the Get Thumbnail scenario

mode as parameters, in GetHeadersAsyncFunc .

A lambda expression for a delegate to get the Operation-Location response header value, in

GetOperationUrlFunc .

The RecognizeUrlAsync method calls the RecognizeAsync method, passing:

A lambda expression for a parameterized asynchronous delegate of the

ComputerVisionClient.RecognizeTextAsync method, with the URL of the remote image and the recognition mode

as parameters, in GetHeadersAsyncFunc .

A lambda expression for a delegate to get the Operation-Location response header value, in

GetOperationUrlFunc .

When the RecognizeAsync method is completed, both UploadAndRecognizeImageAsync and RecognizeUrlAsync

methods return the result as a TextOperationResult instance. The methods inherited from the ImageScenarioPage

class present the returned results in the scenario page.

The RecognizeAsync method calls the parameterized delegate for either the

ComputerVisionClient.RecognizeTextInStreamAsync or ComputerVisionClient.RecognizeTextAsync method passed in

GetHeadersAsyncFunc and waits for the response. The method then calls the delegate passed in GetOperationUrlFunc

to get the Operation-Location response header value from the response. This value is the URL used to retrieve the

results of the method passed in GetHeadersAsyncFunc from Computer Vision.

The RecognizeAsync method then calls the ComputerVisionClient.GetTextOperationResultAsync method, passing the

URL retrieved from the Operation-Location response header, to get the status and result of the method passed in

GetHeadersAsyncFunc . If the status doesn't indicate that the method completed, successfully or unsuccessfully, the

RecognizeAsync method calls ComputerVisionClient.GetTextOperationResultAsync 3 more times, waiting 3 seconds

between calls. The RecognizeAsync method returns the results to the method that called it.

This scenario is managed by the ThumbnailPage.xaml page. You can indicate whether to use smart cropping, and

specify desired height and width, to generate a thumbnail from an image, and see both the image and the results.

The scenario page uses the following methods, depending on the source of the image:

UploadAndThumbnailImageAsync

This method is used for local images, in which the image must be encoded as a Stream and sent to Computer

Vision by calling the ComputerVisionClient.GenerateThumbnailInStreamAsync method.

ThumbnailUrlAsync

This method is used for remote images, in which the URL for the image is sent to Computer Vision by calling

the ComputerVisionClient.GenerateThumbnailAsync method.

The UploadAndThumbnailImageAsync method creates a new ComputerVisionClient instance, using the specified

subscription key and endpoint URL. Because the sample app is analyzing a local image, it has to send the contents

of that image to Computer Vision. It opens the local file specified in imageFilePath for reading as a Stream . It calls

the ComputerVisionClient.GenerateThumbnailInStreamAsync method, passing the width, height, the Stream for the file,

and whether to use smart cropping, then returns the result as a Stream . The methods inherited from the

ImageScenarioPage class present the returned results in the scenario page.

The RecognizeUrlAsync method creates a new ComputerVisionClient instance, using the specified subscription key

and endpoint URL. It calls the ComputerVisionClient.GenerateThumbnailAsync method, passing the width, height, the

URL for the image, and whether to use smart cropping, then returns the result as a Stream . The methods inherited

from the ImageScenarioPage class present the returned results in the scenario page.

Clean up resources

Next steps

When no longer needed, delete the folder into which you cloned the Microsoft/Cognitive-Vision-Windows

repository. If you opted to use the sample images, also delete the folder into which you cloned the

Microsoft/Cognitive-Face-Windows repository.

Get started with Face API

Computer Vision API Frequently Asked Questions

4/18/2019 • 2 minutes to read • Edit Online

TIPTIP

If you can't find answers to your questions in this FAQ, try asking the Computer Vision API community on StackOverflow or

contact Help and Support on UserVoice

Question: Can I train Computer Vision API to use custom tags? For example, I would like to feed in pictures of cat

breeds to 'train' the AI, then receive the breed value on an AI request.

Answer: This function is currently not available. However, our engineers are working to bring this functionality to

Computer Vision.

Question: Can Computer Vision be used locally without an internet connection?

Answer: We currently do not offer an on-premises or local solution.

Question: Can Computer Vision be used to read license plates?

Answer: The Vision API offers good text-detection with OCR, but it is not currently optimized for license plates. We

are constantly trying to improve our services and have added OCR for auto license plate recognition to our list of

feature requests.

Question: What types of writing surfaces are supported for handwriting recognition?

Answer: The technology works with different kinds of surfaces, including whiteboards, white paper, and yellow

sticky notes.

Question: How long does the handwriting recognition operation take?

Answer: The amount of time that it takes depends on the length of the text. For longer texts, it can take up to

several seconds. Therefore, after the Recognize Handwritten Text operation completes, you may need to wait before

you can retrieve the results using the Get Handwritten Text Operation Result operation.

Question: How does the handwriting recognition technology handle text that was inserted using a caret in the

middle of a line?

Answer: Such text is returned as a separate line by the handwriting recognition operation.

Question: How does the handwriting recognition technology handle crossed-out words or lines?

Answer: If the words are crossed out with multiple lines to render them unrecognizable, the handwriting

recognition operation doesn't pick them up. However, if the words are crossed out using a single line, that crossing

is treated as noise, and the words still get picked up by the handwriting recognition operation.

Question: What text orientations are supported for the handwriting recognition technology?

Answer: Text oriented at angles of up to around 30 degrees to 40 degrees may get picked up by the handwriting

recognition operation.

Computer Vision 86-category taxonomy

4/19/2019 • 2 minutes to read • Edit Online

abstract_

abstract_net

abstract_nonphoto

abstract_rect

abstract_shape

abstract_texture

animal_

animal_bird

animal_cat

animal_dog

animal_horse

animal_panda

building_

building_arch

building_brickwall

building_church

building_corner

building_doorwindows

building_pillar

building_stair

building_street

dark_

drink_

drink_can

dark_fire

dark_fireworks

sky_object

food_

food_bread

food_fastfood

food_grilled

food_pizza

indoor_

indoor_churchwindow

indoor_court

indoor_doorwindows

indoor_marketstore

indoor_room

indoor_venue

dark_light

others_

outdoor_

outdoor_city

outdoor_field

outdoor_grass

outdoor_house

outdoor_mountain

outdoor_oceanbeach

outdoor_playground

outdoor_railway

outdoor_road

outdoor_sportsfield

outdoor_stonerock

outdoor_street

outdoor_water

outdoor_waterside

people_

people_baby

people_crowd

people_group

people_hand

people_many

people_portrait

people_show

people_tattoo

people_young

plant_

plant_branch

plant_flower

plant_leaves

plant_tree

object_screen

object_sculpture

sky_cloud

sky_sun

people_swimming

outdoor_pool

text_

text_mag

text_map

text_menu

text_sign

trans_bicycle

trans_bus

trans_car

trans_trainstation

Language support for Computer Vision

4/19/2019 • 2 minutes to read • Edit Online

Text recognition

LANGUAGE LANGUAGE CODE OCR API

Arabic ar ✔

Chinese (Simplified)zh-Hans ✔

Chinese (Traditional)zh-Hant ✔

Czech cs ✔

Danish da ✔

Dutch nl ✔

English en ✔

Finnish fi ✔

French fr ✔

German de ✔

Greek el ✔

Hungarian hu ✔

Italian it ✔

Japanese ja ✔

Korean ko ✔

Norwegian nb ✔

Some features of Computer Vision support multiple languages; any features not mentioned here only support

English.

Computer Vision can recognize text in many languages. Specifically, the OCR API supports a variety of languages,

whereas the Read API and Recognize Text API only support English. See Recognize printed and handwritten text

for more information on this functionality and the advantages of each API.

OCR automatically detects the language of the input material, so there is no need to specify a language code in the

API call. However, language codes are always returned as the value of the "language" node in the JSON response.

Polish pl ✔

Portuguese pt ✔

Romanian ro ✔

Russian ru ✔

Serbian (Cyrillic)sr-Cyrl ✔

Serbian (Latin)sr-Latn ✔

Slovak sk ✔

Spanish es ✔

Swedish sw ✔

Turkish tr ✔

LANGUAGE LANGUAGE CODE OCR API

Image analysis

LANG

UAGE

LANG

UAGE

CODE

CATE

GORI

ES TAGS

DESC

RIPTI

ON

ADUL

T

BRAN

DS

COLO

R FACES

IMAG

ETYPE

OBJEC

TS

CELEB

RITIES

LAND

MARK

S

Chine

se

zh ✔✔✔-----✔ ✔

Englis

h

en ✔✔✔✔✔✔✔✔✔✔✔

Japan

ese

ja ✔✔✔-----✔ ✔

Portu

gues

e

pt ✔✔✔-----✔ ✔

Spani

sh

es ✔✔✔-----✔ ✔

Next steps

Some actions of the Analyze - Image API can return results in other languages, specified with the language query

parameter. Other actions return results in English regardless of what language is specified, and others throw an

exception for unsupported languages. Actions are specified with the visualFeatures and details query

parameters; see the Overview for a list of all the actions you can do with image analysis.

Get started using the Computer Vision features mentioned in this guide.

Analyze a local image (REST)

Extract printed text (REST)

What Is The Computer Vision API? Reference Guide

Navigation menu

Versions of this User Manual:

Views

Navigation