Tutorial: Azure Vision OCR Agent
This agent is a highly-efficient text recognition system designed to recognize and extract printed text from images and convert it into a machine-usable character stream.
RapidAPI Registration
RapidAPI
provides a variety of endpoints for various purposes. Among these, the Microsoft Computer Vision
endpoint is utilized by this agent.
Follow the following steps to get your Rapid API key:
- Go to https://rapidapi.com/
- If you do not have an account, proceed to create one by following the registration process.
- After completing the sign up process, you will be assigned a default API key that you can access here.

- Copy your API key using the
copy
icon. You can also click the eye icon to see your API key.

Microsoft Computer Vision Subscription
- Navigate to this link and go to the pricing tab.

- Select a subscription plan from the options available: Basic (free), Pro , or Ultra Plan. The Basic free plan offers 5,000 requests/month.

- Click the
Subscribe
button.

- Subscription created successfully.

Azure Vision OCR Agent Setup
- In SmythOS, go to
Templates
tab

- Find the
Azure OCR
Agent, hover over it, click the Remix button, and allow the template to initialize and configure.

LLM Prompt Setup
- In
SmythOS
, locate theLLM Prompt
component then click on the gear icon to access its settings.

- Input your RapidAPI key in the Prompt section. Remember to click on the check mark icon to confirm and save your configuration.

- You’re all set!
Test the Agent
Follow the steps outlined below to evaluate the Azure Vision OCR Agent.
- Open the ChatBot embodiment and click on the chat icon.

-
Ask the agent to analyze your image URL.
- Supported Extensions:
.png
or.jpeg
or.jpg
or.webp
- URL Format: https://www.example.com/image.png
- Supported Extensions:
-
Let’s use the image below as an example.

- Ask the agent to analyze your image.
- Prompt: Analyze this image https://i.stack.imgur.com/sFPWe.png

- Here’s the result!
