[Workshop] Google Cloud Vision API and Colab: A Beginner’s Guide to Image Analysis
Introduction
Google Cloud Vision API offers the ability to analyze images and extract valuable information, such as object detection, face recognition, text extraction, and more. This tutorial will guide you on using this API in Google Colab to detect labels in an image, making it accessible even for programming beginners.
I will use this image as example:
Prerequisites
Ensure you have the following ready:
- A Google account
- Access to Google Colab
- A GCP account with available credit
Colab with Full Code
https://colab.research.google.com/drive/1JxtaGIuBMPgDC1Nh_Ld5vicoHhM7mWLM?usp=sharing
Step 1: Create a Project in GCP
- Visit the GCP Console Project Selector page: https://console.cloud.google.com/projectselector2/
- Sign in with your Google account if prompted.
- Click on the “New Project” button.
- Enter a project name and click “Create” to initialize your new project.
Step 2: Enable the Cloud Speech-to-Text API
- In your GCP project dashboard, navigate to the “APIs & Services” dashboard.
- Click “Enable APIs and Services”.
- Search for “Speech-to-Text API” and select it.
- Click “Enable” to activate the API for your project.
Preparing Google Colab
Step 1: Authentication
- Open Google Colab: Google Colab
- Install the Google Cloud SDK by running the following command in a cell:
!pip install google-cloud
3. Authenticate your session with the following code:
from google.colab import auth
# Replace 'your-project-id' with your actual GCP project ID.
# the first image shows an example of project id
auth.authenticate_user(project_id='your-project-id')
Image Recognition with Python
Import the necessary library:
from google.cloud import vision
- Set up your GCP credentials (btw: this is usually handled automatically in Colab after authentication).
- Prepare your image file (
image.jpg
) by uploading it to Colab. - Use the following script to analize the image:
# Create a Vision client
client = vision.ImageAnnotatorClient()
# The path to the image file you want to analyze
file_path = "image.jpg"
# Load the image
with open(file_path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
# Request the API to detect labels in the image
response = client.label_detection(image=image)
# Print the detected labels
labels = response.label_annotations
print('Labels: ')
for label in labels:
print(label.description)
Result:
This code initializes the Vision client, loads the desired image, requests the API to detect labels, and displays the results.
Conclusion
By following these steps, you can start exploring the potential of Google Cloud Vision API for image analysis in your projects. This tutorial provides you with the basics to integrate advanced computer vision capabilities simply and effectively.
Congratulations on completing the tutorial! 😊