[Workshop] Google Cloud Vision API and Colab: A Beginner’s Guide to Image Analysis

3 min readMar 23, 2024

Introduction

Google Cloud Vision API offers the ability to analyze images and extract valuable information, such as object detection, face recognition, text extraction, and more. This tutorial will guide you on using this API in Google Colab to detect labels in an image, making it accessible even for programming beginners.

I will use this image as example:

Prerequisites

Ensure you have the following ready:

A Google account
Access to Google Colab
A GCP account with available credit

Colab with Full Code

https://colab.research.google.com/drive/1JxtaGIuBMPgDC1Nh_Ld5vicoHhM7mWLM?usp=sharing

Step 1: Create a Project in GCP

Visit the GCP Console Project Selector page: https://console.cloud.google.com/projectselector2/
Sign in with your Google account if prompted.
Click on the “New Project” button.
Enter a project name and click “Create” to initialize your new project.

Step 2: Enable the Cloud Speech-to-Text API

In your GCP project dashboard, navigate to the “APIs & Services” dashboard.
Click “Enable APIs and Services”.
Search for “Speech-to-Text API” and select it.
Click “Enable” to activate the API for your project.

Preparing Google Colab

Step 1: Authentication

Open Google Colab: Google Colab
Install the Google Cloud SDK by running the following command in a cell:

!pip install google-cloud

3. Authenticate your session with the following code:

from google.colab import auth
# Replace 'your-project-id' with your actual GCP project ID.
# the first image shows an example of project id
auth.authenticate_user(project_id='your-project-id')

Image Recognition with Python

Import the necessary library:

from google.cloud import vision

Set up your GCP credentials (btw: this is usually handled automatically in Colab after authentication).

Prepare your image file (image.jpg) by uploading it to Colab.
Use the following script to analize the image:

# Create a Vision client
client = vision.ImageAnnotatorClient()

# The path to the image file you want to analyze
file_path = "image.jpg"

# Load the image
with open(file_path, 'rb') as image_file:
    content = image_file.read()
image = vision.Image(content=content)

# Request the API to detect labels in the image
response = client.label_detection(image=image)

# Print the detected labels
labels = response.label_annotations
print('Labels: ')
for label in labels:
    print(label.description)

Result:

This code initializes the Vision client, loads the desired image, requests the API to detect labels, and displays the results.

Conclusion

By following these steps, you can start exploring the potential of Google Cloud Vision API for image analysis in your projects. This tutorial provides you with the basics to integrate advanced computer vision capabilities simply and effectively.

Congratulations on completing the tutorial! 😊