An application programming interface called the indexing API enables website owners to inform Google when pages are added or removed. Google can now instantaneously index any webpage thanks to this. Short-lived content, including job listings and news pieces, are primarily used for it.
The Indexing API can be used to instruct Google to add or remove pages from its index. The location of a web page must be specified in the requests. Furthermore, you may learn the status of the Google notifications you've sent. The Indexing API can only initially be employed to crawl sites that include a BroadcastEvent or a JobPosting embedded inside a VideoObject.
To build this script in Python we will use Google Collab and we will also need the following libraries:
Use the command below to install these libraries on Google Collab:
!pip install oauth2client
Instead, you can execute the following command in Window's "Command Prompt" or in Mac’s "Terminal" to install the libraries:
pip install oauth2client
The following codes must be used and call the required libraries after installation:
The next step is to transfer our URLs into a text file so that we can update Google on any changes, new releases, or deleted pages. Remember that the Indexing API has a daily link limit of 100 links or less.
When using Google Collab, you can upload and call the matching text file for each of your URLs by using the following code:
uploaded_file = files.upload()
The next step is to build a dictionary and have the URLs ready for sending requests. The source procedure will allow us to achieve that:
It is important to note that the necessary dictionary will be created for updating or publishing new content in the code. Use the URL DELETED command in place of the URL UPDATED if you need to remove URLs.
Setup a URL: The actions listed below should be followed to inform Google that a new URL has been submitted or that content at an existing URL has been updated:
Send an HTTP POST request to https://indexing.googleapis.com/v3/urlNotifications:publish
Delete the URL:
Notify Google so that we can remove the page from our index and avoid trying to crawl and index it again when you delete a page from your servers or add a meta name="robots" content="no index" /> tag in the head> section of a particular page. The URL must yield a 404 or 410 status code prior to removal requests, or the page must contain the meta element "meta name="robots" content="no index"/>.
The measures listed below should be followed to request removal from our index:
Send an HTTP POST request to https://indexing.googleapis.com/v3/urlNotifications:publish
Examples include:
You must visit the Google Developer Console, click on "Choose a project," then choose "New project" to establish a brand-new project in order to create and activate the API.
After that, you must choose a name for your project and click "Create."
You must first create the project before selecting it from the menu's project section, IAM & ADMIN from the left menu, and then "Service Accounts."
Following this, click "Create service account" to continue with account creation. Choose a name for your account in the first area, then click "Create and proceed." After finishing, you can proceed to step two. Choose a role for your account in the "Give this service account access to project" section, making sure to select "Owner" from the Quick access menu before moving on to the "Basic" section. After that, simply click "Done" to complete the process without making any changes.
Save the email address that is in the "Email" field on the newly opened page; we'll need it later. Choose "Actions," then select "Manage Keys."
Click on "Create new key" in the "Add Key" area of the newly opened page, then build a JSON file, and save the downloaded file.
It's time to turn on the Indexing API; to do this, click "Enable APIs & Services" under the "APIs & Services" section.
Search for "Indexing API" on the following page. After choosing it, press "Enable" to start using your API.
You must set up an access account for the Indexing API in Google Search Console before you can use it. You must first access your Google Search Console, go to the "Settings" tab, click on "Users and permissions," and then select "Add user" to add a new user. Enter the email you saved previously and set its permission to "Owner" when a new page appears.
Simply return it to your service accounts and copy the email if you forgot to save it. By opening the JSON file you downloaded earlier and searching for "client email," you should be able to find your email address there as well.
Use the following code to upload the JSON file to Google Colab:
json_key = files.upload()
The next step is to identify the location where the uploaded files are located. The OS library can be used to accomplish this. To verify that the file has been uploaded, you can use the following code, but you must include an "if" at the beginning:
As you are aware, Google has said that our software and script must utilize OAuth 2.0 to approve requests in order to use the Indexing API. The following link from the Indexing API should be used to obtain the data you need to use OAuth 2.0 and send requests. See the Authorize Requests page for further details on this subject.
SCOPES = [“https://www.googleapis.com/auth/indexing" ]
The following conditions must be met by the endpoint used to make queries, according to Google's explanation on the page using the Indexing API, and the requests must be sent using the post method:
If you only want to make one request:
ENDPOINT = "https://indexing.googleapis.com/v3/urlNotifications:publish"
To send many requests at once:
ENDPOINT = "https://indexing.googleapis.com/batch"
The queries are sent one at a time with this script, but there is a difference because we have previously created a dictionary with 100 URLs. Now that we've created the ServiceAccountCredentials variables, we need to construct the variables for "Authorize credentials." To do this, we use the oauth2client library.
The following stage will be to create service tools, after which we will create a function. The final request is then handled by us. On the Class BatchHttpRequest page, you may get more details about how the scripts operate. The last codes in this section appear as follows:
Code updated on November 21th 2023.
If everything was done correctly, requests will be printed after being sent.
Google scans a page's text, photos, and video files before storing the information in its massive database, the Google Index. Providing search results Google only displays results for user's queries that are pertinent to their inquiry. With a string of characters or a list of items, the Python index method enables you to get the element's or item's position in the index. It generates the lowest possible index for the element specified in the list. If the requested item isn't present in the list, a ValueError is returned.
You are permitted to do the following with Google's indexing API:
Google's Indexing API is a tool that allows website owners to directly notify Google about updates or changes made to their website's content. This is done by sending a request to the API, which will then trigger Google's indexing process for that particular page.
Using the Indexing API in bulk with Python allows website owners to submit multiple URLs at once, which can save a lot of time and effort compared to submitting each URL individually. Python is a popular programming language for working with APIs and web scraping, making it a natural choice for automating the process of submitting URLs to the Indexing API.
Submitting URLs in bulk can be particularly useful for large websites or websites that frequently update their content, as it ensures that Google is aware of all changes promptly. It can also be helpful for websites that are newly launched or have undergone significant updates, as it can help to get the site indexed and ranked more quickly.
Overall, using Google's Indexing API in bulk with Python can streamline the process of keeping a website's content up-to-date and ensuring that it is properly indexed by Google.
There are several benefits to using Google's Indexing API in bulk with Python: