Skip to content Skip to sidebar Skip to footer

How Can I Get The Contents Of The "feedback" Box From Google Searches?

When you ask a question or request the definition of a word in a Google search, Google gives you a summary of the answer in the 'feedback' box. For example, when you search for def

Solution 1:

It is easily done using requests and bs4, you just need to pull the text from the div with the class lr_dct_ent

import requests
from bs4 import BeautifulSoup

h = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36"}
r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).text
soup = BeautifulSoup(r)

print("\n".join(soup.select_one("div.lr_dct_ent").text.split(";")))

The main text is in an ordered list, the noun is in the div with the lr_dct_sf_h class:

In [11]: r = requests.get("https://www.google.ie/search?q=define+apple", headers=h).text
In [12]: soup = BeautifulSoup(r,"lxml")    
In [13]: div = soup.select_one("div.lr_dct_ent")    
In [14]: n_v = div.select_one("div.lr_dct_sf_h").text   
In [15]: expl = [li.text for li in div.select("ol.lr_dct_sf_sens li")]    
In [16]: print(n_v)
noun

In [17]: print("\n".join(expl))
1. the round fruit of a tree of the rose family, which typically has thin green or red skin and crisp flesh.used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
used in names of unrelated fruits or other plant growths that resemble apples in some way, e.g. custard apple, oak apple.
2. the tree bearing apples, with hard pale timber that is used in carpentry and to smoke food.

Solution 2:

Question is nice idea

program can be started with python3 defineterm.py apple

#! /usr/bin/env python3.5
# defineterm.py

import requests
from bs4 import BeautifulSoup
import sys
import html
import codecs

searchterm = ' '.join(sys.argv[1:])

url = 'https://www.google.com/search?q=define+' + searchterm
res = requests.get(url)
try:
    res.raise_for_status()
except Exception as exc:
    print('error while loading page occured: ' + str(exc))

text = html.unescape(res.text)
soup = BeautifulSoup(text, 'lxml')
prettytext = soup.prettify()

#next lines are for analysis (saving raw page), you can comment them
frawpage = codecs.open('rawpage.txt', 'w', 'utf-8')
frawpage.write(prettytext)
frawpage.close()

firsttag = soup.find('h3', class_="r")
if firsttag != None:
    print(firsttag.getText())
    print()

#second tag may be changed, so check it if not returns correct result. That might be situation for all searched tags.
secondtag = soup.find('div', {'style': 'color:#666;padding:5px 0'})
if secondtag != None:
    print(secondtag.getText())
    print()

termtags = soup.findAll("li", {"style" : "list-style-type:decimal"})

count = 0
for tag in termtags:
    count += 1
    print( str(count)+'. ' + tag.getText())
    print()

make script as executable

then in ~/.bashrc
this line can be added

alias defterm="/data/Scrape/google/defineterm.py "

putting correct path to script your place

then executing

source ~/.bashrc

program can be started with:

defterm apple (or other term)

Solution 3:

The easiest way is to grab CSS selectors of this text by using the SelectorGadget.

from bs4 import BeautifulSoup
import requests, lxml

headers = {
    'User-agent':
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

html = requests.get('https://www.google.de/search?q=define apple', headers=headers)
soup = BeautifulSoup(html.text, 'lxml')

syllables = soup.select_one('.frCXef span').text
phonetic = soup.select_one('.g30o5d span span').text
noun = soup.select_one('.h3TRxf span').text
print(f'{syllables}\n{phonetic}\n{noun}')

# Output:
'''
ap·ple
ˈapəl
the round fruit of a tree of the rose family, which typically has thin red or green skin and crisp flesh. Many varieties have been developed as dessert or cooking fruit or for making cider.
'''

Alternatively, you can do the same thing using Google Direct Answer Box API from SerpApi. It's a paid API with a free trial of 5,000 searches.

Code to integrate:

from serpapi import GoogleSearch

params = {
  "api_key": "YOUR_API_KEY",
  "engine": "google",
  "q": "define apple",
  "google_domain": "google.com",
}

search = GoogleSearch(params)
results = search.get_dict()

syllables = results['answer_box']['syllables']
phonetic = results['answer_box']['phonetic']
noun = results['answer_box']['definitions'][0] # array output
print(f'{syllables}\n{phonetic}\n{noun}')

# Output:
'''
ap·ple
ˈapəl
the round fruit of a tree of the rose family, which typically has thin red or green skin and crisp flesh. Many varieties have been developed as dessert or cooking fruit or for making cider.
'''

Disclaimer, I work for SerpApi


Post a Comment for "How Can I Get The Contents Of The "feedback" Box From Google Searches?"