Thursday, January 27, 2022

[FIXED] How can get the json data automatically instead of copy and paste manually?

January 27, 2022 python-3.x, scrapy No comments

Issue

I want to get the json data in the target url:
target url

To get it manually :open it in brower manually and copy,paste.I want a more samrt way--programmatically and automatically,have tried with several way,all failed.
Method 1--traditional way with wget or curl:

wget  https://xueqiu.com/stock/cata/stocktypelist.json?page=1&size=300
--2021-02-09 11:55:44--  https://xueqiu.com/stock/cata/stocktypelist.json?page=1
Resolving xueqiu.com (xueqiu.com)... 39.96.249.191
Connecting to xueqiu.com (xueqiu.com)|39.96.249.191|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-02-09 11:55:44 ERROR 403: Forbidden.

Method 2--scrapy with selenium:

>>> from selenium import webdriver
>>> browser = webdriver.Chrome()
>>> url="https://xueqiu.com/stock/cata/stocktypelist.json?page=1&size=300"
>>> browser.get(url)

It happen to me in the browser:

{"error_description":"遇到错误，请刷新页面或者重新登录帐号后再试","error_uri":"/stock/cata/stocktypelist.json","error_code":"400016"}

Method 3--build a mitmproxy:

mitmweb   --listen-host  127.0.0.1  -p  8080

Set proxy in browser and open the target url in browser

Error info in terminal:

Web server listening at http://127.0.0.1:8081/
Opening in existing browser session.
Proxy server listening at http://127.0.0.1:8080
127.0.0.1:41268: clientconnect
127.0.0.1:41270: clientconnect
127.0.0.1:41268: HTTP/2 connection terminated by client: error code: 0, last stream id: 0, additional data: None

Error info in browser:

error_description   "遇到错误，请刷新页面或者重新登录帐号后再试"
error_uri   "/stock/cata/stocktypelist.json"
error_code  "400016"

So powerful site to protect the data ,is there no way to get the data automatically?

Solution

You could use requests module

import json

import requests
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0",}
import requests

cookies = {
    'xq_a_token': '176b14b3953a7c8a2ae4e4fae4c848decc03a883',
    'xqat': '176b14b3953a7c8a2ae4e4fae4c848decc03a883',
    'xq_r_token': '2c9b0faa98159f39fa3f96606a9498edb9ddac60',
    'xq_id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOi0xLCJpc3MiOiJ1YyIsImV4cCI6MTYxMzQ0MzE3MSwiY3RtIjoxNjEyODQ5MDY2ODI3LCJjaWQiOiJkOWQwbjRBWnVwIn0.VuyNicSjIvVkp9FrCzIlRyx8487XM4HH1C3X9KsFA2FipFiilSifBhux9pMNRyziHHiEifhX-xOgccc8IG1mn8cOylOVy3b-L1YG2T5Hs8MKgx7qm4gnV5Mzm_5_G5BiNtO44aczUcmp0g53dp7-0_Bvw3RlwXzT1DTvCKTV-s_zfBsOPyFTfiqyDUxU-oBRvkz1GpgVJzJL4EmZ8zDE2PBqeW00ueLLC7qPW50WeDCsEFS4ZPAvd2SbX9JPk-lU2WzlcMck2S9iFYmpDwuTeQuPbSeSl6jt5suwTImSgJDIUP9o2TX_Z7nNRDTYxvbP8XlejSt8X0pRDPDd_zpbMQ',
    'u': '661612849116563',
    'device_id': '24700f9f1986800ab4fcc880530dd0ed',
    'Hm_lvt_1db88642e346389874251b5a1eded6e3': '1612849123',
    's': 'c111f3y1kn',
    'Hm_lpvt_1db88642e346389874251b5a1eded6e3': '1612849252',
}

headers = {
    'Connection': 'keep-alive',
    'Cache-Control': 'no-cache',
    'sec-ch-ua': '"Chromium";v="88", "Google Chrome";v="88", ";Not A Brand";v="99"',
    'sec-ch-ua-mobile': '?0',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36',
    'Accept': 'image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-Mode': 'no-cors',
    'Sec-Fetch-User': '?1',
    'Sec-Fetch-Dest': 'image',
    'Accept-Language': 'en-US,en;q=0.9',
    'Pragma': 'no-cache',
    'Referer': '',
}

params = (
    ('page', '1'),
    ('size', '300'),
)

response = requests.get('https://xueqiu.com/stock/cata/stocktypelist.json', headers=headers, params=params, cookies=cookies)
print(response.status_code)
json_data = response.json()
print(json_data)

Answered By - Samsul Islam

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, January 27, 2022

[FIXED] How can get the json data automatically instead of copy and paste manually?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels