Versuchen, Python-API-Leseskript zu optimieren

Also ich ein Skript erstellen, um mit unserem API-Server für Asset-Management zu kommunizieren und einige Informationen abrufen. Ich habe festgestellt, dass die längste Gesamtzeit Teil des Skripts ist:Versuchen, Python-API-Leseskript zu optimieren

{method ‚lesen‘ von ‚_ssl._SSLSocket‘ Objekte}

Derzeit sind wir Informationen über 25 Assets ziehen oder so und dass bestimmte Teil dauert 18,89 Sekunden.

Gibt es eine Möglichkeit, dies zu optimieren, so dass es keine 45 Minuten dauert, alle 2.700 Computer, die wir haben, zu tun?

Ich kann eine Kopie des tatsächlichen Codes zur Verfügung stellen, wenn das hilfreich wäre.

import urllib2 
import base64 
import json 
import csv 

# Count Number so that process only runs for 25 assets at a time will be 
# replaced with a variable that is determined by the number of computers added 
# to the list 
Count_Stop = 25 

final_output_list = [] 


def get_creds(): 
    # Credentials Function that retrieves username:pw from .file 
    with open('.cred') as cred_file: 
     cred_string = cred_file.read().rstrip() 
     return cred_string 
     print(cred_string) 


def get_all_assets(): 
    # Function to retrieve computer ID + computer names and store the ID in a 
    # new list called computers_parsed 
    request = urllib2.Request('jss' 
           'JSSResource/computers') 
    creds = get_creds() 
    request.add_header('Authorization', 'Basic ' + base64.b64encode(creds)) 
    response = urllib2.urlopen(request).read() 
    # At this point the request for ID + name has been retrieved and now to be 
    # formatted in json 
    parsed_ids_json = json.loads(response) 
    # Then assign the parsed list (which has nested lists) at key 'computers' 
    # to a new list variable called computer_set 
    computer_set = parsed_ids_json['computers'] 
    # New list to store just the computer ID's obtained in Loop below 
    computer_ids = [] 
    # Count variable, when equal to max # of computers in Count_stop it stops. 
    count = 0 
    # This for loop iterates over ID + name in computer_set and returns the ID 
    # to the list computer_ids 
    for computers in computer_set: 
     count += 1 
     computer_ids.append(computers['id']) 
     # This IF condition allows for the script to be tested at 25 assets 
     # instead of all 2,000+ (comment out other announce_all_assets call) 
     if count == Count_Stop: 
      announce_all_assets(computer_ids, count) 
    # announce_all_assets(computer_ids, count) 


def announce_all_assets(computer_ids, count): 
    print('Final list of ID\'s for review: ' + str(computer_ids)) 
    print('Total number of computers to check against JSS: ' + 
      str(count)) 
    extension_attribute_request(computer_ids, count) 


def extension_attribute_request(computer_ids, count): 
    # Creating new variable, first half of new URL used in loop to get 
    # extension attributes using the computer ID's in computers_ids 
    base_url = 'jss' 
    what_we_want = '/subset/extensionattributes' 
    creds = get_creds() 
    print('Extension attribute function starts now:') 
    for ids in computer_ids: 
     request_url = base_url + str(ids) + what_we_want 
     request = urllib2.Request(request_url) 
     request.add_header('Authorization', 'Basic ' + base64.b64encode(creds)) 
     response = urllib2.urlopen(request).read() 
     parsed_ext_json = json.loads(response) 
     ext_att_json = parsed_ext_json['computer']['extension_attributes'] 
    retrieve_all_ext(ext_att_json) 


def retrieve_all_ext(ext_att_json): 
    new_computer = {} 
    # new_computer['original_id'] = ids['id'] 
    # new_computer['original_name'] = ids['name'] 
    for computer in ext_att_json: 
     new_computer[str(computer['name'])] = computer['value'] 
     add_to_master_list(new_computer) 


def add_to_master_list(new_computer): 
    final_output_list.append(new_computer) 
    print(final_output_list) 


def main(): 
    # Function to run the get all assets function 
    get_all_assets() 

if __name__ == '__main__': 
    # Function to run the functions in order: main > get all assets > 
    main()

Quelle

2016-12-23 Don Medlinger

, die nicht hilfreich wäre, die _necessary_ würde. Warum sonst die Frage als "Python" markieren? –

Danke für die Rückmeldung! Der Code wurde abzüglich der URLs hinzugefügt, da sie intern sind –

Ich lasse jemand anderen versuchen zu antworten. Zumindest ist es jetzt möglich. –

Ich würde _hhull empfehlen, das Modul 'Anfragen' über 'urllib2' zu verwenden. Es behandelt eine Menge Dinge für Sie und wird Ihnen viele Kopfschmerzen ersparen.

I glaube, es wird Ihnen auch eine bessere Leistung, aber ich würde gerne Ihr Feedback zu hören.

Hier ist Ihr Code mit Anfragen. (Ich habe neue Zeilen hinzugefügt meine Änderungen markieren Beachten Sie die eingebaute in .json() Decoder..):

# Requires requests module be installed.: 
# `pip install requests` or `pip3 install requests` 
# https://pypi.python.org/pypi/requests/ 
import requests 

import base64 
import json 
import csv 

# Count Number so that process only runs for 25 assets at a time will be 
# replaced with a variable that is determined by the number of computers added 
# to the list 
Count_Stop = 25 

final_output_list = [] 

def get_creds(): 
    # Credentials Function that retrieves username:pw from .file 
    with open('.cred') as cred_file: 
     cred_string = cred_file.read().rstrip() 
     return cred_string 
     print(cred_string) 

def get_all_assets(): 
    # Function to retrieve computer ID + computer names and store the ID in a 
    # new list called computers_parsed 


    base_url = 'jss' 
    what_we_want = 'JSSResource/computers' 
    request_url = base_url + what_we_want 


# NOTE the request_url is constructed based on your request assignment just below. 
# As such, it is malformed as a URL, and I assume anonymized for your posting on SO. 
# request = urllib2.Request('jss' 
#        'JSSResource/computers') 
# 

    creds = get_creds() 


    headers={ 
     'Authorization': 'Basic ' + base64.b64encode(creds), 
    } 
    response = requests.get(request_url, headers) 
    parsed_ids_json = response.json() 


    #[NO NEED FOR THE FOLLOWING. 'requests' HANDLES DECODES JSON. SEE ABOVE ASSIGNMENT.] 
    # At this point the request for ID + name has been retrieved and now to be 
    # formatted in json 
    # parsed_ids_json = json.loads(response) 


    # Then assign the parsed list (which has nested lists) at key 'computers' 
    # to a new list variable called computer_set 
    computer_set = parsed_ids_json['computers'] 
    # New list to store just the computer ID's obtained in Loop below 
    computer_ids = [] 
    # Count variable, when equal to max # of computers in Count_stop it stops. 
    count = 0 
    # This for loop iterates over ID + name in computer_set and returns the ID 
    # to the list computer_ids 
    for computers in computer_set: 
     count += 1 
     computer_ids.append(computers['id']) 
     # This IF condition allows for the script to be tested at 25 assets 
     # instead of all 2,000+ (comment out other announce_all_assets call) 
     if count == Count_Stop: 
      announce_all_assets(computer_ids, count) 
    # announce_all_assets(computer_ids, count) 

def announce_all_assets(computer_ids, count): 
    print('Final list of ID\'s for review: ' + str(computer_ids)) 
    print('Total number of computers to check against JSS: ' + 
      str(count)) 
    extension_attribute_request(computer_ids, count) 

def extension_attribute_request(computer_ids, count): 
    # Creating new variable, first half of new URL used in loop to get 
    # extension attributes using the computer ID's in computers_ids 
    base_url = 'jss' 
    what_we_want = '/subset/extensionattributes' 
    creds = get_creds() 
    print('Extension attribute function starts now:') 
    for ids in computer_ids: 
     request_url = base_url + str(ids) + what_we_want 


     headers={ 
      'Authorization': 'Basic ' + base64.b64encode(creds), 
     } 
     response = requests.get(request_url, headers) 
     parsed_ext_json = response.json() 


     ext_att_json = parsed_ext_json['computer']['extension_attributes'] 
    retrieve_all_ext(ext_att_json) 

def retrieve_all_ext(ext_att_json): 
    new_computer = {} 
    # new_computer['original_id'] = ids['id'] 
    # new_computer['original_name'] = ids['name'] 
    for computer in ext_att_json: 
     new_computer[str(computer['name'])] = computer['value'] 
     add_to_master_list(new_computer) 

def add_to_master_list(new_computer): 
    final_output_list.append(new_computer) 
    print(final_output_list) 

def main(): 
    # Function to run the get all assets function 
    get_all_assets() 

if __name__ == '__main__': 
    # Function to run the functions in order: main > get all assets > 
    main()

Bitte lassen Sie mich wissen, die relative Performance Zeit mit Ihrer 25 Assets in 18,89 Sekunden! Ich bin sehr neugierig.

Quelle

2016-12-23 23:35:30

Wenn Leistung wirklich ein sehr großes Problem ist, würde ich vorschlagen, PyCurl in Betracht zu ziehen. Ich habe es nicht selbst benutzt, aber es taucht häufig in der Diskussion von sehr leistungsfähigen Umgebungen auf. –

Ich würde immer noch meine andere Antwort unten empfehlen (?) In Bezug auf die Verwendung des Requests-Moduls aus reiner Sauberkeit Perspektive (Anfragen ist sehr sauber zu arbeiten), aber ich erkenne es möglicherweise oder möglicherweise nicht Ihre Adresse ursprüngliche Frage.

Wenn Sie pycurl versuchen, die wahrscheinlich wird Ihre ursprüngliche Frage auswirken, hier ist der gleiche Code mit diesem Ansatz implementiert:

# Requires pycurl module be installed.: 
# `pip install pycurl` or `pip3 install pycurl` 
# https://pypi.python.org/pypi/pycurl/7.43.0 
# NOTE: The syntax used herein for pycurl is python 3 compliant. 
# Not python 2 compliant. 
import pycurl 

import base64 
import json 
import csv 

def pycurl_data(url, headers): 
    buffer = BytesIO() 
    connection = pycurl.Curl() 
    connection.setopt(connection.URL, url) 
    connection.setopt(pycurl.HTTPHEADER, headers) 
    connection.setopt(connection.WRITEDATA, buffer) 
    connection.perform() 
    connection.close() 

    body = buffer.getvalue() 
    # NOTE: The following assumes a byte string and a utf8 format. Change as desired. 
    return json.loads(body.decode('utf8')) 

# Count Number so that process only runs for 25 assets at a time will be 
# replaced with a variable that is determined by the number of computers added 
# to the list 
Count_Stop = 25 

final_output_list = [] 

def get_creds(): 
    # Credentials Function that retrieves username:pw from .file 
    with open('.cred') as cred_file: 
     cred_string = cred_file.read().rstrip() 
     return cred_string 
     print(cred_string) 

def get_all_assets(): 
    # Function to retrieve computer ID + computer names and store the ID in a 
    # new list called computers_parsed 
    base_url = 'jss' 
    what_we_want = 'JSSResource/computers' 
    request_url = base_url + what_we_want 

# NOTE the request_url is constructed based on your request assignment just below. 
# As such, it is malformed as a URL, and I assume anonymized for your posting on SO. 
# request = urllib2.Request('jss' 
#        'JSSResource/computers') 
# 

    creds = get_creds() 


    headers= [ 'Authorization: Basic ' + base64.b64encode(creds) ] 
    response = pycurl_data(url, headers) 


    # At this point the request for ID + name has been retrieved and now to be 
    # formatted in json 
    parsed_ids_json = json.dumps(response) 
    # Then assign the parsed list (which has nested lists) at key 'computers' 
    # to a new list variable called computer_set 
    computer_set = parsed_ids_json['computers'] 
    # New list to store just the computer ID's obtained in Loop below 
    computer_ids = [] 
    # Count variable, when equal to max # of computers in Count_stop it stops. 
    count = 0 
    # This for loop iterates over ID + name in computer_set and returns the ID 
    # to the list computer_ids 
    for computers in computer_set: 
     count += 1 
     computer_ids.append(computers['id']) 
     # This IF condition allows for the script to be tested at 25 assets 
     # instead of all 2,000+ (comment out other announce_all_assets call) 
     if count == Count_Stop: 
      announce_all_assets(computer_ids, count) 
    # announce_all_assets(computer_ids, count) 

def announce_all_assets(computer_ids, count): 
    print('Final list of ID\'s for review: ' + str(computer_ids)) 
    print('Total number of computers to check against JSS: ' + 
      str(count)) 
    extension_attribute_request(computer_ids, count) 

def extension_attribute_request(computer_ids, count): 
    # Creating new variable, first half of new URL used in loop to get 
    # extension attributes using the computer ID's in computers_ids 
    base_url = 'jss' 
    what_we_want = '/subset/extensionattributes' 
    creds = get_creds() 
    print('Extension attribute function starts now:') 
    for ids in computer_ids: 
     request_url = base_url + str(ids) + what_we_want 


     headers= [ 'Authorization: Basic ' + base64.b64encode(creds) ] 
     response = pycurl_data(url, headers) 

     parsed_ext_json = json.dumps(response) 


     ext_att_json = parsed_ext_json['computer']['extension_attributes'] 
    retrieve_all_ext(ext_att_json) 

def retrieve_all_ext(ext_att_json): 
    new_computer = {} 
    # new_computer['original_id'] = ids['id'] 
    # new_computer['original_name'] = ids['name'] 
    for computer in ext_att_json: 
     new_computer[str(computer['name'])] = computer['value'] 
     add_to_master_list(new_computer) 

def add_to_master_list(new_computer): 
    final_output_list.append(new_computer) 
    print(final_output_list) 

def main(): 
    # Function to run the get all assets function 
    get_all_assets() 

if __name__ == '__main__': 
    # Function to run the functions in order: main > get all assets > 
    main()

Quelle

2016-12-24 04:55:59

Nur ein Gedanke. Ich will nicht davon ausgehen, dass Sie mit Ihrer API nicht vertraut sind, aber auf die aussichtslose Situation ... Könnte es einen Mechanismus geben, über den Sie alle Computer _mit_ Erweiterungsattributen in einer einzigen json-Rückgabe anfordern können? Das würde einen langen Weg zu Ihrer Lösung gehen. Es kann erforderlich sein, mehrere Seiten aufzurufen, aber das würde zumindest weniger Netzwerkinteraktionen erfordern, als jeden Computereintrag einzeln abzurufen. Wenn es keinen solchen Mechanismus gibt, könnte es sich lohnen, ihn umzusetzen oder an wen auch immer die Implementierung als Vorschlag zu nehmen. –

Versuchen, Python-API-Leseskript zu optimieren

Antwort

Verwandte Themen