Python GAE - Wie programmgesteuert Daten von einer Sicherung zu Big Query exportieren?

Ich habe lange gegooglet und ich habe keine Möglichkeit gefunden, meine Backups (innerhalb von Buckets) zu Big Query zu exportieren, ohne es manuell zu machen ...Python GAE - Wie programmgesteuert Daten von einer Sicherung zu Big Query exportieren?

Ist es möglich, dies zu tun?

Vielen Dank!

Quelle

2016-05-18 user309838

sollten Sie in der Lage sein, damit über die python-bigquery api zu tun.

Zuerst müssen Sie die Verbindung zum BigQuery-Dienst machen. Hier ist der Code, den ich so zu tun, verwenden:

class BigqueryAdapter(object): 
    def __init__(self, **kwargs): 
     self._project_id = kwargs['project_id'] 
     self._key_filename = kwargs['key_filename'] 
     self._account_email = kwargs['account_email'] 
     self._dataset_id = kwargs['dataset_id'] 
     self.connector = None 
     self.start_connection() 

    def start_connection(self): 
     key = None 
     with open(self._key_filename) as key_file: 
      key = key_file.read() 
     credentials = SignedJwtAssertionCredentials(self._account_email, 
                key, 
                ('https://www.googleapis' + 
                '.com/auth/bigquery')) 
     authorization = credentials.authorize(httplib2.Http()) 
     self.connector = build('bigquery', 'v2', http=authorization)

Danach Sie jobs mit self.connector (in this answer Sie einige Beispiele finden) ausführen können.

Um Sicherungen von Google Cloud Storage Sie so die configuration wie definieren würde:

body = "configuration": { 
    "load": { 
    "sourceFormat": #Either "CSV", "DATASTORE_BACKUP", "NEWLINE_DELIMITED_JSON" or "AVRO". 
    "fieldDelimiter": "," #(if it's comma separated) 
    "destinationTable": { 
     "projectId": #your_project_id 
     "tableId": #your_table_to_save_the_data 
     "datasetId": #your_dataset_id 
    }, 
    "writeDisposition": #"WRITE_TRUNCATE" or "WRITE_APPEND" 
    "sourceUris": [ 
     #the path to your backup in google cloud storage. it could be something like "'gs://bucket_name/filename*'. Notice you can use the '*' operator. 
    ], 
    "schema": { # [Optional] The schema for the destination table. The schema can be omitted if the destination table already exists, or if you're loading data from Google Cloud Datastore. 
     "fields": [ # Describes the fields in a table. 
     { 
      "fields": [ # [Optional] Describes the nested schema fields if the type property is set to RECORD. 
      # Object with schema name: TableFieldSchema 
      ], 
      "type": "A String", # [Required] The field data type. Possible values include STRING, BYTES, INTEGER, FLOAT, BOOLEAN, TIMESTAMP or RECORD (where RECORD indicates that the field contains a nested schema). 
      "description": "A String", # [Optional] The field description. The maximum length is 16K characters. 
      "name": "A String", # [Required] The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters. 
      "mode": "A String", # [Optional] The field mode. Possible values include NULLABLE, REQUIRED and REPEATED. The default value is NULLABLE. 
     }, 
     ], 
    }, 
    },

Und dann laufen:

self.connector.jobs().insert(body=body).execute()

Hoffentlich ist das, was Sie suchen. Lassen Sie uns wissen, wenn Sie auf Probleme stoßen.

Quelle

2016-05-19 16:17:47

Python GAE - Wie programmgesteuert Daten von einer Sicherung zu Big Query exportieren?

Antwort

Verwandte Themen