99. A-tag-not-highly-recommended

FirestoreのデータをStorageを経由してBigQueryへエクスポートする

ソースコードによって実現される素晴らしい世界に驚嘆する人
FIrestoreからStorageにエクスポートする
const functions = require('firebase-functions');
const firestore = require('@google-cloud/firestore');
const client = new firestore.v1.FirestoreAdminClient();

// Replace BUCKET_NAME
const bucket = 'gs://****';

exports.scheduledFirestoreExport = functions.pubsub
                                            .schedule('every 24 hours')
                                            .onRun((context) => {

  const projectId = process.env.GCP_PROJECT || process.env.GCLOUD_PROJECT;
  const databaseName =
    client.databasePath(projectId, '(default)');

  return client.exportDocuments({
    name: databaseName,
    outputUriPrefix: bucket,
    // Leave collectionIds empty to export all collections
    // or set to a list of collection IDs to export,
    collectionIds: [{collection id1},{sub collection id of id1},{sub collection id of id1},{collection id2}']//works fine
    })
  .then(responses => {
    const response = responses[0];
    console.log(`Operation Name: ${response['name']}`);
  })
  .catch(err => {
    console.error(err);
    throw new Error('Export operation failed');
  });
});
collectionIdsの指定でかなり迷ったが,最も上の階層にあるコレクションIDとその下の階層にあるコレクションIDを同列で書くらしい.上の階層のコレクションIDを指定せずに下の階層のIDだけ指定すると何も出力されない collectionIdsを指定しなくてもエクスポートはできるものの,その場合,StorageからBigQueryに読み込めなくなるので今回は指定することが必須. https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore?hl=ja

StorageからBigQueryへの読み込み

aaa

import os
from google.cloud import bigquery
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './credential.json'
# TODO(developer): Set table_id to the ID of the table to create.

table_id = "[project id].[db of bq].[table id]"

# TODO(developer): Set uri to the path of the kind export metadata
uri = (
    "gs://****-backup-all/2021-09-09T11:50:09_53192/all_namespaces/kind_***/all_namespaces_kind_****.export_metadata"
)

# TODO(developer): Set projection_fields to a list of document properties
#                  to import. Leave unset or set to `None` for all fields.
projection_fields = []

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.DATASTORE_BACKUP,
    projection_fields=projection_fields,
    write_disposition='WRITE_TRUNCATE'
)
'''
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the load.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
'''

load_job = client.load_table_from_uri(
    uri, table_id, job_config=job_config
)  # Make an API request.

load_job.result()  # Waits for the job to complete.

destination_table = client.get_table(table_id)
print("Loaded {} rows.".format(destination_table.num_rows))
a u 上書きするかどうかはjob_config = bigquery.LoadJobConfigの部分でwrite_disposition引数で指定する. WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the load. WRITE_APPEND: If the table already exists, BigQuery appends the data to the table. WRITE_EMPTY: If the table already exists and contains data, a ‘duplicate’ error is returned in the job result. projection_fieldsでエクスポートするFIeldを指定できる. a ‘ ‘
Meditation Tools開発者
絹田 雅
複数の瞑想を学ぶことができるMeditation Toolsの開発者。 売上は人権段階を通じた寄附により社会をより良くすることに使われます。 利用はこちら
twitter-timeline