FIrestoreからStorageにエクスポートする
const functions = require('firebase-functions');
const firestore = require('@google-cloud/firestore');
const client = new firestore.v1.FirestoreAdminClient();
// Replace BUCKET_NAME
const bucket = 'gs://****';
exports.scheduledFirestoreExport = functions.pubsub
.schedule('every 24 hours')
.onRun((context) => {
const projectId = process.env.GCP_PROJECT || process.env.GCLOUD_PROJECT;
const databaseName =
client.databasePath(projectId, '(default)');
return client.exportDocuments({
name: databaseName,
outputUriPrefix: bucket,
// Leave collectionIds empty to export all collections
// or set to a list of collection IDs to export,
collectionIds: [{collection id1},{sub collection id of id1},{sub collection id of id1},{collection id2}']//works fine
})
.then(responses => {
const response = responses[0];
console.log(`Operation Name: ${response['name']}`);
})
.catch(err => {
console.error(err);
throw new Error('Export operation failed');
});
});
collectionIdsの指定でかなり迷ったが,最も上の階層にあるコレクションIDとその下の階層にあるコレクションIDを同列で書くらしい.上の階層のコレクションIDを指定せずに下の階層のIDだけ指定すると何も出力されない
collectionIdsを指定しなくてもエクスポートはできるものの,その場合,StorageからBigQueryに読み込めなくなるので今回は指定することが必須.
https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore?hl=ja
StorageからBigQueryへの読み込み
aaa
import os
from google.cloud import bigquery
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './credential.json'
# TODO(developer): Set table_id to the ID of the table to create.
table_id = "[project id].[db of bq].[table id]"
# TODO(developer): Set uri to the path of the kind export metadata
uri = (
"gs://****-backup-all/2021-09-09T11:50:09_53192/all_namespaces/kind_***/all_namespaces_kind_****.export_metadata"
)
# TODO(developer): Set projection_fields to a list of document properties
# to import. Leave unset or set to `None` for all fields.
projection_fields = []
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
job_config = bigquery.LoadJobConfig(
source_format=bigquery.SourceFormat.DATASTORE_BACKUP,
projection_fields=projection_fields,
write_disposition='WRITE_TRUNCATE'
)
'''
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the load.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a 'duplicate' error is returned in the job result.
'''
load_job = client.load_table_from_uri(
uri, table_id, job_config=job_config
) # Make an API request.
load_job.result() # Waits for the job to complete.
destination_table = client.get_table(table_id)
print("Loaded {} rows.".format(destination_table.num_rows))
a
u
上書きするかどうかはjob_config = bigquery.LoadJobConfigの部分でwrite_disposition引数で指定する.
WRITE_TRUNCATE: If the table already exists, BigQuery overwrites the table data and uses the schema from the load.
WRITE_APPEND: If the table already exists, BigQuery appends the data to the table.
WRITE_EMPTY: If the table already exists and contains data, a ‘duplicate’ error is returned in the job result.
projection_fieldsでエクスポートするFIeldを指定できる.
a
‘
‘