Amazon S3 到 SQL

使用 S3ToSqlOperator 传输操作符将数据从 Amazon Simple Storage Service (S3) 文件复制到现有的 SQL 表中。通过提供一个应用于下载文件的解析器函数,此操作符可以接受各种文件格式。

先决条件任务

要使用这些操作符,您必须执行以下几项操作

操作符

Amazon S3 到 SQL 传输操作符

要获取有关此操作符的更多信息,请访问:S3ToSqlOperator

使用 CSV 文件解析器的示例用法。此解析器将文件加载到内存中并返回行列表

tests/system/amazon/aws/example_s3_to_sql.py

#
# This operator requires a parser method. The Parser should take a filename as input
# and return an iterable of rows.
# This example parser uses the builtin csv library and returns a list of rows
#
def parse_csv_to_list(filepath):
    import csv

    with open(filepath, newline="") as file:
        return list(csv.reader(file))

transfer_s3_to_sql = S3ToSqlOperator(
    task_id="transfer_s3_to_sql",
    s3_bucket=s3_bucket_name,
    s3_key=s3_key,
    table=SQL_TABLE_NAME,
    column_list=SQL_COLUMN_LIST,
    parser=parse_csv_to_list,
    sql_conn_id=conn_id_name,
)

使用返回生成器的解析器函数的示例用法。

tests/system/amazon/aws/example_s3_to_sql.py

#
# As the parser can return any kind of iterator, a generator is also allowed.
# This example parser returns a generator which prevents python from loading
# the whole file into memory.
#

def parse_csv_to_generator(filepath):
    import csv

    with open(filepath, newline="") as file:
        yield from csv.reader(file)

transfer_s3_to_sql_generator = S3ToSqlOperator(
    task_id="transfer_s3_to_sql_paser_to_generator",
    s3_bucket=s3_bucket_name,
    s3_key=s3_key,
    table=SQL_TABLE_NAME,
    column_list=SQL_COLUMN_LIST,
    parser=parse_csv_to_generator,
    sql_conn_id=conn_id_name,
)

此条目是否有帮助?