Amazon Web Services S3

Transfer to AWS S3 as destination dataset

User can transfer data to AWS S3 as destination as from following sources dataset:

  1. Tables

        SQLITE = "sqlite"
        BIGQUERY = "bigquery"
        SNOWFLAKE = "snowflake"
    
  2. Files

        LOCAL = "local"
        GS = "gs"  # Google Cloud Storage
        S3 = "s3"  # Amazon S3
        SFTP = "sftp"
    

Following transfer modes are supported:

  1. Non-native transfer

    Following is an example of non-native transfers between Google cloud storage to AWS S3 using non-native transfers:

        transfer_non_native_gs_to_s3 = UniversalTransferOperator(
            task_id="transfer_non_native_gs_to_s3",
            source_dataset=input_file,
            destination_dataset=File(path=f"{s3_bucket}/example_uto/", conn_id="aws_default"),
        )
    

Examples

  1. GCS to S3 transfers
    • Non-native transfer

      Following is an example of non-native transfers between Google cloud storage to S3 using non-native transfers:

          transfer_non_native_gs_to_s3 = UniversalTransferOperator(
              task_id="transfer_non_native_gs_to_s3",
              source_dataset=input_file,
              destination_dataset=File(path=f"{s3_bucket}/example_uto/", conn_id="aws_default"),
          )
      

Transfer from AWS S3 as source dataset

User can transfer data from Google cloud storage to the following destination dataset:

  1. Tables

        SQLITE = "sqlite"
        BIGQUERY = "bigquery"
        SNOWFLAKE = "snowflake"
    
  2. Files

        LOCAL = "local"
        GS = "gs"  # Google Cloud Storage
        S3 = "s3"  # Amazon S3
        SFTP = "sftp"
    

Following transfer modes are supported:

  1. Non-native transfer

    Following is an example of non-native transfers between AWS S3 to Google cloud storage using non-native transfer:

        transfer_non_native_s3_to_gs = UniversalTransferOperator(
            task_id="transfer_non_native_s3_to_gs",
            # [START dataset_individual_file]
            source_dataset=File(path=f"{s3_bucket}/example_uto/", conn_id="aws_default"),
            # [END dataset_individual_file]
            destination_dataset=File(
                path=f"{gcs_bucket}/example_uto/",
                conn_id="google_cloud_default",
            ),
        )
    
  2. Transfer using third-party tool

    Following is an example of transfers between AWS S3 to Snowflake using Fivetran with connector:

        transfer_fivetran_with_connector_id = UniversalTransferOperator(
            task_id="transfer_fivetran_with_connector_id",
            source_dataset=File(path=f"{s3_bucket}/uto/", conn_id="aws_default"),
            destination_dataset=Table(name="fivetran_test", conn_id="snowflake_default"),
            transfer_mode=TransferMode.THIRDPARTY,
            transfer_params=FiveTranOptions(conn_id="fivetran_default", connector_id="filing_muppet"),
        )
    

    Following is an example of transfers between AWS S3 to Snowflake using Fivetran without connector:

        transfer_fivetran_without_connector_id = UniversalTransferOperator(
            task_id="transfer_fivetran_without_connector_id",
            source_dataset=File(path=f"{s3_bucket}/uto/", conn_id="aws_default"),
            destination_dataset=Table(
                name="fivetran_test",
                conn_id="snowflake_conn",
                metadata=Metadata(database=snowflake_database, schema=snowflake_schema),
            ),
            transfer_mode=TransferMode.THIRDPARTY,
            transfer_params=FiveTranOptions(
                conn_id="fivetran_default",
                connector_id="filing_muppet",
                group=Group(name="test_group"),
                connector=Connector(
                    service="s3",
                    config=connector_config,
                    connector_id=None,
                    connect_card_config={"connector_val": "test_connector"},
                ),
                destination=Destination(
                    service="snowflake",
                    time_zone_offset="-5",
                    region="GCP_US_EAST4",
                    config=destination_config,
                ),
            ),
        )
    

Examples

  1. AWS S3 to GCS transfers
    • Non-native transfer

      Following is an example of non-native transfers between AWS S3 to Google cloud storage using non-native transfer:

          transfer_non_native_s3_to_gs = UniversalTransferOperator(
              task_id="transfer_non_native_s3_to_gs",
              # [START dataset_individual_file]
              source_dataset=File(path=f"{s3_bucket}/example_uto/", conn_id="aws_default"),
              # [END dataset_individual_file]
              destination_dataset=File(
                  path=f"{gcs_bucket}/example_uto/",
                  conn_id="google_cloud_default",
              ),
          )
      
  2. AWS S3 to Snowflake transfers
    • Non-native transfer

      Following is an example of non-native transfers between AWS S3 to Snowflake:

          transfer_non_native_s3_to_snowflake = UniversalTransferOperator(
              task_id="transfer_non_native_s3_to_snowflake",
              source_dataset=File(
                  path="s3://astro-sdk-test/example_uto/csv_files/", conn_id="aws_default", filetype=FileType.CSV
              ),
              destination_dataset=Table(name="uto_s3_table_to_snowflake", conn_id="snowflake_conn"),
          )
      
    • Transfer using third-party tool

      Following is an example of transfers between AWS S3 to Snowflake using Fivetran with connector passed:

          transfer_fivetran_with_connector_id = UniversalTransferOperator(
              task_id="transfer_fivetran_with_connector_id",
              source_dataset=File(path=f"{s3_bucket}/uto/", conn_id="aws_default"),
              destination_dataset=Table(name="fivetran_test", conn_id="snowflake_default"),
              transfer_mode=TransferMode.THIRDPARTY,
              transfer_params=FiveTranOptions(conn_id="fivetran_default", connector_id="filing_muppet"),
          )
      

      Following is an example of transfers between AWS S3 to Snowflake using Fivetran without connector passed:

          transfer_fivetran_without_connector_id = UniversalTransferOperator(
              task_id="transfer_fivetran_without_connector_id",
              source_dataset=File(path=f"{s3_bucket}/uto/", conn_id="aws_default"),
              destination_dataset=Table(
                  name="fivetran_test",
                  conn_id="snowflake_conn",
                  metadata=Metadata(database=snowflake_database, schema=snowflake_schema),
              ),
              transfer_mode=TransferMode.THIRDPARTY,
              transfer_params=FiveTranOptions(
                  conn_id="fivetran_default",
                  connector_id="filing_muppet",
                  group=Group(name="test_group"),
                  connector=Connector(
                      service="s3",
                      config=connector_config,
                      connector_id=None,
                      connect_card_config={"connector_val": "test_connector"},
                  ),
                  destination=Destination(
                      service="snowflake",
                      time_zone_offset="-5",
                      region="GCP_US_EAST4",
                      config=destination_config,
                  ),
              ),
          )
      
  3. AWS S3 to Bigquery transfers
    • Non-native transfer

      Following is an example of non-native transfers between AWS S3 to Bigquery using non-native transfers:

          transfer_non_native_s3_to_bigquery = UniversalTransferOperator(
              task_id="transfer_non_native_s3_to_bigquery",
              source_dataset=File(
                  path="s3://astro-sdk-test/example_uto/csv_files/", conn_id="aws_default", filetype=FileType.CSV
              ),
              destination_dataset=Table(
                  name="uto_s3_to_bigquery_destination_table",
                  conn_id="google_cloud_default",
                  metadata=Metadata(schema="astro"),
              ),
          )