Ask AI

You are viewing an unreleased or outdated version of the documentation

dlt (dagster-dlt)

This library provides a Dagster integration with dlt.

For more information on getting started, see the dlt & Dagster documentation.

Assets

@dagster_dlt.dlt_assets(*, dlt_source, dlt_pipeline, name=None, group_name=None, dagster_dlt_translator=None, partitions_def=None, op_tags=None)[source]

Asset Factory for using data load tool (dlt).

Parameters:
  • dlt_source (DltSource) – The DltSource to be ingested.

  • dlt_pipeline (Pipeline) – The dlt Pipeline defining the destination parameters.

  • name (Optional[str], optional) – The name of the op.

  • group_name (Optional[str], optional) – The name of the asset group.

  • dagster_dlt_translator (DagsterDltTranslator, optional) – Customization object for defining asset parameters from dlt resources.

  • partitions_def (Optional[PartitionsDefinition]) – Optional partitions definition.

  • op_tags (Optional[Mapping[str, Any]]) – The tags for the underlying op.

Examples

Loading Hubspot data to Snowflake with an auto materialize policy using the dlt verified source:

from dagster_dlt import DagsterDltResource, DagsterDltTranslator, dlt_assets


class HubspotDagsterDltTranslator(DagsterDltTranslator):
    @public
    def get_auto_materialize_policy(self, resource: DltResource) -> Optional[AutoMaterializePolicy]:
        return AutoMaterializePolicy.eager().with_rules(
            AutoMaterializeRule.materialize_on_cron("0 0 * * *")
        )


@dlt_assets(
    dlt_source=hubspot(include_history=True),
    dlt_pipeline=pipeline(
        pipeline_name="hubspot",
        dataset_name="hubspot",
        destination="snowflake",
        progress="log",
    ),
    name="hubspot",
    group_name="hubspot",
    dagster_dlt_translator=HubspotDagsterDltTranslator(),
)
def hubspot_assets(context: AssetExecutionContext, dlt: DagsterDltResource):
    yield from dlt.run(context=context)

Loading Github issues to snowflake:

from dagster_dlt import DagsterDltResource, dlt_assets


@dlt_assets(
    dlt_source=github_reactions(
        "dagster-io", "dagster", items_per_page=100, max_items=250
    ),
    dlt_pipeline=pipeline(
        pipeline_name="github_issues",
        dataset_name="github",
        destination="snowflake",
        progress="log",
    ),
    name="github",
    group_name="github",
)
def github_reactions_dagster_assets(context: AssetExecutionContext, dlt: DagsterDltResource):
    yield from dlt.run(context=context)
dagster_dlt.build_dlt_asset_specs(dlt_source, dlt_pipeline, dagster_dlt_translator=None)[source]

Build a list of asset specs from a dlt source and pipeline.

Parameters:
  • dlt_source (DltSource) – dlt source object

  • dlt_pipeline (Pipeline) – dlt pipeline object

  • dagster_dlt_translator (Optional[DagsterDltTranslator]) – Allows customizing how to map dlt project to asset keys and asset metadata.

Returns:

List[AssetSpec] list of asset specs from dlt source and pipeline

class dagster_dlt.DagsterDltTranslator[source]
get_asset_key(resource)[source]

Defines asset key for a given dlt resource key and dataset name.

This method can be overriden to provide custom asset key for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

AssetKey of Dagster asset derived from dlt resource

get_auto_materialize_policy(resource)[source]

Defines resource specific auto materialize policy.

This method can be overriden to provide custom auto materialize policy for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

The automaterialize policy for a resource

Return type:

Optional[AutoMaterializePolicy]

get_automation_condition(resource)[source]

Defines resource specific automation condition.

This method can be overridden to provide custom automation condition for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

The automation condition for a resource

Return type:

Optional[AutomationCondition]

get_deps_asset_keys(resource)[source]

Defines upstream asset dependencies given a dlt resource.

Defaults to a concatenation of resource.source_name and resource.name.

Parameters:

resource (DltResource) – dlt resource

Returns:

The Dagster asset keys upstream of dlt_resource_key.

Return type:

Iterable[AssetKey]

get_description(resource)[source]

A method that takes in a dlt resource returns the Dagster description of the resource.

This method can be overriden to provide a custom description for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

The Dagster description for the dlt resource.

Return type:

Optional[str]

get_group_name(resource)[source]

A method that takes in a dlt resource and returns the Dagster group name of the resource.

This method can be overriden to provide a custom group name for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

A Dagster group name for the dlt resource.

Return type:

Optional[str]

get_kinds(resource, destination)[source]

A method that takes in a dlt resource and returns the kinds which should be attached. Defaults to the destination type and “dlt”.

This method can be overriden to provide custom kinds for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

The kinds of the asset.

Return type:

Set[str]

get_metadata(resource)[source]

Defines resource specific metadata.

Parameters:

resource (DltResource) – dlt resource

Returns:

The custom metadata entries for this resource.

Return type:

Mapping[str, Any]

get_owners(resource)[source]

A method that takes in a dlt resource and returns the Dagster owners of the resource.

This method can be overriden to provide custom owners for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

A sequence of Dagster owners for the dlt resource.

Return type:

Optional[Sequence[str]]

get_tags(resource)[source]

A method that takes in a dlt resource and returns the Dagster tags of the structure.

This method can be overriden to provide custom tags for a dlt resource.

Parameters:

resource (DltResource) – dlt resource

Returns:

A dictionary representing the Dagster tags for the

dlt resource.

Return type:

Optional[Mapping[str, str]]

Resources

class dagster_dlt.DagsterDltResource[source]

experimental This API may break in future versions, even between dot releases.

run(context, dlt_source=None, dlt_pipeline=None, dagster_dlt_translator=None, **kwargs)[source]

Runs the dlt pipeline with subset support.

Parameters:
  • context (Union[OpExecutionContext, AssetExecutionContext]) – Asset or op execution context

  • dlt_source (Optional[DltSource]) – optional dlt source if resource is used from an @op

  • dlt_pipeline (Optional[Pipeline]) – optional dlt pipeline if resource is used from an @op

  • dagster_dlt_translator (Optional[DagsterDltTranslator]) – optional dlt translator if resource is used from an @op

  • **kwargs (dict[str, Any]) – Keyword args passed to pipeline run method

Returns:

An iterator of MaterializeResult or AssetMaterialization

Return type:

DltEventIterator[DltEventType]