Questions and Answers

Question qHHJA9X6TZbWVXs5sxtj

Question

A dataset has been defined using Delta Live Tables and includes an expectations clause: CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION DROP ROW What is the expected behavior when a batch of data containing data that violates these constraints is processed?

Choices

  • A: Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.
  • B: Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.
  • C: Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
  • D: Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
  • E: Records that violate the expectation cause the job to fail.

Question 7TMIYRliPFNxzXswLgmw

Question

Which of the following describes when to use the CREATE STREAMING LIVE TABLE (formerly CREATE INCREMENTAL LIVE TABLE) syntax over the CREATE LIVE TABLE syntax when creating Delta Live Tables (DLT) tables using SQL?

Choices

  • A: CREATE STREAMING LIVE TABLE should be used when the subsequent step in the DLT pipeline is static.
  • B: CREATE STREAMING LIVE TABLE should be used when data needs to be processed incrementally.
  • C: CREATE STREAMING LIVE TABLE is redundant for DLT and it does not need to be used.
  • D: CREATE STREAMING LIVE TABLE should be used when data needs to be processed through complicated aggregations.
  • E: CREATE STREAMING LIVE TABLE should be used when the previous step in the DLT pipeline is static.

Question W6NTZ2EikiTdVVnfQAuQ

Question

A data engineer is designing a data pipeline. The source system generates files in a shared directory that is also used by other processes. As a result, the files should be kept as is and will accumulate in the directory. The data engineer needs to identify which files are new since the previous run in the pipeline, and set up the pipeline to only ingest those new files with each run. Which of the following tools can the data engineer use to solve this problem?

Choices

  • A: Unity Catalog
  • B: Delta Lake
  • C: Databricks SQL
  • D: Data Explorer
  • E: Auto Loader

Question b1t1aijUmn7cwVZsueFv

Question

Which of the following Structured Streaming queries is performing a hop from a Silver table to a Gold table?

Choices

  • A:
  • B:
  • C:
  • D:
  • E:

Question 30ntmlnPGYfBKJRuMwga

Question

A data engineer has three tables in a Delta Live Tables (DLT) pipeline. They have configured the pipeline to drop invalid records at each table. They notice that some data is being dropped due to quality concerns at some point in the DLT pipeline. They would like to determine at which table in their pipeline the data is being dropped. Which of the following approaches can the data engineer take to identify the table that is dropping the records?

Choices

  • A: They can set up separate expectations for each table when developing their DLT pipeline.
  • B: They cannot determine which table is dropping the records.
  • C: They can set up DLT to notify them via email when records are dropped.
  • D: They can navigate to the DLT pipeline page, click on each table, and view the data quality statistics.
  • E: They can navigate to the DLT pipeline page, click on the “Error” button, and review the present errors.