Questions and Answers

Question Irf8QUGT0NLGS3dFBbHC

Question

Which of the following commands can be used to write data into a Delta table while avoiding the writing of duplicate records?

Choices

  • A: DROP
  • B: IGNORE
  • C: MERGE
  • D: APPEND
  • E: INSERT

Question xi85iH0ejppDMhZGUq77

Question

A data organization leader is upset about the data analysis team’s reports being different from the data engineering team’s reports. The leader believes the siloed nature of their organization’s data engineering and data analysis architectures is to blame.

Which of the following describes how a data lakehouse could alleviate this issue?

Choices

  • A: Both teams would respond more quickly to ad-hoc requests
  • B: Both teams would use the same source of truth for their work
  • C: Both teams would reorganize to report to the same department
  • D: Both teams would be able to collaborate on projects in real-time

Question JoxonN9nMR9NV4NYHAbp

Question

A data analyst has developed a query that runs against Delta table. They want help from the data engineering team to implement a series of tests to ensure the data returned by the query is clean. However, the data engineering team uses Python for its tests rather than SQL.

Which of the following operations could the data engineering team use to run the query and operate with the results in PySpark?

Choices

  • A: SELECT * FROM sales
  • B: spark.delta.table
  • C: spark.sql
  • D: spark.table

Question ORcvACP85uckLE1bLbvZ

Question

A data engineer has a Job that has a complex run schedule, and they want to transfer that schedule to other Jobs.

Rather than manually selecting each value in the scheduling form in Databricks, which of the following tools can the data engineer use to represent and submit the schedule programmatically?

Choices

  • A: pyspark.sql.types.DateType
  • B: datetime
  • C: pyspark.sql.types.TimestampType
  • D: Cron syntax

Question XyQAuPXru07NhesFdreT

Question

A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL. The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.

Which of the following changes will need to be made to the pipeline when migrating to Delta Live Tables?

Choices

  • A: The pipeline will need to be written entirely in Python
  • B: The pipeline will need to stop using the medallion-based multi-hop architecture
  • C: The pipeline will need to be written entirely in SQL
  • D: The pipeline will need to use a batch source in place of a streaming source