Questions and Answers

Question reXOQqFKJTSZp4JyxxnC

Question

A data engineer runs a statement every day to copy the previous day’s sales into the table transactions. Each day’s sales are in their own file in the location “/transactions/raw”. Today, the data engineer runs the following command to complete this task: //IMG//

After running the command today, the data engineer notices that the number of records in table transactions has not changed. Which of the following describes why the statement might not have copied any new records into the table?

Choices

  • A: The format of the files to be copied were not included with the FORMAT_OPTIONS keyword.
  • B: The names of the files to be copied were not included with the FILES keyword.
  • C: The previous day’s file has already been copied into the table.
  • D: The PARQUET file format does not support COPY INTO.
  • E: The COPY INTO statement requires the table to be refreshed to view the copied rows.

Question Grxjpjl4QfSviyfjQukJ

Question

Which of the following describes a scenario in which a data team will want to utilize cluster pools?

Choices

  • A: An automated report needs to be refreshed as quickly as possible.
  • B: An automated report needs to be made reproducible.
  • C: An automated report needs to be tested to identify errors.
  • D: An automated report needs to be version-controlled across multiple collaborators.
  • E: An automated report needs to be runnable by all stakeholders.

Question Lun3yC4gVvMguLqFpsCY

Question

A data engineer needs to create a table in Databricks using data from their organization’s existing SQLite database. They run the following command: //IMG//

Which of the following lines of code fills in the above blank to successfully complete the task?

Choices

  • A: org.apache.spark.sql.jdbc
  • B: autoloader
  • C: DELTA
  • D: sqlite
  • E: org.apache.spark.sql.sqlite

Question usQT2u1IBpTGLIzGOQdO

Question

A data engineering team has two tables. The first table march_transactions is a collection of all retail transactions in the month of March. The second table april_transactions is a collection of all retail transactions in the month of April. There are no duplicate records between the tables. Which of the following commands should be run to create a new table all_transactions that contains all records from march_transactions and april_transactions without duplicate records?

Choices

  • A: CREATE TABLE all_transactions AS SELECT * FROM march_transactions INNER JOIN SELECT * FROM april_transactions;
  • B: CREATE TABLE all_transactions AS SELECT * FROM march_transactions UNION SELECT * FROM april_transactions;
  • C: CREATE TABLE all_transactions AS SELECT * FROM march_transactions OUTER JOIN SELECT * FROM april_transactions;
  • D: CREATE TABLE all_transactions AS SELECT * FROM march_transactions INTERSECT SELECT * from april_transactions;
  • E: CREATE TABLE all_transactions AS SELECT * FROM march_transactions MERGE SELECT * FROM april_transactions;

Question 6CN3dYdBb7shQXDE7g6N

Question

A data engineer only wants to execute the final block of a Python program if the Python variable day_of_week is equal to 1 and the Python variable review_period is True. Which of the following control flow statements should the data engineer use to begin this conditionally executed code block?

Choices

  • A: if day_of_week = 1 and review_period:
  • B: if day_of_week = 1 and review_period = “True”:
  • C: if day_of_week 1 and review_period “True”:
  • D: if day_of_week == 1 and review_period:
  • E: if day_of_week = 1 & review_period: = “True”: