Questions and Answers
Question jslH6mRYoMD15DyFXn1Z
Question
A data engineer needs access to a table new_table, but they do not have the correct permissions. They can ask the table owner for permission, but they do not know who the table owner is.
Which of the following approaches can be used to identify the owner of new_table?
Choices
- A: Review the Permissions tab in the table’s page in Data Explorer
- B: There is no way to identify the owner of the table
- C: Review the Owner field in the table’s page in Data Explorer
- D: Review the Owner field in the table’s page in the cloud storage solution
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1338643 by Sd1988
- Upvotes: 1
Selected Answer: C C is the best choice
Question J9knXs2ep9qUKhlYORyA
Question
In which of the following scenarios should a data engineer use the MERGE INTO command instead of the INSERT INTO command?
Choices
- A: When the location of the data needs to be changed
- B: When the target table is an external table
- C: When the source is not a Delta table
- D: When the target table cannot contain duplicate records
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1366382 by Kayceetalks
- Upvotes: 1
Selected Answer: D Correct answer
Question k3ewF4JP1laHH8MTrckB
Question
A data engineer is designing a data pipeline. The source system generates files in a shared directory that is also used by other processes. As a result, the files should be kept as is and will accumulate in the directory. The data engineer needs to identify which files are new since the previous run in the pipeline, and set up the pipeline to only ingest those new files with each run.
Which of the following tools can the data engineer use to solve this problem?
Choices
- A: Unity Catalog
- B: Delta Lake
- C: Databricks SQL
- D: Auto Loader
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1388574 by e872ce8
- Upvotes: 1
Selected Answer: D Auto Loader is a feature in Databricks that automatically ingests new data files as they appear in a specified directory, and it efficiently handles large volumes of data. It can track which files are new since the previous run and only process those files, which perfectly fits the use case described.
Question IyyWKo0V1youUofPtAzS
Question
What is stored in the Databricks customer’s cloud account?
Choices
- A: Databricks web application
- B: Cluster management metadata
- C: Notebooks
- D: Data
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1388584 by e872ce8
- Upvotes: 1
Selected Answer: D In Databricks, the customer’s cloud account primarily stores data. This data is stored in cloud storage services (e.g., AWS S3, Azure Blob Storage, or Google Cloud Storage) linked to the Databricks environment. Databricks manages and processes the data using its clusters, but the actual data is stored in the cloud storage solution chosen by the customer.
Question I4QtsNpFMjL2XXbiTYon
Question
A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.
Which of the following relational objects should the data engineer create?
Choices
- A: Spark SQL Table
- B: View
- C: Delta Table
- D: Temporary view
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1388659 by e872ce8
- Upvotes: 1
Selected Answer: D A Temporary view in Databricks allows a data engineer to create a relational object that pulls data from other tables but does not require physical storage. It is session-scoped, meaning it only exists for the duration of the session and is not persisted in storage, which saves on storage costs.