Questions and Answers

Question Irf8QUGT0NLGS3dFBbHC

Question

Which of the following commands can be used to write data into a Delta table while avoiding the writing of duplicate records?

Choices

A: DROP
B: IGNORE
C: MERGE
D: APPEND
E: INSERT

answer?

Answer: C Answer_ET: C Community answer C (94%) 6% Discussion

Comment 1262402 by 80370eb

Upvotes: 3

Selected Answer: C C. MERGE

The MERGE command allows you to perform upserts (update and insert) into a Delta table, effectively avoiding duplicates by updating existing records and inserting new ones as needed.

Comment 1213173 by BharaniRaj

Upvotes: 1

Selected Answer: C C is the right answer

Comment 1203174 by benni_ale

Upvotes: 1

Selected Answer: C C merge

Comment 1113196 by SerGrey

Upvotes: 1

Selected Answer: C Correct answer is C

Comment 1064802 by awofalus

Upvotes: 1

Selected Answer: C C is correct

Comment 1047724 by J_1_2

Upvotes: 1

Selected Answer: C Merge is correct

Comment 1028518 by DavidRou

Upvotes: 2

MERGE INTO is the one to choose if you want to avoid duplicates.

Comment 1020507 by chris_mach

Upvotes: 1

Selected Answer: C Merge is correct

Comment 1017354 by KalavathiP

Upvotes: 1

Selected Answer: C Merge will avoid duplicates by comparing the results based on primary key columns

Comment 997923 by vctrhugo

Upvotes: 3

Selected Answer: C C. MERGE

The MERGE command is used to write data into a Delta table while avoiding the writing of duplicate records. It allows you to perform an “upsert” operation, which means that it will insert new records and update existing records in the Delta table based on a specified condition. This helps maintain data integrity and avoid duplicates when adding new data to the table.

Comment 946000 by Atnafu

Upvotes: 2

C. MERGE

To write data into a Delta table while avoiding the writing of duplicate records, you can use the MERGE command. The MERGE command in Delta Lake allows you to combine the ability to insert new records and update existing records in a single atomic operation.

The MERGE command compares the data being written with the existing data in the Delta table based on specified matching criteria, typically using a primary key or unique identifier. It then performs conditional actions, such as inserting new records or updating existing records, depending on the comparison results.

By using the MERGE command, you can handle the prevention of duplicate records in a more controlled and efficient manner. It allows you to synchronize and reconcile data from different sources while avoiding duplication and ensuring data integrity.

Therefore, option C, MERGE, is the correct command to use when writing data into a Delta table while avoiding the writing of duplicate records.

Comment 889315 by softthinkers

Upvotes: 2

Answer is C. AS DROP is used to remove a table or database IGNORE is used to skip errors while executing a query. INSERT will add new records but will not avoid duplication so Merge is right answer

Comment 876213 by Varma_Saraswathula

Upvotes: 2

Ans - C https://docs.databricks.com/sql/language-manual/delta-merge-into.html

Comment 875872 by naxacod574

Upvotes: 1

Option C

Comment 861295 by XiltroX

Upvotes: 1

Selected Answer: D Wrong answer. The correct answer is D.

Comment 858873 by knivesz

Upvotes: 3

Selected Answer: C la unica opcion posible

Question xi85iH0ejppDMhZGUq77

Question

A data organization leader is upset about the data analysis team’s reports being different from the data engineering team’s reports. The leader believes the siloed nature of their organization’s data engineering and data analysis architectures is to blame.

Which of the following describes how a data lakehouse could alleviate this issue?

Choices

A: Both teams would respond more quickly to ad-hoc requests
B: Both teams would use the same source of truth for their work
C: Both teams would reorganize to report to the same department
D: Both teams would be able to collaborate on projects in real-time

answer?

Answer: B Answer_ET: B Community answer B (100%) Discussion

Comment 1322554 by Manish_Kum

Upvotes: 2

Selected Answer: B B is correct

Question JoxonN9nMR9NV4NYHAbp

Question

A data analyst has developed a query that runs against Delta table. They want help from the data engineering team to implement a series of tests to ensure the data returned by the query is clean. However, the data engineering team uses Python for its tests rather than SQL.

Which of the following operations could the data engineering team use to run the query and operate with the results in PySpark?

Choices

A: SELECT * FROM sales
B: spark.delta.table
C: spark.sql
D: spark.table

answer?

Answer: C Answer_ET: C Community answer C (100%) Discussion

Comment 1322555 by Manish_Kum

Upvotes: 1

Selected Answer: C C is correct

Question ORcvACP85uckLE1bLbvZ

Question

A data engineer has a Job that has a complex run schedule, and they want to transfer that schedule to other Jobs.

Rather than manually selecting each value in the scheduling form in Databricks, which of the following tools can the data engineer use to represent and submit the schedule programmatically?

Choices

A: pyspark.sql.types.DateType
B: datetime
C: pyspark.sql.types.TimestampType
D: Cron syntax

answer?

Answer: D Answer_ET: D Community answer D (100%) Discussion

Comment 1335986 by duzi

Upvotes: 1

Selected Answer: D Question is repeated. See details on https://learn.microsoft.com/en-us/azure/databricks/jobs/scheduled

Question XyQAuPXru07NhesFdreT

Question

A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL. The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.

Which of the following changes will need to be made to the pipeline when migrating to Delta Live Tables?

Choices

A: The pipeline will need to be written entirely in Python
B: The pipeline will need to stop using the medallion-based multi-hop architecture
C: The pipeline will need to be written entirely in SQL
D: The pipeline will need to use a batch source in place of a streaming source

answer?

Answer: D Answer_ET: D Community answer D (56%) A (33%) 11% Discussion

Comment 1409787 by Billybob0604

Upvotes: 1

Selected Answer: C Delta Live Tables (DLT) currently requires SQL or Python for defining data pipelines. However, for streaming data, SQL has become the primary language for defining Delta Live Tables pipelines in Databricks.

Comment 1388415 by e872ce8

Upvotes: 1

Selected Answer: A A & C. The pipeline will need to be written entirely in Python (if using Python APIs for Delta Live Tables) The pipeline will need to be written entirely in SQL (if using SQL-based Delta Live Tables)Delta Live Tables (DLT) is a declarative ETL framework built on Databricks, designed to simplify pipeline development and management. It supports: Python-based pipelines using @dlt.table decorators SQL-based pipelines using CREATE LIVE TABLE Since the data engineer is using Python and the data analyst is using SQL, the pipeline will need to be rewritten in one of the two supported languages.

Comment 1366381 by Kayceetalks

Upvotes: 2

Selected Answer: D None of these options are currect

Comment 1328533 by MultiCloudIronMan

Upvotes: 2

Selected Answer: A The correct response is A. None of these changes will need to be made. Delta Live Tables supports both Python and SQL, as well as streaming and batch sources. This means that the existing medallion-based multi-hop architecture can be maintained, and the pipeline can continue to use both Python and SQL for different layers. Therefore, no changes are necessary when migrating to Delta Live Tables.

Comment 1322557 by Manish_Kum

Upvotes: 3

Selected Answer: D best choice in this questions is D

vuthanhdatt's Second Brain

Explorer

14

Questions and Answers

Question Irf8QUGT0NLGS3dFBbHC

Question

Choices

Comment 1262402 by 80370eb

Comment 1213173 by BharaniRaj

Comment 1203174 by benni_ale

Comment 1113196 by SerGrey

Comment 1064802 by awofalus

Comment 1047724 by J_1_2

Comment 1028518 by DavidRou

Comment 1020507 by chris_mach

Comment 1017354 by KalavathiP

Comment 997923 by vctrhugo

Comment 946000 by Atnafu

Comment 889315 by softthinkers

Comment 876213 by Varma_Saraswathula

Comment 875872 by naxacod574

Comment 861295 by XiltroX

Comment 858873 by knivesz

Question xi85iH0ejppDMhZGUq77

Question

Choices

Comment 1322554 by Manish_Kum

Question JoxonN9nMR9NV4NYHAbp

Question

Choices

Comment 1322555 by Manish_Kum

Question ORcvACP85uckLE1bLbvZ

Question

Choices

Comment 1335986 by duzi

Question XyQAuPXru07NhesFdreT

Question

Choices

Comment 1409787 by Billybob0604

Comment 1388415 by e872ce8

Comment 1366381 by Kayceetalks

Comment 1328533 by MultiCloudIronMan

Comment 1322557 by Manish_Kum

Graph View

Table of Contents