Questions and Answers
Question QHc4A85C4hythhACPdEM
Question
A data engineer has been given a new record of data:
id STRING = ‘a1’ rank INTEGER = 6 rating FLOAT = 9.4
Which of the following SQL commands can be used to append the new record to an existing Delta table my_table?
Choices
- A: INSERT INTO my_table VALUES (‘a1’, 6, 9.4)
- B: my_table UNION VALUES (‘a1’, 6, 9.4)
- C: INSERT VALUES ( ‘a1’ , 6, 9.4) INTO my_table
- D: UPDATE my_table VALUES (‘a1’, 6, 9.4)
- E: UPDATE VALUES (‘a1’, 6, 9.4) my_table
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1263467 by 80370eb
- Upvotes: 1
Selected Answer: A insert into table comment
Comment 1203822 by benni_ale
- Upvotes: 1
Selected Answer: A A is correct
Comment 1127372 by azure_bimonster
- Upvotes: 2
Selected Answer: A A is correct because syntax is correct
Comment 1117568 by Annelijn
- Upvotes: 2
Selected Answer: A A is correct
Comment 1050931 by meow_akk
- Upvotes: 3
Ans A : check the correct syntax for insert into
Question lBgbN1YHqmWQOjBzxobo
Question
A data engineer has realized that the data files associated with a Delta table are incredibly small. They want to compact the small files to form larger files to improve performance.
Which of the following keywords can be used to compact the small files?
Choices
- A: REDUCE
- B: OPTIMIZE
- C: COMPACTION
- D: REPARTITION
- E: VACUUM
answer?
Answer: B Answer_ET: B Community answer B (100%) Discussion
Comment 1048842 by kishanu
- Upvotes: 5
Selected Answer: B OPTIMIZE can be used to club small files into 1 and improve performance.
Comment 1263464 by 80370eb
- Upvotes: 1
Selected Answer: B The OPTIMIZE command in Databricks is used to compact small files into larger ones, improving the performance of Delta tables.
Comment 1262789 by 80370eb
- Upvotes: 1
Selected Answer: B The OPTIMIZE command in Delta Lake merges small files into larger files, which can help improve query performance and manage storage more efficiently.
Comment 1203823 by benni_ale
- Upvotes: 1
Selected Answer: B B is correct
Comment 1132184 by UGOTCOOKIES
- Upvotes: 2
Selected Answer: B OPTIMIZE is the correct answer. Compacting small files using the OPTIMIZE command improves table performance such as by combining multiple small files into larger ones.
Comment 1127373 by azure_bimonster
- Upvotes: 2
Selected Answer: B OPTIMIZE would help in this scenario
Comment 1094881 by nedlo
- Upvotes: 2
Selected Answer: B Its B https://docs.databricks.com/en/delta/optimize.html
Comment 1050932 by meow_akk
- Upvotes: 3
Ans B : optimize is used to compact small files which in turn improves perf
Question OjcWDz5SyNJlXdIMDH7e
Question
In which of the following file formats is data from Delta Lake tables primarily stored?
Choices
- A: Delta
- B: CSV
- C: Parquet
- D: JSON
- E: A proprietary, optimized format specific to Databricks
answer?
Answer: C Answer_ET: C Community answer C (90%) 10% Discussion
Comment 1263465 by 80370eb
- Upvotes: 1
Selected Answer: C Delta Lake builds on top of the Parquet file format, adding features like ACID transactions, versioning, and more, while leveraging Parquet’s efficient columnar storage capabilities.
Comment 1262790 by 80370eb
- Upvotes: 1
Selected Answer: C Delta Lake tables use the Parquet file format for storing data, which is a columnar storage format optimized for performance and efficient data processing.
Comment 1203824 by benni_ale
- Upvotes: 1
Selected Answer: C Parquet for data and JSON for metadata
Comment 1127374 by azure_bimonster
- Upvotes: 1
Selected Answer: C Parquet
Comment 1088506 by nedlo
- Upvotes: 1
Selected Answer: C Parquet format because its columnar format, much faster alternative to CSV because it supports partition pruning for example. No such file format as “Delta”
Comment 1057636 by kishore1980
- Upvotes: 3
Selected Answer: C Parquet format is correct
Comment 1055975 by kishanu
- Upvotes: 1
Selected Answer: C Parquet it is
Comment 1052553 by SD5713
- Upvotes: 1
Selected Answer: B parquet format
Comment 1050939 by meow_akk
- Upvotes: 1
so i think data from delta lake is stored in parquet format .. while the storage format seems to be delta .. very confusing some notes : What format does Delta Lake use to store data? Delta Lake uses versioned Parquet files to store your data in your cloud storage. Apart from the versions, Delta Lake also stores a transaction log to keep track of all the commits made to the table or blob store directory to provide ACID transactions. https://docs.delta.io/latest/delta-faq.html
Question UNUNi5c0rbqur26Cy4GD
Question
Which of the following is stored in the Databricks customer’s cloud account?
Choices
- A: Databricks web application
- B: Cluster management metadata
- C: Repos
- D: Data
- E: Notebooks
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1263468 by 80370eb
- Upvotes: 2
Selected Answer: D Data stored in Delta Lake or other formats on cloud storage is managed within the customer’s own cloud account, while other components like the Databricks web application and cluster management metadata are managed by Databricks itself.
Comment 1203827 by benni_ale
- Upvotes: 2
Selected Answer: D Data is in Data plane
Comment 1156133 by Bob123456
- Upvotes: 1
Answer should be B Because When the customer sets up a Spark cluster, the cluster virtual machines are deployed in the data plane in the customer’s cloud account.
Comment 1127376 by azure_bimonster
- Upvotes: 2
Selected Answer: D D is correct
Comment 1117391 by bartfto
- Upvotes: 2
Selected Answer: D D. Data
Comment 1050118 by meow_akk
- Upvotes: 4
D. Data
Question PLgn53CU3deo2sk63LtA
Question
Which of the following can be used to simplify and unify siloed data architectures that are specialized for specific use cases?
Choices
- A: None of these
- B: Data lake
- C: Data warehouse
- D: All of these
- E: Data lakehouse
answer?
Answer: E Answer_ET: E Community answer E (100%) Discussion
Comment 1203828 by benni_ale
- Upvotes: 1
Selected Answer: E E is correct
Comment 1127377 by azure_bimonster
- Upvotes: 2
Selected Answer: E Lakehouse, so E is correct
Comment 1048844 by kishanu
- Upvotes: 4
Selected Answer: E Data Lakehouse can be used as a single source of truth for multiple specific use cases