Questions and Answers
Question mJflUaButPk6H4EANrPu
Question
Which of the following code blocks will remove the rows where the value in column age is greater than 25 from the existing Delta table my_table and save the updated table?
Choices
- A: SELECT * FROM my_table WHERE age > 25;
- B: UPDATE my_table WHERE age > 25;
- C: DELETE FROM my_table WHERE age > 25;
- D: UPDATE my_table WHERE age ⇐ 25;
- E: DELETE FROM my_table WHERE age ⇐ 25;
answer?
Answer: C Answer_ET: C Community answer C (93%) 7% Discussion
Comment 1558182 by sunil01
- Upvotes: 1
Selected Answer: C Answer is C
Comment 1411225 by devbila
- Upvotes: 1
Selected Answer: C Response is C
Comment 1262387 by 80370eb
- Upvotes: 1
Selected Answer: C to remove the data from a table we can use delete from table with condition.
Comment 1227525 by mascarenhaslucas
- Upvotes: 1
Selected Answer: C The answer is C!
Comment 1198176 by Svengance
- Upvotes: 1
Selected Answer: A there is not delete history option just the vacuum with its parameters of time retention.
Comment 1182679 by bettermakeme
- Upvotes: 3
Answer is C. Just finished exam-got 100% [Databricks Associate Exam Practice Exams] All questions came from Databricks Certified Data Engineer Associate https://www.udemy.com/share/10aEFa3@9M_uT6vrKbnl68tOK96kfy-YWitjwzLTlVCrzPs-0hGUu8fyX8V4Tn_x_y65bwLm/
Comment 1177169 by Itmma
- Upvotes: 1
Selected Answer: C C is correct
Comment 1104702 by SerGrey
- Upvotes: 1
Selected Answer: C C. DELETE FROM my_table WHERE age > 25;
Comment 1028761 by VijayKula
- Upvotes: 1
Selected Answer: C Answer is C
Comment 1019619 by DavidRou
- Upvotes: 2
Selected Answer: C C is the correct answer as the SELECT statement allows to query a table, the UPDATE statement allows to modify values in columns. If you want to remove rows that don’t match a specific condition you must use DELETE
Comment 1017341 by KalavathiP
- Upvotes: 1
Selected Answer: C C is correct
Comment 1005345 by ArindamNath
- Upvotes: 1
C is correct
Comment 997866 by vctrhugo
- Upvotes: 1
Selected Answer: C C. DELETE FROM my_table WHERE age > 25;
Comment 941041 by nb1000
- Upvotes: 1
C is correct
Comment 895828 by prasioso
- Upvotes: 2
C is correct. use DELETE FROM to delete existing records from the table. UPDATE is used to modify existing records. SELECT only creates a view, it does not alter the table records.
Comment 876193 by Varma_Saraswathula
- Upvotes: 1
C - is correct answer
Comment 860268 by knivesz
- Upvotes: 2
Selected Answer: C C es correcto
Comment 859619 by surrabhi_4
- Upvotes: 1
Selected Answer: C option c
Comment 857960 by XiltroX
- Upvotes: 3
C is the correct answer
Question jbRGBPb2V1JDh4FE6DDj
Question
A data analyst has developed a query that runs against Delta table. They want help from the data engineering team to implement a series of tests to ensure the data returned by the query is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following operations could the data engineering team use to run the query and operate with the results in PySpark?
Choices
- A: SELECT * FROM sales
- B: spark.delta.table
- C: spark.sql
- D: There is no way to share data between PySpark and SQL.
- E: spark.table
answer?
Answer: C Answer_ET: C Community answer C (89%) 11% Discussion
Comment 1048855 by kishanu
- Upvotes: 8
Selected Answer: C spark.sql() should be used to execute a SQL query with Pyspark spark.table() can only be used to load a table and not run a query.
Comment 1203837 by benni_ale
- Upvotes: 1
Selected Answer: E I am not sure wheter it is C or E . I see majority went for E but you can still query your data with spark.table by using purely pyspark syntax . I don’t see any part of the question specifying you HAVE to use SQL syntax.
Comment 1050154 by meow_akk
- Upvotes: 3
C is correct EG : from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
df = spark.sql(“SELECT * FROM sales”)
print(df.count())
Question L6ZRNnpBjF0O0oa3hv9w
Question
Which of the following commands will return the number of null values in the member_id column?
Choices
- A: SELECT count(member_id) FROM my_table;
- B: SELECT count(member_id) - count_null(member_id) FROM my_table;
- C: SELECT count_if(member_id IS NULL) FROM my_table;
- D: SELECT null(member_id) FROM my_table;
- E: SELECT count_null(member_id) FROM my_table;
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1203835 by benni_ale
- Upvotes: 1
Selected Answer: C C is correct
Comment 1117402 by bartfto
- Upvotes: 1
Selected Answer: C C: There are no ‘null’ and ‘count_null’ functions in SparkSQL
Comment 1083352 by 55f31c8
- Upvotes: 4
Selected Answer: C https://docs.databricks.com/en/sql/language-manual/functions/count_if.html
Comment 1050156 by meow_akk
- Upvotes: 3
Ans C : https://docs.databricks.com/en/sql/language-manual/functions/count.html
Returns A BIGINT.
If * is specified also counts row containing NULL values.
If expr are specified counts only rows for which all expr are not NULL.
If DISTINCT duplicate rows are not counted.
Comment 1048860 by kishanu
- Upvotes: 3
Selected Answer: C count_if() can be used in this scenario
Question N03FznZNv8TYlQ4aW8xV
Question
A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.
Which of the following code blocks successfully completes this task?
Choices
- A:
- B:
- C:
- D:
- E:
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1203838 by benni_ale
- Upvotes: 1
Selected Answer: A A is correct
Comment 1101170 by AndreFR
- Upvotes: 2
Selected Answer: A B & E incorrect : source is employees not exp_employees
D incorrect : does not use FILTER higher-order function)
C incorrect : syntax errror
A : correct by elimination & based on https://docs.databricks.com/en/sql/language-manual/functions/filter.html#examples
Comment 1089733 by kz_data
- Upvotes: 2
Selected Answer: A A is correct
Comment 1083361 by 55f31c8
- Upvotes: 3
Selected Answer: A https://docs.databricks.com/en/sql/language-manual/functions/filter.html
Comment 1050157 by meow_akk
- Upvotes: 4
A is correct.
Question NFQ87hcaQNJ4s5b4wiI0
Question
A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name.
They have the following incomplete code block:
____(f”SELECT customer_id, spend FROM {table_name}”)
Which of the following can be used to fill in the blank to successfully complete the task?
Choices
- A: spark.delta.sql
- B: spark.delta.table
- C: spark.table
- D: dbutils.sql
- E: spark.sql
answer?
Answer: E Answer_ET: E Community answer E (100%) Discussion
Comment 1360126 by sandbar_dorados_09
- Upvotes: 1
Selected Answer: E spark.sql() is a PySpark function allowing the caller to execute a SQL query from Python
Comment 1203839 by benni_ale
- Upvotes: 2
Selected Answer: E E is correct
Comment 1127401 by azure_bimonster
- Upvotes: 2
Selected Answer: E E is correct
Comment 1050159 by meow_akk
- Upvotes: 4
E is correct you use spark.sql to execute python comamands