New Year Sale - Special 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70dumps

Databricks-Certified-Professional-Data-Scientist Questions and Answers

Question # 6

RMSE measures error of a predicted

A.

Numerical Value

B.

Categorical values

C.

For booth Numerical and categorical values

Full Access
Question # 7

Select the correct problems which can be solved using SVMs

A.

SVMs are helpful in text and hypertext categorization

B.

Classification of images can also be performed using SVMs

C.

SVMs are also useful in medical science to classify proteins with up to 90% of the compounds classified correctly

D.

Hand-written characters can be recognized using SVM

Full Access
Question # 8

In which lifecycle stage are appropriate analytical techniques determined?

A.

Model planning

B.

Model building

C.

Data preparation

D.

Discovery

Full Access
Question # 9

Question-13. Which of the following is not the Classification algorithm?

A.

Logistic Regression

B.

Support Vector Machine

C.

Neural Network

D.

Hidden Markov Models

E.

None of the above

Full Access
Question # 10

Select the correct statement regarding the naive Bayes classification

A.

it only requires a small amount of training data to estimate the parameters

B.

Independent variables can be assumed

C.

only the variances of the variables for each class need to be determined

D.

for each class entire covariance matrix need to be determined

Full Access
Question # 11

Suppose A, B , and C are events. The probability of A given B , relative to P(|C), is the same as the probability of A given B and C (relative to P ). That is,

A.

P(A,B|C) P(B|C) =P(A|B,C)

B.

P(A,B|C) P(B|C) =P(B|A,C)

C.

P(A,B|C) P(B|C) =P(C|B,C)

D.

P(A,B|C) P(B|C) =P(A|C,B)

Full Access
Question # 12

Regularization is a very important technique in machine learning to prevent overfitting. Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is...

A.

L2 is the sum of the square of the weights, while L1 is just the sum of the weights

B.

L1 is the sum of the square of the weights, while L2 is just the sum of the weights

C.

L1 gives Non-sparse output while L2 gives sparse outputs

D.

None of the above

Full Access
Question # 13

In unsupervised learning which statements correctly applies

A.

It does not have a target variable

B.

Instead of telling the machine Predict Y for our data X, we're asking What can you tell me about X?

C.

telling the machine Predict Y for our data X

Full Access
Question # 14

Suppose a man told you he had a nice conversation with someone on the train. Not knowing anything about this conversation, the probability that he was speaking to a woman is 50% (assuming the train had an equal number of men and women and the speaker was as likely to strike up a conversation with a man as with a woman). Now suppose he also told you that his conversational partner had long hair. It is now more

likely he was speaking to a woman, since women are more likely to have long hair than men.____________

can be used to calculate the probability that the person was a woman.

A.

SVM

B.

MLE

C.

Bayes' theorem

D.

Logistic Regression

Full Access
Question # 15

Which of the following are point estimation methods?

A.

MAP

B.

MLE

C.

MMSE

Full Access
Question # 16

RMSE is a useful metric for evaluating which types of models?

A.

Logistic regression

B.

Naive Bayes classifier

C.

Linear regression

D.

All of the above

Full Access
Question # 17

Google Adwords studies the number of men, and women, clicking the advertisement on search

engine during the midnight for an hour each day.

Google find that the number of men that click can be modeled as a random variable with distribution

Poisson(X), and likewise the number of women that click as Poisson(Y).

What is likely to be the best model of the total number of advertisement clicks during the midnight for an hour ?

A.

Binomial(X+Y,X+Y)

B.

Poisson(X/Y)

C.

Normal(X+Y(M+Y)1/2)

D.

Poisson(X+Y)

Full Access
Question # 18

A website is opened 3 times by a user. What is the probability of he clicks 2 times the advertisement, is best calculated by

A.

Binomial

B.

Poisson

C.

Normal

D.

Any of the above

Full Access
Question # 19

Consider flipping a coin for which the probability of heads is p, where p is unknown, and our goa is to estimate p. The obvious approach is to count how many times the coin came up heads and divide by the total number of coin flips. If we flip the coin 1000 times and it comes up heads 367 times, it is very reasonable to estimate p as approximately 0.367. However, suppose we flip the coin only twice and we get heads both times. Is it reasonable to estimate p as 1.0? Intuitively, given that we only flipped the coin twice, it seems a bit

rash to conclude that the coin will always come up heads, and____________is a way of avoiding such rash

conclusions.

A.

Naive Bayes

B.

Laplace Smoothing

C.

Logistic Regression

D.

Linear Regression

Full Access
Question # 20

Let's say you have two cases as below for the movie ratings

1. You recommend to a user a movie with four stars and he really doesn't like it and he'd rate it two stars

2. You recommend a movie with three stars but the user loves it (he'd rate it five stars). So which statement correctly applies?

A.

In both cases, the contribution to the RMSE is the same

B.

In both cases, the contribution to the RMSE is the different

C.

In both cases, the contribution to the RMSE, could varies

D.

None of the above

Full Access