MASTERING THE DATA SCIENCE BASICS _2 - Home Teachers India

Breaking

Welcome to Home Teachers India

The Passion for Learning needs no Boundaries

Translate

Thursday, 1 December 2022

MASTERING THE DATA SCIENCE BASICS _2

 

Probability

Data scientists and machine learning engineers rely on probability theory to undertake statistical analysis of their data. Testing for probability abilities is a suitable proxy metric for organizations to assess analytical thinking and intellect since probability is also strikingly unintuitive.

Probability theory is used in different situations, including coin flips, choosing random numbers, and determining the likelihood that patients would test positive for a disease. Understanding probability might mean the difference between gaining your ideal job and having to go back to square one if you're a data scientist.

Interview Questions on Probability Concepts

These probability questions are meant to test your understanding of probability theory on a conceptual level. You might be tested on the different forms of distributions, the Central Limit Theorem or the application of Bayes' Theorem. This issue requires proper probability theory understanding and explaining this information to a layperson.

1.       How   do   you   distinguish    between   the   Bernoulli    and   binomial distributions?

The Bernoulli distribution simulates one trial of an experiment with just two possible outcomes, whereas the binomial distribution simulates n trials.


2.    Describe how a probability distribution might be non-normal and provide an example.

The probability distribution is not normal if most observations do not cluster around the mean, creating the bell curve. A uniform probability distribution is an example of a non-normal probability distribution, in which all values are equally likely to occur within a particular range.

4.   How can you tell the difference between correlation and covariance? Give a specific example.

Covariance can be any numeric value, but correlation can only be between

-1 (strong negative correlation) and 1 (strong positive correlation) (strong direct correlation). As a result, a link between two variables may appear to have a high covariance but a low correlation value.

5.   How are the Central Limit Theorem and the Law of Large Numbers different?

The Law of Large Numbers states that a "sample mean" is an unbiased estimator of the population mean and that the error of that mean decreases as the sample size grows. In contrast, the Central Limit Theorem states that as a sample size n grows large, the normal distribution can approximate its distribution.

6.   What is the definition of an unbiased estimator? Give a layperson an example.

An accurate statistic used to estimate a population parameter is an unbiased estimator. An example is using a sample of 1000 voters in a political poll to assess the overall voting population, and there is nothing like an utterly objective estimator.

7.   Assume that the chance of finding a particular object X at location A is 0.6 and that finding it at location B is 0.8. What is the likelihood of finding item X in places A or B?

Let us begin by defining our probabilities: P(Item at location A) = P(A) = 0.6


P(Item at location B) = P(B) = 0.8

We want to understand how likely it is that item X will be found on the internet in this city. The likelihood that item X is at location A or location B may be calculated from the question. We can describe this probability in equation form since our occurrences are not mutually exclusive: P(A or B)

= P(A or B) (AUB)

8.   Assume you have a deck of 500 cards with numbers ranging from 1 to 500. What is the likelihood of each following card being larger than the previously drawn card if all the cards are mixed randomly, and you are asked to choose three cards one at a time?

Consider this a sample space problem, with all other specifics ignored. We may suppose that if someone selects three distinct numbered unique cards at random without replacement, there will be low, medium, and high cards.

Let's pretend we drew the numbers 1, 2, and 3 to make things easier. In our case, the winning scenario would be if we pulled (1,2,3) in that precise order. But what is the complete spectrum of possible outcomes?

9.      Assume you have one function, which gives a random number between a minimum and maximum value, N and M. Then take the output of that function and use it to calculate the total value of another random number generator with the same minimum value N. How would the resulting sample distribution be spread? What would the second function's anticipated value be?

Let X be the first run's outcome, and Y be the second run's result. Because the integer output is "random" and no other information is provided, we may infer that any integers between N and M have an equal chance of being chosen. As a result, X and Y are discrete uniform random variables with N & M and N & X limits, respectively.

10.   An equilateral triangle has three zebras seated on each corner. Each zebra chooses a direction at random and only sprints along the triangle's outline to either of the triangle's opposing edges. Is there a chance that none of the zebras will collide?


Assume that all of the zebras are arranged in an equilateral triangle. If they're sprinting down the outline to either edge, they have two alternatives for going in. Let's compute the chances that they won't collide, given that the scenario is random. In reality, there are only two options. The zebras will run in either a clockwise or counter-clockwise motion.

Let's see what the probability is for each one. The likelihood that each zebra will choose to go clockwise is the product of the number of zebras who opt to travel clockwise. Given two options (clockwise or counter-clockwise), that would be 1/2 * 1/2 * 1/2 = 1/8.

Every zebra has the same 1/8 chance of traveling counter-clockwise. As a result, we obtain the proper probability of 1/4 or 25% if we add the probabilities together.

11.  You contact three random pals in Seattle and ask them if it's raining on their own. Each of your pals has a two-thirds probability of giving you the truth and a one-third risk of deceiving you by lying. "Yes," all three of your buddies agree, it is raining. What are the chances that it's raining in Seattle right now?

According to the outcome of the Frequentist method, if you repeat the trials with your friends, there is one occurrence in which all three of your friends lied inside those 27 trials.

However, because your friends all provided the same response, you're not interested in all 27 trials, which would include occurrences when your friends' replies differed.

12.     You have 576 times to flip a fair coin. Calculate the chance of flipping at least 312 heads without using a calculator.

This question needs a little memory. Given that we have to predict the number of heads out of some trials, we may deduce that it's a binomial distribution problem at first look. As a result, for each test, we'll employ a binomial distribution with n trials and a probability of success of p. The probability for success (a fair coin has a 0.5 chance of landing heads or tails) multiplied by the total number of trials is the anticipated number of


heads for a binomial distribution (576). As a result, our coin flips are projected to turn up heads 288 times.

13.   You are handed a neutral coin, and you are to toss the coin until it lands on either Heads Heads Tails (HHT) or Heads Tails Tails (HTT). Is it more probable that one will appear first? If so, which one and how likely is it?

Given the two circumstances, we may conclude that both sequences need H to come first. The chance of HHT is now equal to 1/2 after H occurs. What is the reason behind this? Because all you need for HHT in this circumstance is one H. Because we are flipping the coin in series until we observe a string of HHT or HTT in a row, the coin does not reset. The fact that the initial letter is H enhances the likelihood of HHT rather than HTT.

No comments:

Post a Comment

Thank you for Contacting Us.

Post Top Ad