(For Problems 1 and 2, no other package except numpy and matplotlib should be used for the programming questions. For problem 3 you can use the packages of your choice.)

AssignmentTutorOnline

**Problem 1.**

- In this problem we will analyze logistic regression learned in class.

Sigmoid function can be written as

- For a given variable X assume
*P*(*Y*= +1|*X*) is modeled as*P*(*Y*= +1|*X*) =*S*(*β*_{0 }+*β*_{1}*X*). Plot a 3d figure showing the relation between output and variable*β*_{0 }and*β*_{1 }when X = 1. Take values between [-2, 2] for both*β*_{0 }and*β*_{1 }with a step size of 0.1 to plot the 3d plot. - In class, we have done binary classification with labels
*Y*= 0*,*1. In this problem, we will be using the labels as*Y*= −1*,*1 as it will be easier to derive the likelihood of the*P*(*Y*|*X*).- Show that if Y ∈ −1
*,*1 the probability of Y given X can be written as

- Show that if Y ∈ −1

- We have learned that the coefficients
*β*_{0 }and*β*_{1 }can be found using MLE estimates. Show that the Log Likelihood function for*m*data points can be written as

- Plot a 3d figure showing the relation between Log Likelihood function and variable
*β*_{0 },*β*_{1 }when*X*= 1*,Y*= −1 and*X*= 1*,Y*= 1. Take values between [−2*,*2] for both*β*_{0 }and*β*_{1 }with a step size of 0.1 to plot the 3d plot.- Based on the graph, is it possible to maximize this function?

The hints for the plots that needs to be derived using python in Question one part (b)

**Problem 2.**

**The hint for the classification rule derived is:**

**Hints for three plots that needs to derive using python are:**

**Problem 3. **Recall that in classification we assume that each data point is an i.i.d. sample from a (unknown) distribution *P*(*X *= *x,Y *= *y*). In this question, we are going to design the data distribution *P *and evaluate the performance of logistic regression on data generated using *P*. Keep in mind that we would like to make *P *as simple as we could. In the following, we assume *x *∈ R and *y *∈ 0*,*1, i.e. the data is one-dimensional and the label is binary. Write *P*(*X *= *x,Y *= *y*) = *P*(*X *= *x*)*P*(*Y *= *y*|*X *= *x*). We will generate *X *= *x *according to the uniform distribution on the interval [0*,*1] (thus *P*(*X *= *x*) is just the pdf of the uniform distribution).

- Design
*P*(*Y*=*y*|*X*=*x*) such that (i)*P*(*y*= 0) =*P*(*y*= 1) = 0*.*5; and (ii) the classification accuracy of any classifier is at most 0*.*9; and (iii) the accuracy of the Bayes optimal possible classifier is at least 0*.*8. - Using Python, generate
*n*= 100 training data points according to the distribution you designed above and train a binary classifier using logistic regression on training data. - Generate and
*n*= 100 test data points according to the distribution you designed in part 1 and compute the prediction accuracy (on the test data) of the classifier that you designed in part 2. Also, compute the accuracy of the Bayes optimal classifier on the test data. Why do you think Bayes optimal classifier is performing better? - Redo parts 2,3 with
*n*= 1000. Are the results any different than part 3? Why?

(For Problems 1 and 2, no other package except numpy and matplotlib should be used for the programming questions. For problem 3 you can use the packages of your choice.)

**Problem 1.**

- In this problem we will analyze logistic regression learned in class.

Sigmoid function can be written as

- For a given variable X assume
*P*(*Y*= +1|*X*) is modeled as*P*(*Y*= +1|*X*) =*S*(*β*_{0 }+*β*_{1}*X*). Plot a 3d figure showing the relation between output and variable*β*_{0 }and*β*_{1 }when X = 1. Take values between [-2, 2] for both*β*_{0 }and*β*_{1 }with a step size of 0.1 to plot the 3d plot. - In class, we have done binary classification with labels
*Y*= 0*,*1. In this problem, we will be using the labels as*Y*= −1*,*1 as it will be easier to derive the likelihood of the*P*(*Y*|*X*).- Show that if Y ∈ −1
*,*1 the probability of Y given X can be written as

- Show that if Y ∈ −1

- We have learned that the coefficients
*β*_{0 }and*β*_{1 }can be found using MLE estimates. Show that the Log Likelihood function for*m*data points can be written as

- Plot a 3d figure showing the relation between Log Likelihood function and variable
*β*_{0 },*β*_{1 }when*X*= 1*,Y*= −1 and*X*= 1*,Y*= 1. Take values between [−2*,*2] for both*β*_{0 }and*β*_{1 }with a step size of 0.1 to plot the 3d plot.- Based on the graph, is it possible to maximize this function?

The hints for the plots that needs to be derived using python in Question one part (b)

**Problem 2.**

**The hint for the classification rule derived is:**

**Hints for three plots that needs to derive using python are:**

**Problem 3. **Recall that in classification we assume that each data point is an i.i.d. sample from a (unknown) distribution *P*(*X *= *x,Y *= *y*). In this question, we are going to design the data distribution *P *and evaluate the performance of logistic regression on data generated using *P*. Keep in mind that we would like to make *P *as simple as we could. In the following, we assume *x *∈ R and *y *∈ 0*,*1, i.e. the data is one-dimensional and the label is binary. Write *P*(*X *= *x,Y *= *y*) = *P*(*X *= *x*)*P*(*Y *= *y*|*X *= *x*). We will generate *X *= *x *according to the uniform distribution on the interval [0*,*1] (thus *P*(*X *= *x*) is just the pdf of the uniform distribution).

- Design
*P*(*Y*=*y*|*X*=*x*) such that (i)*P*(*y*= 0) =*P*(*y*= 1) = 0*.*5; and (ii) the classification accuracy of any classifier is at most 0*.*9; and (iii) the accuracy of the Bayes optimal possible classifier is at least 0*.*8. - Using Python, generate
*n*= 100 training data points according to the distribution you designed above and train a binary classifier using logistic regression on training data. - Generate and
*n*= 100 test data points according to the distribution you designed in part 1 and compute the prediction accuracy (on the test data) of the classifier that you designed in part 2. Also, compute the accuracy of the Bayes optimal classifier on the test data. Why do you think Bayes optimal classifier is performing better? - Redo parts 2,3 with
*n*= 1000. Are the results any different than part 3? Why?

- Assignment status: Already Solved By Our Experts
*(USA, AUS, UK & CA PhD. Writers)***CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS**

**NO PLAGIARISM**– CUSTOM PAPER