Markov Modeling in Decision Treese

Markov Modeling in Decision Treese

 DAT 520 Problem Set 6
Markov Modeling in Decision Trees

Module Six introduces the Markov model. In this model, there are a defined set of states and we then have the probability of jumping to the next state. Consider the states being the nodes of the tree and the probability as the jump to the next state. These chains are simply another way to define the probability we need for moving from state to state in the decision tree.

This Module Six assignment will take you through the process of using real data to generate values that will feed a Markov chain. We will also build a transition matrix in this assignment. The steps are simple and explicit to allow you to focus on the learning of building a Markov model. Make sure to take the time to understand the steps you are executing.

In the Module Five ProblemSet 5 Tree, we calculated X% by finding a numerator and a denominator from filtering the Mopps with Commas dataset. In this assignment, we will use a Markov model to recalculate a more accurate X% for the Mopps with Commas dataset from Module Five. To achieve this, we will explore this question: “How do you measure success?”

The steps needed to complete this assignment are as follows:

  1. From Blackboard, download the following files to Documents folder. These will all be found in the DAT 520 Data Files folder in Course Information.
    1. Mopps with Commas: This is the data set for the problem.
    2. ProblemSet 6 Tree: This is the tree we will fill in values to in the exercise
    3. Module6_Problem3: This is the R code needed for problem 3 below.
  1. Open Mopps with Commas in Excel.
  1. Determine d and f using the dataset.

Note: Success is any company that had at least two occurrences when it had more profit than the previous year and a rising market index, simultaneously. These are reflected in the tot_success variable. Non-success is defined as two or more occurrences of decreasing year-to-year profit and decreasing market index. These are reflected in the total_nsuccess variable.

  1. How many businesses have tot_success of at least two?
    • What percentage is that of the entire data set? This is the d in the Markov model.
  2. How many businesses have two or more occurrences of non-success (total_nsuccess)?
    • What percentage is that of the entire data set? This is the f in the Markov model. 

You now have enough information to construct a Markov model. You have:

The letters d and f here refer to the values from the assignment’s questions, a and b, above. 

Now complete the following problems. Ensure that you provide an answer for each problem below and submit it via a word document:

  1. Assuming previous success [1 0], what is the probability of having a successful year?

Hint: Use R and enter the matrix using the d and f values from the assignment questions a and b, found above. Refer to Module One homework for assistance with R matrix math.

  1. What is the probability of having two successful years?

Hint: Use R and enter the matrix using the d and f values from the assignment questions a and b, above. Refer to Module One homework for assistance with R matrix math. 

  1. Using the strategy outlined in the document “How to Iterate Markov Processes in R,” show the table of values for 10 years of success:

Hint: Use R and enter the matrix using the d and f values from the assignment questions a and b, above. Refer to Module One homework for assistance with R matrix math. Use the Module6_Problem3 file to execute in R. Note: You will have to replace the d, 1-d, 1-f, f in mat2 with the values you determined in questions a and b. Copy the printed matrix and paste to Word as your answer.

  1. Using the Success and 1-Success matrix output from problem 3, produce a line graph in Excel showing how the probabilities stabilize over time. It will look something like this: 

Hint: Using the values from the printed matrix in Problem 3, copy these and paste into the Tab Markov model cells B2:C11 in the Problem 6 Tree file. The graph should adjust accordingly. 

  1. At the end of 10 years, what are values of the blue and red lines? These are your new X% and 1-X% in the Tree Model. These represent the long-term probability of success or non-success, given the starting state of a business. 

Hint: These are the values found in row 10 of your matrix output. Column 1 is Success and Column 2 is 1-Success. 

  1. Should you employ Dustin to do the research or not? State the new EVs and explain your decision and what the tree is telling you. Use the new X% and 1-X% in your Problem 6 Tree model in Excel and recalculate the tree.

Hint: These are the values found in row 10 of your matrix output.  Column 1 is Success and Column 2 is 1-Success.  Use these values and enter Success in D9 and 1-Success in D10 

Solution 

code used.R 

x <-Mopps_with_commas #fisrt you need to read the excel file in R

y <- x$tot_success[x$tot_success>=2] #we need it for d

z <- x$tot_nsuccess[x$tot_nsuccess >=2] #we need it for f

d=length(y)/length(x$tot_success) #answer to 3a

print(d)

f=length(z)/length(x$tot_nsuccess)#answer to 3b

print(f)

#Q1,2,3

library(expm)

#install exponent packageinstall.packages(“expm”)

library(expm)

#Create matrix 1

mat1 <- matrix(c(1,0), 1, byrow=T)

#Create matrix 2 using d, 1-d, f and 1-f values

mat2 <- matrix(c(d,1-d,1-f,f), 2, byrow=T)

#mat2 <- matrix(c(.33,.67,.81,.19), 2, byrow=T)

#Create a null vector to hold results

xrez <- NULL

# Iterate through each exponent 1-10 for this problem

for (x in 1:10) {

xrez <- c(xrez, mat1 %*% ( mat2 %^% x ))

}

#Convert array to matrix

matrez <- matrix(xrez, 10, byrow=T)

#List matrix results

print(round(matrez,2)) #using values correct to 2 decimal places for conveniece

Answers

  1. 33 is the probability of having a successful year
  2. 651 is the probability of having 2 successful years
  3. See in excel file
  4. See in excel file
  5. X%=54.75912 % % 1-X%= 45.24088% are the long term probabilities in this case