Homework 5 CSE 630 Jason Kim Problem1 Compute Bellman Update Utility: State 1: (Vermont) Best: Scarlet U(1) = -.04 + (1*1) = .96 Result: Kinkakuji Reward +1, Restart State 2: Topiary Garden Best: Black U(2) = -.04 + (1*1) = .96 Result: Ohio_Stadium Reawrd +1, Restart State 3: Burning Man: Best: Scarlet U(3) = -.04 + (1*1) = .96 Result: Kinkakuji Reward +1, Restart State 2: Topiary Garden Best: Black U(2) = -.04 + (1*1) = .96 Result: Ohio_Stadium Reawrd +1, Restart ... Converges. State 1: Utility=.96 Policy=Scarlet State 2: Utility=.96 Policy=Black State 3: Utility=.96 Policy=Scarlet Problem 2: State 1: Argentina Best: Black Black: Has Lowest Chance of going to Prison for -1 reward U(1) = -.04 + (1*.4) = .36 (Countryside) U(1) = -.04 + (1*.3) = .26 (Topiary Garden) U(1) = -.04 + (1*.3) = .26 (Prison) State 2: Countryside Best: Scarlet Scarlet: Has Highest Chance of reaching OSU Stadium for Reward U(2) = -.04 + (1*.6) = .56 (OSU Stadium) U(2) = -.04 + (1*.3) = .26 (Argentina) U(2) = -.04 + (1*.1) = .06 (Vermont) State 3: Topiary Garden from Argentina. Best: Gray Gray: Has Lowest Chance of Daleks Invasion for -1 reward U(3) = -.04 + (1*.3) = .26 (Daleks) U(3) = -.04 + (1*.4) = .36 (Countryside) U(3) = -.04 + (1*..3) = .26 (OSU Stadium) Converges... State 1: Utility = .36, .26, .26, Policy: Black State 2: Utility = .56, .26, .06, Policy: Scarlet State 3: Utility = .26, .36, .26, Policy: Gray