抱歉,您的浏览器无法访问本站
本页面需要浏览器支持(启用)JavaScript
了解详情 >

题解:So you want to be a 2n-aire

1.题意

地址:https://vjudge.net/problem/UVA-10900

The player starts with a prize of $1, and is asked a sequence of n questions. For each question, he may • quit and keep his prize. • answer the question. If wrong, he quits with nothing. If correct, the prize is doubled, and he continues with the next question. After the last question, he quits with his prize. The player wants to maximize his expected prize. Once each question is asked, the player is able to assess the probability p that he will be able to answer it. For each question, we assume that p is a random variableniformly distributed over the range t..1.

大致意思是:参加有奖竞赛,初始有1$,之后每答对一题奖金翻倍,对于每个题答对的概率为[t,1],如果答对,奖金翻倍,答错当场退赛且没有奖金,放弃答题则拿走已经得到的奖金退赛。求玩家在最优策略下奖金期望的最大值

2.解法

这个题最大的问题就在于答对的概率p不是一个定值。

我们不妨先看看p一定的情况。

设 d[i] 为已经答对了i道题的情况下,这场比赛在最优策略下的期望奖金,很显然答案为 d[0]

那么转移方程为 $d[i]=max(2^i , p\times d[i+1])$ 也就是放弃答题和继续答题两种策略取最大值

那么如果概率不是一个定值应该怎么办

根据连续概率

$$d_i = \int_{t}^{1}max(2^i , p \times d[i+1]) ,dp$$

很显然,这是个分段函数,所以积分的结果等于两段函数分别的积分的和

$$F(p) = max(2^i , p \times d_{i+1}) \Leftrightarrow \begin{cases}{\int_t^{p_0}2^i \, dp} & p\leqslant p^0\\ {\int_{p_0}^1d[i+1]\,dp} &p^0<p<1\end{cases}$$

当 $p=p_0$ 时,${\int_{p_0}^1d[i+1],dp} = {\int_t^{p_0}2^i , dp}$

求解之后

$$d[i] = 2^i \times \frac{p_0-t}{1-t} + \frac{1+p_0}{2} \times d[i+1] \times (1-\frac{p_0-t}{1-t})$$

我认为这个题的难点在于dp方程中状态的设计

3.代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include <cstdio>
using namespace std;
const int MAXN = 5e1+7;
double d[MAXN];
template <typename _TP>
inline _TP qread(_TP &X){
char ch=0;int w;X=0;
while(!isdigit(ch)){w=ch=='-';ch=getchar();}
while(isdigit(ch)){X=(X<<1)+(X<<3)+(ch^48);ch=getchar();}
X=w?-X:X;
return X;
}
inline double Max(const double a,const double b){
return a>b?a:b;
}
void solve(const int n,const double t){
d[n]=1<<n;
for(int i=n-1;i>=0;i--){
double p0=Max(t,(double)(1<<i)/d[i+1]);
double p=(p0-t)/(1-t);
d[i]=(1<<i)*p+(1+p0)/2*d[i+1]*(1-p); //积分
}
printf("%.3lf\n", d[0]);
}
int main(){
int n;
double t;
while(true){
scanf("%d%lf",&n,&t);
if(n==0 && t==0)break;
solve(n,t);
}
return 0;
}

评论