Geologists often evaluate aggregate volumes of discovered plus undiscovered oil and/or gas in a petroleum basin by use of geologic-volumetric methods. Although sophisticated geological reasoning may be employed, the essential idea behind these methods is simple: estimate (a) the volume of hydrocarbon bearing sediment in the basin, (b) the amount of hydrocarbons present per unit volume of sediment, and (c) the fraction of hydrocarbons present per unit volume that is technologically recoverable. The product of (a) and (b) is interpretable as a point estimate of the sum of amounts of oil and gas in place in individual oil and gas deposits (fields) in the basin. The product of (a), (b), and (c) is a point estimate of amounts of oil and gas recoverable from all individual deposits in the basin. When a basin is partially explored additional information is available in the form of a sample composed of magnitudes of discovered deposits. Geologists sometimes ask, "Given a point estimate of the sum of magnitudes of un- discovered as well as discovered fields generated by a geo- logic-volumetric analysis and given a sample of magnitudes of discovered deposits, what inferences can be made about the 'size distribution' (empirical distribution of magnitudes) of all fields in the basin?" This question motivates our study of a variant of Murthy's (1957) unbiased estimator of a finite population total when sampling is successive. Here is how Murthy's estimator comes into play. We assume the existence of N fields, labeled 1, 2, .. ., N, and associate a magnitude Yk to the field labeled k (k.= 1, 2, ... , N). The sampling scheme adopted as a characterization of discovery is "successive sampling"; that is, the order in which fields are discovered is governed by sampling proportional to field magnitude and without replacement. This particular sampling scheme conforms to empirically based industry folklore ("on the average, the big fields are found first") and is representable as a per- mutation distribution with domain the set of all permutations of field labels; for example, the probability that all N fields are discovered in the order 1, 2, 3, . .. , N is N If Yjl[Yj + + YN]i j=1 With y = (YI, . . , YN), S = the set of labels of fields discovered in a sample of size n - N, P(s y Y) = the probability of an unordered sample s, and P(s I k#; y) = the probability of observing s given that the field labeled k appears first in s, E h(Yk)P(S I kl; y)lP(s I Y) kes is Murthy's unbiased estimate of the sum IN= h(yk) of a function h of individual field magnitudes. To calculate Murthy's estimator, at least R = Yi + + YN must be known [check the form of P(s I y)]. If a geologist provides us with a point estimate Re of R, we can then (in principle) compute each term P(s k#; y)IP(s I y) in the aforementioned sum to arrive at an unbiased estimate of EN I h(yk). Choice of h(x) = 1 if x - X and h(x) = 0 otherwise allows us to calculate an unbiased estimate of the empirical frequency function generated by {Yi, ., YN} conditional on R = Re. Setting X = co yields an unbiased estimate of N, the number of fields in the basin, conditional on Re = R. The remaining fly in the ointment is that for even moderate sample sizes, computation of P(s I y) is a formidable task. To overcome this computational difficulty we use an integral representation of P(s I y) to develop asymptotic expansions of Murthy's estimator, the first few terms of which are easily computable. Properties of the leading term are examined by application to two examples. The first example consists of an experiment in which N is estimated for a large number of Monte Carlo successive samples from a fixed finite population whose parameter y is known. The second example uses data gathered and analyzed by Smith and Ward (1981). They estimated the amount of recoverable oil remaining to be discovered in the North Sea, employing a discretized version of a model of discovery like that described at the outset, and gave maximum likelihood estimators (MLE's) of the ultimate number of oil and gas fields in each of seven magnitude classes computed by numerical grid search. We compare their estimates with estimates computed by use of an approximation to Murthy's estimator. A rationale for the close match between unbiased and maximum likelihood estimates for these data is provided in Section 7, and the approximation to Murthy's estimator that we pro- pose is shown to be closely related to Chapman's (1951) MLE for the (unknown) size of a binomial sample.
Estimation of Finite Population Properties when Sampling is without Replacement and Proportional to Magnitude
ANDREATTA, GIOVANNI;
1986
Abstract
Geologists often evaluate aggregate volumes of discovered plus undiscovered oil and/or gas in a petroleum basin by use of geologic-volumetric methods. Although sophisticated geological reasoning may be employed, the essential idea behind these methods is simple: estimate (a) the volume of hydrocarbon bearing sediment in the basin, (b) the amount of hydrocarbons present per unit volume of sediment, and (c) the fraction of hydrocarbons present per unit volume that is technologically recoverable. The product of (a) and (b) is interpretable as a point estimate of the sum of amounts of oil and gas in place in individual oil and gas deposits (fields) in the basin. The product of (a), (b), and (c) is a point estimate of amounts of oil and gas recoverable from all individual deposits in the basin. When a basin is partially explored additional information is available in the form of a sample composed of magnitudes of discovered deposits. Geologists sometimes ask, "Given a point estimate of the sum of magnitudes of un- discovered as well as discovered fields generated by a geo- logic-volumetric analysis and given a sample of magnitudes of discovered deposits, what inferences can be made about the 'size distribution' (empirical distribution of magnitudes) of all fields in the basin?" This question motivates our study of a variant of Murthy's (1957) unbiased estimator of a finite population total when sampling is successive. Here is how Murthy's estimator comes into play. We assume the existence of N fields, labeled 1, 2, .. ., N, and associate a magnitude Yk to the field labeled k (k.= 1, 2, ... , N). The sampling scheme adopted as a characterization of discovery is "successive sampling"; that is, the order in which fields are discovered is governed by sampling proportional to field magnitude and without replacement. This particular sampling scheme conforms to empirically based industry folklore ("on the average, the big fields are found first") and is representable as a per- mutation distribution with domain the set of all permutations of field labels; for example, the probability that all N fields are discovered in the order 1, 2, 3, . .. , N is N If Yjl[Yj + + YN]i j=1 With y = (YI, . . , YN), S = the set of labels of fields discovered in a sample of size n - N, P(s y Y) = the probability of an unordered sample s, and P(s I k#; y) = the probability of observing s given that the field labeled k appears first in s, E h(Yk)P(S I kl; y)lP(s I Y) kes is Murthy's unbiased estimate of the sum IN= h(yk) of a function h of individual field magnitudes. To calculate Murthy's estimator, at least R = Yi + + YN must be known [check the form of P(s I y)]. If a geologist provides us with a point estimate Re of R, we can then (in principle) compute each term P(s k#; y)IP(s I y) in the aforementioned sum to arrive at an unbiased estimate of EN I h(yk). Choice of h(x) = 1 if x - X and h(x) = 0 otherwise allows us to calculate an unbiased estimate of the empirical frequency function generated by {Yi, ., YN} conditional on R = Re. Setting X = co yields an unbiased estimate of N, the number of fields in the basin, conditional on Re = R. The remaining fly in the ointment is that for even moderate sample sizes, computation of P(s I y) is a formidable task. To overcome this computational difficulty we use an integral representation of P(s I y) to develop asymptotic expansions of Murthy's estimator, the first few terms of which are easily computable. Properties of the leading term are examined by application to two examples. The first example consists of an experiment in which N is estimated for a large number of Monte Carlo successive samples from a fixed finite population whose parameter y is known. The second example uses data gathered and analyzed by Smith and Ward (1981). They estimated the amount of recoverable oil remaining to be discovered in the North Sea, employing a discretized version of a model of discovery like that described at the outset, and gave maximum likelihood estimators (MLE's) of the ultimate number of oil and gas fields in each of seven magnitude classes computed by numerical grid search. We compare their estimates with estimates computed by use of an approximation to Murthy's estimator. A rationale for the close match between unbiased and maximum likelihood estimates for these data is provided in Section 7, and the approximation to Murthy's estimator that we pro- pose is shown to be closely related to Chapman's (1951) MLE for the (unknown) size of a binomial sample.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.