| 3.1 FUNCTIONS OF RANDOM VARIABLES: 
 DERIVED DISTRIBUTIONS
 Often when examining a system we know by hypothesis or 
measurement the probability law of one or more random
variables, and wish to obtain the probability laws of other 
random variables that can be expressed in terms of the
original random variables. The random variables in the second set 
are functions of the random variables in the first
set. We call this a problem of derived distributions, since we 
must derive the joint probability distribution(s) for the
random variables in the second set. Derived distribution problems 
can arise with discrete, continuous, or mixed
random variables.
 There are many special techniques for deriving distributions, but 
we will focus on a "never-fail" method.
Virtually all of the work associated with this method occurs in 
the joint sample space of the original random
variables; the never-fail method is simply a systematic procedure 
for carrying out Step 4 ("working in the sample
space") in a probabilistic modeling analysis.
 Suppose that the original set of random variables is given by 
{X1, X2, ..., XN} with joint cdf FX1, X2,...,XN(·). 
Suppose that there are M random variables 
Y1, Y2, ..., YM, each of which can be 
expressed as a function of X1, X2, ..., XN, 
namely Yi = gi(X1, X2, ..., 
XN), i = 1, 2, ..., M.  Then the never-fail method, called the 
cumulative distribution method, allows computation of the joint 
cumulative distribution function for the Yi's,
 
 
   
 
 as follows:
 
a. Identify the set of points in the original 
(X1, X2, ..., XN) sample space that 
corresponds to the joint event
 
   
 
 b. For each set of values for the y1's, [y1, y2, . . . , yM], 
determine by summation or integration the probability in the 
(X1, X2,... XM,) sample space of this joint event, thereby obtaining
FY1,Y2... Ym(y1,y2,...yM) -  < y1,y2,...yM < +   If the random variables are continuous, we can find the joint pdf 
for {Y1, Y2,... YM} by taking partial derivatives of
FY1,Y2,...YM (·) with respect to each of its arguments,
 
   
 
 If they are discrete, the pmf is found simply by using the cdf 
and subtracting appropriate successive values.
 While the method described in its full generality may appear 
intimidating, applying it carefully in a step-by-step
manner makes problems much easier to solve. Fortunately, for many 
problems of interest the number of variables
involved i s small, often with neither M nor N exceeding 2. 
Gaining proficiency in this aspect of probabilistic
modeling seems to require study of numerous examples, to uncover 
potential pitfalls that await the unwary analyst.
Thus, we will analyze many examples, most of which are of 
independent interest in the analysis of urban service
systems. Continuous random variables appear to give the greatest 
difficulty to those first learning use of the
method, and thus our focus will be on continuous random 
variables. Examples involving discrete random variables
are given in the problems. (See Problem 3.2 for strictly discrete 
random variables and Problems 3.24 and 3.30 for
"mixed" random variables.)
 Example 1: Response Distance of an Ambulette
 
 This first example will provide a framework for demonstrating 
several characteristics of "derived distribution" problems.
Suppose that a public safety vehicle travels back and forth along 
a straight highway, the traveling perhaps to find
motorists in need of assistance. Also, along this highway 
accidents can occur that create a need for on-scene assistance by
the vehicle. The vehicle is dispatched by radio to these 
accidents. Because of its limited on-board emergency medical
equipment, we call the vehicle an ambulette. We are interested in 
determining the probability law of the travel distance for
the ambulette to reach a random medical emergency.
 Solution
 
 
 Following the general discussion above, a derived distribution 
problem is like any other probabilistic modeling problem;
it requires that we do four things to model the experiment:
 
 
As discussed above, the activity specific to derived distributions (functions of 
random variables) occurs in Step 4.| STEP 1: | Define the random variables of interest. |  | STEP 2: | Identify the joint sample space. |  | STEP3: | Determine the joint probability distribution over the sample space. |  | STEP 4: | Work within the sample space to determine the answers to any
	                                            questions about the experiment. |  
 1. Random variables. Suppose that the highway is of unit length. Then the two key 
random variables would be
 
 X1 = location of the medical emergency, 0
  X1  I 
 X2 = location of the ambulette at the moment of dispatch, 0
  X2 
 Later, when we are interested in travel distance, assuming U-turns are possible and 
permissible everywhere, the travel
distance D can be expressed as a function of X1 and X2, D = |X1 - X2|
 2. Joint sample space. The joint sample space is the unit square in the positive 
quadrant (0  X1  1, 0  X2  1). 	3. Joint probability distribution. We will assume that the locations of the ambulette 
and the medical emergency
are uniformly, independently distributed over the highway. In practice, the three 
assumptions entailed in such a statement
would have to be argued for plausibility and measurements might have to be taken. 
Naturally, the analysis could also
proceed with an alternative set of assumptions. Since we are now dealing with strictly 
continuous random variables, we will
work with the joint probability density function, which is
 
 
 
   
 
 4. Work in the sample space. This is the point at which the never-fail method for 
deriving distributions comes
into play. We want the probability law of
 
 
 
   
 Here, in our general notation, N = 2 and M = 1 and we are confronted with what is 
sometimes called a 2-to-1 transformation.
 To apply the never-fail method for finding the cdf of D, FD(y), we first locate the 
region in the (XI, X2) sample space
corresponding to the event (D < y). Formally, the steps are written as follows:
 
 
 
   
 To remove the absolute value operator, we consider two cases 
separately: case 1: X1
  X2; case 2: X1 < X2
. For the first case,
D = X1 - X2 and experimental values x1 and x2
 of X1 and X2, 
respectively, must lie between the line x2 = x1 and x2 
= xI - y 
(Figure
3.1). For the second case, D = X2 - X1, and experimental values 
of X1 and X2 must lie between the line x2 = x1 
and x2 = xI + y.
Consideration of these two cases gives rise to the shaded region 
in the sample space in Figure 3.1. Once we have
determined such a region, we have identified the set of points 
corresponding to the event of interest: [D < y), thereby
completing step a of the never-fail method. This is often the 
most difficult part of a derived distribution problem. Note that
determination of this region in no way depended on the joint pdf 
for X1 and X2; thus, the "work" invested to this point could
be applied to several alternative models, each with its own joint 
pdf for X1 and X2. 
 
   
 Step b of the never-fail method requires that we integrate 
fx1,x2,(·) over the set of points in the shaded region to obtain
FD(y). Since the joint X1, X2 pdf is uniform over the unit 
square, we can perform the integration by computing areas in the
sample space. (Conceptually, each area is multiplied by "l," the 
height of the pdf at that point, to yield a probability
measured as a volume.) By computing areas of the triangles not in 
the shaded region,
 
 
 
 
   
 we have now completed step b of the never-fail method and we are 
"done." [What do we know about FD(-2) or FD
  ?] 
 Should we desire the pdf of D, we differentiate, obtaining
 
 
 
 
   
 From the pdf (or cdf) we can determine anything that is desired 
concerning
D. For instance, the expected value (or mean value) of D is
 
 
   
 These results will be of use in our further work.
 A system administrator may be interested in knowing the effects 
on travel distance of prepositioning the ambulette at
the center of the interval depicting the highway, thus fixing X2 
Then the joint sample space is the straight line indicated in
Figure 3.2. If the new travel distance is D'=|X1-1/2|, the
 
   
 region for which (D'
  y) is the line segment of length 2y 
centered at X1=1/2. Integrating the (uniform) pdf of X1, we have
FD'(y) =P{D'  y)=
P{ | X - 1/2 |  y)= 2y (O  y  1/2). Thus, the pdf of D'is 
 
   
 How could this result also be obtained by inspection? The mean 
and variance are
 
 
   
 Thus, a change in deployment policy resulting in an ambulette 
prepositioned at the center of its service area rather
than randomly patrolling its service area reduces mean travel 
distance by 25 percent, the variance of the travel
distance by 62.5 percent, and, perhaps important in "worst-case" 
analyses, the maximum possible travel distance by
50 percent.
 
Question: How would one determine (or estimate) the joint 
distribution function for X1 and X2 in practice?
 Further work: Problems 3.2-3.4.
 
 Extension: Scaling
 
 We often select the scale of a probabilistic modeling problem for 
analytical convenience. For instance, if the
length of highway analyzed in Example I had been 13.72 
kilometers, the factor of 13.72 would have occurred in
numerous places (making the analysis obviously less attractive). 
Thus, after performing the analysis for a
conveniently scaled problem, we often rescale it to suit the 
real-world situation at hand. Scaling can also occur when
switching systems of measurement, say from British units to 
metric units.
 Suppose that we have derived the probability law for W, given one 
scale, and we wish to find the moments and
the probability law of
 
   
 In words, multiplying a random variable by a constant results in 
its variance being multiplied by the square of that
constant.
 We can also derive the probability law of V (assumed to be 
continuous) using the never-fail method. The
analysis proceeds as follows:
 
 
 
   
 To proceed further, we must distinguish two cases: case 1: a>0; case2;a<0.
 
 
   
 These equations constitute the answer to our problem. For 
instance, in the ambulette example, if a = 13.72
kilometers and b = 71.09 kilometers, we would be modeling a 
13.72-kilometer stretch of highway starting 71.09
kilometers from the origin. Returning to the patrolling ambulette 
example, the cdf for X1 becomes
 
 
 
 
   
 You might find it helpful to sketch several different 
applications of this result.
 
 Exercise 3.1: Rectangular Response Area Suppose that we have an 
X0-byY0 rectangular response area for the
ambulette (Figure 3.3), with sides of the rectangle parallel to 
the coordinate axes. The location of the medical
emergency (X1, Y1) and of the ambulette (X2, Y2) are 
independently uniformly distributed over the response area.
Travel distance occurs according to the "right-angle" metric,
 
 D=|X1 -X2| + |Y1 - Y2|	(3.11)
 
 
 
 
 
 
   
 Example 1: Revisited (Min and Max)
 
 Suppose we are interested in the coordinates that determine a 
dispatch incident, X1 and X2, without regard to which location
represents the ambulette and which the medical emergency. 
Instead, we may be concerned with the rightmost coordinate R
and the leftmost coordinate L. For instance all points between R 
and L may be exposed to siren and lights as the ambulette
passes at high speed. Thus, the joint probability law of R and L 
would be of interest. We will ignore scaling and assume
that all locations, as before, occur in the interval [0, 1].
 
 Solution
 
 Since we have already performed Steps 1-3 in describing the 
experiment, we are ready to go to Step 4 (work in the sample
space) and employ the neverfail method. The random variables that 
are functions of the original random variables are
 
 
 
 
   
 We wish to derive the joint probability law for R and L. This is 
sometimes called an N = 2-to-M = 2 transformation. To
execute step a of the never fail method, we proceed formally as follows:
 
 
 
 
   
 To proceed from here, we consider separately each of the two events 
in braces and "merge" these later by intersection.That is, we can write
 
 
   
 To determine the set of points in the (X1, X2) rample space 
corresponding to Max (X1, X2)
  r, we again consider two
cases: case 1: X1  X2; case 2: X1  X2. For case 1, Max (X1, 
X2) = X1 and the event Max (X1, X2)  r corresponds to
the set of points to the left of the line x1 = r (Figure 3.4). 
Similarly, for case 2, Max (X1, X2) = X2 and the event Max (X1,
X2)  r corresponds to the set of points below the line X2 = 
r. Combining these two cases, the event Max (X1, X2)  r
corresponds to the square of area r2 shown in Figure 3.4. Proceeding in a similar manner for Min (X1, X2)  1, we again 
consider case 1: X1  X2, and case 2: X1  X2. For
case 1, Min (X1, X2) = X2 and the event Min 
(X1, X2)  1 
corresponds to the set of points below the line x2 = 1 (Figure
3.4). For case 2, Min (X1, X2) = X1 and the event Min (X1, X2)  1 corresponds to the set of points to the left of the line x1
= 1. Combining these two cases, the event Min (X1, X2)  1 
corresponds to the L-shaped region shown in Figure 3.4. 
 
   
 The intersection of the two events found above yields the event 
of interest, {R
  r, L  l}, shown in the crosshatched
region in Figure 3.4. We have now completed step a of the 
never-fail method, 
		To carry out step b all we need to do is to integrate the joint 
pdf fx1, x2(·) over the region (event) found in Step 1. Again,
because of the special nature of this sample space and its 
probability assignment, we can do this by working directly with
areas in the sample space. By computing the relevant areas, we 
obtain
 
   
 The joint R, L sample space is shown in Figure 3.5. The joint pdf 
of R, L over this triangular region is uniform. Does this
make sense intuitively?
 
 Example 2: Travel Time
 
 Suppose that it is not travel distance we are interested in but 
rather travel time. If we define random variables
 
 
   
 time is related to distance and speed by the familiar equation
 
 
   
 Solution
 
 In general, to obtain the pdf of T we would require the joint pdf 
of D and S, say fD, S(x, s). The never-fail method would
proceed as follows:
 
 
   
 The event corresponding to [D
  tS) in the (D, S) sample space is 
shown in Figure 3.6. In principle, all we need do is
integrate the joint D, S pdf over this region for each value of t 
to obtain the cdf for T, FT(t). 1 
 
   
 As a simple example, suppose that the speed of response could 
assume only two values, S = 1 or S = 2, with equal
probability. Assume that distance is distributed as the ambulette 
response distance of Example 1, independently of the
speed of response. Then
 
 
   
 This formidable-looking expression represents the pdf of two 
random variables, one continuous and the other discrete. As
long as we keep in mind that pdf's have no probabilistic meaning 
until we integrate them and that the integration
properties of impulses are well defined, we will be in fine 
shape. (Recall Problem 2.2.)
 The joint (D, S) sample space is shown in Figure 3.7. We now 
proceed with the never-fail method.
 
   
 Examining Figure 3.7, we see that the straight line x = ts 
intersects both "lines" of the sample space for 0 < t < 1/2. So,
for those values of t, we have
 
 
   
 The "1/2"'s arise from integrating left to right across the 
impulses; the FD(·) terms arise from integrating from x = 0 to x 
= ts at
s = 1 and s = 2. Since from Example 1, (3.1), we know that
 
 
 
 
   
 Once t exceeds 1/2 in value, the sweep of the line x = ts no longer 
picks up additional probability from the "line impulse" at s =
2. So, for 1/2 < t
  1, FT(t) =1/2[1 - (1 - t)2] + 1/2. Thus, combining 
results, the answer to our problem is 
 
 
 
   
 This pdf is sketched in Figure 3.8. Note the discontinuity in 
slope at t=1/2.
This is not unusual in practice; in fact, one often comes across 
problems in which the derived pdf is discontinuous (in
value) at one or more points. Points of discontinuity, either in 
value or slope, usually correspond to "switchover points"
in the original sample space in which the summation or integral 
for accumulating probability for the cdf switches over to
some new functional form. Switchovers often occur when the region 
of accumulated probability changes in geometric form,
such as occurred at t=1/2 in the example.
 	While we have completed our derived distribution work on this 
problem, there is one additional issue that we wish to
address and that deals with expected values of random variables. 
Here the expected value of T is
	
 
   
 We may wish to calculate the expected value simply by working in 
the (D, S) sample space. Because of independence, if T =
h1(D)h2(S), then
 
 
   
 as calculated previously, This is an illustration of the 
following general principle:
 
If one only desires expected values and not the complete 
probability law of a function of random variables, it is usually
computationally easier to work directly in the original sample 
space to compute the expected values.
		 There is a second general principle we can illustrate with this 
example. When asked to calculate E[T], one may be
tempted to say that
 
   
 Clearly, this is not correct, the answer being about 11 percent 
less than the correct answer. The error lies in assuming that
E[1/S] = 1/E[S].
 In general, the expected value of a function of a random variable 
is not equal to the function evaluated at the expected value
of the random variable.
 In this case one can prove mathematically that for any 
nonnegative random variable S,
 
   
 Hence, using (E[D]/E[S]) to estimate E[T] in such a case results 
in an optimistically low estimate of average travel time. In
a practical sense these relations imply that an urban service 
agency cannot infer that, say, a 20-mile/hr average response
speed and a 1-mile average travel distance imply a 3-minute 
average travel time. On the contrary, the average inverse speed
could be, say, 0.10 hour/mile; in such a case if travel distance 
and travel speed are independent, the average travel time is 6
minutes, not 3 minutes.
 
 Further work: Problem 3.5.
 
 Example 3: Rayleigh Distribution
 
 To this point our derived distribution examples have dealt with 
sample spaces in which all random variables had finite
maximum and minimum values. This is not a necessary requirement, 
and many derived distribution problems, such as the
case considered here, allow one or more random variables to 
assume infinitely large (positive or negative) values.
 	Suppose an urban vehicle is located at (X0, Y0). An automatic 
vehicle location (AVL) system utilizes one of the several
available technologiesz to estimate the location of the vehicle. 
Such an application is relevant in police departments,
taxicab services, maintenance services, and numerous other urban 
services. Suppose that the estimated position of the
vehicle is given byX = X0 + Xe
 
 
 Y = Y0 + Ye
where (X0, Y0) represent the true position coordinates of the 
vehicle and (Xe, Ye) are the additive error terms due to imperfect
resolution. For certain AVL technologies it makes sense to assume 
that Xe and Ye are independent, zero-mean Gaussian
random variables:
 
 
   
 where the standard deviation
  specifies the resolution of the 
system. It now makes sense to examine properties of the
"radius of error" 
 
   
 Solution
 
 To derive the probability law of R we work in the (Xe, Ye) sample 
space, which is the entire plane (Figure 3.9), and utilize
the joint (Xe, Ye) pdf, which is (by independence)
 
 
   
 
 
 
   
 Because of the circular symmetry of the situation, we find it 
easier to evaluate this integral by changing to polar
coordinates
  and  , where 
 
   
 These relationships are shown in Figure 3.9. Since the 
infinitesimal area to be integrated changes from dx dy to
  d  d  ,
we can write 
 
   
 Carrying out the final integration, we find that
 
 
   
 Notice that this pdf behaves as we might expect intuitively: it 
starts at zero at r = 0 and grows monotonically to a maximum
(which occurs at r =
  ) and then decreases monotonically in an 
exponential way according to r2 Among other applications, the Rayleigh probability law arises in 
physics in various scattering experiments and in
communication theory in the modeling of noise over a 
communication channel. We have now seen how it arises as a
derived distribution in an urban vehicle location context.
	 	There is an alternative way of deriving the Rayleigh pdf directly 
without first finding the cdf. The method is useful in
other applications, as well, in which it is easy to make 
infinitesimal probability arguments. However, when in doubt, we
always prefer to resort to the never-fail cdf method. The direct 
method proceeds as follows : since a pdf has a probability
meaning only if it is integrated, we "integrate" fR(r) over the 
infinitesimal interval [r, r + dr),
 
   
 Again because of circular symmetry, we change to polar 
coordinates
  and 6, with  = r and d  = dr, thereby obtaining 
 
   
 as previously derived. We used such an infinitesimal argument 
when showing in Section 2.12 that the Ith-order interarrival
time of a Poisson process has an Ith-order Erlang pdf. However, 
again we caution those computing derived distributions
that this "infinitesimal" method for finding the pdf directly is 
fraught with potential pitfalls and difficulties for all but the
simplest problems. Thus, the never-fail cdf method remains our 
primary tool for deriving distributions.
 
 Further work on A VL position estimation errors: Problems 3.6 and 
3.7.
 
 Example 4: Ratio of Right Angle to Euclidean Distance Metrics
 
 As another example of deriving distributions of random variables, 
we consider a problem that arises in transportation
systems (e.g., "dial-a-ride" systems, taxicab systems), emergency 
services (fire, police, and ambulance), and other
municipal systems having mobile units. The problem deals with the 
"penalty" in travel distance incurred by a mobile unit
while traveling a grid of streets, compared to a helicopter or 
other unit that could travel "as the crow flies."
 	If the mobile unit is located at (x1, y1) and is traveling along 
a shortestdistance path to (x2, y2) perhaps to pick up a
passenger, then the right-angle distance between the points isd = |x1 - x2| + |y1 - y2|
 
 
 If street directions are parallel to the coordinate axes, the 
right-angle distance (also called Manhattan, metropolitan, or
rectangular distance) is a good approximation for the actual 
travel distance covered. 3
 Of interest in designing computer dispatching algorithms and in 
developing planning models, the ratio of the right
angle to the Euclidean distance provides insight as to the extra 
distance traveled because of the requirement of driving on
streets. For instance, if one knew the average value of this 
ratio, then in a computer dispatching algorithm it might be
acceptable to estimate the travel distance as the product of this 
average value and the Euclidean distance, the latter being
obtained easily from a file of (x, y) coordinates.
 Consider two points (X1, Y1) and (X2, Y2),
 corresponding to the 
trip origin and destination, respectively, defined
relative to any fixed coordinate system. Let  (0        /2) be 
the angle at which the directions of travel are rotated with
respect to the straight line connecting the two points (see 
Figure 3.10). Given  , the right-angle travel distance between
(X1, Y1) and X1, Y1) is 
 
   
 
 
 
   
 We wish to derive the cdf of R using the never-fail method, 
making reasonable assumptions about the probabilistic
behavior of
  . 
 Solution
 
 Here we are deriving the distribution of one continuous random 
variable which is expressed as a function of another
continuous random variable (i.e., a "one-to-one" transformation). 
The cdf of R is
 
 
   
 The event corresponding to (R
  r) in the  sample space is shown 
in Figure 3.11. Now in a large, uniform city it makes
sense to assume that  is uniformly distributed over [0,  /2]. 
(Why?) We call this an isotropy assumption, meaning
sameness regardless of direction. Given the isotropy assumption, 
we can integrate the pdf of  over the event indicated in
Figure 3.11 to obtain 
 
   
 
 
 
   
 Thus, "on the average" the mobile unit travels about 1.273 times 
the Euclidean
distance (given the model assumptions). Since
  = 0.0155, the ratio  R/E[R], the coefficient of variation, is only 0.098, meaning 
that the estimate of 4/  for E[R] is quite robust.
		A reasonable "test" of the right-angle distance metric would be 
to compare the empirical distribution of ratios of
recorded travel distances and corresponding Euclidean distances 
to FR(·) and to compare the empirically found average R to
1.273. 
 Further work: Problem 3.8 {deriving E[R] and
  without
 FR(·)};
	
Problems 3.9 and 3.10 (alternatives to the isotropy assumption). 
 Example 5: Quantization Model
 
 As a final detailed example of a derived distribution problem, we 
consider a situation in which two continuous random
variables give rise to one discrete random variable. This 2-to-l 
transformation arises due to quantization of odometer
readings in urban vehicles. The same analysis applies in other 
quantization settings, for instance in cases where successive
event times are quantized.
 Assume that we are running an experiment to estimate the 
distribution of distance traveled by taxicabs, where distance
 D
  miles traveled from the moment of 
dispatch to arrival at the 
address of the caller 
 All we have available experimentally are recorded travel 
distances, which are quantized as 0 miles, I mile, 2 miles, and so
on. We wish to examine the quantitative effects of such 
truncation. Quite clearly, the same model could be used for 
studying
response distances of emergency vehicles, "paid" trips of 
taxicabs, trips of dial-a-ride vehicles, etc.
 	For a journey of length D, the recorded travel distance equals 
the sum of D and the accumulated odometer mileage at the
moment of dispatch since the last odometer reading change, the 
sum truncated to the largest integer not exceeding the sum.
For instance, if the vehicle had traveled 0.9 mile since the last 
reading change and then traveled 1.2 miles to the address of
the caller (following dispatch), the recorded mileage would be 
the largest integer not exceeding (0.9 + 1.2) = 2.1, which is
2 (miles). If, however, the noninteger accumulated odometer 
mileage at the moment of dispatch had been 0.6 rather than
0.9, the recorded mileage would be the largest integer not 
exceeding (0.6 + 1.2) = 1.8, which is I mile. In the first case, 
the
odometer's mileage reading had changed twice; in the second, 
once. As examples will clearly demonstrate, the recorded
travel distance can either underestimate or overestimate the 
actual travel distance by as much as I mile.
 Solution
 
 Random Variables
 
 There are two key random variables that give rise to the 
quantized distance random variable:
 
 D
  actual travel distance 
 
    accumulated noninteger 
odometer mileage at the moment 
of dispatch (a random variable distributed over [0, 1)) 
 
 If we let the quantized distance random variable be
 K
  recorded mileage for the journey 
Then K is a function of D and  : 
 
   
 Here we have a discrete random variable expressed as a function 
of two continuous random variables. If we have the joint
probability law for D and
  ), 
we would like the probability law 
for K. 
 Joint Sample Space
 
 The (D,
  ) sample space is the infinite strip of 
width 1 (0 < D <  , 0      1), shown in Figure 3.12. Without yet
assigning a probability law over this sample space, we have 
performed in Figure 3.12 the "work" required to find the sets of
points in the sample space that give rise to different values 
 
   
 of the random variable K. We illustrate the derivation of one of 
the "45° lines" partitioning the sample space. Suppose that
the experimental value for D lies between 1 and 2 (i.e., 1
  d  2). Then, for "sufficiently small"  , K will equal 1;
otherwise, K will equal 2. The switch from K = 1 to K = 2 will 
occur at the point at which d +  = 2. Thus, the switch occurs
along the line 
 
   
 Joint Probability Distribution
 
 Without knowing the exact distribution for D, we can make some 
further progress in our analysis of the effects of
quantization. From physical considerations, the following 
assumptions seem reasonable:
 
 
 
Thus, we will limit our knowledge of the joint (D, The random variables D and  are  independent. 
   is  uniformly distributed over [0,1]. (Why?) 
  )pdf to say 
that it takes the following form: 
 
   
 Working in the Joint Sample Space
 
 
 Since entire subregions of the (D,  )
sample space give rise to 
exactly one value of K, we can deal directly with the pmf
for K, not the cdf. Given the assumptions regarding fD,  (d,  ) 
above, if the cdf for D is known, say FD(·), the probability
mass function for K is readily computed: 
 
   
 Thus, any statistical procedure using experimental data to 
estimate E[K] should also yield an (unbiased) estimate of E[D].
For such a procedure to remain unbiased, it is necessary that 
zero-mileage journeys be recorded and used in the statistical
tabulations.
 
 Question: Given the foregoing analysis, can one lump together 
recorded mileages quantized in tenths of miles with those
quantized in miles?
 
 Furtherwork: Problem 3.11 [for a proof of (3.31)]; Problem 3.12 
(for an application of these ideas to time measurements).
 
 1 Typical empirical relationships found among speed, distance, and 
time are described later in this chapter.
 
 2 See, for example, R. C. Larson, K. W. Colton, and G. C. Larson, 
"Evaluating an Implemented AVM System: The St. Louis
Experience (Phase I)," Public Systems Evaluation, Inc., 
Cambridge, Mass., 1976.
 
 3 See Problems 3.24 and 3.25 for realistic variations to the 
right-angle distance (due to discreteness of streets and one-way
streets).
 |