Darren, by producing your framework once, you can ensure that you can draw from multiple samples in a variety of ways. That is the only point I was making. In the case of ensuring that you don't have duplication of a record, you divide your sample into almost equal sizes (there are exact ones, but we will go with an easy one).
Simple case
I just need 10% of the sample .. just do it once
I need to do it again duplicate observations doesn't matter ... use either of the two proposed methods
I need a 10%, but no duplicate observations... Your proposal requires checking that a record hasn't been selected... no biggy
Let's set up some scenarios:
>>> a = np.random.randint(0, 10, 1000)
>>> size = np.array([len(a[a==i]) for i in range(10)])
>>> size
array([ 93, 99, 100, 115, 108, 115, 86, 93, 94, 97])
>>> size.mean()
100.0
10% sample, draw 50% from 1 of our 10 samples and 50% from another of our samples...this ensures no duplicates
>>> d = np.concatenate((a[a==1][:50], a[a==3][:50])) # draw 50 from a==1 and a==3 or
>>> d
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3])
or any combination of
np.concatenate((a[a==X][:50], a[a==Y][:50])) # where X !=Y lots of combinations, absolutely no duplication
Now you could vary the proportions if you want as well, here is an example 3 proportions are used
>>> #pull 30,60,10 out of a == 1, 3, 5 proportionally
>>> np.vstack((a[a==1][:30], a[a==3][60], a[a==5][10]))
9
>>> np.concatenate((a[a==1][:30], a[a==3][:60], a[a==5][:10]))
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5])
>>>
Now if you want replacement, you can just replace the == with either >=, <= or combinations thereof and you can vary your proportions in both cases by changing your slices. All of these have been for a uniform distribution, the principles also apply to other distributions as well.