By using process substitution (thanks Tom Fenech), both commands are seen as files. Then using cat we can concatenate these "files" together and output to STDOUT. cat <(awk '/^#/' file) <(awk '!/^#/' file | shuf -n 10) Input #blah de blah 1 2 3 4 5 6 7 8 9...

language-agnostic,iterator,random-sample

Encryption is reversible, hence an encryption is a one-to-one mapping from a set onto itself. Pick a block cypher with a large enough block size to cover the number of items you have. Encrypt the numbers 0, 1, 2, 3, 4, ... This will give you a non-repeating ordered list...

algorithm,hadoop,mapreduce,sample,random-sample

I'm not sure what "elegant" means, but perhaps you're interested in something analogous to reservoir sampling. Let k be the size of the sample and initialize a k-element array with nulls. The elements from which we are sampling arrive one by one. When the jth (counting from 1) element arrives,...

python,numpy,random,pandas,random-sample

The number of items in your resulting sample (n attempts each independently with probability p) has a binomial distribution and thus can rapidly be randomly generated e.g with numpy: sample_size = numpy.random.binomial(len(population). p) Now, the_sample = random.sample(population, sample_size) gives you exactly what you desire -- the equivalent of randomly, independently...

Initially give each track a weight w, e.g. 10 - a vote up increases this, down reduces it (but never to 0). Then when deciding which track to play next: Calculate the total of all the weights, generate a random number between 0 and this total, and step through the...

If you don't have a hard dependency on mySeq really being a lazy sequence, you can just make it an array instead. let ran = System.Random(10001100) let mySeq = Array.init 10 (fun i -> ran.Next()) for time in 0..4 do for element in mySeq do printf "%O " element printf...

python,algorithm,random-sample

If you know in advance the total number of items that will be yielded by an iterable population, it is possible to yield the items of a sample of population as you come to them (not only after reaching the end). If you don't know the population size ahead of...

matlab,distribution,sampling,random-sample

So, you can use this, for example: y = 0.8 + rand*0.4; this will generate random number between 0.8 and 1.2. because rand creates uniform distribution I believe that rand*0.4 creates the same ;) ...

java,statistics,boxplot,random-sample

Supposed that min, a, median, b, max values separate quartiles of distribution (http://en.wikipedia.org/wiki/Quartile): static public double next(Random rnd, double median, double a, double b, double min, double max) { double d = -3; while (d > 2.698 || d < -2.698) { d = rnd.nextGaussian(); } if (Math.abs(d) < 0.6745)...

A simple/easy way to do this is to create an array of integers from 0 to n - 1 where n is the length of the first array. Shuffle this array, and then use the values in it as indices for iteration over the original array. There's no standard shuffle...

The problem lies in hist, not in sample. You can check that doing: > table(sample(0:15, 10000, replace=T)) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 634 642 664 654 628 598 633 642 647 625 587 577 618 645 615 591 From...

function,macros,sas,distribution,random-sample

What you need is the inverse cumulative distribution function. This is the function that is the inverse of the normalized integral of the distribution over the entire domain. So at 0% is your most negative possible value and 100% is your most positive. Practically though you would calmp to something...

I think you may be misunderstanding what rdirichlet(...) does (BTW: you do have to spell it correctly...). rdirichlet(n,alpha) returns a matrix with n rows, and length(alpha) columns. Each row corresponds to a random deviate taken from the gamma distribution with scale parameter given by the corresponding element of alpha, normalized...

sql,sql-server,sample,random-sample

You want a stratified sample. I would recommend doing this by sorting the data by course code and doing an nth sample. Here is one method that works best if you have a large population size: select d.* from (select d.*, row_number() over (order by coursecode, newid) as seqnum, count(*)...

If I understand your question correctly, I don't think the randi function is the way to start here. I would suggest the following procedure: Start with a list with 500*500 elements, with 7000 elements set to 1 and the rest to 0 Randomize the order of elements in the list...

python,string,python-2.7,random,random-sample

numbers = ["1", "2", "3"] letters = ["X", "Y", "Z"] from random import sample, shuffle samp = sample(letters,2)+sample(numbers*3,8) shuffle(samp) print("".join(samp)) 113332X2Z2 Or use choice and range: from random import sample, shuffle,choice samp = sample(letters,2)+[choice(numbers) for _ in range(8)] shuffle(samp) print("".join(samp)) 1212ZX1131 ...

java,algorithm,sorting,random,random-sample

Modified Fisher-Yates algorithm The shuffle solution can be improved, since you only have to shuffle the first k elements of the array. But that's still O(n) because the naïve shuffle implementation requires an array of size n, which needs to be initialized to the n values from 0 to n-1....

copy,sample,julia-lang,deep-copy,random-sample

Presumably you only need to copy if the different copies will later mutate in different ways. If there's just breeding and selection with no mutation, then a reference to the "copied" individual would be sufficient. FYI deepcopy is (in currently julia releases) slow; if you need performance, you should write...

Here is some sample data: A <- seq_len(75) B <- rpois(75, 3) B <- B / sum(B) So now B is a probability vector for each element in A. To pull 25 samples, simply use sample(A, size = 25, replace = FALSE, prob = B). Fill the matrix as usual...

This doesn't answer your question about how to do this with the "sampling" package, but I've written a function called stratified that will do this for you. If you have "devtools" installed, you can load it like this: library(devtools) source_gist(6424112) Otherwise, just copy the code of the function from the...

I would sample once and turn the result into a data.frame, which can be passed to paste0: set.seed(42) do.call(paste0, as.data.frame(matrix(sample(LETTERS, 50, TRUE), ncol = 5))) #[1] "XLXTJ" "YSDVL" "HYZKA" "VGYRZ" "QMCAL" "NYNVY" "TZKAX" "DDXFQ" "RMLXZ" "SOVPQ" ...

matlab,random-sample,deterministic

As I stated in the my comment, I don't really understand what you're asking. But, I will answer this as if you had asked it on codereview. The following is not good practice in MATLAB: A1=24; A2=23; A3=23; A4=23; A5=10; There are very few cases (if any), where you actually...

To make these match, you need two things: The seed used to generate the random number The formula used to generate the random number SAS uses for rannor (and I think also for rand, but I haven't seen confirmation of this), the following algorithm (found in Psuedo-Random Numbers: Out of...

c++,algorithm,random-sample,primality-test

You might want to take a look at the Miller-Rabin primality test. In this test you use a series of "witness" values and perform some calculations. Each witness calculation gives a result of "composite" or "possibly prime". If you use k witnesses and they all give "possibly prime" results, the...

You can try n <- 2 df[with(df, transactionID %in% sample(unique(transactionID),n, replace=FALSE)),] # transactionID desc #1 1 a #2 1 d #3 1 a #17 8 f #18 8 d data df <- structure(list(transactionID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 7L, 7L, 7L,...

sql,postgresql,indexing,random-sample,postgresql-performance

After initializing max_id as max(id) - 1000 to leave room for 1000 rows, this should be using the index: UPDATE table SET test = true FROM (SELECT (random() * max_id)::bigint AS lower_bound) t WHERE id BETWEEN t.lower_bound AND t.lower_bound + 999; No need for the complicated structure with a CTE...

r,data.frame,plyr,random-sample

dat$x3 <- ave( dat$x2, dat$x1, FUN=sample) The way you have constructed the output (to have the same number of entries as there were rows of the dataframe) you will get permutations of x2 values within distinct values of x1. (Edited your code to make it run.)...

java,filewriter,random-sample,bufferedwriter,file-writing

In this section of your code while ((line = buf.readLine()) != null) { lineCopy=line; String [] LineArray=lineCopy.split(","); lineCopy=LineArray[0]; if (word.equals(lineCopy)) { System.out.println(line); writer1.write(line); } writer1.newLine(); } writer.close(); You should probably replace writer.close()with writer1.close(); for the only other writer variable that appears in your code is a local variable in your...

I'll throw my proposed solution in here as well: # for example a = np.random.random_integers(0, 500, size=(200,1000)) N = 200 result = np.zeros((200,1000)) ia = np.arange(result.size) tw = float(np.sum(a.ravel())) result.ravel()[np.random.choice(ia, p=a.ravel()/tw, size=N, replace=False)]=1 where a is the array of weights: that is, pick the indexes for the items to change...

opencv,random-sample,probability-density

As I know, OpenCV has no functions for your task, but using RNG::uniform you can generate samples as you want, take a look at this paper: paper.

r,statistics,distribution,random-sample

x <- lapply(c(1:20000), function(x){ lapply(c(1:2), function(y) rnorm(50,2.5,3)) }) This produces 20000 paired samples, where each sample is composed of 50 observations from a N(2.5,3^2) distribution. Note that x is a list where each slot is a list of two vector of length 50. To t-test the samples, you'll need to...

In principle, you want to do this using expand.grid I believe. Using your example data, I worked out the basics here: dat <- data.frame(A = c(1, 4, 5, 3, NA, 5), B = c(6, 5, NA, 5, 3, 5), C = c(5, 3, 1, 5, 3, 7), D = c(5,...

arrays,random,julia-lang,random-sample

Use the StatsBase.jl package, i.e. Pkg.add("StatsBase") # Only do this once, obviously using StatsBase items = ["a", 2, 5, "h", "hello", 3] weights = [0.1, 0.1, 0.2, 0.2, 0.1, 0.3] sample(items, WeightVec(weights)) Or if you want to sample many: # With replacement my_samps = sample(items, WeightVec(weights), 10) # Without replacement...

Yeah, you should use random.sample(). It will make your code cleaner and as a bonus you get a performance increase. Performance issues with your loop solution: a) It has to check in the output list before choosing any number b) does rejection sampling, so the expected time will be higher...

One way is to assign a cumulative sum column to mtcars so you're not having to recalculate that all the time. mtcars$cumsum <- cumsum(mtcars$count) Car.ID <- function(x) { if (x < mtcars$cumsum[1]) { return(paste(rownames(mtcars)[1], x, sep = ":")) } else { row <- tail(which(mtcars$cumsum < x), n = 1) return(paste(rownames(mtcars)[row...

Here's a straightforward implementation of the rejection sampling. There may be a faster way to do the adjacency check than the query_pairs thing (which in this case also will check for collisions), since you only want to test if there is at least one pair within this distance threshold. import...

The only way to figure out what is fastest for you is to do a comparison of the different methods. In fact the loop appears to be very fast in this case! pop = randn(1,100); n = [1 3 10 6 2]; tic sr = @(n) sum(randsample(pop,n)); sum_sample = arrayfun(sr,n);...

python-2.7,numpy,random-sample

The way you're doing it is sound. However, you could use the more intuitive nonzero function: random.sample(visited.nonzero(), k) EDIT: As to the second question in you comment, you can inverse the "zeroness" of you array: visited==0. You get: random.sample((visited==0).nonzero(), k) ...

I think your problem can be solved by generating the distribution in a reactive function like so: get_observations <- reactive( { return(rnorm(input$observations,mean=0,sd=1)) }) if (input$individual_obs) { rug(get_observations(), col = "red") } if (input$density) { dens <- density(get_observations(), kernel = input$kernel, adjust = input$bw_adjust) lines(dens, col = "blue") } get_observations will...

r,for-loop,web-scraping,random-sample

Use something like this. Loop over all the product index randomly. for (i in sample(1:x)){ <Your code here> # Sleep for 120 seconds Sys.sleep(120) } And if you want to do 10 at a time. Sleep for 120 seconds every 10 executions. n = 1 for (i in sample(1:x)){ #...

r,data.frame,repeat,random-sample

Basically your question boils down to how to replace randomly selected elements of your data with 0. You can do this pretty simply with runif, in this case replacing each value with 0 with probability 0.1: set.seed(144) data[-1] <- sapply(data[-1], function(x) ifelse(runif(length(x)) < 0.1, 0, x)) data # id X1...