python,random,frequency-distribution
So, it turns out you can totally use random.triangular(0,1,0) for this. See documentation here: https://docs.python.org/2/library/random.html random.triangular(low, high, mode) Return a random floating point number N such that low <= N <= high and with the specified mode between those bounds. Histogram made with matplotlib: bins = [0.1 * i for...
r,table,data.table,frequency-distribution
To directly get the counts for each group, using data.table 1.9.3 (in earlier versions you have to omit the by argument and add allow.cartesian=TRUE): setkey(test, Month, Complaint) # may need to also add allow.cartesian, depending on actual data test[CJ(unique(Month), unique(Complaint)), .N, by = .EACHI] # Month Complaint N # 1:...
string,r,perl,frequency-distribution
For the record, for x = "TGAGGTAGTAGTTTGTGCTGTTAT TAGTAGTTTGTGCTGTTA TGAGGTAGTAGTTTGTAC TGAGAACTGAATTCCATAGG" a Biostrings solution is library(Biostrings) consensusMatrix(DNAStringSet(strsplit(x, "\n")[[1]])) which will be fast for millions of sequences....
javascript,arrays,frequency-distribution
var dif = max - min; var a = dif/bucked; for (var i=min; i<=max;i=i+a){ i; } ...
r,ggplot2,rstudio,frequency-distribution,stackedbarseries
The easiest place to drop them is when you set the data set for the plot p <- ggplot(subset(ds, !is.na(attendF)), aes(x=yearF, fill=attendF)) Here i've created some sample data (which would have been helpful in the initial question) and re-ran your plotting commands after subsetting ds<-data.frame( id=rep(1:100, each=4), yearF=factor(rep(2001:2004, 100)), attendF=sample(1:8,...
r,frequency,variance,frequency-distribution
One option is using data.table. Convert the data.frame to data.table (setDT) and get the var of "Value" and sum of "Count" by "Group". library(data.table) setDT(df1)[, list(GroupVariance=var(rep(Value, Count)), TotalCount=sum(Count)) , by = Group] # Group GroupVariance TotalCount #1: A 2.7 5 #2: B 4.0 4 a similar way using dplyr is...
r,matrix,probability,apply,frequency-distribution
Here's an attempt, but on a dataframe instead of a matrix: df <- data.frame(replicate(100,sample(1:10, 10e4, rep=TRUE))) I tried a dplyr approach: library(dplyr) df %>% mutate(rs = rowSums(.)) %>% mutate_each(funs(. / rs), -rs) %>% select(-rs) Here are the results: library(microbenchmark) mbm = microbenchmark( dplyr = df %>% mutate(rs = rowSums(.)) %>%...