python,machine-learning,scikit-learn,classification,probability

I don't know much about how SVC works, so you may consider what is said in the comment to complete this answer. You have to consider that predic_proba will give you the categories in a lexicographical order as they appear in the classes_ attribute. You have this in the doc....

c++,c++11,random,boolean,probability

See std::bernoulli_distribution in the <random> header, aptly named after the Bernoulli distribution. std::random_device device; std::mt19937 gen(device()); std::bernoulli_distribution coin_flip(0.5); bool outcome = coin_flip(gen); ...

p1 + p2 + ... + pn = 1 p1 = p2 * x p2 = p3 * x ... p_n-1 = pn * x Solving this gives you: p1 + p2 + ... + pn = 1 (p2 * x) + (p3 * x) + ... + (pn *...

arrays,variables,random,permutation,probability

You've got an unsorted Array array with n elements. You've got two possible positions for where the local maxima could be. The local maxima could be either on the end or between the first and last element. Case 1: If you're looking at the element in either the first or...

Answer from @twalberg: If you're stuck with the existing algorithm, you probably want to set the probabilities at 33, 50 and 100% for each of the actions. In that case the first action has a 33% chance, but if it doesn't happen, then the second one will have a 50%...

Using a normalized gaussian distribution I'd do something like this: public class ClusterRandom{ Random dice = new Random(); int mRange; int mWidth = 1; int mMean; public ClusterRandom(int range, int startingMean, int...width){ mRange = range; mMean = startingMean; if(width.length > 0) mWidth = width[0]; } public int nextInt(){ int pick;...

You can't, there's no following function in std::discrete_distribution. You can get probabilities, but not set, so, there is only one way - reinit discrete_distribuion (probably you can use vector of predefined destributions).

A simple approach could be as follows. No need to *2, because probabilities will be the same. String giantRat []={"Bandage", "Healing Potion", "Minor Healing Potion", "Rat Teeth", "Fur", "Rat Tail", ""}; int[] a = {1,1,1,6,8,3,5}; int sum = 0; for(int i: a) sum += i; Random r = new Random();...

Creating a "Dicebag" is quite simple; it takes little to no effort at all to write it. However, there are a few things that need to be described before continuing. A die(plural "dice") can never be zero or negative, thus when writing the class we need to prepare for that...

php,key,probability,probability-theory

Probability of a collision is very low in your case (though possible). Counting all possible values of image name: (9999999-1000000+1)^3 == 7.29 * 10^20. Hint: you may increase this value by generating numbers between 0 and 9999999 and left-padding them with zeros while converting to strings, e.g.: sprintf("%07d", $number) mt_rand...

Put 70 zeros (NxN - M) and 30 ones (M) into a vector. Shuffle the vector. Iterate through and map each index k to 2-d indices via i = k / 10 and j = k % 10 for your example (use N as the divisor more generally). ADDENDUM After...

From what I understand, you want a function that takes a population size, and a given probability of an individual event, and delivers a random outcome? So something like: public int Outcome(int p, int n){ var random = new Random(); int count = 0; int rnd = 0; for (var...

You need to construct a deterministic function that generates p_diabetes as a function of your predictors. The safest way to do this is via a logit-linear transformation. For example: intercept = pymc.Normal('intercept', 0, 0.01, value=0) beta_race = pymc.Normal('beta_race', 0, 0.01, value=np.zeros(4)) beta_bmi = pymc.Normal('beta_bmi', 0, 0.01, value=0) @pymc.deterministic def p_diabetes(b0=intercept,...

matlab,vector,probability,entropy

You can generate a vector p of 3 real numbers between zero and one p = rand(1,3); then normalize p p = p / sum(p); Then p(1) + p(2) + p(3) is 1. EDIT: to respond to OP's comment N = 100; p = rand(N, 3); for k = 1:...

javascript,probability,bayesian-networks

The following seems to do the job up to marginalise. It creates a table with rows that have members and a value. Each row in a table must have the same number of members. It doesn't implement multiplication yet (real work has to be done sometime…). Note that in ECMAScript,...

You could use the moment function from scipy. It calculates the n-th central moment of your data. You could also define your own function, which could look something like this: def nmoment(x, counts, c, n): return np.sum((x-c)^n*counts) / np.sum(counts) In that function, c is meant to be the point around...

What you want is calculate the expected value of the function. This can be done recursively. I assume you have the rules in a tree-like data structure. Then the initial call would just be root.CalculateExpectedValue(). There are three kinds of nodes: Leaf nodes (that specify an actual value). CalculateExpectedValue() should...

There are probably a lot of different (and good) answers, but in my humble opinion, the common characteristics of probabilistic data structures is that they provide you with approximate, not precise answer. How many items are here? About 1523425 with probability of 99% Update: Quick search produced link to decent...

r,matrix,probability,apply,frequency-distribution

Here's an attempt, but on a dataframe instead of a matrix: df <- data.frame(replicate(100,sample(1:10, 10e4, rep=TRUE))) I tried a dplyr approach: library(dplyr) df %>% mutate(rs = rowSums(.)) %>% mutate_each(funs(. / rs), -rs) %>% select(-rs) Here are the results: library(microbenchmark) mbm = microbenchmark( dplyr = df %>% mutate(rs = rowSums(.)) %>%...

javascript,random,probability,entropy

You're up against something called the birthday problem: even though there are 366 possible birthdays, when you get only 26 people in a room, the chance that some pair will have the same birthday is better than 50-50. In general, collisions are likely when your numbers approach the square root...

Another entry for the brute force method, using a list of Integers(dice sides) to handle multiple die types. The advantage is that if you want a lot of probabilities, you can run it once then just query the various probabilities. The disadvantage is that as a brute force method it's...

random,numbers,distribution,probability

The most straightforward way that I can see is this. Assuming that you have have large number of points {f(X1),--,f(Xn)}, plot them as distribution and fit a generalized Gaussian distribution curve through them. After this, you can use rejection sampling to generate further numbers from the same distribution.

rand() is a very bad RNG, it would be ok just for implementing tic-tac-toe, but not for any serious business. If you can't use C+11 random module, you can still take advantage of Boost.Random, which works with C++03 too. Browse the generators page and look for the best suit....

algorithm,unique,probability,cardinality,hyperloglog

HyperLogLog itself is quite simple, once you understand it (and already have a hash function). I have implemented a didactic version in Python with 5 lines for element insertion and another 6 lines for estimation: from mmhash import mmhash from math import log from base64 import b64encode class HyperLogLog: def...

python,numpy,random,generator,probability

Let me take a small example first In [1]: import numpy as np In [2]: import math In [3]: alpha = 0.1 In [4]: n = 5 In [5]: tmp = [1. / (math.pow(float(i), alpha)) for i in range(1, n+1)] In [6]: zeta = reduce(lambda sums, x: sums + [sums[-1]...

Add up the total number of items in _cachedLoot, then divide the amount of the item in question by the total. double total = _cachedLoot.Values.Sum(n => (int)n); return _cachedLoot[name] / total; (Note the cast to double - this is required to prevent integer division.) If you want to return a...

phyper(5, 8, 92, 30) gives the probability of drawing five or fewer red marbles. 1 - phyper(5, 8, 92, 30) thus returns the probability of getting six or more red marbles Since you want the probability of getting five or more (i.e. more than 4) red marbles, you should use...

binary,classification,probability,naivebayes

Yes, it is normal to mix continuous and discrete variables within the same model. Consider the following example. Suppose I have two random variables: T - the temperature today D - the day of the week Note T is continuous and D is discrete. Suppose I want to predict whether...

math,probability,sha1,hash-collision

@Teepeemm has correctly answered the related question ‘given a particular sequence of 8 hex digits, what is the chance of another SHA-1 hash appearing with the same 8 digits?’ It's a very small number. What's at stake in this question, though, is a different question: ‘given a large number of...

matlab,probability,markov-chains

You code A(S,S)*A(S,C) + A(S,R)*A(R,C) + A(S,C)*A(C,C) (i.e. sum over all possible intermediate states, or Chapman-Kolmogorov equation) is just matrix multiplication: A(S,:)*A(:,C) In general, A2 = A^2 gives the probabilty of all such double transitions, and An = A^n is the probability of n-order transitions (see for example here). So...

The following should work. non.part2$p_x1 <- predict(probit, yourDataToPredictOn, type = "response") ...

sql,select,random,mysqli,probability

let's say your rank column has these three values: 3 top 2 normal 1 low Then what you can do is: $rand = rand()%3 +1; $result = mysqli_query($link, " SELECT * FROM ( SELECT * FROM objects ORDER BY RAND() ) as t WHERE rank >= {$rand} LIMIT 1" );...

probability,scilab,cumulative-sum

The sum of all the probabilities of PMF is not 1 You could not possibly add all the probabilities, because the geometric distribution assigns nonzero probabilities to all positive integers. If you run the sum up to n=10, the sum of probabilities is appreciably less than 1. If you...

java,random,double,bit-manipulation,probability

Because nextDouble works like this: (source) public double nextDouble() { return (((long) next(26) << 27) + next(27)) / (double) (1L << 53); } next(x) makes x random bits. Now why does this matter? Because about half the numbers generated by the first part (before the division) are less than 1L...

This may be done in PDL (at scale) using the vsearch function. use strict; use warnings; use PDL; my @x = qw( a b c d ); my $pdf = pdl( 0.1, 0.4, 0.4, 0.1 ); # vsearch requires a CDF, my $cdf = $pdf->dcumusumover; $cdf /= $cdf->max; # $sample...

sql,sql-server,distribution,probability

SQL-Server does not incorporate a lot of statistical functions. tinv is not present in SQL-Server. The only way to add a tinv function, is to use a CLR-Function. Thus, the problem reduces itselfs to "How do I calculate tinv with the subset of C# allowed in SQL-Server ?". If you're...

r,algorithm,function,random,probability

A quick google search found this: http://blog.revolutionanalytics.com/2009/02/how-to-choose-a-random-number-in-r.html random_value <- sample(values, 1) ...

This is the approach I would take, using a for() loop and if statements for clarity (these could be collapsed and vectorized if efficiency is of utmost importance: df <- "Sam1 Sam2 Sam3 Sam4 Sam5 Prb1 0 0 1 2 3 Prb2 0 0 1 2 2 Prb3 0 1...

python,combinations,permutation,probability

A step by step way to do this would be pairs = {} for first in range(1,7): for second in range(1,7): total = first + second if total in pairs: # If sum exists, add this tuple to the list for this key. pairs[total] += [(first,second)] else: # If sum...

Using numpy and the built-in collections module you can greatly simplify the code: import numpy as np import collections sample_ips = [ "131.084.001.031", "131.084.001.031", "131.284.001.031", "131.284.001.031", "131.284.001.000", ] C = collections.Counter(sample_ips) counts = np.array(C.values(),dtype=float) prob = counts/counts.sum() shannon_entropy = (-prob*np.log2(prob)).sum() print (shannon_entropy) ...

math,machine-learning,probability,bayesian

What you want is the probability of having the right fish given that the detector beeps, which is P(A|B). The P(B|A) = 9999/10000 is the probability of the detector beeping given you have the right fish. However, we don't know if the fish you have is the right one. All...

Suppose my population has n marbles, and only 1% of them are red. In a sample of 30 draws, what's the probability that I draw at least 1 red marble? You're right that the probability of at least 1 red marble is 1-Pr(no marbles); for a binomial, it's actually...

You could use a hit and miss approach (often used in probability simulations to randomly choose elements which satisfy certain constraints). The performance is acceptable unless np is too close to 1. Here is a Python implementation, which should be easy enough to translate to other languages: from random import...

python,algorithm,permutation,probability

import itertools mystring = 'abcde' for i in range(1,len(mystring)+1): for combo in itertools.combinations(mystring, i): print(''.join(combo)) Output: a b c d e ab ac ad ae bc bd be cd ce de abc abd abe acd ace ade bcd bce bde cde abcd abce abde acde bcde abcde ...

Just add the probabilities into sample using prob sample(c("sp.1", "sp.2", "sp.3"), 2, prob=c(1,2,3)) to repeat, you could wrap in replicate, e.g. 100 times: replicate(100, sample(c("sp.1", "sp.2", "sp.3"), 2, prob=c(1,2,3))) ...

Thanks for the clarification. From what I can tell, you don't have enough information to solve this problem properly. Specifically, you need to have some estimate of the dependence of power from one time step to the next. The longer the time step, the less the dependence; if the steps...

matlab,probability,truncate,cdf

I would suggest first using the Matlab truncate function to adjust your distribution: pd = makedist('poiss') trunc = truncate(pd,1,3) for Poisson, it can only be positive. set a discrete range: x = 0:.1:4; distribution = pdf(trunc,x); cummulative = cdf(trunc,x); alternatively, you could integrate the pdf function using matlab integrate...

As I stated in the comments; generate random number with RAND from 0 to 1, compare with the probability. If it is bigger then it is 0, else 1. =IF(RAND()>=A1,0,1) ...

python,matlab,numpy,histogram,probability

The bins are equally spaced already. To get probabilities out of an histogram you have to normalize (i.e. divide by the sum over all histogram values): probs = probs / np.sum(probs) ...

r,statistics,probability,prediction,calibration

The warning is telling you that predict.gam doesn't recognize the value you passed to the type parameter. Since it didn't understand, it decided to use the default value of type, which is "terms". Note that predict.gam with type="terms" returns information about the model terms, not probabilties. Hence the output values...

python,python-2.7,random,probability

As @jgritty said earlier, your assumption is wrong. The probability would not be 1/10000 because you are selecting from two different sets of numbers at the same time, which doesn't mean that you are picking a number from a set of numbers twice. You can easily find the solution like...

python,machine-learning,scikit-learn,probability,prediction

Per the SVC documentation, it looks like you need to change how you construct the SVC: model = SVC(probability=True) and then use the predict_proba method: class_probabilities = model.predict_proba(sub_main) ...

If I understand this correctly, it is a nice example to demonstrate some of the differences in thinking between Anglican and PyMC. Here is a tweaked version of your PyMC code that I think captures your intention: def make_model(): a = pymc.Poisson("a", 100) # better to have the stochastics themselves...

r,io,probability,sparse-matrix,entropy

Sample data Creating a data frame with only non-zero z values (suppose we can remove all of the zero lines from the flat file before importing data). N <- 50000 S <- N * 0.8 df_input <- data.frame( x = sample(1:N, S), y = sample(1:N, S), z = runif(S)) #...

arrays,matlab,random,probability

You can take advantage of the fact that rows/columns with a single non-zero entry in A automatically give you results for that same entry in A_rand. If A(2,5) = w and it is the only non-zero entry in its column, then A_rand(2,5) = w as well. What else could it...

python,list,dictionary,probability

I am surprised no one's given this a shot. The following code does not perform exactly as the example you described, but is (in my opinion) a better way of expressing the data. probabilityDict = {} for i in valueList: if i != ' ' : y = i.split() typeKey...

matlab,math,histogram,probability

The function hist() will normalise for you with its 3rd parameter: x = rand(1000, 1)*360-180; [probas, angles] = hist(x, -180:10:180, 1.0); bar(angles, probas); You might want to combine values of bin -180 and +180 Now angles and probas are available for other plots....

string,algorithm,probability,combinatorics,discrete-mathematics

Suppose you build up palindrome-free strings one letter at a time. For the first letter, you have M choices, and for the second, you have M-1, since you can't use the first letter. This much is obvious. For every letter after the first two, you can't use the previous letter,...

machine-learning,probability,mle,language-model

The likelihood function describes the probability of generating a set of training data given some parameters and can be used to find those parameters which generate the training data with maximum probability. You can create the likelihood function for a subset of the training data, but that wouldn't be represent...

r,statistics,simulation,probability

If you inspect the body of the function ks.test you will see the following line somewhere in the body: if (length(unique(x)) < n) { warning("ties should not be present for the Kolmogorov-Smirnov test") TIES <- TRUE } This tells you that when the number of unique elements in x is...

python,string,count,probability,markov-chains

Two things: This isn't related to Markov Chains. At All. Python actually has some really nice builtins that will make this more or less trivial. I won't spoon-feed an answer, but I don't want to leave you high and dry on this one. The gist is that depending on your...

javascript,combinations,probability

Nevermind, I managed to figure it out after some good hours: function getChance(numbers, out_of) { return numbers>0?out_of/numbers*getChance(numbers-1,out_of-1):1; } var np = 6; //numbers picked var tn = 49; //total numbers var ntm = 6; //numbers to match var picks = getChance(np-ntm, tn-ntm); var combs = getChance(np, tn); var probs =...

python,probability,probability-theory

So, if I'm understanding your comment correctly, what you are having trouble with is the concept of calculating the conditional probability when there are two or more "conditions" as opposed to a single condition. It's been quite a while since I last took a probability/statistics class, but I think what...

The solution is quite simple: you are adding too many elements to the arrays. for (int i = 0; i < 33; i++) { numbers1.Add(i); numbers2.Add(i); } Adds 33 elements to the arrays not 32. Random.Next(a, b) generates a random number between [a, b) (half-open interval). So random.Next(0, 33) will...

Let a1 = 25, a4 = a*r^3 = 1 a4 / a1 => r^3 = 1/25 r = Cube root of (1/25). Three terms not needed. Hope it will solve!!...

The tasks can be ordered by t1/(1-p1) This will give the minimum test time....

Sine your code didn't work by copy & paste, I changed it a little bit, It's better if you define a function that calculates the probability for given data, function p = prob(data) n = size(data,1); uniquedata = unique(data); p = zeros(length(uniquedata),2); p(:,2) = uniquedata; for i = 1 :...

math,probability,markov-chains,random-walk

Your state has to contain the last position, so that you have transitions (-1,-1) --> (-1,-1) (+1,+1) --> (+1,+1) with 70% probability and (-1,+1) --> (+1,-1) (+1,-1) --> (-1,+1) with 30% probability each....

If you have a function random() that returns doubles in the interval [0, 1), then you look at pages 1 to floor(1 / (1 - random())). Page n is examined if and only if the output of random() is in the interval [1 - 1/n, 1), which has length 1/n....

python,math,statistics,probability

Since a couple of people have asked to see the mathematical solution, I'll give it. This is one of the Project Euler problems that can be done in a reasonable amount of time with pencil and paper. The answer is 7(1 - (60 choose 20)/(70 choose 20)) To get this...

python,statistics,scipy,probability

(1) "Is it from distribution X" is generally a question which can be answered a priori, if at all; a statistical test for it will only tell you "I have a large sample / not a large sample", which may be true but not too useful. If you are trying...

r,parallel-processing,statistics,probability,montecarlo

The default random number generator in R is Mersenne-Twister. You can change between them using setRNG('Wichmann-Hill') setRNG('default')#or setRNG('Mersenne-Twister') If you want to generate numbers in parallel, you can use the foreach package. require(foreach) require(doParallel) c1 <- makeCluster(2) registerDoParallel(c1) generateRandom <- function(rng='default',n) { setRNG(rng) runif(n) } result = foreach(i = 1:2,rng...

java,random,probability,effective-java,non-uniform-distribution

Question 1: if n is a small power of 2, the sequence of random numbers that are generated will repeat itself after a short period of time. This is not a corollary of anything Josh is saying; rather, it is simply a known property of linear congruential generators. Wikipedia...

matlab,random,probability,probability-density

If you recall from probability theory, you know that the Cumulative Distribution Function sums up probabilities from -infinity up until a certain point x. Specifically, the CDF F(x) for a probability distribution P with random variable X evaluated at a certain point x is defined as: Note that I am...

java,for-loop,increment,probability

p is a local variable for the method dProb, and it's a different variable than p in iProb. Meaning that px, pz and py are not affected (Java passes by value, always). When you enter the method, a temporary variable is created, and will be destroyed as soon as you...

matlab,syntax,probability,combinatorics

brute force solution: [d1,d2,d3,d4]=ndgrid(0:4,0:4,0:4,0:4); d = d1+d2+d3+d4; i = find(d==4); [d1(i),d2(i),d3(i),d4(i)] ...

language-agnostic,byte,probability,magic-numbers

Capture real-world data from a diverse set of inputs that would be used by applications of your library. Write a quick and dirty program to analyze dataset. It sounds like what you want to know is which bytes are most frequently totally excluded. So the output of the program would...

java,algorithm,math,probability

Would this be ok? If you use the additive version, you'll end up having the same probabilities always. I'm using the updated multiplicative version. Also, use x<1 for lower chance of getting higher values. And x>1 otherwise. import java.util.Arrays; import java.util.Random; public class Main { private static Random random =...

c,probability,normal-distribution,quantitative-finance,probability-theory

The standard normal cumulative distribution function is exactly (1/2)*(1 + erf(z/sqrt(2))) where erf is the Gaussian error function, which is found in many C programming libraries. Check the development environment you are using -- chances are good that erf is already in one of its libraries.

matlab,statistics,probability,kernel-density,probability-density

You have two issues: A 1-unit displacement between blue and red plots. The blue spikes are wider and less tall than the red ones. How to solve each issue: This is caused by a possible confusion between the data range 0,...,255 and the indexing interval 1,...,256. Since your data represents...

r,matlab,statistics,distribution,probability

It appears that R's qt may use a completely different algorithm than Matlab's tinv. I think that you and others should report this deficiency to The MathWorks by filing a service request. By the way, in R2014b and R2015a, -Inf is returned instead of NaN for small values (about eps/8...

javascript,jquery,random,probability

Just accumulate the probabilities and return an item for which current_sum >= random_number: probs = [0.41, 0.29, 0.25, 0.05]; function item() { var r = Math.random(), s = 0; for(var i = 0; i < probs.length; i++) { s += probs[i]; if(r <= s) return i; } } // generate...

TTT also counts as an outcome with an even number of H.

You could just plot the results and see that it gives something very similar: # slightly improved version of my.ecdf my.ecdf<-function(x,t) { out<-numeric(length(t)) for(i in 1:length(t)) { indicator <- as.numeric(x<=t[i]) out[i] <- sum(indicator)/length(t) } out } # test 1 x <- rnorm(1000) plot(ecdf(x)) lines(seq(-4, 4, length=1000), my.ecdf(x, seq(-4, 4, length=1000)),...

python,statistics,probability,computation

Let's label the students according to their round-1 team: 0000 1111 2222 3333 4444 5555 6666 7777 8888 9999 AAAA BBBB CCCC DDDD EEEE FFFF The number of ways to assign round-2 teams, without restrictions, is 64! / ((4! ** 16) * (4! ** 16)) == 864285371844932000669608560277139342915848604 ^ ^ ^...

python,recursion,probability,solver

If each board in the children variable is a state after a random tile has been spawned, wouldn't you need to add 1 to the number of empty tiles since the spot where the new tile spawned was empty before it spawned there? weights[section]*(1/(num_empty(board)+1))) That being said, calling a function...

Cheat. Decide first if the player wins or not const bool winner = ( rand() % 100 ) < 40 // 40 % odds (roughly) Then invent an outcome that supports your decision. if ( winner ) { // Pick the one winning fruit. this->slotOne = this->slotTwo = this->slotThree =...

I don't think the test case is wrong, but the statement If you just dialed a 6, the next number must be a 1 or 7. is wrong. Not being able to get from 6 to 0 makes the problem not make any sense. As to your comment, blanks aren't...

algorithm,loops,time-complexity,complexity-theory,probability

Let the length of the string be written as n. n = str.length(); Now, the outer for-loop iterates for the whole length of the String,i.e., n times. Hence,outer for-loop complexity is O(n). Talking about child loop, the inner while loop executes (val)^(1/62) times. So, you can consider the inner while-loop...

If you're looking for a solution that uses built-in MATLAB functions, you can look into random, which allows you to supply parameters to many types of well-known distributions. For example, if you want to draw a M x N matrix of values from a binomial distribution with n trials and...

sas,probability,prediction,logistic-regression

2 ways to get predicted values: 1. Using Score method in proc logistic 2. Adding the data to the original data set, minus the response variable and getting the prediction in the output dataset. Both are illustrated in the code below: *Create an dataset with the values you want predictions...