As you noted correctly, the net can get stuck in local minima. Due to the random initialization of the weights the final results can differ a lot. One way of minimizing the generalization error is early stopping (i.e. different parameter values for maxit, abstol or reltol). Another way that is...

Backpropogation has two ways of implementation: batch and online training algorithm. Initially you described online training algorithm. Then you found and tried to implement batch training algorithm which sometime has side effect which you described. In your case it can be good idea to split learning samples into smaller chunks...

I think the short answer is yes, but, as so often with neural networks, it depends on your problem. The type of architecture you're describing in your question is called a "skip-layer" model. For a brief discussion of skip-layer connections, you might want to check these online resources: http://stats.stackexchange.com/questions/56950/neural-network-with-skip-layer-connections http://www.iro.umontreal.ca/~bengioy/ift6266/H12/html.old/mlp_en.html...

python,machine-learning,neural-network

OK, so, first, here's the amended code to make yours work. #! /usr/bin/python import numpy as np def sigmoid(x): return 1.0 / (1.0 + np.exp(-x)) vec_sigmoid = np.vectorize(sigmoid) # Binesh - just cleaning it up, so you can easily change the number of hiddens. # Also, initializing with a heuristic...

In artificial neural networks, the cost function to return a number representing how well the neural network performed to map training examples to correct output. See here and here In other words, after you train a neural network, you have a math model that was trained to adjust its...

There are (at least) two nnet-methods. You are using the formula method and supplying an argument for linout using partial matching. This would test whether any of the argument names started with "lin": fit <- nnet(c ~ ., data = df, size = 3, lin=TRUE) # weights: 13 # snipped...

matlab,machine-learning,neural-network

Let's break your question in parts: First he says that he uses a subset of the MNIST dataset, which contaings 5000 training examples and each training example is an image in a 20x20 gray scale format. With that he says that we have a vector of 400 elements of length...

From the source: if (value < -45) { value = 0; } else if (value > 45) { value = 1; } else { value = 1 / (1 + Math.exp(-value)); } return value; Pretty simple sigmoid with a clamp using the logistic function. That said, you're very likely not...

Why not have a look of my implementation in https://github.com/zizhaozhang/simple_neutral_network/blob/master/nn.py The derivatives is actually here: def dCostFunction(self, theta, in_dim, hidden_dim, num_labels, X, y): #compute gradient t1, t2 = self.uncat(theta, in_dim, hidden_dim) a1, z2, a2, z3, a3 = self._forward(X, t1, t2) # p x s matrix # t1 = t1[1:, :]...

machine-learning,neural-network,normalization

During standard SGD training of a network, the distribution of inputs to a hidden layer will change because the hidden layer before it is constantly changing as well. This is known as covariate shift and can be a problem; see, for instance, here. It is known that neural networks converge...

java,machine-learning,neural-network

First and most important thing, regardless of how you code it, feed-forward multilayer neural network won't learn x*y, especially when data are presented in the form of two continious inputs. Reasons: 1). x * y output is unbounded and normal MLP is not suited for learning such functions. At best,...

Aside from few classical problems, there is no single right way to feed complex data into NN. It is sort of an art and in fact recent progress in deep learning owns a lot to advances in a ways of representing complex data. Thus, without knowing the nature of your...

While doing Neural Networks you usually take your data and divide it into 3 subsets: Training, Validation and Test. Someone has a quite good questiond and answer in this site about what these are and why: whats is the difference between train, validation and test set, in neural networks? In...

When you use Matlab's neural network toolbox you have the option of choosing the percentage of your Training, Validation and Testing data (the default is 70% for training and 15-15% for validation and testing). The toolbox divides your data randomly, this is why you get different results. You can...

2 is the worst option because "2 or more networks recognize an image as "their own"" will definitly happen many times and how you descriminate between them after that? 1 will work reasonable well. 3 is the basic idea behind softmax output function and softmax usually works best for classification...

java,machine-learning,neural-network

AI is being set to the output value from the leftNeuron of the previous connection (whatever node that is connecting to the current one). The way the back propagation algorithm works is by going through every layer in the ANN, and every node in it, then summing up all of...

matlab,neural-network,linear-regression,backpropagation,perceptron

A neural network will generally not find or encode a formula like t = a + b*X1 + c*X2, unless you built a really simple one with no hidden layers and linear output. If you did then you could read the values [a,b,c] from the weights attached to bias, input...

python,algorithm,neural-network,perceptron

The problem is that you are not recomputing the output after the weights change so the error signal remains constant and the weights will change in the same way on every iteration. Change the code as follows: def update(theta,x,n,target,output): for i in range(0,len(x)): output[i] = evaluate(theta,x[i]) # This line is...

c,artificial-intelligence,neural-network,fann

You need to use fann_get_connection_array() function. It gives you array of struct fann_connection, and struct fann_connection has field weight, so it's what you want. You can do something like this to print your weight matrix: int main(void) { struct fann *net; /* your trained neural network */ struct fann_connection *con;...

Install from Pypi pip install Lasagne Here's the official docs from lasagne....

python,neural-network,data-mining

The answer on our Piazza page (UC Berkeley, INFO290t) seems correct: I believe you're supposed to round the output to zero or one and then compare the rounded results with your target to determine the accuracy. "If the output of your classifier is a continuous value between 0 and 1,...

r,machine-learning,neural-network

Try using this to predict instead: res = compute(r, m2[,c("Pclass", "Sexmale", "Age", "SibSp")]) That worked for me and you should get some output. What appears to have happend: model.matrix creates additional columns ((Intercept)) which isn't part of the data which was used to build the neural net, as such in...

My understand is that this is not possible for any new labels. We can only continue training when the new data has the same labels as the old data. As a result, we are training or retuning the weights of the already learned vocabulary, but are not able to learn...

As of version 3.2 it was not implemented for .NET (that was one of the reasons why I quit using Encog). I don't know about 3.3 for sure, but it seems that things are still the same. Java RegularizationStrategy seems to be community contribution (see https://github.com/encog/encog-java-core/issues/28). If you absolutely want...

computer-vision,neural-network,feature-detection,deep-learning

Thats an open problem in image recognition. Besides sliding windows, existing approaches include predicting object location in image as CNN output, predicting borders (classifiyng pixels as belonging to image boundary or not) and so on. See for example this paper and references therein. Also note that with CNN using max-pooling,...

artificial-intelligence,neural-network,pybrain

There are a number of ways, but one is to calculate the average value of that column, and pass that in any case where there is missing data. You can also add a column that is True/False for whether or not the data is present, so that the network has...

machine-learning,artificial-intelligence,neural-network,convolution

After the last convolutional layer, you have N feature maps, with WxH resolution. This can be seen as a feature vector X of size NxWxH if you concatenate all the values. This is how you connect it to an MLP: i.e X acts as an input of a linear transformation...

The point you get a heap corruption error/crash is typically just the symptom of an actual heap overflow/underflow or other memory error at some other time/point in the past. This is why heap corruptions can be difficult to track down. You have a lot of code and all the double-pointers...

c#,unit-testing,neural-network

Testing backpropagation algorithms is really important because it's really common for them to have subtle bugs. E.g. the gradients are off, but close enough the network can still sort of learn. You just use the finite differences method. For each parameter (or a random parameter or whatever) you add a...

matlab,neural-network,vectorization,bsxfun

There is a parallelogram-like structure of blocks you are creating inside the nested loops with iLowMax:iHighMax,jLowMax:jHighMax which won't lead to any easy vectorizable codes. But you can go full-throttle vectorization on that if performance is paramount for your case and seems like convolution would be of good use there. Listing...

neural-network,genetic-algorithm,encog,simulated-annealing,particle-swarm

It seems logical, however it will not work. With the default parameters of the RPROP, this sequence will not likely work. The reason why is that after your previous training the weights of the neural network will be near a local optimum. Because of the nearness to a local optimum...

neural-network,deep-learning,dimensionality-reduction,autoencoder

You should take a look at some of the tutorials over at deeplearning.net. They have a Stacked Denoising Autoencoder example with code. All of the tutorials are written in Theano which is a scientific computing library that will generate GPU code for you. Here's an example of a visualization of...

java,arrays,multidimensional-array,neural-network

int[] sizes = { layer1, layer2, layer3 }; int k = sizes.length - 1; So, k is equal to 2. int i; for (i = 0; i < k; i++) net[i] = new double[sizes[i]][]; After that loop i is equal to 2. for (int j = 0; j < sizes[i];...

machine-learning,neural-network

From your questions, it seems that you on the right track. Anyhow, the Liquid State Machine and Echo State machine are complex topics that deal with computational neuroscience and physics, topics like chaos, dynamic action system, and feedback system and machine learning. So it’s ok if you feel like it’s...

matlab,neural-network,matlab-figure

I am not sure about the format of Neural Networks Toolbox, but I can help you with the drawing part. There is a command called gplot that draws adjacency matrices (taken directly from Matlab help). You can adjust it to show circles instead of points: k = 1:30; [B,XY] =...

machine-learning,artificial-intelligence,neural-network

Any model can act like that on those two instances. Your question is very broad, so I'll just list a few things that you should consider. Data normalization and scaling You might have better luck by applying feature scaling or mean normalization to your data. Detect overfitting Use a method...

sdk,opencl,neural-network,gpgpu,deep-learning

I have been in the same situation as yourself as I have a MacBook Pro with Intel Iris graphics. I have spent the best part of a week looking through all possible workarounds and I would be more than welcome to alternatives to those that I offer. The best solution...

machine-learning,neural-network,genetic-algorithm,evolutionary-algorithm

You can include as many hidden layers you want, starting from zero (--that case is called perceptron). The ability to represent unknown functions, however, does -- in principle -- not increase. Single-hidden layer neural networks already possess a universal representation property: by increasing the number of hidden neurons, they can...

matlab,neural-network,workspace,matlab-guide

From the wording of your problem it sounds like finalnet is a previously stored workspace such that finalnet.mat is located in some directory. Let's assume the current directory. In this case you need to load the workspace into your GUI. Assuming that's in some random callback function, you want to...

neural-network,convolution,theano,conv-neural-network

I deduce from this that you intend to have tied weights, i.e. if the first operation were are matrix multiplication with W, then the output would be generated with W.T, the adjoint matrix. In your case you would thus be looking for the adjoint of the convolution operator followed by...

neural-network,deep-learning,caffe

Caffe doesn't determine the number of neurons--the user does. This is pulled straight from Caffe's website, here: http://caffe.berkeleyvision.org/tutorial/layers.html For example, this is a convolution layer of 96 nodes (or neurons): layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" # learning rate and decay multipliers for the filters param...

It is usually perfectly ok to have OTHER (NOT FRAUD) class along with these you are interested in. But I understand your concern. Basically, its job of NN to learn "case/switch" and in most cases it will learn right one, assuming that most samples belong to NOT FRAUD class. In...

machine-learning,computer-vision,neural-network,deep-learning,pylearn

moving the comment to an answer; modifying my previous answer seemed wrong The full dataset may not be properly shuffled so the examples in the test set may be easier to classify. Doing the experiment again with examples redistributed among the train / valid / test subsets would show if...

From this site: Try to replace doublefann.h by fann.h.

artificial-intelligence,neural-network

If anything, this is based on intuition and empirical results. I've seen people use recursive neural networks. With a feedforward neural network, it makes sense to connect all neurons from layer n to all neurons in layer n+1. Here is an example from my latest usage (to demonstrate the enormous...

c++,machine-learning,neural-network,genetic-algorithm

Here is the best answer I can give, based on my interpretation of your question. Apologize if it is not what you were asking for, but you did ask for the most basic explanation. I don't see exactly how the tank track values relate to the ability of the tank...

machine-learning,artificial-intelligence,neural-network,classification,backpropagation

The "units" are just floating point values. All computations happening there are vector multiplications, and thus can be parallelized well using matrix multiplications and GPU hardware. The general computation looks like this: double v phi(double[] x, double[] w, double theta) { double sum = theta; for(int i = 0; i...

"Equivalent" is too generalizing but you can roughly say that in terms of architecture (at least regarding their original proposal - there have been more modifications like the MS-TDNN which is even more different from a MLP). The correct phrasing would be that TDNN is an extended MLP architecture [1]....

machine-learning,artificial-intelligence,neural-network

You can either have NxM boolean inputs or have N inputs where each one is a float that goes from 0 to 1. In the latter case the float values would be: {A/M, B/M, C/M, ... 1}. For example if you have 4 inputs each one with discrete values: {1,2,3,4}...

The features are the elements of your input vectors. The number of features is equal to the number of nodes in the input layer of the network. If you were using a neural network to classify people as either men or women, the features would be things like height, weight,...

What you describe is called a recurrent neural network. Note that it needs quite different type of structure, input data, and training algorithms to work well. There is the rnn library for Torch to work with recurrent neural networks....

machine-learning,neural-network,deep-learning,dbn,conv-neural-network

I don't know if you still need an answer but anyway I hope you will find this useful. A CDBN adds the complexity of a DBN, but if you already have some background it's not that much. If you are worried about computational complexity instead, it really depends on how...

python,optimization,neural-network,theano

I don't know whether this is faster, but it may be a little more concise. See if it is useful for your case. import numpy as np import theano import theano.tensor as T minibatchsize = 2 numfilters = 3 numsamples = 4 upsampfactor = 5 totalitems = minibatchsize * numfilters...

machine-learning,neural-network,deep-learning,caffe,matcaffe

You should look for the file 'synset_words.txt' it has 1000 line each line provides a description of a different class. For more information on how to get this file (and some others you might need) you can read this. If you want all the labels to be ready-for-use in Matlab,...

java,machine-learning,artificial-intelligence,neural-network

This is the standard backpropagation algorithm where it is backpropagating the error through all the hidden layers. Unless we are in the output layer, the error for a neuron in a hidden layer is dependent on the succeeding layer. Let's assume that we have a particular neuron a with synapses...

Nearly all machine learning models, neural networks included, accept a vector (one dimension) input. The only way to represent such 2D (or higher dimensional) data to the BasicNetwork (in Encog) is to flatten the matrix to a vector. A 8x8 matrix would be a 64-element vector. For a traditional feedforward...

c++,machine-learning,neural-network

Finally I have realized what's wrong. The problem is not in the code itself. The thing is that cost function on such network configuration with XOR has local minimum. So, I came there and was stuck. Solution is to make a step in random direction until you made it out...

unit-testing,neural-network,backpropagation

I'm in the middle of doing something similar for my degree. What you are looking for is integration tests, not unit tests. Unit test only tell you if the code works the way you want it to. To check if the algorithm actually works correctly, you should write integration tests...

matlab,neural-network,perceptron

The tansig activation function essentially makes it possible than a neuron becomes inactive due to saturation. A linear neuron is always active. Therefore if one linear neuron has bad parameters, it will always affect the outcome of the classification. A higher number of neurons yield a higher probability of bad...

As Kelu stated, that part of the equation is based on derivatives of your transfer function (in this case sigmoid). To understand why you need derivatives, you need to understand how the delta rule works(*): Your overall goal is to minimize the error in the network's output using gradient descent....

TL;DL: net.trainParam.max_fail = 8; I've used the example provided in the page you linked to get a working instance of nntraintool. When you open nntraintool.m you see a small piece of documentation that says (among else): % net.<a href="matlab:doc nnproperty.net_trainParam">trainParam</a>.<a href="matlab:doc nnparam.showWindow">showWindow</a> = false; This hinted that some properties are...

javascript,neural-network,conv-neural-network

The sample new convnetjs.Vol([1.3, 0.5]) has label 0. The sample new convnetjs.Vol([0.1, 0.7]) has label 1. In general, in machine learning, you'd usually have samples which can be quite high-dimensional (here they are only two-dimensional), but you'd have a single label per sample which tells you which "class" it belongs...

Similar to your other question this is an error the stems from matrix multiplication. In essence, the following error: Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments means that your matrices don't have dimensions that match for matrix multiplication. It's like trying to multiply a 4x4 matrix by a 10x10...

you probably want something like NetLab. The authors state that most of it works in Octave. There is a page with examples that comes from a book about the software. From the Intro on the site it looks like netopt is similar to fitnet in functionality, if not necessarily in...

The key to avoid "destroying" the networks current knowledge is to set the learning rate to a sufficiently low value. Lets take a look at the mathmatics for a perceptron: The learning rate is always specified to be < 1. This forces the backpropagation algorithm to take many small steps...

neural-network,backpropagation,gradient-descent

If you have 5 hidden layers (assuming with 100 nodes each) you have 5 * 100^2 weights (assuming the bias node is included in the 100 nodes), not 100^5 (because there are 100^2 weights between two consecutive layers). If you use gradient descent, you'll have to calculate the contribution of...

python,debugging,neural-network,theano

Use .eval() to evaluate the symbolic expression Use Test Values ...

math,machine-learning,neural-network,linear-algebra,perceptron

A linear function is f(x) = a x + b. If we take another linear function g(z) = c z + d, and apply g(f(x)) (which would be the equivalent of feeding the output of one linear layer as the input to the next linear layer) we get g(f(x)) =...

machine-learning,neural-network,backpropagation

Note: it makes little sense to ask for the best method here. Those are two different mathematical notations for exactly the same thing. However, fitting the bias as just another weight allows you to rewrite the sum as a scalar product of an observed feature vector x_d with the weight...

machine-learning,neural-network

Neural network is fine for this. Your output would be the 10 coefficients. Comparing them "two by two" is nothing that influences the net architecture. Standard neural net training procedure takes care of "comparing the items" (if you want to call it that) itself. At last, make sure to know...

c#,artificial-intelligence,neural-network

A simple fixed to the question is to change this : public delegate void ChangeHandler (System.Object sender, EventArgs nne); to this : public delegate void ChangeHandler (System.Object sender, NeuralNetworkEventArgs nne); This fixes the problem....

r,neural-network,deep-learning,r-package

Although I am unfamiliar with the deepnet package, it appears it is structured the same as other neural net packages. After looking at the documentation (?sae.dnn.train) you will see: hidden: vector for number of units of hidden layers.Default is c(10). Now this isn't the clearest description but I believe it...

You can reduce the feedback factor. Then the network may require more time to learn but is less likely to oscillate. Another common technique is to add a decay, i.e. reducing the factor each iteration. In general neural networks have the same stability rules as control systems have (because as...

machine-learning,artificial-intelligence,neural-network,backpropagation

I agree with the comments that this model is probably not the best for your classification problem but if you are interested in trying to get this to work I will give you the reason I think this oscillates and the way that I would try and tackle this problem....

The problem is the convolutional neural network from this tutorial has been made to work with a fixed size input resolution of 32x32 pixels. Right after the 2 convolutional / pooling layers you obtain 64 feature maps with a 5x5 resolution. This gives an input of 64x5x5 = 1,600 elements...

machine-learning,neural-network,gpu

Almost all ML software that uses GPU works (best) with CUDA, thus Nvidia's GPUs are preferable. Take a look at this discussion. And, there's an article about which GPU to get for deep learning (modern neural networks). Relevant quote: So what kind of GPU should I get? NVIDIA or AMD?...

machine-learning,neural-network,point-clouds

An RBF network essentially involves fitting data with a linear combination of functions that obey a set of core properties -- chief among these is radial symmetry. The parameters of each of these functions is learned by incremental adjustment based on errors generated through repeated presentation of inputs. If I...

machine-learning,neural-network,deep-learning,caffe

If you got caffe from git you should fine in data/ilsvrc12 folder a shell script get_ilsvrc_aux.sh. This script should download several files used for ilsvrc (sub set of imagenet used for the large scale image recognition challenge) training. The most interesting file (for you) that will be downloaded is synset_words.txt,...

The sigmoid The sigmoid activation function output values in the range: It seems that you are trying to teach the sigmoid function to output values from 1 to 10000, which is impossible. The best fitness the network can achieve is thus to always output 1's. Alternative approach You can still...

There are many possibilities. For example: 1) use whole words as input, encoded either as one-hot input vectors or pre-trained word embeddings 2) use bi-directional RNN that is aware both of previous and next characters at the same time...

c++,opencv,machine-learning,neural-network,weight

I've only done a little bit of poking around so far, but what I've seen confirms my first suspicion... It looks as though each time you start the program, the random number generator is seeded to a fixed value: rng = RNG((uint64)-1); So each time you run the program you're...

java,neural-network,backpropagation

If you move from binary classification to multiclass classification, you'll have to generalize your backpropagation algorithm to properly handle more than two classes. The main differences to binary classification are that the update changes to: with: being the new score where the argument y (output) is chosen that yields the...

c,lua,neural-network,luajit,torch

I can't find a way to convert or write to a file the Torch Tensors to make them readable in C. Ideally, I want to convert the Tensors into arrays of double in C. The most basic (and direct) way is to directly fread in C the data you...

machine-learning,computer-vision,neural-network

If you refer to VGG Net with 16-layer (table 1, column D) then 138M refers to the total number of parameters of this network, i.e including all convolutional layers, but also the fully connected ones. Looking at the 3rd convolutional stage composed of 3 x conv3-256 layers: the first one...

java,algorithm,neural-network,genetic-algorithm,encog

Not all trainers in Encog support the simple pause/resume. If they do not support it, they return null, like this one. The genetic algorithm trainer is much more complex than a simple propagation trainer that supports pause/resume. To save the state of the genetic algorithm, you must save the entire...

machine-learning,neural-network,backpropagation,feed-forward

In short, yes it is a good approach to use a single network with multiple outputs. The first hidden layer describes decision boundaries (hyperplanes) in your feature space and multiple digits can benefit from some of the same hyperplanes. While you could create one ANN for each digit, that kind...

lua,neural-network,backpropagation,training-data,torch

why the predictionValue variable is always the same? Why doesn't it get updates? First of all you should perform the backward propagation only if predictionValue*targetValue < 1 to make sure you back-propagate only if the pairs need to be pushed together (targetValue = 1) or pulled apart (targetValue =...

c++,neural-network,genetic-algorithm,temporary-objects

First consider this subset of the CNeuralNetwork class: class CNeuralNetwork { // ... public: std::vector<CSynapticConnection *> getm_vListofSynaptics() { return m_vListofSynaptics; } std::vector<CSynapticConnection*> m_vListofSynaptics; // ... }; Here you have a getter (getm_vListofSynaptics()) that returns a temporary value: a copy of the public data member m_vListofSynaptics. In the CGeneticEngine::NewPopulation() function you're...

machine-learning,neural-network,perceptron

Changing the learning rate to 0.075 fixed the issue.

javascript,machine-learning,neural-network

Arrays in JavaScript are zero based. Therefore you have to use document.write(output[0]);. Maybe it would be helpfull to use a console.log or even better a debugger; statement. This way you can inspect your variables through the JS Console. More info on debugging can be found here....

machine-learning,artificial-intelligence,neural-network

This is a good application of convolutional neural networks. There are a number of libraries and services available for doing this. Caffe is a tool for doing this, though I don't have any experience with it. Do some googling for other tools, search for "convolutional neural networks". For services there's...

machine-learning,neural-network

Let me answer your question with some mathematical notations that will make it easier to understand than just random images. First, remember the Perceptron. The task of the Perceptron is to find a decision function that will classify some points in a given set into n classes. So, for a...

machine-learning,neural-network

If you are talking about session-based course (which I have passed previously): https://www.coursera.org/learn/machine-learning than it uses a batch-learning approach, in exercise 4 (which covers ANN). If you carefully study the cost function you will see that it is calculated using all of available examples, not just one randomly chosen....

machine-learning,integration,neural-network,implementation,calculus

My personal opinion it is not possible to feed into NN enough rules for integrating. Why? Because NN are good for linear regression ( AKA approximation ) or logical regression ( AKA classification ). Integration is neither of them. It is calculation task according to some strict algorithms. So from...

Just pass the test data to the test function: n.test([[1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0]]) ...

To whom it may concern. The code which is written above is correct: in a case where the network has been properly trained, it manages to output different values for different inputs. The main problem was to train the network. As it gave me a correct answer (14.7, which was...

machine-learning,neural-network,classification,backpropagation

A Feed-Forward Neural Network is a type of Neural Network architecture where the connections are "fed forward", i.e. do not form cycles (like in recurrent nets). The term "Feed forward" is also used when you input something at the input layer and it travels from input to hidden and...