Menu
  • HOME
  • TAGS

Fitting a subset model with just one lag, using R package FitAR

r,time-series

Use GetFitARpMLE(z,4) You will get > GetFitARpMLE(z,4) $loglikelihood [1] -2350.516 $phiHat ar1 ar2 ar3 ar4 0.0000000 0.0000000 0.0000000 -0.9262513 $constantTerm [1] 0.05388392 ...

Changing tick intervals when x axis values are dates

r,ggplot2,time-series

Upgrade comment You can change the x-axis labels using scale_x_date and formats from the scales package. Using the code from the ggplot2 scale_x_date help pages library(ggplot2) library(scales) # to access breaks/formatting functions # Change Date to date format aaci$dt <- as.Date(aaci$Date) # Plot # You can change the format to...

Trying to check data frequency with Pandas Series of datetime64 objects

python,pandas,time-series

One method after converting to datetime64, if frequency sampling rate is the same then we could call diff() to calculate the difference between all rows which should be the same and compare this with a np.timedelta64 type, so for your sample data this would be: In [277]: all(df.datetime.diff()[1:] == np.timedelta64(1,...

Measure the STD of RMSE

matlab,time-series,forecasting

The term y_real-y_pred is the vector of errors. The expression squares each element of it, and then sqrts each element of it, thus having the effect of abs(). Then std() is run on the vector of errors. Thus, this is computing the S.D. of the (absolute) error. That is a...

R: Volatility function that interprets NAs

r,data.frame,time-series,na

We can remove the 'NA' elements with !is.na(x), but the lag(x) will return NA as the first element, which can be removed by using na.rm=TRUE in the sd volcalc= function (x) { x <- x[!is.na(x)] returns=log(x)-log(lag(x)) vol=sd(returns, na.rm=TRUE)*sqrt(252) return(vol) } apply(dataexample, 2, volcalc) # x y #3.012588 1.030484 ...

Matlab's VARMAX regression parameters/coefficients nX & b

matlab,time-series

On Saturday, May 16, 2015 at 6:09:20 AM UTC-4, Rick wrote: Your assessment is generally correct, "nX" and "b" parameters do indeed correspond to the exogenous input data "x(t)". The number of columns (i.e., time series) in x(t) is "nX" and is what SAS calls "r", and the coefficient vector...

pandas shift time series with missing values

python,pandas,time-series,shift

In [588]: df = pd.DataFrame({ 'date':[2000,2001,2003,2004,2005,2007], 'value':[5,10,8,72,12,13] }) In [589]: df['previous_value'] = df.value.shift()[ df.date == df.date.shift() + 1 ] In [590]: df Out[590]: date value previous_value 0 2000 5 NaN 1 2001 10 5 2 2003 8 NaN 3 2004 72 8 4 2005 12 72 5 2007 13 NaN...

ggplot2: arranging multiple boxplots as a time series

r,ggplot2,time-series,boxplot

Is this what you want? Code: p <- ggplot(data = dtm, aes(x = asDate, y = mortes, group=interaction(date, trmt))) p + geom_boxplot(aes(fill = factor(dtm$trmt))) The key is to group by interaction(date, trmt) so that you get all of the boxes, and not cast asDate to a factor, so that ggplot...

Creating a running counting variable in R?

r,time-series,running-total

Here's a very straightforward solution that isn't pretty but does the job. First, just a change to your data to make comparisons easier: mtable<-data.frame(date,t.1,t.2,m.result, stringsAsFactors = FALSE) Edited in: If you want to assure the matches are ordered by date, you can use order as pointed out by @eipi10: mtable...

Arima.sim issues in R

r,math,statistics,time-series,forecasting

You seem to be confused between modelling and simulation. You are also wrong about auto.arima(). auto.arima() does allow exogenous variables via the xreg argument. Read the help file. You can include the exogenous variables for future periods using forecast.Arima(). Again, read the help file. It is not clear at all...

Imputing missing values using ARIMA model

r,time-series,missing-data

fitted gives in-sample one-step forecasts. The "right" way to do what you want is via a Kalman smoother. A rough approximation good enough for most purposes is obtained using the average of the forward and backward forecasts for the missing section. Like this: x <- AirPassengers x[90:100] <- NA fit...

R, how to use pch with time series plot

r,time-series,lattice

By default, time series plots in R use type = "l", which means that you get a line but no point characters. To get both, you can change your type to "b". xyplot(a1, col = "red", pch = 2, type = "b") This yields: The same logic applies to the...

Deleting duplicates in a time series

sql-server,duplicates,time-series,sql-delete

You can do this using a CTE and ROW_NUMBER: SQL Fiddle WITH CteGroup AS( SELECT *, grp = ROW_NUMBER() OVER(ORDER BY MS) - ROW_NUMBER() OVER(PARTITION BY Value ORDER BY MS) FROM YourTable ), CteFinal AS( SELECT *, RN_FIRST = ROW_NUMBER() OVER(PARTITION BY grp, Value ORDER BY MS), RN_LAST = ROW_NUMBER()...

Munging Time Series in Excel

vba,excel-vba,time-series

You don't need any code to change the text that looks like dates into real dates. Select the column of dates, then click Data > Text To Columns > Next > Next In this dialog select Date as the data type and choose the order of MDY if the text...

Fitted values in R forecast missing date / time component

r,time-series,forecasting

Do not use the dates in your plot, use a numeric sequence as x axis. You can use the dates as labels. Try something like this: y=GED$Mfg.Shipments.Total..USA. n=length(y) model_a1 <- auto.arima(y) plot(x=1:n,y,xaxt="n",xlab="") axis(1,at=seq(1,n,length.out=20),labels=index(y)[seq(1,n,length.out=20)], las=2,cex.axis=.5) lines(fitted(model_a1), col = 2) The result depending on your data will be something similar: ...

Calculate days since last event in R

r,time-series

You could try something like this: # make an index of the latest events last_event_index <- cumsum(df$event) + 1 # shift it by one to the right last_event_index <- c(1, last_event_index[1:length(last_event_index) - 1]) # get the dates of the events and index the vector with the last_event_index, # added an...

How to best compress timeseries into a different duration?

r,time-series

Use aggregate.ts with sum, mean or whatever summary function desired. See ?aggregate.ts > aggregate(tser, 4, sum) Qtr1 Qtr2 Qtr3 Qtr4 2010 10.21558 15.22923 21.98924 30.94460 2011 39.81982 45.00208 61.26129 73.03194 2012 87.63780 97.27455 104.69757 115.09325 2013 126.71070 138.39925 145.47344 159.00137 ...

Plotting multiple time series after a groupby in pandas

python,pandas,group-by,time-series

(Am a bit amused, as this question caught me doing the exact same thing.) You could do something like valgdata\ .groupby([valgdata.dato_uden_tid.name, valgdata.news_site.name])\ .mean()\ .unstack() which would reverse the groupby unstack the new sites to be columns To plot, just do the previous snippet immediately followed by .plot(): valgdata\ .groupby([valgdata.dato_uden_tid.name, valgdata.news_site.name])\...

Read Data into Time Series Object in R

r,data,time-series

Try this (assuming your data is called df) ts(df$Number, start = c(2010, 01), frequency = 12) ## Jan Feb Mar ## 2010 1 19 1 Edit: this will work only if you don't have missing dates and your data is in correct order. For a more general solution see @Anandas...

geom_vlines multiple vlines per plot

r,ggplot2,time-series,timeserieschart

This is the solution: library(ggplot2) library(reshape2) library(ecp) synthetic_control.data <- read.table("/Users/geoHeil/Dropbox/6.Semester/BachelorThesis/rResearch/data/synthetic_control.data.txt", quote="\"", comment.char="") n <- 2 s <- sample(1:100, n) idx <- c(s, 100+s, 200+s, 300+s, 400+s, 500+s) sample2 <- synthetic_control.data[idx,] df = as.data.frame(t(as.matrix(sample2))) #calculate the change points changeP <- e.divisive(as.matrix(df[1]), k=8, R = 400, alpha = 2, min.size = 3)...

R conditionally matching date-time from one dataframe to closest date-time field in second dataframe

r,datetime,merge,data.frame,time-series

It is sometimes hard to avoid loops, especially when you have conditions like you do. Sometimes we end up spending much efforts avoiding them while they are probably either the best we can do, or are not too far behind in terms of performance and/or readability. Having said that, this...

Plotting Probability Density Heatmap Over Time in R

r,plot,time-series,kriging

Here is one solution to what I think you are after. Generate data. myData <- mapply(rnorm, 1000, 200, mean=seq(-50,50,0.5)) This is a matrix with 1000 rows (observations) and 201 time points. In each time point the mean of data there shifts gradually from -50 to 50. By 0.5 each time....

timeseries fitted values from trend python

python,pandas,time-series,statsmodels,trend

Quick and dirty ... # get some data import pandas.io.data as web import datetime start = datetime.datetime(2015, 1, 1) end = datetime.datetime(2015, 4, 30) df=web.DataReader("F", 'yahoo', start, end) # a bit of munging - better column name - Day as integer df = df.rename(columns={'Adj Close':'AdjClose'}) dayZero = df.index[0] df['Day'] =...

small line x-axis for years

r,time-series

Now I see what you mean. One way to handle this would be to create two time series, and use one for your calculations and plotting your data, and the other for the tic marks. Like this: library(xts) n <- 1000 d1 <- seq(as.Date("2001-01-01"),as.Date("2021-01-01"),length.out=n) d1y <- seq(as.Date("2001-01-01"),as.Date("2021-01-01"),length.out=21) d2 <- rnorm(n,10,1)...

plotting a graph with 3 curves time series data

r,plot,time-series

You could use matplot as follows: matplot(cbind(xtsplot1, xtsplot2, xtsplot3), xaxt = "n", xlab = "Time", ylab = "Value", col = 1:3, ann = FALSE, type = 'l') ...

Matlab Reintroduction of AR and GARCH processes

matlab,for-loop,return,time-series,volatility

I think you are a bit confused about how matrix indexing works in Matlab. If understood correctly, you have a variable TR_t with which you want to store the value for time t. You then try to do the following: TR_t = TR_{t-1} * exp(R_t); I will try to explain...

R, lag( ) has inconsistent behavior for xts and ts objects

r,time-series,xts

As was pointed out in the documentation from ?lag.xts, this is the intended behavior.

how to plot multiple time series in the same graph with customized x axis

javascript,django,python-2.7,highcharts,time-series

This is a good case for the pointStart and pointInterval properties, on a datetime type x axis. Example: http://jsfiddle.net/jlbriggs/92gkjwo3/ You can use the axis label formatter function, and the tickInterval properly to define the placement and format of the labels. References: http://api.highcharts.com/highcharts#xAxis.labels.formatter http://api.highcharts.com/highcharts#plotOptions.series.pointInterval http://api.highcharts.com/highcharts#plotOptions.series.pointStart ...

Calculate the maximum price fluctuation in a 24 hour window

r,time-series

This seemed to work. I got tripped up by trying to use the date as the bottom window cutoff, but then the duplicates add 0's as a min where there wouldn't be otherwise. as.POSIXct might not be necessary depending on what format your date is. I also used 60 seconds...

Remove values which are surrounded by a certain number of NAs

r,time-series,missing-data

First, we will define the two thresholds you specified. (I set the second one to 4 so we can work consistently with "<" and ">", instead of the error-prone "<" and ">="). threshold.data <- 10 threshold.NA <- 4 Now, the key is to work with run length encoding on is.na(y)....

Cassandra storage internal

cassandra,apache-spark,time-series,cql

I'm trying to understand what exactly happens internally in storage engine level when a row(columns) is inserted in a CQL style table. Let's say that I build tables with both of your PRIMARY KEYs, and INSERT some data: [email protected]:stackoverflow2> SELECT userid, time, dateof(time), category, subcategory, itemid, count, price FROM...

Forecasting an Arima Model in R Returning Strange Error

r,time-series,shiny,forecasting

The issue wound up being that I was using the arima(...) function instead of Arima(...). It turns out that they are two different functions. The issue that I was experiencing was a result of differences in how the functions store their data. More information about this can be found in...

Convert data frame with epoch timestamps to time-series with milliseconds in R

r,time-series,xts,zoo

Your timestamps are in milliseconds. You need to convert them to seconds to be able to use them with as.POSIXct. And there's no point in calling strptime on a POSIXct vector. Also, it's good practice to explicitly set the timezone, rather than leave it set to "". df$datetime <- as.POSIXct(df$timestamp/1000,...

SQL Server Time Series Modelling Huge datacollection

sql,.net,sql-server,time-series

1) You probably want to explore the use of partitions. This will allow very effective inserts (its a meta operation if you do the partitioning correctly) and very fast (2). You may want to explore columnstore indexes because the data (once collected) will never change and you will have very...

R: compute waldtest using dynlm

r,time-series

The easiest approach is to fit a nested model with interactions rather than two separate models. So you can first generate a factor that encodes the two segments: fac <- factor(as.numeric(time(zoop) > as.Date("2005-01-24"))) fac <- zoo(fac, time(zoop)) And then you can fit a model where all coefficients are constrained to...

python freezes with many repeated pandas calls

python,pandas,time-series,freeze

Your approach looks a little complicated ... I hope my simplification is this what you need ... # get an index of pandas Timestamps df.index = pd.to_datetime(df.Date + ' ' + df.Time) # get the column we want as a pandas Series called price price = df['Close'] Update # use...

Matlab: trying to estimate multifractal spectrum from time series by histogram box-counting

matlab,statistics,time-series,histogram,fractals

Your code seems to be generally bug-free but I made some changes since you perform needless repetitions over loops (I moved the outer loop inside and "vectorized" it since all moment calculations can be performed simultaneously for a given histogram. Also, it is building the histogram that takes longest). intel...

time series data in mongodb - how to query embedded document

mongodb,time-series

Since you are specifying the time and date in the keys, you can do this by projecting the keys you want displayed. So if you wanted the week from 16 to 22 February, you could do something like this: db.servers.find( { "_id": "i-09484d47_201502" }, { "values.16": 1, "values.17": 1, "values.18":...

Enlarge time series and fill with -9999 R

r,merge,time-series

Try this (short is the name of your 2nd matrix): res <- as.matrix(merge(long.date.col, short, all.x = T)) res[is.na(res)] <- "-9999" ...

Converting time series to data frame, matrix, or table

r,time-series,tapply

You do not need time series, just tapply: res=tapply(AVG_LOSCAT2$AVG_LOSCAT, list(year = AVG_LOSCAT2$YEAR, month = AVG_LOSCAT2$MONTH), round,2) res month year 1 2 3 4 5 6 7 8 9 10 11 12 2012 NA NA NA NA NA 7.51 7.31 8.33 7.66 5.36 6.46 8.30 2013 5.74 7.89 6.49 7.09 5.91...

R Nested For Loop for a Sensitivity Analysis

r,for-loop,nested,time-series

I took the freedom to define a simple fnc function. The idea is to loop over the indices of n_lens and not on the values of n_lens. Nested for loops may be (will be?) slower in R compared to other ways of R. It produces the required output. fnc <-...

Using Cassandra for time series data

cassandra,time-series,composite-key

My intention of this question was more like this. Cassandra storage internal Check it out....

Relationship between LinearModel & GeneralizedLinearMixedModel classes

matlab,oop,time-series,linear-regression,superclass

To determine this, you can use the superclasses function: superclasses('LinearModel') superclasses('GeneralizedLinearMixedModel') This will return the names of the visible superclasses for each case. As you'll see, both inherit from the abstract superclass classreg.regr.ParametricRegression. You can also view the actual classdef files and look at the inheritances. In your Command Window,...

Calculating Active dates based on gap length using Pandas Dataframes

python,date,datetime,pandas,time-series

After understanding what you want this is much simpler, so we calculate whether the difference between the current and previous rows is larger than 5 days giving us a boolean series, we use this filter the df and then use the index value to perform slicing: In [57]: inactive_index =...

value from a past, potentially missing month in dataframe

python,pandas,time-series

I found a way of doing this, not too happy about it tho: full_index = [] for g in all_genders: for s in all_states: for m in all_months: full_index.append((g, s, m)) df = df.set_index(['Gender', 'State', 'Month']) df = df.reindex(full_index) # fill in all missing values So basically, instead of dealing...

Pandas Dataframe Plot

python,pandas,time-series,ipython-notebook

Your problem (as spotted by @ J Richard Snape) is that your dates are in fact strings so it's ordered lexicographically. You should convert to datetime dtype: df1['Ship_date'] = pd.to_datetime(df1['Ship_date']) After which it should maintain the expected order....

How to get a column of lists of contract payout pairs (date:amounts)?

r,time-series,dplyr

Not really sure why you want this, but here you go: library(data.table) dt = as.data.table(E) # or convert in place using setDT dt[, .(contract_len = as.numeric(difftime(Date[.N], Date[1], unit = 'days')), first_pay = Date[1], last_pay = Date[.N], num_payments = .N, payment = sum(Amt), summary = list(data.table(Date, Amt))) , by = ID]...

Adding date tick marks to a Matlab plot

matlab,plot,time-series

There is an answer on the Mathworks website that I think you will find helpful: http://www.mathworks.com/matlabcentral/answers/92565-how-do-i-control-axis-tick-labels-limits-and-axes-tick-locations. Basically what you want to do is manipulate the XTick or XTickLabel attributes of the current axis handle. Lets say I have a plot that spans 100 years from 1900 - 2000. After creating...

Calculating the difference in dates in a Pandas GroupBy object

python,pandas,time-series

Here's something that's almost your dataframe (I avoided copying the dates): df = pd.DataFrame({ 'col1': [1, 1, 1, 2, 2, 2], 'col2': [1, 2, 3, 1, 2, 3], 'date': [1, 9, 10, 10, 10, 25] }) With this, define: def max_diff_date(g): g = g.sort(columns=['date']) return g.col2.ix[(g.date.ix[1: ] - g.date.shift(1)).argmax() -...

i10test for integration/stationary time series

matlab,time-series

I think the answer is that you can specify the alpha for each test via that iParams and sparams arguments. Without such a user specification, each test has a default alpha. The button to "Answer Your Question" doesn't seem to be working, so here it is, in the Comments.

Python Pandas: Calculations with Two Different Size Dataframes

python,pandas,time-series

Give this a try. Using map to pull directly from your series of averages df["diff"] = df["snow_depth"] - df["month"].map(nameofyourseries) year month snow_depth diff 0 1979 1 18.322581 3.937382 1 1979 2 11.535714 -3.776587 2 1979 3 5.322581 -1.187855 3 1979 4 0.300000 0.031092 4 1979 5 0.000000 -0.005819 5 1979...

Plotting multivariate time-series data in R

r,graph,plot,ggplot2,time-series

Two ways of doing this: If sample data created as follows: Full.df <- data.frame(Date = as.Date("2006-01-01") + as.difftime(0:364, units = "days")) Full.df$Month <- as.Date(format(Full.df$Date, "%Y-%m-01")) Full.df[paste0("Count.", c("S", "G", "W", "F"))] <- matrix(sample(100, 365 * 4, replace = TRUE), ncol = 4) Optimal way using reshape2 package: molten <- melt(Full.df, id.vars...

ggplot vertical line with date axis R

r,ggplot2,time-series,vline

Create a new data table with the sundays data: MUSEUS_PLOT_SUNDAYS <- MUSEUS_PLOT[weekdays(MUSEUS_PLOT$VisitDate) == "Sunday"] And change the geom_vline for this: geom_vline(data = MUSEUS_PLOT_SUNDAYS,aes(xintercept = as.numeric(VisitDate)),colour = "black") ...

how to convert data frame into time series in R

r,time-series

R has multiple ways of represeting time series. Since you're working with daily prices of stocks, you may wish to consider that financial markets are closed on weekends and business holidays so that trading days and calendar days are not the same. However, you may need to work with your...

Unable to pass xreg values to hts ARIMA forecast

r,time-series,forecasting

I solved the direct question so this is technically the answer while I don't completely understand why. I read through the HTS code on using the trace() function and found the line causing issues: else if (fmethod == "arima") { models <- auto.arima(x, lambda = lambda, xreg = xreg, parallel...

How to use parameters from data frame in R and loop through time holding them constant

r,nested,time-series,lapply,sapply

Using plyr: As a matrix (time in cols, rows corresponding to rows of df): aaply(df, 1, function(x) weisurv(t, x$sc, x$shp), .expand = FALSE) As a list: alply(df, 1, function(x) weisurv(t, x$sc, x$shp)) As a data frame (structure as per matrix above): adply(df, 1, function(x) setNames(weisurv(t, x$sc, x$shp), t)) As a...

Auto-regressive model prediction decays to flatline

python,time-series,statsmodels,autoregressive-models

There is nothing wrong. That's the behavior of a stationary ARMA process where predictions converge to the mean. If you have fixed seasonality, then you could difference the time series at the seasonal lag, i.e. use a SARIMA, and the prediction would converge to a fixed seasonal structure. If you...

Fitting a polinomial curve to time series data

r,ggplot2,time-series,lm

try using mean instead of sum like this ggplot(data = df, aes(x = Month, y = Count.V)) + stat_summary(fun.y = mean, geom ="line")+ stat_smooth(method = "lm", formula = y ~ poly(x, 3), size = 1) + geom_point()+ scale_x_date(labels = date_format("%m-%y"), breaks = "3 months") ...

How to create a function and a loop to calculate growth rates of variables in a data frame in R

r,time-series,growth-rate

You can use data.table too. data.table is a very powerful data manipulation package. You can get started here. library("data.table") as.data.table(testdata)[, lapply(.SD, function(x)x/shift(x) - 1), .SDcols = 2:4] gdp cpi_index rpi_index 1: NA NA NA 2: 0.006427064 0.0072281257 0.009296686 3: 0.007166400 0.0030245056 0.004805767 4: 0.004061822 0.0061020008 0.006377043 5: 0.006772674 0.0009282349 0.005544554...

Why does R give me a time series of row numbers instead of values?

r,time-series

Your values variable is a factor (usually used for categorical values). Convert values to numeric before creating time series: values <- as.numeric(levels(values))[values] ...

How to create a time series plot in the style of a horizontal stacked bar plot in r

r,plot,time-series,bar-chart

For a ggplot2 plot first convert df to long form (using melt from the reshape2 package), convert the date column to "Date" class and the value column to a factor and then use geom_tile: library(ggplot2) library(reshape2) long <- melt(df, measure.var = 2:4) long <- transform(long, date = as.Date(long$date, "%d/%m/%Y"), value...

vgxset command: Q parameter for resulting model object?

matlab,time-series

I will try to iterate to an answer, but being so many branches of discussion, i prefer to access directly onto this format. Whatever mean, this is a constructive process, as the purpose of this forum is... Some previous "clarifications": The Output Covariance from EstSpec.Q after and before running the...

Different regression output using dynlm and lm

r,time-series,lm

You are not allowing dynlm to use the same amount of data as in lm. The latter model contains two fewer observations. dim(model.frame(reg1)) # [1] 24 7 dim(model.frame(lmx)) # [1] 22 7 The reason is that withlm you are transforming the variables (differencing) with the entire data set (31 observations),...

seas start year error R

r,time-series

If your data has the value 4199, this means that you included the date column when trying to form your ts object. Since you specified the start and frequency of your time series in your ts function, you no longer need the date values as it will be generated by...

Direct forecast using epsilion-SVR

matlab,time-series,libsvm,forecasting

A Support-Vector-Regression based predictor is used for exactly that. It shall stand for PH >= 1. The value of epsilon in the epsilon-SVR model specifies the epsilon-tube, within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value Y(t)....

Use Python/Pandas to Match Sample Pairs Yearly Data

python,pandas,match,time-series,multi-index

This method is a little messy, but I am trying to make it more robust to account for missing data. First, we'll remove duplicates in the data and then convert the dates to Pandas Timestamps: df = df.drop_duplicates() df.SampleDate = [pd.Timestamp(ts) for ts in df.SampleDate] Then let's arrange you DataFrame...

R seasonal decomposition

r,time-series

There must be more than 2 periods, so frequency must be less than n/2 n = 1000 x = ts(0.1*rnorm(n) + sin(6*pi*(1:n)/n) + (1:n)/n, frequency=n/2.1) plot(x) stl(x,"per") ...

plotting; adding own x-axis does not work

r,plot,time-series

set.seed(1) r <- rnorm(20,0,1) z <- c(1,1,1,1,1,-1,-1,-1,1,-1,1,1,1,-1,1,1,-1,-1,1,-1) data <- as.data.frame(na.omit(cbind(z, r))) series1 <- ts(cumsum(c(1,data[,2]*data[,1]))) series2 <- ts(cumsum(c(1,data[,2]))) d1y <- seq(as.Date("1991-01-01"),as.Date("2015-01-01"),length.out=24) matplot(cbind(series1, series2), xaxt = "n", xlab = "Time", ylab = "Value", col = 1:3, ann = TRUE, type = 'l', lty = 1) axis(1, at=seq(2,20,2), labels=format(d1y[seq(2,20,2)],"%Y")) ...

R function to return start & end date of a time series ts() object?

r,datetime,time-series,forecasting

Here is a simple example assuming weekly data: x <- ts(rnorm(200), frequency=52) endx <- end(x) window(x, end=c(endx[1],endx[2]-3)) Of course, there are not actually 52 weeks in a year, but that is probably a complication that can be overlooked for most analyses....

Replace -inf, NaN and NA values with zero in a dataset in R

r,time-series,nan,zoo

As per ?zoo: Subscripting by a zoo object whose data contains logical values is undefined. So you need to wrap the subsetting in a which call: log_ret[which(!is.finite(log_ret))] <- 0 log_ret x y z s p t 2005-01-01 0.234 -0.012 0 0 0.454 0 ...

Timeline just with years that shall be ordered and just showing per year how many values are available

d3.js,time-series,timeline,timeserieschart

Here is an augmentation of your JS fiddle, Demo: http://jsfiddle.net/robschmuecker/c8txLxo9/ It takes the data you have and then parses it to get a collection of years so that we only insert one dom element per year rather than several. Then we can conditionally add events for years which have more...

How to aggregate time series documents in mongodb

mongodb,mapreduce,time-series,mongodb-query,nosql-aggregation

This will be hard to achieve using the aggregation framework. But it "works" well with MapReduce. Something along the lines of that (untested): // collect *individual* values map = function() { for (var min in this.values) for (sec in this.values[min]) data = {value: {}, count: {}} data.value[this.name] = this.values[min][sec] data.count[this.name]...

Plot monthly Time series from a data frame with daily data

r,plot,time-series,legend

Mannat here is an answer using data.table package to help you aggregate. Use install.packages(data.table) to first get it into your R. library(data.table) # For others # I copied your data into a csv file, Mannat you will not need this step, # other helpers look at data in DATA section...

Pandas Time-Series: Find previous value for each ID based on year and semester

python,pandas,time-series

I think there are two critical points: (1) sorting by Year and Term so that the order corresponds to temporal order; and (2) using groupby to collect on IDs before selecting and shifting the Rating. So, from a frame like >>> df ID Year Term Rating 0 1 2010 0...

Rolling average pairwise correlation - code doesn't work as expected

r,statistics,time-series,correlation,xts

What about using rollapply in different way? As you dont supply the complete dataset, here a demonstration how I mean it: set.seed(123) m <- matrix(rnorm(100), ncol = 10) rollapply(1:nrow(m), 5, function(x) cor.mean(m[x,])) [1] -0.080029692 -0.038168840 -0.058443824 0.005699772 -0.014459878 -0.021569173 As I just figured out, you can also use the function...

Issue with setting up time series correctly in R

r,time-series

I don't see the xts frequency argument doing the same thing as the ts frequency argument. So, I assume you need to convert your data into a ts object before you use decompose. The way I got it to work is the following: Using the following data: data(sample_matrix) df <-...

ts.intersect does not work with xts objects

r,time-series,xts

ts.intersect determines whether the objects is a ts object by looking for the tsp attribute. as.xts.ts removes the tsp attribute, which is why it is not coerced back to a ts object. This looks like a bug in xts->ts->xts conversion, but I need to take a closer look. As a...

SensorEvent.timestamp and Location.getElapsedRealtimeNanos() Timestamp Delay Offset

java,android,gps,time-series,kalman-filter

The answer is simple, the SensorEvent.timestamp has an arbitrary zero reference: It turns out after a bit of Googling (tip o' the hat to StackOverflow, as usual) that the timestamp one receives isn't based off of any particular 0-point defined in the Android OS or the API; it's an arbitrary...

Estimating change of a cyclic boolean variable

time-series,sampling,measurement,probability-theory

I'm going to approach this problem as if it were on a test. First, let's name the variables. Bx is value of the boolean variable after x opportunities to flip (and B0 is the initial state). P is the chance of changing to a different value every opportunity. Given that...

Lag dependent variable [closed]

r,time-series

Use the dynlm package. Here is an example using the data you supplied: library(dynlm) dfX = read.table( textConnection( "Date YY XX ZZ MM 03.01.2005 2.154 2.089 0.001 344999 04.01.2005 2.151 2.084 0.006 344999 05.01.2005 2.151 2.087 -0.007 333998 06.01.2005 2.15 2.085 -0.005 333998 07.01.2005 2.146 2.086 -0.006 333998 10.01.2005 2.146...

Replace list of permutations with getSymbols data in R

r,time-series,permutation,quantmod

If you wanted to get a list of data frames, one for each pair, you could try: dfs <- lapply(seq_len(ncol(perm)), function(x) close[,paste0(perm[,x], ".Close")]) Now you can get the 2-column data frames for each pair with dfs[[1]], dfs[[2]], etc. You can perform statistical analyses on each pair using the lapply function....

Convert Daily Data into Weekly in R Week Starts on Saturday

r,data.frame,time-series,weekend

Find the first Saturday in your data, then assign a week ID to all dates in your data set based on that : library(lubridate) # for the wday() and ymd() functions daily_FWIH$Date <- ymd(daily_FWIH$Date) saturdays <- daily_FWIH[wday(daily_FWIH$Date) == 7, ] # filter for Saturdays startDate <- min(saturdays$Date) # select first...

R: faster alternative of period.apply

r,time-series,apply

Using rowsum seems to be faster (at least for this small example dataset) than the data.table approach: sgibb <- function(datframe) { data.frame(Group = unique(df$Group), Avg = rowsum(df$Weighted_Value, df$Group)/rowsum(df$SumVal, df$Group)) } Adding the rowsum approach to @platfort's benchmark: library(microbenchmark) library(dplyr) library(data.table) microbenchmark( Nader = df %>% group_by(Group) %>% summarise(res = sum(Weighted_Value)...

ETS multiplicative trend model written in state space form

time-series,forecasting,state-space

The state vector is exactly the same in the multiplicative case as in the additive case. All the equations are given here: https://www.otexts.org/fpp/7/7 For the ETS(M,Md,N) model, ...

Setting limits with scale_x_datetime and time data

r,ggplot2,time-series

the error message says that you should use as.POSIXct on lims. You also need to add the date (year, month and day) in lims, because by default it will be `2015, which is off limits. lims <- as.POSIXct(strptime(c("2011-01-01 03:00","2011-01-01 16:00"), format = "%Y-%m-%d %H:%M")) ggplot(df, aes(x=dates, y=times)) + geom_point() +...

Dates with month and day in time series plot in ggplot2 with facet for years

r,ggplot2,time-series

You are very close. You want the x-axis to be a measure of where in the year you are, but you have it as a character vector and so are getting every single point labelled. If you instead make a continuous variable represent this, you could have better results. One...

Combining time series data into a single data frame

r,date,data.frame,time-series

You can try Reduce(function(...) merge(..., by=c('Date', 'Month', 'Week', 'Year'), all=TRUE), list(Standard.df, Guardian.df, Welt.df)) ...

Time Series Oriented IoT Platform

database,rest,time-series,publish-subscribe,iot

If you want a single solution, try ATSD, it does all of the above.

Cassandra Time-Series: Allow Filtering, Buckets, or Other

database,cassandra,time-series,data-modeling,cql

Writing this question has helped me sort out some of my problems. I've come up with an alternative solution which I am more-or-less happy with but will need some fine-tuning. There is the possibility of calculating all of the time buckets we need to access, making a query for each...

Time series forecasting use SVM

python,time-series

So (X,y) is your train set (356 data instances with their labels), to forecast the first month of the next year your SVR Model need a data set X_nextMonth (30 data instances with the same features as those of X) to pass as argument to its .predict() method that he...

R: HAC by NeweyWest using dynlm

r,time-series,regression

NeweyWest calculates the 'lag' with this code: lag <- floor(bwNeweyWest(x, order.by = order.by, prewhite = prewhite, ar.method = ar.method, data = data)) ... and when called with the default arguments it replicates your (and my replication of it) error: >bwNeweyWest(m2,lag = NULL, order.by = NULL, prewhite = TRUE, adjust =...

Dates on x-axis, time series

r,date,time-series,as.date

I had to augment your example to get something to play with, but here is something that works. And I just changed it to eliminate lubridate... library(xts) d1 <- seq(as.Date("2001-01-01"),as.Date("2021-01-01"),"years") d2 <- rnorm(21,10,1) Dollar <- data.frame(d1,d2) dates <- as.Date(Dollar[,1], "%d.%m.%Y",tz="GMT") xtsplot <- as.xts(Dollar[,2], dates) plot(xtsplot, xaxt = "n", main="SMA", ann...

150x150 crosstab in stata, showing timeseries movement between categories

time-series,stata,crosstab

tabout from SSC may work for you: clear set more off *----- example data set ----- input /// id year occup 1 1999 1 1 2000 1 1 2001 1 2 1999 1 2 2000 2 2 2001 1 3 1999 1 3 2000 2 3 2001 2 4 1999...

multi-monthly mean with pandas' Series

python,pandas,time-series

the following worked for me: # create some random data with datetime index spanning 17 months s = pd.Series(index=pd.date_range(start=dt.datetime(2014,1,1), end = dt.datetime(2015,6,1)), data = np.random.randn(517)) In [25]: # now calc the mean for each month s.groupby(s.index.month).mean() Out[25]: 1 0.021974 2 -0.192685 3 0.095229 4 -0.353050 5 0.239336 6 -0.079959 7...

R - how to calculate “global” monthly means of a zoo object

r,time-series,mean,zoo

1 You can aggregate with a data.table library(data.table) # This turns all Jans to 1 and Decs to 12 for example mth <- month(as.Date(df$date)) dt2 <- as.data.table(df) # turn df into data table dt dt2[, mth := mth] # pop month into your data frame setkey(dt2, "mth") # data tables...

R: Handling subsets using dynlm

r,time-series,subset

If you do the subsetting yourself via data = zooX[...,], then dynlm() doesn't see the full sample and hence has to lose two observations. If you supply the full data = zooX and then set end = 14 and start = 15 respectively, then dynlm() can first put together the...

Matlab's VAR[X] coefficient constraints for vector time series

matlab,time-series

On Friday, May 29, 2015 at 2:05:06 PM UTC-4, Rick wrote: (1) No, not necessarily. Turning off a flag (i.e., setting a particular element of an input "solve" flag to logical FALSE) holds the corresponding parameter value fixed throughout the estimation. For example, if, say, the 3rd element of the...

Difference between multi year timeseries and it's 'standard year'

python,datetime,numpy,pandas,time-series

You can do this using the groupby, just subtract each group's mean from the values for that group: average_diff = ts.groupby([ts.index.month, ts.index.day]).apply( lambda g: g - g.mean() ) ...

SciKit-learn for data driven regression of oscillating data

python,time-series,scikit-learn,regression,prediction

Here is my guess about what is happening in your two types of results: .days does not convert your index into a form that repeats itself between your train and test samples. So it becomes a unique value for every date in your dataset. As a consequence your models either...