Group by GROUPING SETS with empty grouping: ... Group by Grouping Sets((datename(month,arvd), datepart(month,arvd), year(arvd)), ()) order by year(arvd), datepart(month,arvd) ...
group-by,sas,date-formatting,proc-sql
Try converting OCC_DATE to text like this: put(OCC_DATE, year4) as MyOCC_DATE I think the original code is only displaying the date in the format you assigned but it is still the actual numerical value of OCC_DATE. If you change it to character in the format you want then it should...
python,pandas,group-by,dataframes
To use multiple conditions you need to use bit-wise & and not and, also you need to enclose the conditions in parentheses due to operator precedence: DF[(DF['Iteration'] == CURRENTLOG_ID) & (DF['Feature Enabled'] == 1)].groupby([’Feature Active'])[['Value1','Value2']].mean() should work...
sql-server,sql-server-2008,group-by,distinct,database-performance
Both queries has the exactly same execution plan. select distinct a,b,c from bigTable select a,b,c from bigTable group by a,b,c works with the same process, so you can choose the sintax you prefer. For being a best Query tunning user, just use "Show execution plan" button from the SQL Server...
mongodb,group-by,nested,aggregate,multiple-columns
Please try with one additional (middleware) "group-by" step: db.tasks.aggregate([ {"$group": { "_id": { "account_id": "$account_id", "task_id": "$task_id", "workweek": "$workweek", }, "total_hours": {$sum: "$hours"} }}, {"$group": { "_id": { "$_id.account_id", "$_id.task_id" }, "workweeks": { "$push": { "workweek": "$_id.workweek", "total_hours": "$total_hours" } } }}, {"$group": { "_id": "$_id.account_id", "tasks": { "$push": {...
Change your query from : SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, [AsOfDate])) as DateOnly ,Min([Value]) FROM Table1 group by DateOnly To SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, [AsOfDate])) as DateOnly ,Min([Value]) FROM Table1 group by DATEADD(dd, 0, DATEDIFF(dd, 0, [AsOfDate])) the alias DateOnly is known to thie level of query ( only...
The problem here is not "how to pass a variable to the group-by attribute", but what is the content of the passed variable. The group-by attribute must contain an XPath expression - you are passing it a string value. If there are only two ways to group the nodes, I...
sql,postgresql,group-by,subquery
You can achieve this with a window function which doesn't require everything to be grouped: select * from ( select addresses.phone, addresses.name, orders.order_number, count(orders.order_number) over (partition by addresses.phone) as cnt from orders inner join carts on orders.order_number = carts.id inner join address on carts.address_id = addresses.id ) t where cnt...
To solve a problem like this, you first need to find the youngest male/female age for each representative. You can do that by using the MIN() function and grouping by sex and rep id like this: SELECT rep_id, sex, MIN(AGE) AS youngestAge FROM myTable GROUP BY rep_id, sex; Once you...
SQL Fiddle Oracle 11g R2 Schema Setup: CREATE TABLE FILE_USAGE_LOG (TESTID, SITE, LATEST_READ, READ_COUNT, FILE_ORIGIN_ID ) AS SELECT 'File1', 'Site1', DATE '2013-05-02', 2, 1 FROM DUAL UNION ALL SELECT 'File1', 'Site2', DATE '2014-01-22', 3, 2 FROM DUAL UNION ALL SELECT 'File2', 'Site1', DATE '2014-06-02', 8, 0 FROM DUAL UNION ALL...
sql-server,group-by,group-concat
Just needed to add DISTINCT to the STUFF. select stuff((select DISTINCT ', ' + Hospital from A where (InsuranceName = i.InsuranceName) for xml path(''),type).value('(./text())[1]','varchar(max)') ,1,2,'') as Hospitals, i.InsuranceName, sum(i.PatientCount) from A i group by i.InsuranceName; ...
One way would be to work with the indices of the minimum values. For example: >>> imin = df.groupby("Symbol")["x"].transform("idxmin") >>> df["yminx"] = df.loc[imin, "y"].values >>> df Symbol x y yminx 0 IBM 12 27 58 1 IBM 1 58 58 2 IBM 13 39 58 3 IBM 4 45 58...
> CREATE TABLE mytable(foo, boo, [...]); > CREATE INDEX bfi ON mytable(boo, foo COLLATE NOCASE); > EXPLAIN QUERY PLAN SELECT foo, boo FROM mytable WHERE foo LIKE 'hi%' GROUP BY boo; 0|0|0|SCAN TABLE mytable USING COVERING INDEX bfi ...
You are using Last_day() function and it is not a aggregate function so in groupby you should specify the function for grouping SELECT ID, LAST_DAY(EXPDATE) EXPDATE, SUM(BASEUNITS) AS BASEUNITS, SUM(BONUSUNITS) AS BONUSUNITS FROM TABLE WHERE ID= '10' GROUP BY ID, Last_Day(EXPDATE) ...
You can do this by finding the minimum time stamp and then choosing all the logins associated with that. This would be much easier with window/analytic functions, but in MySQL: select t.* from mytable t join (select t2.userid, substring_index(group_concat(t2.loginid order by timestamp), ',', 1) as firstlogin from mytable t2 group...
Add a new field to group on, as below. SELECT [GroupName], AVG(s.dlyspeed) FROM swa_intervention AS i INNER JOIN swa_speeding AS s ON i.policy_id = s.policy_id AND s.date_taken BETWEEN DATEADD(WEEK,-4,i.note_date) AND DATEADD(WEEK,4,i.note_date) INNER JOIN swa_policy AS p ON i.policy_id = p.id CROSS APPLY (SELECT CASE WHEN s.date_taken > i.note_date THEN 'After'...
First, a note on normalization: you shouldn't store job_id and user_id in both the applicants and application table. Likely, you only need them in the 'application' table, since I can go from applicant => application to determine that information. By storing those relationships in two tables, you open yourself up...
sql-server,join,group-by,sum,sql-update
Try this: update w set hist_amt_due = d.s from @work w join (select pmt_customer_no, sum(isnull(due_amt, 0)) s from @due_cte group by pmt_customer_no)d on w.pmt_customer_no = d.pmt_customer_no ...
I was facing a similar issue as the OP and found this question while looking for solutions. A simple hack that worked for me after going through the pandas documentation for categorical variables was to change the type of the categorical variable before grouping. Since vol_B is the categorical variable...
Try df$Type <- c('B', 'A')[(df$ID %in% c(101:103, 401:403))+1L] Or df$Type <- c('A', 'B')[(df$ID>103 & df$ID<401)+1L] df <- df[order(df$Type),] row.names(df) <- NULL df ID Point_A Type 1 101 10 A 2 102 20 A 3 103 30 A 4 401 100 A 5 402 110 A 6 403 120 A 7...
mysql,sql,database,group-by,group
Gordon should have credit for writing this out. But I think you probably just want to append a column with the appropriate descriptor and probably sort them in the order you'd like to see them. select title, case when date = curdate() then 'today' when date >= curdate() - interval...
mysql,group-by,left-join,right-join
SQL deals in tables. By definition a table has a bunch of rows, each of which has the same columns as each other. Your query is going to yield a result set that duplicates the client's information for each course she took. Your presentation layer is going to format that...
This is what in MoreLinq is called DistinctBy. But if that method works on IEnumerable, so you can't use it in an EF query. But you can use the same approach: var query = from r in db.SURV_Question_Ext_Model join s in db.SURV_Question_Model on r.Qext_Question_ID equals s.Question_ID where s.Question_Survey_ID == Survey_ID...
Maybe something like this: SELECT order_id, MAX(CASE WHEN state='created' THEN time_stamp ELSE NULL END) AS created_ts, MAX(CASE WHEN state='ack' THEN time_stamp ELSE NULL END) AS ack_ts, MAX(CASE WHEN state='shipped' THEN time_stamp ELSE NULL END) AS shipped_ts, MAX(CASE WHEN state='delivered' THEN time_stamp ELSE NULL END) AS delivered_ts FROM Table1 GROUP BY...
If you want mongodb to handle the query internally you could use the aggregation framework. In mongodb it looks like: db.users.aggregate( [{ $group: { _id: '$firstName', // similar to SQL group by, a field value for '_id' will be returned with the firstName values count: {$sum: 1} // creates a...
php,mysql,count,group-by,having-clause
You can do it by joining on the total category counts, and then using conditional aggregation: select modulecategory, count(case when requireall = 'yes' then if(s = t, 1, null) else s end) from ( select modulecategory,empname, requireall, count(*) s, min(q.total) t from employeeskill e inner join modulecategoryskill mcs on e.skillid...
I think this achieves your desired result, going at it slightly differently: def infect_new_people(group): if (group['Status'] == 'Infected').any(): # Only affect people not already infected group.loc[group['Status'] != 'Infected', 'Status'] = 'Infected2' return group['Status'] # Need group_keys=False so that each group has the same index # as the original dataframe df['Status']...
python,pandas,group-by,dataframes
You can use GroupBy.transform: >>> f = df.groupby(['City', 'Date'])['Weight'].transform >>> df['Wt_Diff'] = f('max') - f('min') >>> df City Date Sex Weight Wt_Diff 0 A 6/12/2015 M 185 65 1 A 6/12/2015 F 120 65 2 A 7/12/2015 M 210 105 3 A 7/12/2015 F 105 105 4 B 6/12/2015 M...
just break it into two steps objs = StoreVideoEventSummary.objects.filter(Timestamp__range=(start_time, end_time), Customer__id=customer_id, Store__StoreName=store)\ .order_by("Timestamp") def date_hour(timestamp): return datetime.datetime.fromtimestamp(timestamp).strftime("%x %H") groups = itertools.groupby(objs, lambda x:date_hour(x.Timestamp)) #since groups is an iterator and not a list you have not yet traversed the list for group,matches in groups: #now you are traversing the list ......
A good idea would be to first normalize your data. You could, for example try this way (I assume your source table is named Files): Create simple table called PermissionCodes with only column named Code (type of string). Put r, w, and x as values into PermissionCodes (three rows total)....
sql-server,tsql,group-by,max,aggregate-functions
You are grouping by the aggregate in your query. You need to group by the scalar columns instead. In this case, group by f.fizz_name, fu.foo_name
javascript,angularjs,database,group-by
For groupping you can use reduce function angular.module('myApp.view2', []) .controller('View2Ctrl', ['$scope', '$filter', function($scope, $filter) { $scope.securities = ['Preferred Stock', 'Common Stock', 'Options']; $scope.transactions = [{ security: 'Preferred Stock', name: 'Robert', value: 5, date: '2014-1-3' }, { security: 'Preferred Stock', name: 'Robert', value: 5, date: '2014-1-5' }, { security: 'Common Stock',...
Let's try this Example Ranking without skipping a rank set @number:=0; set @balance:=0; select customer, balance, rank from ( select *, @number:=if(@balance=balance, @number, @number+1) as rank, @balance:=balance from account order by balance ) as rankeddata; Result customer balance rank S 300 1 Q 400 2 R 400 2 P 500...
mysql,oracle,group-by,sql-order-by,aggregate-functions
Yes, it's possible. Return id_article in the SELECT list, instead of title, and wrap that whole query in parens to make it an inline view, and then select from that, and a join to the articles table to get the associated title. For example: SELECT b.title , c.tag_count FROM (...
sql-server,sql-server-2008,group-by,group,aggregates
I'm assuming that shift 1 is suppose to start at midnight and shift 2 starts at noon, but its sometimes off by some amount of time. This query should work if those assumptions are true. I've made a variable called @ShiftFudgeHours to account for how off the shift start times...
If you want to first take mean on ['cluster', 'org'] combination and then again take mean on cluster groups In [59]: (df.groupby(['cluster', 'org'], as_index=False).mean() .groupby('cluster')['time'].mean()) Out[59]: cluster 1 15 2 54 3 6 Name: time, dtype: int64 If you wan't mean values by cluster only, then you could In [58]:...
If you could take the values on separate rows, then you could do something like this: (select a.* from archivetable order by maxtemp limit 1) union (select a.* from archivetable order by maxtemp desc limit 1) union . . . Otherwise, if you can do something like this: select atmint.mintemp,...
"Same address" is expressed by "groupby". import pandas as pd df=pd.DataFrame({'First': [ 'Sam', 'Greg', 'Steve', 'Sam', 'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'], 'Last': [ 'Stevens', 'Hamcunning', 'Strange', 'Stevens', 'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'], 'Address': ['112 Fake St','13 Crest St','14 Main St','112 Fake St','2 Morningwood','7 Cotton Dr','14 Main St','20 Main...
mysql,group-by,heatmap,weekday
Those looks like "counts" of rows. One of the issues is "sparse" data, we can address that later. To get the day of the week ('Sunday','Monday',etc.) returned, you can use the DATE_FORMAT function. To get those ordered, we need to include an integer value 0 through 6, or 1 through...
mysql,group-by,sum,sql-order-by,sql-limit
Using one of the answers from ROW_NUMBER() in MySQL for row counts, and then modifying to get the top. SELECT ParticipantId, SUM(Points) FROM ( SELECT a.participantid, a.points, a.id, count(*) as row_number FROM scores a JOIN scores b ON a.participantid = b.participantid AND cast(concat(a.points,'.', a.id) as decimal) <= cast(concat(b.points,'.', b.id) as...
string,python-2.7,pandas,group-by,aggregate
Sorry are you after this: In [14]: df['new_string'] = df.groupby('type')['string'].transform(lambda x: '+'.join(x)) df Out[14]: type item string new_string 0 1 0 aa aa+bb+cc 1 1 1 bb aa+bb+cc 2 1 2 cc aa+bb+cc 3 2 0 dd dd+ee+ff 4 2 1 ee dd+ee+ff 5 2 2 ff dd+ee+ff The above...
Is this what you are looking for? SELECT <choose your columns here> FROM Orders o LEFT JOIN XrefOrdersStatuses x ON x.xos_order_id = o.order_id LEFT JOIN (SELECT xos_order_id, MAX(xos_datetime) AS maxdate FROM XrefOrdersStatuses GROUP BY xos_order_id ) xmax ON xmax.xos_order_id = x.xos_order_id AND xmax.maxdate = x.xos_datetime; The LEFT JOIN is only...
The error you've run into In Oracle, it's best to always name each column in each UNION subquery the same way. In your case, the following should work: select count(*) as theCount, COMP_IDENTIFIER from CORDYS_NCB_LOG where AUDIT_CONTEXT='FAULT' group by COMP_IDENTIFIER -- don't forget this union select count(*) as theCount, COMP_IDENTIFIER...
sql,oracle,count,group-by,aggregate-functions
Use a derived table? select sum(cnt) from ( select count(*) as cnt from t_object union all select count(*) as cnt from t_diagram ) dt ...
python,pandas,group-by,time-series
(Am a bit amused, as this question caught me doing the exact same thing.) You could do something like valgdata\ .groupby([valgdata.dato_uden_tid.name, valgdata.news_site.name])\ .mean()\ .unstack() which would reverse the groupby unstack the new sites to be columns To plot, just do the previous snippet immediately followed by .plot(): valgdata\ .groupby([valgdata.dato_uden_tid.name, valgdata.news_site.name])\...
SELECT username ,LISTAGG(colour , ',') WITHIN GROUP (ORDER BY colour ) AS colour ,age FROM t GROUP BY username,age ; ...
You could do something like this: SELECT Student.SID, Student.SName, Student.SEmail, SUM(CASE WHEN Fees_Type.FName='Chess' THEN 1 ELSE 0 END) AS Total_Chess_Played, SUM(CASE WHEN Fees_Type.FName='Cricket' THEN 1 ELSE 0 END) AS Total_Cricket_Played, SUM(Fees_Type.FPrice) AS ToTal_Fees FROM Student JOIN StudentFees ON Student.sId = StudentFees.EId JOIN Fees_Type ON Fees_Type.fId = StudentFees.fId WHERE MONTH(StudentFees.TDDate) =...
Is the below the correct and most narrow scope for the produced result? Well, IGrouping<TKey,TElement> implements IEnumerable<TElement> so you could do: IEnumerable<IEnumerable<TimeInfo>> grouping = = array.GroupBy(element => element.DayOfWeek); Note that the data (including the key values) is still in the concrete data structure; you're just interfacing with is as...
What you are looking for is affectionately known as a groupwise max: http://jan.kneschke.de/projects/mysql/groupwise-max/ https://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html...
// Get data $query = "SELECT * FROM games WHERE player = '$thePlayer' ORDER BY whenPlayed"; $result = $mysqli->query($query); $data = array(); // Iterate over data, use 'whenPlayed` as a key to group your data while ($row = $result->fetch_array(MYSQLI_ASSOC)) { if (!array_key_exists($row['whenPlayed'], $data)) { $data[$row['whenPlayed']] = array(); } $data[$row['whenPlayed']][] =...
sql,sql-server,sql-server-2008-r2,group-by,sql-server-2012
Using Pivot - can look like this: SELECT [TimeStamp], [1] AS [No of records for Id=1], [2] AS [No of records for Id=2], [1]+[2] AS Total FROM dbo.YourTable PIVOT ( COUNT(ID) FOR ID IN ([1],[2]) ) pvt ORDER BY [TimeStamp] ...
Assuming that you have put all categories from a complete file into a data frame called categories.df categories <- c(1,2,3,4,5) # create data frame categories.df <- data.frame(categories) # rename column name colnames(categories.df)[colnames(categories.df)=="categories"] <- "mode" > categories.df mode 1 2 3 4 5 Below is the sample code to merge categories.df...
Here is one method: In [19]: infected = df[df['Status']=='Infected'].set_index('Address') df.loc[df['Address'].isin(infected.index),'Status'] = df['Address'].map(infected['Status']).fillna('') df Out[19]: Address Players Status 0 112 Fake St Sam Infected 1 13 Crest St Greg 2 14 Main St Steve Dead 3 112 Fake St Sam Infected 4 2 Morningwood Jill 5 7 Cotton Dr Bill Infected...
Well, I don't see the need of the group by from your Expected sorted Data. I would do results = results.OrderBy(r => r.Date) .ThenBy(r=>r.TransParentType) //just add an order criterion, checking if TransType == the value that you want at the end //as order by a boolean returns false results first,...
You seem to want the cheapest product for each product name. The query is a bit complicated because the price and name information are separated. One way of writing the query is as: select p.Product_ID, ppd.Product_Name, p.price from product_description pd join products p on pd.product_id = p.product_id join (select pd2.Product_name,...
This has nothing to do with the nvarchar type. The issue is that you include the column you aggregate in the group by clause. Remove [FOC NET AMOUNT] from the group by clause. It shouldn't be there as it is being used in an aggregate function and as such isn't...
postgresql,count,group-by,sql-order-by
Sometimes, it is easier to use subqueries: select p.*, (select count(*) from scores s where p.id_players in (s.winner, s.loser) ) as GamesPlayed, (select count(*) from scores s where p.id_players in (s.winner) ) as GamesWon from players p order by GamesWon desc; If the maximum of the round is the number...
sql,database,group-by,partition-by
I think you want this: select i.* from info i where type = 'B' union all select i.* from info i where not exists (select 1 from info i2 where i2.name = i.name and i2.type = 'B'); ...
python,pandas,count,group-by,pivot-table
You could use pd.crosstab() In [27]: df Out[27]: Col X Col Y 0 class 1 cat 1 1 class 2 cat 1 2 class 3 cat 2 3 class 2 cat 3 In [28]: pd.crosstab(df['Col X'], df['Col Y']) Out[28]: Col Y cat 1 cat 2 cat 3 Col X class...
sql,sql-server,ms-access,group-by
As long as the table don't contain multiple entries on Company, Team, Balance than you SQL should work just fine. But given your explained issue, I pressume there are more values than what are shown and can therefore cause more rows with the same information shown more than once, which...
sql,sql-server-2008,group-by,common-table-expression,for-xml-path
You can use FOR XML PATH like this for concatenation ;with cte as ( select Data , SUBSTRING(Data,1,1) as Chars,1 as startpos from @t union all select Data, SUBSTRING(Data, startpos+1,1) as char,startpos+1 from cte where startpos+1<=LEN(data) ), CTE2 AS ( select Data,Chars,Cast(Chars as varchar(1)) + ' appears (' + cast(COUNT(*)...
sql-server,tsql,sql-server-2008-r2,group-by
You can use the nullif function to return nulls if the date is equal to 1/1/1900 as nullif(Birthday,'1/1/1900') to your advantage. This query can get you started to see all the records with their possible matches: select p1.person_id from core_person p1 join core_person p2 on p1.person_id <> p2.person_id and LEFT(p1.first_name,5)...
sql,regex,postgresql,group-by,order
Not sure if this would be faster than your solution: select f.* from fruits f join orders o on string_to_array(f.name, ' ') @> string_to_array(o.name, ' ') and cardinality(string_to_array(f.name, ' ')) = cardinality(string_to_array(o.name, ' ')); The idea is to split both values into array and check if they overlap. But because...
You can use conditional aggregation: SELECT game_id, COUNT(*) AS roundsCount, COUNT(CASE WHEN winner = player1_id THEN 1 END) As p1WinsCount FROM games AS g INNER JOIN rounds AS r ON g.Id = r.game_id GROUP BY game_id Demo here...
python,loops,dictionary,group-by
You consume group iterator in the first sum, call list on group group = list(group) to store the contents in a list so you can use them twice: for key, group in groupby(sorted(products['product'], key=grouper), grouper): temp_dict = dict(zip(["id", "stock_id", "lot_id"], key)) group = list(group) temp_dict["qty"] = sum(item["qty"] for item in...
You can try this to see whether it works out. I assume if the clip has been triggered, then NaN will be put. You can replace it by your customized choice. import pandas as pd import numpy as np # use np.where(criterion, x, y) to do a vectorized statement like...
reporting-services,group-by,border,hidden,row-number
For your TOP Border Style, use the expression: =IIF(Fields!TourDate.Value = PREVIOUS(Fields!TourDate.Value), "None", "Solid") This just checks to see if your TourDate field is different from the Previous row. The default would need to be solid. You can use it with the border color (black) instead if you want the grey...
mysql,sql,group-by,case,amazon-redshift
Aggregate the table in the from clause to get the limits you want. Join those results back to your query and use those values for the query: select substring(fj.fruit, 5, 5), sum(fj.value <= fmm.minv + (fmm.maxv - fmm.minv) * 0.1) from fruit_juice fj join (select substring(fruit, 5, 5) as fruit5,...
Try to SUM/AVG (depending on what you need) PM.MTM * PG.HOURS AS ADJ_MTM and (PG.UNITS / (PM.MTM * PG.HOURS)) AS PERC_STANDARD, not group by them: SELECT FD.YEAR, FD.WEEK, PG.DT, PG.CAT, SUM(PG.UNITS) AS UNITS, SUM(PG.HOURS) AS HOURS, PM.MTM, SUM(PM.MTM * PG.HOURS )AS ADJ_MTM, SUM((PG.UNITS / (PM.MTM * PG.HOURS))) AS PERC_STANDARD, SUM(CASE...
select word, id, count(*) from your_table group by word, id ...
You can use a correlated subquery: SELECT id, name, (SELECT COUNT(*) FROM mytable AS t1 WHERE t1.name = t2.name) AS cnt FROM mytable AS t2 Demo here ...
sql-server,select,group-by,inner-join,distinct
You can use ROW_NUMBER with a PARTITION BY clause to identify duplicates: ;WITH CTE AS ( SELECT ROW_NUMBER() OVER (PARTITION BY ITEMNUMBER ORDER BY ROWNUMBER DESC) AS rn, INVENTABLE.ITEMNUMBER, INVENTABLE.ITEMNAME1, INVENTABLE.ITEMNAME2, INVENTABLE.W_TILBUD, INVENTABLE.COSTPRICE, INVENTABLE.VENDITEMNUMBER, INVENTABLE.A_PRODUCENT, INVENTABLE.GROUP_, INVENTABLE.A_GROSSISTLAGER, INVENTABLE.SupplementaryUnits FROM INVENTRANS INNER JOIN INVENTABLE ON INVENTABLE.ITEMNUMBER=INVENTRANS.ITEMNUMBER WHERE INVENTRANS.ACCOUNT='xxx' AND...
Skip the GROUP BY, return a row if no other row with same Employee_Number and Cap_Id but a later date exists! SELECT Employee_Number, Cap_Id, Score, Date_Added FROM Scores s1 WHERE NOT EXISTS (select 1 from Scores s2 where s1.Employee_Number = s2.Employee_Number and s1.Cap_Id = s2.Cap_Id and s1.Date_Added < s2.Date_Added) Will...
sorting,elasticsearch,group-by,order
Edit to reflect clarification in comments: To sort an aggregation by string value use an intrinsic sort, however sorting on non numeric metric aggregations is not currently supported. "aggs" : { "order_by_title" : { "terms" : { "field" : "title", "order": { "_term" : "asc" } } } } ...
algorithm,scala,group-by,apache-spark,filtering
This kind of thing is a lot easier if you convert the original RDD to a DataFrame: val df = sc.parallelize( Array((1,30),(2,10),(3,20),(1,10), (2,30)) ).toDF("books","readers") Once you do that, just do a self-join on the DataFrame to make book pairs, then count how many readers have read each book pair: val...
sql,sqlite,group-by,sql-order-by,greatest-n-per-group
You could look up the three most recent dates for each ID: SELECT ID, Date, Value FROM MyTable WHERE Date IN (SELECT Date FROM MyTable AS T2 WHERE T2.ID = MyTable.ID ORDER BY Date DESC LIMIT 3) Alternatively, look up the third most recent date for each ID, and use...
Here is another option: combs <- combn(unique(books), 2)# Generate combos of books setkey(bt, books) both.read <-bt[ # Cartesian join all combos to our data data.table(books=c(combs), combo.id=c(col(combs))), allow.cartesian=T ][, .( # For each combo, figure out how many readers show up twice, meaning they've read both books read.both=sum(duplicated(readers)), book1=min(books), book2=max(books) ),...
Try using MIN(): SELECT ... MIN(s.IsArchived) as IsArchived, ... FROM ... Assuming that s.IsArchived is always either 0 or 1, it will return 0 if at least one of the rows contains the value 0 in this column, otherwise 1....
mongodb,group-by,aggregate-functions
Use the following aggregation pipeline to get the desired results: db.collection.aggregate([ { "$sort": { "updatedAt": -1 } }, { "$group": { "_id": { "to": "$to", "from": "$from" }, "id": { "$first": "$_id" }, "message": { "$first": "$message" }, "createdAt": { "$first": "$createdAt" }, "updatedAt": { "$first": "$updatedAt" } } },...
Your code groupbys A values, then, for each such value, groupbys again the entire dataframe by B, so that's why you're getting too many combinations. To do what you want, your double loop should groupby the B values only on the result of the first groupby: for k1, gp1 in...
The good news is that your query is perfectly legal (and straight forward!) ANSI SQL, that will work on any sensible database. The bad news is that MS Access which you're using does not support this syntax. It can be worked around with a subquery, though: SELECT t.date, COUNT(*) FROM...
With help from this question and its answers: SELECT gid, capt, row_number() OVER (PARTITION BY capt ORDER BY gid) AS rnum FROM your_table_here ORDER BY gid; The row_number window function provides the count. The PARTITION BY statement in the OVER clause tells the database to restart its numbering with each...
Group by is implying distinct values based on subscription.id so probably if you take the group by you you will get something like 8,8,8,8,8,7,7,7,7,7 due to the joins and such. With the group by you only get the distinct values of 8 and 7. When you do sum with the...
sql,sql-server,group-by,subquery,aggregate-functions
You can use ROW_NUMBER(): SELECT id, subject, content, moreContent, modified FROM ( SELECT id, subject, content, moreContent, modified, ROW_NUMBER() OVER (PARTITION BY subject ORDER BY modified DESC) AS rn FROM [CareManagement].[dbo].[Careplans] ) t WHERE rn = 1 rn = 1 will return each record having the latest modified date per...
You need to include the month in the second join condition. SELECT * FROM product_data A LEFT JOIN (SELECT prod_id, sale_date, sum(sales) FROM sales_data group by prod_id, MONTH(sale_date) ) AS B ON A.prod_id=B.prod_id RIGHT JOIN (SELECT prod_id, exp_date, media_expenditure --don't need to SUM since it's only 1 row per month...
r,group-by,datediff,date-difference
You can try library(dplyr) df1 %>% group_by(Customer, Item, Zip) %>% filter(n()>1) %>% summarise(AvgDays=mean(diff(Date)),TotOrd= n(), TotAmt=sum(NetSales)) # Customer Item Zip AvgDays TotOrd TotAmt #1 ABC123 GHTH123 76137 14 2 2700 #2 XYZ999 ZZZZZZZ 68106 59 2 550 #3 XYZ999 YYYYYYY 68106 60 3 1250 Or library(data.table) setDT(df1)[, if(.N>1) list(AvgDays= mean(c(diff(Date))), TotOrd=.N,...
If condition you used in your query works for your data then you can use this query: select t2.class, sum(case when regno like 'ABC%13-14%' then 1 else 0 end) c2013, sum(case when regno like 'ABC%14-15%' then 1 else 0 end) c2014 from table_1 t1 join table_2 t2 on t1.class_id =...
c#,arrays,string,linq,group-by
The problem is due to grouping by new [] it should be simply new, also you have to specify field names for your anonymous types used in group, like: IEnumerable<string[]> query = from row in data where row[0] == "1000200034" group row by new { FirstKey = row[0], SecondKey =...
Using get() and attach() isn't really consistent with dplyr because it's really messing up the environments in which the functions are evaulated. It would better to use the standard-evaluation equivalent of mutate here as described in the NSE vigette (vignette("nse", package="dplyr")) for(i in spec){ output<-iris %>% group_by(Size)%>% mutate_(.dots=list(out=lazyeval::interp(~mean(x), x=as.name(i)))) #...
You have: SUM(C.RATE * (COUNT(*))) AS TRANSIT_CHARGE Nesting aggregation functions (generally) doesn't work. You probably just want: SUM(C.RATE) AS TRANSIT_CHARGE This adds up all the rates on the rows generated before the group by....
To filter out some rows, we need the 'filter' function instead of 'apply'. by = df.groupby(['Symbol', 'Date', 'Strike']) # this is used as filter function, returns a boolean type selector. # pandas.groupby.filter() function would be smart enough to keep all those # entry with True def equal_to_45(group): # return True...
ID is unique and group by ID works just like a plain select. Column createdAt is not unique and results with same createdAt value must be grouped. You should provide a way how they will be grouped - use aggreagete function, remove them from select clause or add them to...
c#,linq,group-by,many-to-many,left-join
Getting the list of all the projects with the assigned users: var projects= from p in db.Projects select new{ project_Id = p.Project_Id, projectName = p.ProjectName, userList = p.Projects-Users.Select(pu=>pu.Users.UserName).ToList()}; ...
fix adjust dynamic sql string to add group by on fields ItemNumber,Item_Name as below static query SELECT ItemNumber,Item_Name,sum(Qunatity) as Qunatity1 ,sum(FOCQty) as FOCQty1 from toCSV5 where ItemNumber = 'I1' group by ItemNumber,Item_Name ; output +------------+-----------+-----------+---------+ | ItemNumber | Item_Name | Qunatity1 | FOCQty1 | +------------+-----------+-----------+---------+ | I1 | ABC |...
sql,sql-server,group-by,sql-order-by,group
One solution could be to use windowed aggregate functions: select *, case when date = MIN(date) over (partition by groupid order by groupid) then 'TRUE' else 'FALSE' end isFirst, case when date = MAX(date) over (partition by groupid order by groupid) then 'TRUE' else 'FALSE' end isLast, count(*) over (partition...
ruby,json,api,ruby-on-rails-4,group-by
you could just render back without the arcs e.g. format.json { render :json => @results[:arcs].to_json } That being said you could also just change the controller method as well to this but you will have to change how the html response handles @results: def index @publisher = Publisher.find(params[:publisher_id]) @books =...
sql,group-by,subquery,aggregate
First, filter out all brushes. This can be done in the where clause. Then you need to process each order as a bunch of records. This can be achieved with group by OrderNumber, Product_Type which breaks your table into order groups. Then you can filter these groups in the having...