The regex needs some work. I've changed the regex based on your requirement i.e ! _ % needs to be updated by space. %let st1 = a b c; %let st2 = a_b_c; %let st3 = %nrstr(a%b%c); %let st4 = a!!b!!c; %put %sysfunc(prxchange(s/[\_\!\%]/ /,-1,%bquote(&st2))); %put %sysfunc(prxchange(s/[\_\!\%]/ /,-1,%bquote(&st3))); %put %sysfunc(prxchange(s/[\_\!\%]/ /,-1,%bquote(&st4)));...
You could sum up the counts for an overall total and put it into a macro variable: proc sql ; select sum(d)+sum(i)+sum(s) into :N from input ;quit ; You can then reference this in your code: data out.calculate_age; set out.calculate_age ; if age = "" then age = "All"; if...
When no order is specified in SQL the order you run the risk of it coming out in a way you don't want as you have in your case. How about adding an order variable(that you then ignore): data curr ; input ordvar currency $ ; cards ; 1 USD...
Date values in SAS (when saved correctly) are stored as an integer, namely the integer number of days since 1/1/1960. So today is 20230, for example. Then formats tell SAS what it should look like when printed neatly; and informats tell SAS how to translate neatly printed dates to this...
What you're doing is fine (limiting the maximum number of records taken from any table to 100), but there are a few alternatives. To avoid any execution at all, use the noexec option: proc sql noexec; select * from sashelp.class; quit; To restrict the obs from a specific dataset, you...
You can use the scan function to get the desired result. By altering the example you have in the link to fit your example: data one; input id name :$10. age PG_86xt AG_86xt IG_86xt; datalines; 1 George 10 85 90 89 2 Mary 11 99 98 91 3 John 12...
If you want to count the number of observations that are 0, you'd want to use proc tabulate or proc freq, and do a frequency count. If you have a lot of values and you just want "0/not 0", that's easy to do with a format. data have; input a...
The datastep process as implicit do loops. So when you consider your datastep... data third; set first; output; set second; output; run; ...your two set statements both act as a dripfeed, providing one observation from the corresponding dataset sets specified on each interation through the datastep loop. If you wanted...
You can use the Data Step automatic variable _n_. This is the iteration count of the Data Step loop. Data want; set have; ID = _n_; run; ...
As others have said, hash tables really would be better (and probably easier to manage). Still, how about this? Test data: data dummy ; input A $ B $ C $ D $ v1 v2 v3 v4 ; cards ; ab ba cf dm 1 2 3 4 ab bc...
arrays,sas,global-variables,sas-macro
What you're trying to do is basically to use a data driven programming approach to drive your macros. Good for you! However, you can't do it directly the way you are trying to. While you could use a macro array the way Yukclam9 mentions, there's an easier way. SAS doesn't...
You should max(I) as your condition. Something like: having I=max(I); Also try this by one proc sql. proc sql; select * from (select time,sum(volume) as total from have group by time) having total=max(total); quit; ...
This question is probably a bit to broad for SO, since you probably have infinite ways to solve the actual problem. But, I'll give my way to address the general assignment. One way to solve this is by rearranging your managers to a more overcoming dataset, where you have every...
A few steps of preprocessing are needed as far as I can tell.... Load your data: data a1 ; input a b c ; cards ; 2 3 4 1 2 3 ;run ; data a2 ; input a b d ; cards ; 0 0.3 1 0 0.2 0...
They are identical in terms of performance. And yes, the where will only allow matching results to be loaded to the PDV. According to the documentation (http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000202951.htm): The WHERE statement selects observations before they are brought into the program data vector, making it a more efficient programming technique. The above...
Depends on the version of SAS, and what you mean by default. There's the user default - which you get when you open SAS, and there's the factory installed default which is what it comes set as in the installation. This is an easily changed option so I wouldn't worry...
A simple macro would do: %macro monthly_table( year, month); proc sql; create table as try_&year.&month. as select t1.var1 as var&year.&month. from t1 left join .... on... where t1.var&year.&month. is not missing; quit; %mend monthlytable; %macro reporting( year,month); %do month =1 to &month; %monthly_table( &year, &month) end; %mend reporting; In case...
There's a handful of ways to deal with this. Search for "Data driven programming", for example. I'll show two: my preferred one, and the one a lot of people would suggest. First, the popular solution would be to do a macro loop. Properly, I would write a macro with min...
Well done on having a go yourself and welcome to the site. My answer is not the most elegant, but it solves your problem. You don't need macros or _N_, but retain comes handy. Here is my code. You retain the average, and have a running count to make sure...
You need to use a set statement to access data from another SAS dataset. data all_ssr; set work.ttest; /*Dataset containing column of values*/ df=25; p=(1-probt(abs(x),df))*2; run; Removing the put statement avoids clogging up the log....
Data l1.MD; infile 'E:\Sasfile\f1.txt' recfm=f lrecl=32767 pad; input text $char32767.; run; That would do it. RECFM=F tells SAS to have fixed line lengths (ignoring line feeds) and the other options set the line length to the maximum for a single variable (lines can be longer, but one variable is limited...
You're nearly there: proc sql ; create table v3 as select *, case when time>mean(time) then 1 else 0 end as time_group from v2; quit; ...
Since you want to loop over Proc import you will have to Macros for that, also since you Numbers 97, 98, 99, 00, 01, 02 are not consecutive you will have to use a workaround. %let files=97,98,99,00,01,02; %macro loop_over; %do i=1 %to %sysfunc(countw("&files.")); PROC IMPORT OUT= WORK.data%sysfunc(scan("&files.",&i.,",")) DATAFILE= "\...\file%sysfunc(scan("&files.",&i.,",")).csv" DBMS=CSV...
Assuming this is provided as a macro variable, this is pretty easily done with a side to side merge-ahead. Certainly faster than a transpose for K much larger than the total record count, and probably faster than looping POINTs. Basically you merge the original dataset to itself, and use FIRSTOBS...
You can do the below and just load in any others values of wait_time you need to change: data input ; set input ; select (wait_time) ; when ('Within the next 6 months') wait_time='<6 Months' ; when ('Between 6 months and a year') wait_time='6 to 12 Months' ; otherwise ;...
Create a dictionary table containing your standards: name label format USERID Username $20. ORDERNO Order Number 8. Join to dictionary table containing column names in your library: proc sql; create table standards2 as select d.memname, s.name, s.label, s.format from sashelp.vcolumn d inner join standards s on d.name = s.name where...
Assuming you're familiar with SQL-left join syntax, you can use the coalesce() function to achieve this. It simply returns the first non-missing value. Using @user667489's sample datasets: proc sql noprint; create table want as select coalesce(b.newname,a.name) as name from original a left join current b on b.name = a.name ;...
Macro variables do not use quotations. %macro test(var); %if &var = %str(Sub Prime) %then %do; %let var2 = Sub_Prime; %put &=var2; %end; %mend; %test(Sub%str( )Prime); You'd be better off using %str around the whole thing, though, rather than inserting the %str in just the space. %test(%str(Sub Prime)); ...
stored-procedures,sas,sas-macro
filename() is restricted in environments with OPTION NOXCMD, which is by default set for server environments. This is for security reasons (as XCMD allows shell access). You can enable this with by enabling OPTION XCMD, though your server admin (if this is not you) would have to enable it on...
You can use this simplified regex: /^[\s0]*11\s*$/ ...
If this is the structure of your data, I would transpose: proc transpose data=parm2 out=parmt; var _all_; run; Then reference the two columns to create all the macro variables and their corresponding values: data _null_; set parmt; call symput(_name_,col1); run; ...
proc surveyselect is the general tool of choice for random sampling in SAS. The code is very simple, I would just sample 4000 of each group, then assign a new subgroup every 2000 rows, since the data is in a random order anyway (although sorted by group). The default sampling...
This was my answer to solve the user's problem. In general, I loaded Stat1-Stat3 into an array, sorted the array with sortc call function and then summed it up by a temporary ID which was constructed out of sorted Stat1-Stat3 array. /* Loading the data into SAS dataset */ /*...
I came up with an approximate solution exploiting plot2 and the classification syntax y*x=i. IMHO (after an extensive process of RTFM and technical paper search), your original request of putting all the plots into one graph cannot be simply done since The by statement is designed for producing DISTINCT graphs....
The "Publish to Email transformation" uses ODS HTML to generate the output so you'll get a HTML output. If you want an XLS output then there is a way. You could change the extension of the output file to xls to generate xls file from the ODS HTML. This is...
sql,performance,sas,large-data
To replace one column, a format (roughly similar to Bendy's approach) is easiest. To replace ten columns, always coming from the same row, I recommend a hash table. Around the same speed as a single format, typically. (Formats actually can be a bit slow at the 10MM rows mark, so...
Try this: %include '/saswrk/go/scripts/envsetup.sas'; filename mymail email "&emaillist" subject = "&env Records Transferred on %sysfunc(date(), yymmdd10.)"; data _null_; length id_list $ 3000; retain id_list ''; set workgo.recds_processed nobs = nobs end = eof; file mymail; if _n_ = 1 then do; put 'Number of records processed=' nobs; put 'The IDs...
DATA Out.Temp_order; SET Out.HSI_income_range; if income_range="Above Rs. 1 Crore" then dummy_column= 6; if income_range="Between Rs. 10-20 lakh" then dummy_column= 2; if income_range="Between Rs. 21-30 lakh" then dummy_column= 3 ; if income_range="Between Rs. 31-50 Lakh" then dummy_column= 4 ; if income_range="Between Rs. 51 Lakh – Rs. 1 Crore" then dummy_column= 5;...
How about you reorder your input dataset into a random order and then calculate the distance for every second observation? proc sql ; create table random as select *, ranuni(0) as randorder from have order by randorder ;quit ; data want ; set random; dist=abs(dif1(x)) ; if _n_/2=int(_n_/2) ; run;...
You can use SAS dictionary tables to find out the last modified date of the tables residing in a certain library. proc sql; create table dataset_1 as select libname,memname, modate from dictionary.tables where libname="your_library"; quit; ...
data test(Rename=(MM=Month)); set test2; MM =inPUT(Month,monyy5.); format MM $MonYY5.; Drop MM; run; ...
This should work with custom month ranges (although assumes the months are all present between min and max for each ID): Dummy data with different month ranges: data input ; do ID=1 to 2 ; do month_id=3+ID to 20-ID*2 ; parm1=int(ranuni(1)*100) ; parm2=int(ranuni(1)*100) ; output ; end ; end ;...
one way to do this would be to use dictionary.columns proc sql; create table Attribute as select * from dictionary.columns; Read through the table and check what attributes you are interested in. For your case you might be interested in the column "NAME" <- consist of the name of all...
Your dataset with character YYYYMM dates: data input ; input date $ ; cards ; 201201 ;run ; You can create a new variable as below: proc sql ; create table output as select date, input(date, yymmn6.)+14 as newdate from input ;quit ; ...
NOPRINT prevents anything to be written to the lst file. The lst file is the batch equivalent of the output window. Anything that would get written to the output window in an interactive session (because ODS LISTING is active) will be written to the lst file. I always run the...
Similar one: proc sort data = orig_data(drop = Other_field); by salesman_id day_id; run; data test (drop = total); retain salesman_id day_id; set orig_data ; by salesman_id day_id notsorted; if first.day_id then sum = total; else sum + total; if last.day_id then output; run; proc transpose data = test out =...
If you want to the number of observations in a dataset into a macro variable then you can put this count into a macro variable - as in your case: proc sql ; select count(*) into :N from out.datafile ; quit ; You can then call this in your final...
Use variable list. data have; input (shiyas1-shiyas3) (:$1.); cards; 1 2 3 ; data want; set have; length cat_shiyas $ 100 /*large enough to hold the content*/ ; cat_shiyas=cats(of shiyas:); run; ...
This becomes a one sided test, so you use the one tailed p-value. Please see the free SAS training course via SAS University Home Page that has the first statistical course, https://communities.sas.com/community/sas-analytics-u, see the training widget on the left handd side of the page. If you have a SAS Communities...
proc sql is one way to solve this kind of situation. I'll break down your original requirements with explanations in how to interpret them in sql. Since you want to group your observations on date, you can use the having clause to filter on the max date per month. data...
If the var<xx> variables are all multiples of ten, i.e. there are no other variables beginning with var, you can use the colon-operator, which acts as a wildcard, e.g. drop var: ; /* drop all variables beginning with 'var' */ Alternatively, you can dynamically generate a list of all the...
There are two informats that can read the text "06/25/2015 03:02:01" and convert it to correct SAS datetime value: ANYDTDTM and MDYAMPM
The macro %IF statement implicitly calls the %EVAL() function. %EVAL() understands integers (whether positive or negative), but not decimal values. When %EVAL() compares two values, if one of them is a decimal it will do a CHARACTER comparison. So %IF (3.1>10) returns true. If you give %EVAL a decimal with...
I don't think there is a direct function for that, you would have to write logical expressions. You can use any of the two below, both take same amount of time, one use mod function other uses intfunction data _NULL_; x=236893953323.1235433; if mod(x,1) = 0 then /**If you want to...
Macro does NOT like unnecessary quotes: %let z = %sysfunc(quantile(normal, 0.975)); ...
So I think you're trying to do something like the below. Note the use of the double @ symbol. That line of code will read in the input into a temporary variable called _infile_. The double @ symbol will prevent the input cursor from progressing. This lets us 'look ahead'...
To get rid of leading spaces before your string: left(trouble_maker); (the one you need) To get rid of trailing spaces after your string: trim(trouble_maker); To get rid of consecutive spaces within your string: compbl(trouble_maker); To get rid of all spaces in your string: compress(trouble_maker);...
group-by,sas,date-formatting,proc-sql
Try converting OCC_DATE to text like this: put(OCC_DATE, year4) as MyOCC_DATE I think the original code is only displaying the date in the format you assigned but it is still the actual numerical value of OCC_DATE. If you change it to character in the format you want then it should...
Basically, you have a misunderstanding of how macros work and timing. You need to compile the macro list previous to the proc report, but you can't use call execute because that actually executes code. You need to create a macro variable. Easiest way to do it is like so: proc...
I would do this in a bit different manner. You can do it in a few ways, but maybe one SQL step and one datastep would be easiest. proc sql; create table lookup_lastdate as select customer_id as start, max(transaction_Date) as label, 'LASTDATEF' as fmtname from transaction_vw group by customer_id; quit;...
Weighted averages are supported in a number of SAS PROCs. One of the more common, all-around useful ones is PROC SUMMARY: PROC SUMMARY NWAY DATA = my_data_set ; CLASS Date ; VAR Return / WEIGHT = MarketCap ; OUTPUT OUT = my_result_set MEAN (Return) = MarketReturn ; RUN; The NWAY...
Your condition is backwards. TableB has the longer column, so you want it on the left side of LIKE: PROC SQL; select i.* from tableA i where exists (select * from tableB b where b.column1 like '%' || i.column1 || '%' ) ; quit; I encourage you to use table...
It seems there is no permutation issue, you could try this: proc sort data=have; by order_id item; run; data temp; set have; by order_id; retain comb; length comb $4; comb=cats(comb,item); if last.order_id then do; output; call missing(comb); end; run; proc freq data=temp; table comb/norow nopercent nocol nocum; run; ...
proc summary allows you to control exactly which ways you want to cross the data. %macro report(year); proc import datafile="/path/to/report-&year..xls" out= salary_data dbms=csv replace ; proc summary data = salary_data; class married gender; types married gender married*gender; var sal; output out = salary_results mean(sal) = mean_salary std(sal) = std_salary; *...
Create a view that appends all the tables with the date variable and select the max date from the variable. If your tables don't have the same structure you can modify the set statement to keep only the date variable. You may want to anyways to speed up the process....
WHERE statement is allowed in the datastep, but only works on a SET statement. You can't use it on an infile-sourced data step. You can either: Use an IF statement in the data step. Use a WHERE statement in PROC DOWNLOAD The former is probably more efficient, but the latter...
This should work: data want; set have; array Bins{*} Bin:; array freq_Bin{4}; do k=1 to dim(Bins); if Bins(k) ne . then freq_Bin(Bins(k))=1; end; drop k; run; I tweaked your code somewhat, but really the only problem was that you need to check that Bins(k) isn't missing before trying to use...
To my knowledge you cannot dynamically set the size of the array at the compile time. One possibility to get this done is to use proc contents and proc sql to figure out how many threshold parameters there are in the parameters data set and then pass that information to...
sql,macros,sas,append,concatenation
Answered my own question. This does what I need: DATA TEST1; SET TEST; STRING=COMPRESS(%NRSTR("%APPEND""("||COMPRESS(YEAR)||","||COMPRESS(CONDITION)||","|| COMPRESS(QRTS)||","|| COMPRESS(SAMPLE)||")"),""""); RUN; %APPEND can then be replaced by any macro name created, so this could be very useful. ...
I am not sure what version of SAS EG you are using, but here are the screenshots for SAS EG 5.1 start building a basic filter in query wizard click on the drop down triangle for the filter value select "columns" tab select the actual column make sure "Enclose values...
CALL EXECUTE has tricky timing issues. When it invokes a macro, if that macro generates macro variables from data set variables, it's a good idea to wrap the macro call in %NRSTR(). That way call execute generates the macro call, but doesn't actually execute the macro. So try changing your...
Primarily your problem is R1C1 formulas use [ ] brackets. So, your formula is probably: =SUM(R[-67]C:R[-4]C)-R[-3]C ...
You could do this through SQL by merging your prices onto your dataset by var1/var2: proc sql ; create table output as select a.var1, a.var2, b.price from input a left join (select distinct var2, price from input where not missing(var2)) as b on (a.var1=b.var2 or a.var2=b.var2) ;quit ; ...
As Gordon says in the comments, I would use a Data Step and a RETAIN statement First, create your data set. Second, sort it in ascending order by DATE Third, use the Data Step and RETAIN to create your values. Use the BY statement and the subsetting IF to output...
I realized the file I was using was meant for Windows. When I use a Linux version of the same file it works fine. Thanks for the help
To make these match, you need two things: The seed used to generate the random number The formula used to generate the random number SAS uses for rannor (and I think also for rand, but I haven't seen confirmation of this), the following algorithm (found in Psuedo-Random Numbers: Out of...
PROC RANK will do this for you pretty easily. If you know the total count of observations, it's trivial - it's slightly harder if you don't. proc rank data=sashelp.class out=class_ranks(where=(height_r>4 and weight_r>4)); ranks height_r weight_r; var height weight; run; That removes any observation that is in the 4 smallest heights...
It's always risky to say "SAS can't do that," but I'm going to go out on a limb and say "EG can't do that." While EG looks similar to old-fashioned Display Manager SAS, it's not really an Interactive SAS session. It's more like doing a series of batch submits connected...
The problems seems to be that when you have syncronous processes running they are completely disconnected from each other. Even though you're only waiting for the fast task2 to complete before continuing: rsubmit task2 wait=no; data _null_; put "Bye World!"; run; endrsubmit; SAS needs task1 to complete as well for...
The short answer is to use the %scan function: %put %scan(&tablelist,4,%str( )); The third argument specifies that %scan should count only spaces as delimiters. Otherwise, it will also treat all of the following characters as delimiters by default: . < ( + & ! $ * ) ; ^ -...
Macros cannot be resolved between single quotation marks. You should rewrite it as "G:\Interns\Shiyas\&cityname..csv" The double period is needed because the first is used for the macro resolution. Check out this link...
Month are 0-based, so you're setting your calendar to February, not January. This should fix the issue: Calendar cal = Calendar.getInstance(); cal.set(Calendar.DAY_OF_MONTH, 1); cal.set(Calendar.MONTH, Calendar.JANUARY); // ... ...
Try this: data want; set account; array vars c:; sum=0; do i=1 to index; sum+vars(i); end; drop i; run; ...
Another option could be Proc SQL, no pre-sort needed: proc sql; create table want as select * from have group by a, b, c, d having count(*)>1; /*This is to tell SAS only to keep those dups*/ quit; ...
sas,sas-macro,enterprise-guide
You could build on the following code Step1 : Creating the dataset which contains all the 12 dates. Not sure how are you calculating the dates for all the 12 months, So I have assumed dataset All_dates contains all your dates with variables - R_act_beg, R_act_end ,name_m,name_y_act,nameR_act. This dataset contains...
EDIT-new answer that actually works: You could read your entire query into a macro variable through a data step, but you'd be limited to 32,767 total characters in the query, as that is the most a character variable would hold. I'd suggest using a data step to read your query...
SQL is certainly a fine way to do it. Here's a data step solution with a double DOW loop. You can also calculate the mean with this method in the same step if you want. Sort the data in order to use by groups. proc sort data=raw_data; by cust_id category;...
You can retain the non-missing value and apply it to any missing values, e.g. data want ; set have ; length last $10. ; retain last '' ; if not missing(PARENT) then last = PARENT ; else PARENT = last ; drop last ; run ; ...
It's probably quite a bit more work than using g3d/g3grid, but you could create a scatter plot with separate series for each band of your z-axis variable, giving each series a different colour. You could use proc sgplot or proc sgscatter to do this.
Another different approach to get a multi-delimiter-containing word is to use call scan. It will tell you the position (and length, which we ignore) of the nth word (and can be forwards OR backwards, so this gives you some ability to search in the middle of the string). Implemented for...
eq is the comparison equals operator, not the assignment equals operator. = performs both roles. So, if total eq 140 then status='works'; would be perfectly legal....
TODAY() is a base SAS function and works in any of the "normal" SAS contexts (including PROC SQL). CURRENT_DATE() is a FedSQL function. FedSQL is new as of 9.4 and is basically a more-compatible version of SQL that includes data types found in other databases (like Hadoop, in particular) for...
If you start of with calculating the mean for each student, you can use that table to assign the macro variables with the call symput routine. Like: data height; input name $ var $ value; datalines; John test1 175 Peter test1 180 Chris test1 140 John test2 178 Peter test2...
This is because your if statement is faulty in this case: else if visits=1 or 2 then band='Low'; You are mistakenly assuming that this is effectively: if visits is 1, or visits is 2 then ... In fact, this is actually: if visits is 1, or 2 is true then...
sas,market-basket-analysis,enterprise-miner
Can you check the highest frequency count of any product in the productid column and get 5% value of that that frequency count. The number you get should be the threshold transaction count till which enterprise miner should be showing you the result. Support percentage is based on the maximum...