I have been creating a .csv file using pandas to connect to Tableau. I would like to do it purely in Tableau if possible so I can outsource the daily updates to a coworker without having to teach him python/pandas.
File A: 1.2 mil rows per month, each row represents an interaction with a customer and it has the following columns which I merge together for a unique value (which is not perfect because sometimes there are multiple rows with different timestamps for the same interaction - I typically drop duplicates on these three fields combined (countd(Date+PersonID+CustomerID)):
Date | PersonID | CustomerID
File B: 300k rows per month. This represents if the PersonID while interacting with the CustomerID has an issue, they reach out to another Person. I also create a field of (Date+PersonID+CustomerID) to join with File A.
Date | PersonID | CustomerID | HelperID
File C: 200k rows per month. This is a subset of File C with some different information but it does not have PersonID. I normally do a (Date+CustomerID) and then lookup who the PersonID from File A was and and that information.
Date | CustomerID | HelperID
In the end I have an overall file (outerjoins) then a smaller file where there are direct matches about the same interaction where I can say this PersonID interacted with this CustomerID and needed help from this HelperID on this date about a certain issue.
How do I approach this in Tableau directly? Should I load all three files separately and Blend on the CustomerID? Should I create the unique concatenated fields on each file and then do joins? It's not perfect because there is no index or key which directly links these files 1 to 1. I am afraid of doing a left join and inflating numbers (I notice there are duplicates sometimes and I don't know how to address it when I do joins in Tableau)