I have a dataframe with 19M rows of different customers (~10K customers) and for their daily consumption over different date ranges. I have resampled this data into weekly consumption and the resulted dataframe is 2M rows. I want to know the ranges of consecutive dates for each customer and select those with the max(range). Any ideas? Thank you!
How to select a range of consecutive dates of a dataframe with many users in pandas
111 views Asked by dogo At
1
There are 1 answers
Related Questions in PANDAS
- ModuleNotFoundError on .ipynb
- Str object is not callable in pandas
- Need help realigning python fill_between with data points
- AttributeError: module 'numba' has no attribute 'generated_jit'
- Fix error when assigning a list of values to dataframe row
- How to make pandas show large datasets in output?
- merge dataframe but do not sort by merge key
- vim python omnifunc not working some modules
- Preserving DataFrame Modifications Across Options in a Streamlit Application
- How to join 2 datasets by looking up based on a string (full match or part match)
- Python Pandas getting hierarchy path till top management
- How to convert pandas series to integer for use in datetime.fromisocalendar
- reformat numbers stored in array
- How can I resolve this error and work smoothly in deep learning?
- What is the best way to merge two dataframes that one of them has date ranges and the other one has date WITHOUT any shared columns?
Related Questions in DATAFRAME
- Preserving DataFrame Modifications Across Options in a Streamlit Application
- Python Pandas getting hierarchy path till top management
- What is the best way to merge two dataframes that one of them has date ranges and the other one has date WITHOUT any shared columns?
- python pandas plot.bar something wrong
- Subsetting rows with sequence of values and identifying columns where sequence begins
- How to group rows by values to create new columns in Pandas DataFrame?
- How to write an R function to pivot the last n minutes?
- How can I change the groupby scope to find the first value that meets the conditions of a mask?
- Eliminate sub elements in a huge list of strings as long as no duplicates appear
- How to transfer object dataframe in sklearn.ensemble methods
- How can i fix this error ? Attempt to get argmax of an empty sequence
- How can I change the groupby column to find the first row that meets the conditions of a mask if the initial groupby failed to find it?
- How to iteratively create matrices/vectors from columns/unique row values of dataframe, and pass them to subsequent code?
- How to convert scraped HTML document to a dataframe?
- Replacing values on a dataframe row using a specific value as reference
Related Questions in TIME-SERIES
- Measures of similarity for time series data
- Is there an algorithm to identify the increasing Period/Interval of a time series?
- What kind of ARIMA model would be best fit for this data?
- How to load very big timeseries file(s) in Python to do analysis?
- How to write the query statement of the total number of time series by paging in Apache IoTDB?
- error to generate regular raster stack time series in R
- Getting NotImplementedError: While Importing ARMA
- Plotting Non-Uniform Time Series Data from a Text File
- How in SQL can I identify if a value has changed within the current week or vis-a-vis the previous week?
- LSTM : predict_step in PyTorch Lightning
- Slow SELECT statement, possibly due to WHERE?
- R: Error in tseries::garch() Function for Auto GARCH Model Detection
- LSTM multistep forecast
- Sum column depending on values from another column on a single row (Pivot columns)
- gap fill for raster stack in R
Related Questions in DATA-ANALYSIS
- Pneumonia detection, using transfer learning
- duplicates within a 30 day period in samples from location A
- Understanding numeric_only boolean parameter in Pandas
- How can I turn categories into columns with percentage results?
- Unable to filter in power bi dax query
- YTD sum by month, using only latest value for each month
- Stopping a Power BI Table visual slicing the result of a virtual table
- Removing duplicate data conditionally in Excel
- How can I compare the similarity between multiple sets?
- Forecast the revenue for next month using 1 year historical data
- issue using dataset with data analysis project
- How can passive terms be rendered in the calculation of an MFA in R?
- Upsert using DuckDB
- Dynamic Filtering of Calculated Table Not Working with SELECTEDVALUE(slicer) in Power BI
- Mediation Analysis in R with two mediators in a repeated measure experiment (within-subject design)
Related Questions in PANDAS-TIMEINDEX
- Week of year is not correctly shown
- time as x-axis for non-continuous time (as cftime.DatetimeProlepticGregorian)
- pandas how to get mean value of datetime timestamp with some conditions?
- Pandas GroupBy time idxmax w/ empty groups throws exception
- Pandas time series index attribute error when using TsTables & PyTables in creating a table class
- Manipulate the Dataframe to start from the nearest varying Midnight timestamp
- How to count number of values in column based on one timestamp value python and add the count to new column
- How to generate monthly period index with annual frequency?
- To find a chosen date between date range of two columns
- How to select a range of consecutive dates of a dataframe with many users in pandas
- Grouping time-series by some custom datetime range?
- Round all index to 30 min in Pandas datetimeindex
- How to resample intra-day intervals and use .idxmax()?
- Why can't I select whole days from intra-day time series?
- Best way to filter out data from specific month in pandas
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
It would be great if you could post some example code, so the replies will be more specific.
You probably want to do something like
earliest = df.groupby('Customer_ID').min()['Consumption_date']to get the earliest consumption date per customer, andlatest = df.groupby('Customer_ID').max()['Consumption_date']for the latest consumption date, and then take the differencetime_span = latest-earliestto get the time span per customer.Knowing the specific df and variable names would be great